Adopting code from Oddsportal URL to current & Next Page
I have a code that scrapes
https://www.oddsportal.com/soccer/england/premier-league/
odds and event data.
I want to adopt the same to get all data for "today" which is reflected at
https://www.oddsportal.com/matches/soccer/
and 'tomorrow' is
https://www.oddsportal.com/matches/soccer/20210309/
which is essentially url/today+1
My current code that works is:
browser = webdriver.Chrome()
browser.get("https://www.oddsportal.com/soccer/england/premier-league/")
df = pd.read_html(browser.page_source, header=0)[0]
dateList = []
gameList = []
home_odds = []
draw_odds = []
away_odds = []
for row in df.itertuples():
if not isinstance(row[1], str):
continue
elif ':' not in row[1]:
date = row[1].split('-')[0]
continue
time = row[1]
dateList.append(date)
gameList.append(row[2])
home_odds.append(row[4])
draw_odds.append(row[5])
away_odds.append(row[6])
result = pd.DataFrame({'date': dateList,
'game': gameList,
'Home': home_odds,
'Draw': draw_odds,
'Away': away_odds})
Because the datelist[]
is different to the links e.g. Algeria»Ligue 1
I am getting an error
NameError: name 'date' is not defined
How can I adopt the same to
https://www.oddsportal.com/matches/soccer/
and for tomorrow
?
See also questions close to this topic
-
How to search for a selector whose class contains spaces with BeautifulSoup?
from bs4 import BeautifulSoup import requests def getPage(url): try: req = requests.get(url) except requests.exceptions.RequestException: return None return BeautifulSoup(req.text, 'html.parser') bs = getPage('https://ssearch.oreilly.com/?q=python') searchResults = bs.select('article.result') searchResults[0].select('p.note')
[<p class="note">By Gabriele Lanaro</p>, <p class="note date2">Publish Date: August 04, 2016 </p>]
I want to obtain the paragraph with the class of "note date2" but when I try to put it into the select method it returns me an empty list. I've also tried variations such as "note_date2" and "note-date2" but I unfortunately obtain the same result.
-
How do I print dicts through multiple files,
so,I need to print a dict.Heres what my file2.py file looks like
import json dict = {} file1 = open("dictfile.txt","w") file1.write(json.dumps(dict)) file1.close()
then my main file where I use this
from api import dict dict["smthng"] = "hi" print(dict["smthng"])
it prints hi,but my dictfile.txt looks like
{}
normal,but not what I was expecting.I want it to save the value assigned in main.py,but do it from file2.py.
-
How to sign with secp256k1 ECDSA in python with a personal private and public key
I have a private and public key that I would like to use for signing a
str
. I'm using Stark Bank's ECDSA. It works, but I need to be able to use my own keys for it to be accepted in streamr. I don't know any alternatives. If you have an easy to understand way, please share it to me.My code:
privateKey = PrivateKey() publicKey = privateKey.publicKey() signature = Ecdsa.sign(c, privateKey) print(Ecdsa.verify(c, signature, publicKey))
c
is the variable that gets signed. This gets hashed beforehand. I need to know how to be able to use my own keys, and starkbank does not give any more info on their github and I'm not at the level that I can read the files directly.Thaks in advance!
-
NameError: name 'score' is not defined
if score < driver.find_element_by_xpath("/html/body/div[@id='body']/div[@id='inner']/blockquote[@class='success']/strong"): NameError: name 'score' is not defined
How to avoid this error?
while True: driver.find_element_by_xpath("/html/body/div[@id='body']/div[@id='inner']/form[1]/blockquote[@class='success']/p[@class='center'][2]/a").click() Score = 8,363 if score < driver.find_element_by_xpath("/html/body/div[@id='body']/div[@id='inner']/blockquote[@class='success']/strong"): break
-
Screenshot is not displayed in Extend report in jenkins due to the prefix "http://localhost:8080/"
Condition: Screenshot files and report HTML file in the same folder, report file display screenshot properly from local.
Jenkins plug-in: HTML Publisher plugin
When I looked at the report and move my cursor to src in the report from Jenkins, it has the "http://localhost:8080" prefix, but in HTML code it doesn't have this prefix.
ex: http://localhost:8080/project_PATH/target/surefire-reports/html/Screenshot.jpg If remove this prefix, the screenshot is accessible.
Could please help to suggest how to remove the "http://localhost:8080" prefix, please?
I've been searched for answers and tried a lot but nothing helps.
I'm using MAC running java Selenium and Jenkins(http://localhost:8080) in my local.
-
Python Selenium Firefox Driver Crash after closing the popup window
I have opened multiple popup windows using the python selenium script. When I close one of the popups the driver object get crashed. Further, I can't able to access the driver object. Can you please help on this to fix the issue.?
Code Snippet:
driver.switch_to.window(driver.window_handles[2]) driver.close() time.sleep(5) driver.switch_to.window(driver.window_handles[1])
Error Output:
2021-04-21 09:55:47 ERROR The Exception in Virtual Media Testcases: Message: Failed to decode response from marionette <class 'selenium.common.exceptions.WebDriverException'> test_selenium.py 560 2
geckodriver.log:
1619015645365 webdriver::server DEBUG -> DELETE /session/c1318c5d-796b-4564-bdc9-68a95bb7cac4/window 1619015645366 Marionette TRACE 0 -> [0,638,"WebDriver:CloseWindow",{}] 1619015645391 Marionette DEBUG Received observer notification message-manager-disconnect ###!!! [Child][DispatchAsyncMessage] Error: PFilePicker::Msg___delete__ Route error: message sent to unknown actor ID 1619015645414 Marionette TRACE 0 <- [1,638,null,["2147483649","2147483658"]] 1619015645416 webdriver::server DEBUG <- 200 OK {"value":["2147483649","2147483658"]} 1619015645419 webdriver::server DEBUG -> GET /session/c1318c5d-796b-4564-bdc9-68a95bb7cac4/window/handles 1619015645419 Marionette TRACE 0 -> [0,639,"WebDriver:GetWindowHandles",{}] 1619015645420 Marionette TRACE 0 <- [1,639,null,["2147483649","2147483658"]] 1619015645420 webdriver::server DEBUG <- 200 OK {"value":["2147483649","2147483658"]} 1619015645421 webdriver::server DEBUG -> POST /session/c1318c5d-796b-4564-bdc9-68a95bb7cac4/window {"handle": "2147483649"} 1619015645422 Marionette TRACE 0 -> [0,640,"WebDriver:SwitchToWindow",{"handle":"2147483649","name":"2147483649"}] 1619015645422 Marionette TRACE 0 <- [1,640,null,{}] 1619015645422 webdriver::server DEBUG <- 200 OK {"value":null} 1619015645424 webdriver::server DEBUG -> GET /session/c1318c5d-796b-4564-bdc9-68a95bb7cac4/screenshot 1619015645424 Marionette TRACE 0 -> [0,641,"WebDriver:TakeScreenshot",{"full":false,"highlights":[],"id":null}] [Parent 77210, Gecko_IOThread] WARNING: pipe error (55): Connection reset by peer: file /builddir/build/BUILD/firefox-60.1.0/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353 ###!!! [Parent][MessageChannel] Error: (msgtype=0x15007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv ###!!! [Parent][MessageChannel] Error: (msgtype=0x15007F,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv 1619015645445 Marionette DEBUG Register listener.js for window 32 1619015645477 Marionette DEBUG Register listener.js for window 34 A content process crashed and MOZ_CRASHREPORTER_SHUTDOWN is set, shutting down 1619015645539 Marionette DEBUG Received DOM event unload for [object XULDocument] 1619015645546 Marionette DEBUG Received observer notification message-manager-disconnect 1619015645548 Marionette TRACE 0 <- [1,641,null,{}] 1619015645565 webdriver::server DEBUG <- 500 Internal Server Error {"value":{"error":"unknown error","message":"Failed to find value field","stacktrace":""}} 1619015645567 webdriver::server DEBUG -> DELETE /session/c1318c5d-796b-4564-bdc9-68a95bb7cac4 1619015645569 webdriver::server DEBUG Deleting session 1619015645574 Marionette DEBUG Closed connection 0 1619015645657 Marionette DEBUG Received observer notification xpcom-will-shutdown 1619015645657 Marionette DEBUG New connections will no longer be accepted
Driver Information:
Selenium Version: 3.11.0 { "rotatable":false, "browserVersion":"60.1.0", "timeouts":{ "pageLoad":300000, "implicit":0, "script":30000 }, "acceptInsecureCerts":true, "moz:headless":false, "moz:geckodriverVersion":"0.26.0", "moz:webdriverClick":true, "moz:profile":"/tmp/rust_mozprofileEdeK7L", "moz:accessibilityChecks":false, "browserName":"firefox", "moz:useNonSpecCompliantPointerOrigin":false, "platformVersion":"3.10.0-862.6.3.el7.x86_64", "moz:processID":359676, "pageLoadStrategy":"normal", "platformName":"linux" }
-
Taking data from iteration loops C
Essentially I'm trying to perform 100 iterations in where I compare a static int to an int that is randomised on every iteration, and a sum is kept of how many times in 100 the static int was larger than the rand.
My code looks something like this:
#include <stdio.h> #include <time.h> #include <stdlib.h> #include <math.> // used math for a different piece of code int main(void){ int rounds, random_int, comparison_int, power, strength; float speed; printf("Enter a speed: "); scanf("%f", &speed); strength = 20; rounds = 0; srand(time(NULL)); if (speed > 60){ random_int = (rand()% (5 - (-5) + 1)) - 5; power = (60 + (strength/100) * 40 + rand_int; while (rounds != 100){ rounds = rounds + 1; comparison_int = (rand() % (100 - 0 + 1)); } printf("Out of 100 battles, you won %d.", rounds); } else if (speed < 60){ random_int = (rand()% (5 - (-5) + 1)) - 5; power = (speed + (strength/100) * 40 + rand_int; while (rounds != 100){ rounds = rounds + 1; comparison_int = (rand() % (100 - 0 + 1)); } printf("Out of 100 battles, you won %d.", rounds); }
and I can't figure it out from here
-
To print square with * by editing this given source code. without making any major changes
I want to print like the way given below (4 stars on one line)
**** **** **** ****
public class OperatorDemo { public static void main(String[] args) { for(int i=1;i<=4;i++); { for(int j=1;j<=4;j++); { System.out.print("* "); } System.out.println(" "); } } }
-
c++ Why loops with more than 3 levels are skipped in c++
Just think about the following simple program example, why can't the part above 3 levels of nested loops be executed? Is there a better solution?
for(int i;i<a;i++){ for(int j;j<b;j++){ for(int k;k<c;k++){ //The code here can be executed for(int z;z<d;z++){ .....//The code here cannot be executed } } } }
thank you very much.
-
Beautiful soup nesting loops
My XML has nested structure similar to this:
<xml> <top> <main_record attr1="val1" attr2 = "val2" attr3="val3"> <sub_record attrx="valx" attry="valy" /> </main_record> <main_record attr1="val4" attr2 = "val5" attr3="val6"> <sub_record attrx="valx2" attry="valy2" /> </main_record> <main_record attr1="val7" attr2 = "val8" attr3="val9"> <sub_record attrx="valx3" attry="valy3" /> </main_record> </top> </xml>
I'm trying to use beautiful soup to extract the data for each "main_record" along with its "sub_record" attributes so I can work with it in rows in a CSV file.
I can get a loop working to print out all the attr1, attr2 and attr3 values in the file, but when I try to add a sub-loop inside to get the attrx and attry, it does not work correctly.
from bs4 import BeautifulSoup f = open("C:\\tracker.log", "r") x = f.read() soup = BeautifulSoup(x, 'html.parser') for entity in soup.find_all('main_record'): print(entity.get('attr1')) print(entity.get('attr2')) print(entity.get('attr3')) for positions in soup.find('sub_record'): print(positions.get('attrx')) print(positions.get('attry'))
Any help/pointers appreciated.
-
BeautifulSoup Parsed document different than original html page's code
I am using BeautifulSoup4 to do some scraping from Spotify Charts.
I have the code up and running fro some weeks. But suddenly today it started to fail. It started to give NaN values for all entries...
I believe the problem is in the html parsed page. The resulting html code is different than the original webpage html.
I have tried with 'html.parses','lxml' and 'html5lib'. I also updated BeautifulSoup and all the parser's packages. But nothing
What could be the problem? I have no idea what could be the root of the problem.Yesterday my Windows 10 was updated, could it be related?
Here is the part of the code that matters:
from bs4 import BeautifulSoup as bs import requests u = 'https://spotifycharts.com/regional/us/daily/2021-04-18' x = requests.get(u) a = bs(x.content,'html.parser') tracks = a.find_all('td',class_='chart-table-position')
tracks is always none, because it does not exist in 'a'. But it should...because it exists in the webpage html and it existed some days ago...
Thanks in advance for the help.