Web scraping with python's request module requires submit button
I'm trying to web scrape my grades off of this website.
I have this which enters the login info (I think):
import requests
def main():
login = {'username':'USERNAME', 'password':'PASSWORD'}
url = 'https://parent-portland.cascadetech.org/portland/Login_Parent_PXP.aspx?regenerateSessionId=True'
r = requests.post(url, login)
r.submit()
print(r.text)
main()
It looks like you have to press a button.
<input name="Submit1" id="Submit1" value="Login" cssclass="btn" text="Login" onclick="this.disabled=true;this.form.submit();" type="submit">
However, I just learned this today and I have no idea how to do this with the requests module.
See also questions close to this topic
-
I got issue with automatic grab in basket python
I got a script on github, but it doesn't seem to work anymore.. anyone who knows a possible solution for it ?
it's on the website ; https://github.com/MartijnDevNull/ticketnak
-
Symmetric Difference of Two CSV Files in Python
I want to write a Python script to compare two specified columns of two separate CSV files to output a third CSV file with the complete rows of the unique values in CSV 1 and 2. So, for example, if both CSVs have a column ID, I want to see, which rows of CSV 1 and CSV 2 have unique ID values and output those as a third CSV file.
I was thinking of using a set of the two CSV files, but how do I specify the shared column?
- How to separate a data frame using pandas?
-
Selenium is not selecting an option from the dropdown
I am trying to scrape some information from this website http://www.dubaitrade.ae/ja-terminal-1
I have to go to the terminal code column and select 'General Cargo'
Here is the HTML.
<select name="terminal" id="terminal1" style="width:52%" tabindex="3"> <option value="GC">General Cargo</option> <option value="T1" selected="selected">Terminal One</option> <option value="T2">Terminal Two</option> <option value="T3">Terminal Three</option></select>
I am unsure if the difficulty is because it is nested within this tag.Meaning it is some kind of form.
<form name="vesselform" id="vesselLinkFormID" method="post" action="/pmisc/vessel.do;jsessionid=1da4f04d50b1a09ff55171820810948e8430f1ad26af4f30e765658bb25b3ffa.e34NaxuKaxmOaO0OaxmKc34Sa3j0">
I have tried
Select(driver.find_element_by_xpath('//*[@id="terminal1"]')).select_by_visible_text('General Cargo') Select(driver.find_element_by_id('terminal1')).select_by_visible_text('General Cargo') Select(driver.find_element_by_css_selector("terminal1")) Select(driver.find_element_by_name('terminal1')).select_by_visible_text('General Cargo')
I always get the error message unable to locate element.
-
Python/Selenium Can't find searchbox
I am trying to select the first searchbox on this website: https://www.ris.bka.gv.at/Bundesrecht/
This is my code:
for ii in testList2: varTitel = ii searchBox = driver.find_element_by_id('MainContent_SuchworteField') searchBox = driver.find_element_by_xpath('//*[@id="MainContent_SuchworteField"]/span') searchBox = driver.find_element_by_name('MainContent_SuchworteField_Value') searchBox.send_keys(varTitel) searchBox.send_keys(Keys.ENTER) time.sleep(1) print("Query link: " + driver.current_url) driver.back() driver.quit()
As you can see, I tried three ways of selecting the searchbox. Everytime I am getting NoSuchElement exceptions.
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"id","selector":"MainContent_SuchworteField"} selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"xpath","selector":"//*[@id="MainContent_SuchworteField"]/span"} selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"name","selector":"MainContent_SuchworteField_Value"}
Here is a snippet of what is being shown after inspecting the site.
I hope you can help me,
Cheers
-
Get href property for each row in a table using rvest
I am trying to extract all links for a table that looks similar to the following:
<!DOCTYPE html> <html> <body> <table> <tr> <td> <a href="https://www.r-project.org/">R</a><br> <a href="https://www.rstudio.com/">RStudio</a> </td> </tr> <tr> <td> <a href="https://community.rstudio.com/">Rstudio Community</a> </td> </tr> </table> </body> </html>
What I would like to do is to get a list of dataframes (or vector) at the end where each dataframe contain all the links for each row in the html table. For example, in this case the list will have vector 1 with
c("https://www.r-project.org/","https://www.rstudio.com/")
and the second vector will bec("https://community.rstudio.com/")
. The main problem I am having right now is that I am not able to keep the href relationship to each node when I do the following:library(rvest) web <- read_html("table.html") %>% html_nodes("table") %>% html_nodes("tr") %>% html_nodes("a") %>% html_attr("href")
-
RemoteDisconnected Error when using requests.patch on Eve endpoint
I'm trying to track down a frustrating error with my Eve application. I am trying to update a record, but whenever I send the patch request, I am greeted with the following error:
File "/home/undivided/.local/lib/python3.6/site-packages/requests/api.py", line 140, in patch return request('patch', url, data=data, **kwargs) File "/home/undivided/.local/lib/python3.6/site-packages/requests/api.py", line 58, in request return session.request(method=method, url=url, **kwargs) File "/home/undivided/.local/lib/python3.6/site-packages/requests/sessions.py", line 508, in request resp = self.send(prep, **send_kwargs) File "/home/undivided/.local/lib/python3.6/site-packages/requests/sessions.py", line 618, in send r = adapter.send(request, **kwargs) File "/home/undivided/.local/lib/python3.6/site-packages/requests/adapters.py", line 490, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response',))
The request is being made like this:
res = requests.patch( url, json=mydata, headers={ "If-Match": _etag, "Authorization": 'Bearer {}'.format(eve_token) } )
The strange thing is, that all other request types to the same instance, including a GET to the exact same record, return properly. There are no errors that I can see in the Eve logs either.
The only possible complicating factor I can think of is that the eve application is deployed on a kubernetes cluster, but, as I said, all of the GETs/POSTs I have tried work as expected.
EDIT:
To add a further complicating factor, when I exec into my pod and try the request there, it succeeds, so it seems the problem is related to Kubernetes? Do I need to do something to enable HTTP PATCH on Kubernetes?
-
python HTTP batch request made by requests
I try to make a batch request to channeladvisor API based on this example:
My request looks like this:
POST https://api.channeladvisor.com/v1/$batch?access_token=xxxxxx Content-Length: 1307 Content-Type: multipart/mixed; boundary=938fabdd69a541beb904b703514f8bc3 --batch Content-Type: multipart/mixed; boundary=938fabdd69a541beb904b703514f8bc3 --938fabdd69a541beb904b703514f8bc3 Content-Type: application/http Content-Transfer-Encoding: binary Content-ID: 1 POST https://api.channeladvisor.com/V1/Products(227657)/UpdateQuantity HTTP/HTTPS 1.1 Content-Type: application/json {"Value": {"UpdateType": "UnShipped", "Updates": [{"DistributionCenterID": 1, "Quantity": 2}]}} --938fabdd69a541beb904b703514f8bc3 Content-Type: application/http Content-Transfer-Encoding: binary Content-ID: 2 POST https://api.channeladvisor.com/V1/Products(227658)/UpdateQuantity HTTP/HTTPS 1.1 Content-Type: application/json {"Value": {"UpdateType": "UnShipped", "Updates": [{"DistributionCenterID": 1, "Quantity": 2}]}} --938fabdd69a541beb904b703514f8bc3 Content-Type: application/http Content-Transfer-Encoding: binary Content-ID: 3 POST https://api.channeladvisor.com/V1/Products(227659)/UpdateQuantity HTTP/HTTPS 1.1 Content-Type: application/json {"Value": {"UpdateType": "UnShipped", "Updates": [{"DistributionCenterID": 1, "Quantity": 2}]}} --938fabdd69a541beb904b703514f8bc3-- --batch
I'm using python requests library and a little hack to achieve this. When using Postman this works fine. I got 204 status.
When I try it with python:
data = { 'Value':{ 'UpdateType': 'UnShipped', 'Updates': [{ 'DistributionCenterID': 1, 'Quantity': 2 }] } } headers = { 'Content-Type': 'application/json' } commands = [] commands.append(requests.Request('POST', 'https://api.channeladvisor.com/V1/Products(227657)/UpdateQuantity HTTP/1.1', json=data, headers=headers)) commands.append(requests.Request('POST', 'https://api.channeladvisor.com/V1/Products(227658)/UpdateQuantity HTTP/1.1', json=data, headers=headers)) commands.append(requests.Request('POST', 'https://api.channeladvisor.com/V1/Products(227659)/UpdateQuantity HTTP/1.1', json=data, headers=headers)) #-----------------------------------------------# batch = BatchRequest() files = batch.prepare_requests(commands) r = requests.Request('POST', 'https://api.channeladvisor.com/v1/$batch?access_token='+self.access_token, files=files) prepared = r.prepare() prepared = batch.finalize_request(prepared) batch.pretty_print_POST(prepared) s = requests.Session() resp = s.send(prepared)
I'm getting this:
Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen chunked=chunked) File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request six.raise_from(e, None) File "", line 2, in raise_from File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request httplib_response = conn.getresponse() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1331, in getresponse response.begin() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 297, in begin version, status, reason = self._read_status() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 258, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 586, in readinto return self._sock.recv_into(b) File "/usr/local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 285, in recv_into raise SocketError(str(e)) OSError: (54, 'ECONNRESET') During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 440, in send timeout=timeout File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 639, in urlopen _stacktrace=sys.exc_info()[2]) File "/usr/local/lib/python3.6/site-packages/urllib3/util/retry.py", line 357, in increment raise six.reraise(type(error), error, _stacktrace) File "/usr/local/lib/python3.6/site-packages/urllib3/packages/six.py", line 685, in reraise raise value.with_traceback(tb) File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 601, in urlopen chunked=chunked) File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 387, in _make_request six.raise_from(e, None) File "", line 2, in raise_from File "/usr/local/lib/python3.6/site-packages/urllib3/connectionpool.py", line 383, in _make_request httplib_response = conn.getresponse() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 1331, in getresponse response.begin() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 297, in begin version, status, reason = self._read_status() File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/http/client.py", line 258, in _read_status line = str(self.fp.readline(_MAXLINE + 1), "iso-8859-1") File "/usr/local/Cellar/python/3.6.5/Frameworks/Python.framework/Versions/3.6/lib/python3.6/socket.py", line 586, in readinto return self._sock.recv_into(b) File "/usr/local/lib/python3.6/site-packages/urllib3/contrib/pyopenssl.py", line 285, in recv_into raise SocketError(str(e)) urllib3.exceptions.ProtocolError: ('Connection aborted.', OSError("(54, 'ECONNRESET')",)) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "channel.py", line 157, in products.batchStockUpload() File "channel.py", line 146, in batchStockUpload resp = s.send(prepared, verify=True) #verify='/usr/local/Cellar/openssl/1.0.2o_1' File "/usr/local/lib/python3.6/site-packages/requests/sessions.py", line 618, in send r = adapter.send(request, **kwargs) File "/usr/local/lib/python3.6/site-packages/requests/adapters.py", line 490, in send raise ConnectionError(err, request=request) requests.exceptions.ConnectionError: ('Connection aborted.', OSError("(54, 'ECONNRESET')",))
I tried everything and really desperate ask this question? Thanks for help
-
iterating through 6 values in python
I have a continuous data which is being stored continuously in the list, I want to iterate through 6 values and want to create an average for that 6 values, like that I want average of every 6 values.can anyone please help with it? advance thanks