Having trouble extracting dynamic div list while scrolling down using Webdriver (selenium & python)
I'm having a hard time trying to figure out how to get a refreshed dynamic list while scrolling down the page using Webdriver in Selenium and Python3. https://www.ubereats.com/stores/ this is the website that I'm trying to scrape and if the site directs you to the homepage, please type any city and click, which will show you list of restaurants in div.
The interesting thing here is that if you go to inspect element, the list of
<div class="base_ ue-ff ...>..</div> changes as I scroll down the page and even I did scroll the page down using a webdriver in selenium python, it still retrieves the old data that has been extracted at the first place. Below is my sample code. I also made a sleep function to let the data to load, but there wasn't any difference to the data extraction.
from bs4 import BeautifulSoup from selenium import webdriver from selenium.webdriver.common.keys import Keys from selenium.common.exceptions import NoSuchElementException from urlib.request import urlopen from importlib import reload import re import sys driver = webdriver.Chrome(path_chrome_driver) driver.get('https://www.ubereats.com') wait_time_for_search_complete = float(np.random.uniform(1,2,1)) time.sleep(wait_time_for_search_complete) input_city_name = driver.find_element_by_xpath("//input[@placeholder='Enter your delivery address']") time_to_wait_to_enter_city_name = float(np.random.uniform(1, 2, 1)) time.sleep(time_to_wait_to_enter_city_name) input_city_name.send_keys('Sydney') time_to_wait_to_write_city = float(np.random.uniform(2, 3, 1)) time.sleep(time_to_wait_to_write_city) select_first_in_dropdown = driver.find_element_by_xpath('//*[@id="app-content"]/div/div/div/div/div/div/div/div/div/div/div/div/div/div/div/button') select_first_in_dropdown.click() time_to_wait_to_load_restaurants = float(np.random.uniform(2, 3, 1)) time.sleep(time_to_wait_to_load_restaurants) current_page = driver.page_source soup = BeautifulSoup(current_page,'html.parser') height = 0 restaurant_site =  while True: restaurant_information = '' restaurant_information = soup.find_all('a',['base_','ue-kl','ue-km','ue-kn','ue-ko']) time.sleep(5) for restaurant in restaurant_information: print(restaurant['href']) height += 1000 driver.execute_script("window.scrollTo(0,"+ str(height) +")") driver.implicitly_wait(3)
I'm really having hard time trying to figure out how to retrieve the restaurant list as I scroll down the page since the div is dynamic. I believe it has something to do with ajax call, but if you do have any alternative solution, please do let me know. Really want to solve this issue as soon as possible.