Use selenium to get information out a table with changing xpaths

I am trying to loop through a list of companies and scrap their environmental ratings from CSRhub. I would post the link as an example, but it is by log in only. My scraper has not been getting accurate numbers as the location of the ratings changes depending on the rows of the table on the webpage.

For example: Here we see that Target has 5 rows in the table and the xpath for 73 (Energy & Climate Change rating) is:

//[@id="rating-section"]/div/div2/div/div/table/tbody/tr[23]/td[5]/div/table/tbody/tr/td2/div/div/span1/span*


enter image description here



But companies vary in their number of rows, here are the xpaths for the different elements I am trying to gather.



enter image description here


The table and webpage features do not have ids or very well labeled classes. I am fairly new to understanding front end. How can I select the correct feature regardless of number of rows that company has?

1 answer

  • answered 2020-11-25 20:11 BCR

    Since you can't rely on the row numbering, identify what you can rely on--in this case the text label of the value you are looking for. Use the xpath contains() method to check the text. I can't read the HTML in your screenshot so it's hard to give exact code, but it will look something like this:

    if the element is <span class="something useless">I am a label!</span>

    use "//*[@id='rating section']//table//span[contains(text(),'I am a label')]"

    BTW a handy trick is to use "//" anywhere there is a lot of non-specific code, so you don't need to have all the /div/span/div cruft in your xpath.

    Also look at using child and parent nodes. Identify an element that is highly static nearby the element you want, then use the child node expression (and additional xpath if needed) to get the element needed.

    Xpath is daunting when you first start out, but I encourage you to keep trying and learning. It's really powerful in cases like this.