Scrapy: How do I select rowspan

Here is my code:

      <td height="34" class="normal">4893</td>
      <td class="normal">Public Utilities Commission </td>
      <td class="normal">Investigation to Examine </td>. 
      <td height="34" rowspan="2" class="normal"><a 
      <td class="normal"><p>RI Distribution Genration 
      <td class="normal">2019 Renewable Energy </td>
      <td class="normal">The Narragansett Ele</td>
      <td class="normal">2018 Renewable Energy </td>
      <td height="34" class="normal"><a 
      <td class="normal">Kearsarge Uxbridge, LLC </td>
      <td class="normal">Renewable Energy</td>

In the 2nd <tr> where rowspan ="2" I want to apply the content of 1st <td> i.e 4892 to the next <tr> where there are two <td>. I have tried the following, but it does not work:

        item['id'] = row.xpath('.//tr//td[1]//text()').extract()

        if not item['id']:
            item['id'] = row.xpath('.//[preceding- 

1 answer

  • answered 2018-11-08 08:32 starrify

    So instead of "select rowspan" you're actually looking to "select by rowspan".

    There're several approaches you may try.

    Select it when a rowspan exists:

    # CSS
    row.css('tr td[rowspan]::text')
    # XPath

    Select it when a rowspan has a specific value ("2" here):

    # CSS
    row.css('tr td[rowspan=2]::text')
    # XPath

    See also: