[SOLVED] Why does scrapy return a different date from a time html tag as the development tools? – Stack Overflow

Issue

This Content is from Stack Overflow. Question asked by Henrik

I queried the html node, where the date of an article is stored. I noticed a different date in the datetime attribute compared to the text inside the node when scraping the site. In the development tools of Google Chrome the datetime attribute is the same as the displayed text.
My question is, why does scrapy get a different datetime attribute as the development tools? And can I somehow get the correct date from the datetime attribute?

This is the code and the return value:

response.xpath("//*[@class='a20-news-date']/time").getall()
['<time datetime="2021-11-15T08:17:20+01:00">Sonntag, 08.03.2020 // 17:20 Uhr</time>']

The development tools of Google display the node as:

<div class="a20-news-date">
    <time datetime="2020-03-08T17:20:16+01:00">8. März 2020</time>
</div>



Solution

Because if you check HTML source code (Ctrl+U) you’ll find that there are several <time> elements in the page. What you see in Dev Tools is a result DOM after Javascript execution. Your target element is located inside <article> tag in source HTML:

response.xpath("//article//time/text()").get()


This Question was asked in StackOverflow by Henrik and Answered by gangabass It is licensed under the terms of CC BY-SA 2.5. - CC BY-SA 3.0. - CC BY-SA 4.0.

people found this article helpful. What about you?