I queried the html node, where the date of an article is stored. I noticed a different date in the datetime attribute compared to the text inside the node when scraping the site. In the development tools of Google Chrome the datetime attribute is the same as the displayed text.
My question is, why does scrapy get a different datetime attribute as the development tools? And can I somehow get the correct date from the datetime attribute?
This is the code and the return value:
response.xpath("//*[@class='a20-news-date']/time").getall() ['<time datetime="2021-11-15T08:17:20+01:00">Sonntag, 08.03.2020 // 17:20 Uhr</time>']
The development tools of Google display the node as:
<div class="a20-news-date"> <time datetime="2020-03-08T17:20:16+01:00">8. März 2020</time> </div>
Because if you check HTML source code (
Ctrl+U) you’ll find that there are several
<article> tag in source HTML: