gofeed
gofeed copied to clipboard
RSS "pubDate", DST timezone identifiers not handled
DST timezone identifiers are not currently being handled properly in RFC 822 formatted dates.
Take the CBC RSS Feeds as an example. Mon, 21 Apr 2025 06:00:00 EDT is wrongly parsed as being UTC, which leads to items being marked as 4 hours later they actually are.
RFC 822 does allow the use of defined offsets as timezone identifiers:
zone = "UT" / "GMT" ; Universal Time
; North American : UT
/ "EST" / "EDT" ; Eastern: - 5/ - 4
/ "CST" / "CDT" ; Central: - 6/ - 5
/ "MST" / "MDT" ; Mountain: - 7/ - 6
/ "PST" / "PDT" ; Pacific: - 8/ - 7
/ 1ALPHA ; Military: Z = UT;
; A:-1; (J not used)
; M:-12; N:+1; Y:+12
/ ( ("+" / "-") 4DIGIT ) ; Local differential
; hours+min. (HHMM)
Now, one could argue that using RFC 822 in 2025 is a bit stupid, and I'd agree, but I don't work at CBC unfortunately. The issue has been reported to them and their reply was that their RSS feeds are not a priority.
Would you be open to adding a fix to dateparser.go to handle this edge-case? I can contribute a PR.