gofeed icon indicating copy to clipboard operation
gofeed copied to clipboard

RSS "pubDate", DST timezone identifiers not handled

Open ghost opened this issue 8 months ago • 0 comments

DST timezone identifiers are not currently being handled properly in RFC 822 formatted dates.

Take the CBC RSS Feeds as an example. Mon, 21 Apr 2025 06:00:00 EDT is wrongly parsed as being UTC, which leads to items being marked as 4 hours later they actually are.

RFC 822 does allow the use of defined offsets as timezone identifiers:

 zone        =  "UT"  / "GMT"                ; Universal Time
                                                 ; North American : UT
                 /  "EST" / "EDT"                ;  Eastern:  - 5/ - 4
                 /  "CST" / "CDT"                ;  Central:  - 6/ - 5
                 /  "MST" / "MDT"                ;  Mountain: - 7/ - 6
                 /  "PST" / "PDT"                ;  Pacific:  - 8/ - 7
                 /  1ALPHA                       ; Military: Z = UT;
                                                 ;  A:-1; (J not used)
                                                 ;  M:-12; N:+1; Y:+12
                 / ( ("+" / "-") 4DIGIT )        ; Local differential
                                                 ;  hours+min. (HHMM)

Now, one could argue that using RFC 822 in 2025 is a bit stupid, and I'd agree, but I don't work at CBC unfortunately. The issue has been reported to them and their reply was that their RSS feeds are not a priority.

Would you be open to adding a fix to dateparser.go to handle this edge-case? I can contribute a PR.

ghost avatar Apr 21 '25 12:04 ghost