gofeed icon indicating copy to clipboard operation
gofeed copied to clipboard

Missing DateTime Parser Tests

Open mmcdole opened this issue 6 years ago • 3 comments

We are missing comprehensive tests for the date/time parser.

Ideally, these would cover a myriad of date formats, including those with / without timezone, etc.

mmcdole avatar Feb 14 '19 07:02 mmcdole

You know - I just added a new format, and put in a PR. And I went looking for the test. But didn't find one. I paused and thought about it. You could run all the formats through the parser. Then back out as formatted time - and you should get back the original format string.

But what does it prove? It's more of a test of the time package - which is well tested. I went delving into the time package test routines. To find a 'Format String' verifier. But they really didn't have one. And it's a complex format. Let me scratching my head.

rodmcelrath avatar Mar 24 '23 17:03 rodmcelrath

I've been considering making date parsing optional, and perhaps having two degrees:

  1. Parse the formats specified by the specs (strict)
  2. Attempt to parse the myriad of date formats I have listed today.

I'm a bit concerned about the date parsing penalty people pay on parsing, when they may not even need/want it.

mmcdole avatar Mar 24 '23 23:03 mmcdole

I for one lean heavily into what you have provided. I am dealing with nearly 700 feeds now.

The format I put in today, was the first failure I have seen on the date parsing. Might be others I have missed, but that is in place is pretty darned good.

Expensive to be sure, but worth it for me. As long as there is a way to engage it.
Another option might be to order the lists. Or further break them up and sample the incoming date string, and select a “group” of formats to test.

Given the other processing I do, those date formats are the least of my worries, if I get a valid date out the other side.

Here is a thought, throw a mutex in there, and perform a move to the front, on matching format string. Even in a multi threaded env, the hot matches will stay on top, and over the course of parsing one or more feeds simultaneously, you will only pay for the deep dive infrequently.

rodmcelrath avatar Mar 25 '23 02:03 rodmcelrath