OpenDirectoryDownloader icon indicating copy to clipboard operation
OpenDirectoryDownloader copied to clipboard

DateTime parsing / support

Open KoalaBear84 opened this issue 6 years ago • 12 comments

Add support for DateTime parsing, add field to WebFile object, add latest date to Reddit stats

KoalaBear84 avatar Feb 10 '19 18:02 KoalaBear84

Help is needed for a library which can automagically parse a lot of date formats.

KoalaBear84 avatar Apr 12 '19 08:04 KoalaBear84

This is still a very hard problem. Have tried to find a good datetime string parsing thing, but haven't found anything.

KoalaBear84 avatar Aug 15 '20 13:08 KoalaBear84

Could you share a few examples of dates you need to parse? :)

Chaphasilor avatar Jan 27 '21 22:01 Chaphasilor

All of the formats in open directories.

I would've thought there would be at least one library which could do that. But couldn't really find one, tried multiple times.

Of course it can be written by hand, but preferring a library 😅

KoalaBear84 avatar Jan 27 '21 23:01 KoalaBear84

Maybe we could start a list of formats we encounter here in this thread? :)

So we can use the list to test different libraries or our custom code?

Chaphasilor avatar Jan 27 '21 23:01 Chaphasilor

I'll start (you should be able to edit my comment to add new formats):

  • 2019-12-08 01:19
  • 2020-08-06 14:43:00
  • 2020-08-6 14:43:00
  • 26-Apr-2020 17:31
  • 2019-Mar-17 03:43:25
  • 2 months ago (with tooltip 2020-08-6 14:43:00)
  • March 28 2013 at 11:21

Went through the last month and top posts on /r/OpenDirectories...

Chaphasilor avatar Jan 29 '21 20:01 Chaphasilor

Wow. Well done. Looks like these are all parseble quite good. The problem starts with the American dates 😅

Looks like I prefer Japanese dates 👍👍👍 Works very good on file systems.

image

Maybe later I'll create an option like the current --exact-file-size thing. Which also takes the date from the response, and probably also the ETag, and both can go into #20 😇

KoalaBear84 avatar Jan 29 '21 21:01 KoalaBear84

That sounds good! I'll keep looking for obscure date formats, but I believe these cover >90% of all submissions...
About parsing, I don't know about C#/.NET, but running all of the above dates throw JavaScript's new Date() works for all except for 2 months ago and March 28 2013 at 11:21 (although simply removing the at did the trick). American dates (using / as separator) were recognized correctly as well.
So if a plain built-in parser can handle this, it shouldn't be a huge problem :)


It's called japanese logic? Never knew that! I know it's the superior data format for everything and I try to use it wherever I can, but good to know there are actually people using this in day-to-day life :D

Chaphasilor avatar Jan 29 '21 23:01 Chaphasilor

https://docs.microsoft.com/en-us/dotnet/api/system.datetime.tryparse?view=net-5.0#System_DateTime_TryParse_System_String_System_DateTime__

Seems like .NET should be able to parse most of these formats.
We should use a static "culture" format tough, so the app behaves consistently...

Chaphasilor avatar Feb 05 '21 13:02 Chaphasilor

You are right. The CultureInfo (language) could maybe even provided/read by the website headers which makes it even better.

See fiddle for example of the parsing.

https://dotnetfiddle.net/HmZvn9

KoalaBear84 avatar Feb 05 '21 13:02 KoalaBear84

Looks good 👍

I also like the idea of using the website meta data for choosing the "culture". Not sure if all servers provide info about the locale though...

Chaphasilor avatar Feb 12 '21 21:02 Chaphasilor

https://code-maze.com/csharp-convert-string-to-datetime/

KoalaBear84 avatar Jul 06 '23 06:07 KoalaBear84