csvs-to-sqlite icon indicating copy to clipboard operation
csvs-to-sqlite copied to clipboard

Figure out a mechanism for interpreting dates as always in the 1900s

Open simonw opened this issue 7 years ago • 4 comments

I created https://csvs-to-sqlite-date-demo.now.sh/antiquities-317d506/actions.under.antiquities.act like this:

$ csvs-to-sqlite actions.under.antiquities.act.csv antiquities.db -d date

Using the CSV from here: https://github.com/fivethirtyeight/data/blob/master/antiquities-act/actions_under_antiquities_act.csv

Just one problem:

2018-04-24 at 9 02 am

It would be nice if there was a way to tell csvs-to-sqlite "if a year is two digits, treat it as being in the 1900s".

simonw avatar Apr 24 '18 16:04 simonw

I filed this with dateparser: https://github.com/scrapinghub/dateparser/issues/410 - and they opened a new feature request ticket against dateutil about it: https://github.com/dateutil/dateutil/issues/703

simonw avatar Apr 24 '18 16:04 simonw

It looks like I could solve this in csvs-to-sqlite by monkey-patching the dateutil.parser class: https://github.com/scrapinghub/dateparser/issues/410#issuecomment-383968432

simonw avatar Apr 24 '18 16:04 simonw

Maybe our command-line option for this could be --date-year-century=1900

simonw avatar Apr 24 '18 16:04 simonw

From https://github.com/dateutil/dateutil/issues/703#issuecomment-383995842

Subclassing parserinfo is the supported, official way to do this for the moment. I don't think that will change at any point in the future.

simonw avatar Apr 29 '18 00:04 simonw