dateparser
dateparser copied to clipboard
[WIP] Updating CLDR data
- Updating CLDR data to
39.0.0
. - Fixing CLDR downlaod error.
- Updating CLDR Data URL :
https://github.com/unicode-cldr/cldr-dates-full
(archived) ->https://github.com/unicode-org/cldr-json
.
TODO :
- Fixing tests
Fixes issue #940
Many tests seem to be wrong just for example in tests/test_languages.py:805 for language Zulu
Current data translates son 23 umasingana 1996
to sunday 23 january 1996
but according to Google Translate it's isonto 23 Januwari 1996
for sunday 23 january 1996
Additionally languages like as
are poorly translated.
This PR fixes those issues but currently, the tests are not updated.
@noviluni, please suggest should I update the tests accordingly.
A review will be helpful.
Thanks
Note: This PR breaks 39 tests.
Hi @gavishpoddar, I created a "guide" to handle this (CLDR updates), but we never started doing it. It would be nice if you read it to see if you missed anything: https://github.com/scrapinghub/dateparser/issues/826
My initial idea was to update version by version, but it's OK if we update directly to the last version as you did. After that we will need to check file by file to see if we are removing things that could generate "breaking changes" (and possibly adding them to our own data), but before starting the review I would like to understand why you removed the "version".
It is really important to point to a specific version and not directly to master to easily understand which version are we pointing and to be able to update easily in the future (master could be "incomplete" or "wrong"). In the past we didn't have a way to know it, so we didn't know which version we were using and how outdated we were, so I would like you to reconsider adding again the cldr_version
and the repo.git.co(cldr_version)
statements. We need to keep this. If it doesn't work because they are tags instead of branches, etc. maybe you need to change the step, but as I mentioned we need to point to a specific version.
thanks! :)
At this point, 7 tests are failing in tests/test_freshness_date_parser.py
.
I am unable to fix them please help.
@noviluni
@gavishpoddar the builds for this PR were not enabled (it's a newish github feature), sorry about that - just enabled them.
Codecov Report
Merging #941 (4580337) into master (507dc6d) will increase coverage by
0.00%
. The diff coverage is100.00%
.
@@ Coverage Diff @@
## master #941 +/- ##
=======================================
Coverage 98.29% 98.29%
=======================================
Files 234 234
Lines 2694 2700 +6
=======================================
+ Hits 2648 2654 +6
Misses 46 46
Impacted Files | Coverage Δ | |
---|---|---|
dateparser/languages/locale.py | 98.71% <100.00%> (+0.02%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact)
,ø = not affected
,? = missing data
Powered by Codecov. Last update 507dc6d...4580337. Read the comment docs.