csv-detective
csv-detective copied to clipboard
refactor: use frformat package
COPY of #83 without fork, trying to trigger CI tests.
Context
The library fr-format has been developed for sharing validation functions between validata and csv-detective, and to introduce a standard library to validate typical French formats.
The aim of this PR is to replace custom validation with the implementation of fr-format.
Refactorings
- code postal et code commune Insee
- code canton, numero departement, code region
- code fantoir
- canton, departement, commune, region, pays
- Latitude_l93, Longitude_l93
- code RNA
Behavior changes
- After noticing code Fantoir prefixes starting with two letters, we slightly changed the regex to accept this format (eg. ZB12A).
- CodeRNA does not allow 'w' as first letter, only uppercase 'W'.
Performance
Performance report ─=≡Σ((( つ•̀ω•́)つ
Testing table with 100000000 rows
without with fr-format
"8730"
code_postal 10.27 s 10.62 s
code_fantoir 10.22 s 10.66 s
code_commune 10.27 s 10.47 s
"ABCDE"
code_postal 12.32 s 11.40 s
code_fantoir 12.24 s 11.37 s
code_commune 11.73 s 11.33 s
"12345"
code_postal 11.63 s 11.38 s
code_fantoir 11.23 s 11.05 s
code_commune 11.31 s 10.97 s
The differences do not appear to be statistically significant, given the variability between the two executions observed.
Edit the 29 May 2024