introduce alphanumerical sorting policy; fixes #81
This PR fixes #81 by introducing a natural language sorting policy that will be appended to the policy list if applicable. I did not implement a reverse version of it as it would have needed a bit different approach and is not needed probably.
In case there is no sorter found for the turtle-based language code, the default, portable 'C' sorter will sort values alphanumerically (instead of user language sorter). This may or may not be wanted as empty literal tags will be sorted based on this.
Any thoughts, @osma ?
Codecov Report
Merging #82 (bc0d7f8) into master (6dca4ae) will increase coverage by
0.52%. The diff coverage is100.00%.
@@ Coverage Diff @@
## master #82 +/- ##
==========================================
+ Coverage 59.23% 59.76% +0.52%
==========================================
Files 10 10
Lines 920 932 +12
==========================================
+ Hits 545 557 +12
Misses 375 375
| Impacted Files | Coverage Δ | |
|---|---|---|
| skosify/check.py | 96.52% <100.00%> (+0.40%) |
:arrow_up: |
Continue to review full report at Codecov.
Legend - Click here to learn more
Δ = absolute <relative> (impact),ø = not affected,? = missing dataPowered by Codecov. Last update 6dca4ae...bc0d7f8. Read the comment docs.
I'm nervous about the code that changes the locale many times over with setlocale(). The documentation for the locale module advises against that:
The C standard defines the locale as a program-wide property that may be relatively expensive to change. On top of that, some implementation are broken in such a way that frequent locale changes may cause core dumps. This makes the locale somewhat painful to use correctly. [...] It is generally a bad idea to call setlocale() in some library routine, since as a side effect it affects the entire program. Saving and restoring it is almost as bad: it is expensive and affects other threads that happen to run before the settings have been restored.
Mostly Skosify is used as an independent CLI program, but there has been some effort to organize the code so that it can also be used as a library from other Python code. So I think the above is a valid argument against changing the locale many times during operation.
The question on how to do locale dependent sorting is also covered in this StackOverflow question. None of the proposed solutions seem ideal to me, but I would prefer using the icu module. That would introduce a new dependency, though.