python-pdfbox
python-pdfbox copied to clipboard
hint: alternative version parsing implementation
There are some problems with the current version parser, like the version object sorting not handling some edge cases correctly (RC releases are considered higher than alpha releases, but for pdfbox v3 it seems to be the other way round).
There are some more minor problems, e.g. calling pkg_resources.parse_version twice (see #29), and pkg_resources being deprecated.
I've written a recipe with an (I think) smarter / more robust implementation that groups by major version and then sorts by date: https://gist.github.com/mara004/881d0c5a99b8444fd5d1d21a333b70f8 It doesn't use an HTML parser, but direct regex, which appeared easier / more flexible here.