htmldate icon indicating copy to clipboard operation
htmldate copied to clipboard

Fast and robust date extraction from web pages, with Python or on the command-line

Results 17 htmldate issues
Sort by recently updated
recently updated
newest added

I have mostly tested `htmldate` on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web...

good first issue
up for grabs

By default dates before 1995 are considered implausible, however changing the minimum date does not fix the issue. CLI: `htmldate -u "https://web.archive.org/web/20201205182452/https://www.lesechos.fr/1991/01/saddam-hussein-menace-larabie-saoudite-939083" -vv -min "1990-01-01"` Python: Here is the debugging...

bug

Is there a way to force htmldate to look for datetime and not date, or prioritise specific extractors over others, eg opengraph over url-extraction. Let me give you an example:...

enhancement

Configuration arguments are available for Python functions, it would be nice to make them available as command-line arguments as well: - outputformat

enhancement

In our testing the current code produces unreliable results when tested on Wikipedia articles. Sometimes it returns a data, sometimes it doesn't. Wikipedia articles are constantly updated, so @coreydockser and...

question

A short version of the documentation is available straight from Github ([README.rst](https://github.com/adbar/htmldate/blob/master/README.rst)) while a more exhaustive one is present in the `docs` folder and online on [htmldate.readthedocs.io](https://htmldate.readthedocs.io) Several problems could...

good first issue
up for grabs

In order to help new contributors it would be nice to add [pre-commit](https://pre-commit.com/) hooks to the repository with the following checks: - black - flake8 - isort - ...? The...

up for grabs
documentation

Dear all, Htmldate is now widely used and it has become apparent that the GPL license is not prevalent in Python packages, its potential implications are also not easily understood....

maintenance

So far only the logs provide info on this. It would be nicer to be able to pinpoint the type (header, element, or text) or even the exact location of...

enhancement