Adrien Barbaresi comments

Results 412 comments of


                                            Adrien Barbaresi

return datetime instead of date

Hi @kvasilopoulos, the example above now works with the additional parameter `deferred_url_extractor`: ``` >>> from htmldate import find_date >>> url = "https://www.ot.gr/2022/03/23/apopseis/daimonopoiisi/" >>> find_date(url, outputformat='%Y-%m-%d %H:%M:%S', verbose=True, deferred_url_extractor=True) '2022-03-23 06:15:58'...

Extend test coverage for json_metadata functions

For further reference, here is a new URL to follow the coverage: https://app.codecov.io/gh/adbar/trafilatura/blobs/master/trafilatura/json_metadata.py

Incompatible input type error parsing urls

@vprelovac I may work further on the `fetch_url()` function in the future, in the meantime I chose to document how to perform the operation manually, it's in the PR above.

anchor issue

@chakravir Trafilatura tries to work in a generic way and there is only little potential for customization.

Added Coinbase article annotation

@swetepete are you still working on this or should we consider merging/aborting?

CLI: run as server

Note: see the API as described in corresponding [doc page](https://trafilatura.readthedocs.io/en/latest/usage-api.html).

Additional inflection data for RU & UK

Hi @1over137, thanks for listing the differences, I like your approach but I'm not sure how to modify the software to improve performance. Judging from your results most differences come...

Additional inflection data for RU & UK

Thanks for the file, it's not particularly long and I'm not sure I could directly use it (what's the source?) but it shows the problem with the current approach. I've...

Additional inflection data for RU & UK

No worries, thanks for providing additional information. I get your point, although the EN Wiktionary is going to get better over time using the RU Wiktionary would make sense. As...

Additional inflection data for RU & UK

@1over137 Yes, `greedy` works because it can take into account affixes of up to 2 characters in an unsupervised way. I decided to make it the default for languages for...