Adrien Barbaresi
Adrien Barbaresi
In the end `fetch_url()` would remain (as easy mode, without args). The advanced mode would be available as `fetch_response()`. In a second step we could also change that but users...
I confirm that all lists are absent for this page, standard extraction fails on a structure of the type `ul > li > span`. here is an example: ``` You...
Ongoing work is in #479.
Steps 1 to 3 are now implemented. Feel free to provide feedback or additional functionality with a PR. I'll leave this thread open at least until the next release (and...
Feel free to draft a PR yes, otherwise I'll see when I have time to tackle this.
A rule like `len_algo > 2 * len_text` is brittle but according to the benchmark it's mostly reliable. Edge cases like this one are an open question: How do we...
So far the strategies are standard, "favor_recall" and "favor_precision", all offering a relatively good balance according to the benchmark. I don't plan to to tweak it further but feel free...
Hi @hugoobauer, thanks for the PR and your feedback. You're welcome to try out the evaluation script and make changes. The same goes for the data, your example is telling,...
Then we probably need to think twice about the change because it doesn't bring much on the evaluation dataset. Maybe another heuristic is needed, or we could expose your changes...
- Yes, just create another branch, you can comment out the readabilipy function. I'll test it and see if I can make it work, if not we'll remove it. -...