Adrien Barbaresi
Adrien Barbaresi
Fixes #13.
Current state: `ValueError`. The `decode` argument can be removed altogether. Note: use `fetch_response()` instead.
- [ ] Check if all elements in this list are accounted for: `https://developer.mozilla.org/en-US/docs/Web/HTML/Element` - [ ] Check conversion to XML
Use explicit re-exports to prevent linter warnings.
So far part of the information is present in `CONTRIBUTING.md` and in the `tests/` folder. It would be best to regroup all necessary instructions in a documentation page on how...
The package now has support for an optional setting where all requests are routed through a proxy. As described in [this comment](https://github.com/adbar/trafilatura/pull/682#issuecomment-2335199996) it is not currently possible to switch between...
Hi, in cases like "3. Minute" the sentence wrongly ends at "3." according to Punkt. I see you have an effective list of words (notably months) in `packages/tokenizers/punkt_tab/german/collocations.tab` but it...
You might want to revise the sections "Corpus Linguistics" and "Data Analysis", I tried to add the tools accordingly but the distinction is not always clear-cut. **Short pitch** I posted...