Michele Dolfi
Michele Dolfi
Thanks for posting the issue, indeed this is an issue we have to look into. The current issue is that those files are automatically generated from the OpenAPI specs of...
Thanks @davidberenstein1957 we will have a look. I don't remember the exact reason, but if it is minor we will for sure relax it. We only have to be careful...
I was just looking it up, we have already `typer=^0.9.0`, https://github.com/DS4SD/deepsearch-toolkit/blob/main/pyproject.toml#L43
@davidberenstein1957 is this still an issue?
This was actually released just this week, but we still have to propagate the feature to the Toolkit.
Could it be you are downloading the OCR models? in your code `do_ocr` is a parameter, can you please try once making sure it is `False`
> @dolfim-ibm Like layout and tableformer is there a way to prefetch the OCR models as well? it depends on the specific OCR engine. as a non-complete summary: - Tesseract...
I'm a bit confused by the conversation of this issue. I see two possible interpretation: 1. The character which is read by the html backend is wrong, i.e. "https://www.compart.com/en/unicode/U+E157 is...
Thanks @gaspardpetit for this fix. Here a few notes 1. Please review the test failures. I think you are missing the pre-commit styling, which could be fixed with `pre-commit run...
I think we both want to achieve the same, but we interpret the `tempfile` library documentation in different ways 😉 Let me elaborate my understanding of it. - When the...