John

Results 11 issues of John

Minor/partial refactor of interfaces.py and add tests

Per discussion here (https://github.com/Unstructured-IO/unstructured/pull/1259/files#r1312235977), `add_pytesseract_bbox_to_elements` can be improved by using `pytesseract.image_to_data` and vector math to find the coordinates of elements.

Currently `partition_json` is intended only for deserializing the unstructured JSON outputs/elements and is not included as a file format we accept for partitioning ([see here](https://unstructured-io.github.io/unstructured/bricks.html)). The goal of this issue...

enhancement
json

The pinned version of unstructured-client was changed from `>=0.15.1` to `

bug

Stemming from conversations [here](https://github.com/Unstructured-IO/unstructured/pull/1627#discussion_r1346498859) and [here](https://github.com/Unstructured-IO/unstructured/pull/1652), it would be worthwhile to do our own comparison of language detection packages to see which is best for detecting the language of short...

Best to review commit by commit. This PR is the first for cleaning up the partition params. Fixes in this PR: - Move non-partitioner modules to `unstructured/partition/utils/` - Note that...

The ability to set which OCR Agent should be used was added [here](https://github.com/Unstructured-IO/unstructured/pull/2462/files), but there is no documentation describing this (example of [user asking if this can be done](https://github.com/Unstructured-IO/unstructured/issues/2958))

documentation

**Is your feature request related to a problem? Please describe.** `partition_via_api` does not accept arguments for defining the retry logic to be used by the python client. This means `partition_via_api`...

enhancement
ingest

I installed `pyonenote` via `pip install pyonenote` and tried running `pyonenote -f example-docs/QuickNotes.one` and got the following error: ``` Traceback (most recent call last): File ".../pyonenote", line 8, in sys.exit(main())...

bug