unstructured
unstructured copied to clipboard
'hi_res' and 'fast' strategies taking more time than expected for larger files
- Unable to process large files (like 'covid19treatmentguidelines2.pdf' attached below) in less time. Taking time of around 20 mins to process it.
from unstructured.partition.pdf import partition_pdf
elements = partition_pdf(file_path, strategy="hi_res")
- Model 'yolox_quantized' is not running faster as expected(or as explained in the documentation
elements = partition(filename=filename,
strategy="hi_res",
hi_res_model_name="yolox")
Versions used for the above scenario: unstructured-inference==0.7.24 unstructured==0.12.4 pillow-heif==0.15.0