unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

bug/application/octet-stream not supported

Open jeremydiba opened this issue 4 months ago • 0 comments

Describe the bug When calling the API using a .tif file, received a bug "detail": "File type application/octet-stream is not supported."

To Reproduce My code snippet (to show params being passed)

    if "strategy" not in kwargs:
        kwargs["strategy"] = "auto"
    if "chunking_strategy" not in kwargs:
        kwargs["chunking_strategy"] = "by_title"
    if "combine_under_n_chars" not in kwargs and kwargs["chunking_strategy"] == "by_title":
        kwargs["combine_under_n_chars"] = 500
    if "coordinates" not in kwargs:
        kwargs["coordinates"] = True
    if "languages" not in kwargs:
        kwargs["languages"] = ["eng"]
    if "max_characters" not in kwargs:
        kwargs["max_characters"] = 4000
    if "unique_element_ids" not in kwargs:
        kwargs["unique_element_ids"] = False
    if "pdf" in filename and "split_pdf_page" not in kwargs:
        kwargs["split_pdf_page"] = True
    if "pdf" in filename and "split_pdf_concurrency_level" not in kwargs:
        kwargs["split_pdf_concurrency_level"] = 10
    if "include_orig_elements" not in kwargs:
        kwargs["include_orig_elements"] = True
        
    req = operations.PartitionRequest(
        partition_parameters=shared.PartitionParameters(
            files=shared.Files(
                content=data,
                file_name=filename,
            ),
            **kwargs)
    )

    res = client.general.partition(request=req)
    return res

Expected behavior Parsing of the document

Screenshots If applicable, add screenshots to help explain your problem.

Environment Info Python 3.11 runtime

Additional context Add any other context about the problem here.

jeremydiba avatar Sep 30 '24 20:09 jeremydiba