unstructured
unstructured copied to clipboard
feat/when used requests,may need a kwargs to support requests special params like verify
Is your feature request related to a problem? Please describe.
I have been digging into the Langchain code. The UNSTRUCTURED_API_URL appears to utilize partition_multiple_via_api within unstructured.partition.api
detail ref:https://github.com/langchain-ai/langchain/issues/21488
The user is using a private URL, and its SSL is private, so it requires to skip SSL verification for requests.
and now. partition_multiple_via_api no way to do that
Describe the solution you'd like It seems to me that we might consider providing **kwargs to support this scenario. requests support params: https://github.com/psf/requests/blob/2d5f54779ad174035c5437b3b3c1146b0eaf60fe/src/requests/api.py#L14
Describe alternatives you've considered
Perhaps we could reuse **request_kwargs which contains request data. We can retrieve parameters from **request_kwargs and utilize them in requests.
partition_multiple_via_api:https://github.com/Unstructured-IO/unstructured/blob/e4c895923d6f8d1bbfe7baa8abc47dbe833aaacc/unstructured/partition/api.py#L105
Additional context Add any other context or screenshots about the feature request here.
The unstructured-python-client is the preferred way to call the API now. @awalker4 - do you know if the client supports turning off SSL verify?
Separately, added an issue internally to update the LangChain loaders to use the AP client instead of partition_via_api.
We can pass a custom requests.Session to the unstructured-client like this. We can certainly add a flag to the Loader to set this up before the partition call.
Thanks! Please reference the link @awalker4 provided above if you need to use the API without SSL verification.
if use partition_multiple_via_api or partition_via_api in unstructured.partition.api.there is no entry for client=http_client, like @awalker4 solution.
so what should i do?
Hi @qingdengyue - we recommend using the client library direction, and you can use the elements_from_json from the unstructured library to convert the JSON outputs to unstructured elements if you need.