langchain icon indicating copy to clipboard operation
langchain copied to clipboard

ValueError: Invalid file union\Book1.csv. The FileType.UNK file type is not supported in partition.

Open ragvendra3898 opened this issue 1 year ago • 1 comments

Hi, I am using DirectoryLoader as document loader and for some of csv files getting below error

ValueError: Invalid file union\Book1.csv. The FileType.UNK file type is not supported in partition.

can anyone suggest oplease, how to fix this, I will be thankful to you.

Thank You

ragvendra3898 avatar Apr 17 '23 06:04 ragvendra3898

seems to be similar to this one: https://github.com/imClumsyPanda/langchain-ChatGLM/issues/46

dsx1986 avatar Apr 24 '23 17:04 dsx1986

Thanks dsx1986

ragvendra3898 avatar Apr 25 '23 17:04 ragvendra3898

I think my virtual environment was not created successfully that's why I was getting this error. After creating again the new fresh environment with updated packages then this issue has been fixed.

Thanks

ragvendra3898 avatar Apr 25 '23 17:04 ragvendra3898

LangChain uses unstructured to determine the file types.

There is a dependency on libmagic to be installed in the system for unstructured to work correctly:

https://github.com/Unstructured-IO/unstructured/blob/3c3c59a726582cbf1d1bd5bbfe5ad015d4a3c1f6/unstructured/file_utils/filetype.py#L187-L195

In Debian you would need the packages libmagic-mgc and libmagic1 to have this working correctly.

zioproto avatar May 02 '23 19:05 zioproto