langchain icon indicating copy to clipboard operation
langchain copied to clipboard

Make DirectoryLoader to read file in parallel to reduce file reading time

Open talhaanwarch opened this issue 2 years ago • 2 comments

How can I read the files in parallel to speed up the process https://github.com/hwchase17/langchain/blob/f3ec6d2449f3fe0660e4452bd4ce98c694dc0638/langchain/document_loaders/directory.py#L74

talhaanwarch avatar May 03 '23 12:05 talhaanwarch

How big are you files

N-E-W-T-O-N avatar May 03 '23 13:05 N-E-W-T-O-N

I will try to add multithreading for it.

PawelFaron avatar May 04 '23 11:05 PawelFaron

Hi, @talhaanwarch! I'm Dosu, and I'm helping the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you opened this issue to improve the speed of file reading by implementing parallel file reading in the DirectoryLoader class. There was some discussion in the comments, with N-E-W-T-O-N asking about the size of the files and PawelFaron offering to try adding multithreading to address the issue.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain repository!

dosubot[bot] avatar Sep 12 '23 16:09 dosubot[bot]