trt-llm-rag-windows icon indicating copy to clipboard operation
trt-llm-rag-windows copied to clipboard

Large amount of files get stuck before loading

Open jeffersoncgo opened this issue 1 year ago • 1 comments

I'm using Chat With RTX, and i'm trying to load my whole disk X wich is a mounted network disk.

It has aprox 200/300 Gb of content, mostly scripts.

I left it loading the data for about 3 days, and it didn't show any difference in the terminal, so i changed the SimpleDirectoryLoader function, so it "log" each file that it found, so i realize that it just get stuck, after some time while still "looking" for files, not even in the load_data function, but in the _add_files.

I Added the line:

print(f"Added {ref} to the list of files to process.", flush=True)

After a certain time running, it just stop logging, and don't do anything, and it always stop in the same files.

It don't give any error, close the terminal, or anything, it just stops.

image

Is there any limit to it?

It always stop at the same file count, if i remove this file, will stop in the next one, at the same number.

Also, if there is, is possible for me to create various embedings, so if theres a limit, i can make then manually while still inside the limit

jeffersoncgo avatar Feb 21 '24 17:02 jeffersoncgo

we have not verified it for the 200/300 Gb,

anujj avatar May 23 '24 09:05 anujj