private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

ingest.py is running for a couple of hours on MBP M1 Max, 64G. Ingestion size ~15MByte

Open arviaja opened this issue 2 years ago • 6 comments

Operating System: MacOS 13.3.1 (a) Hardware: M1 Max Chip, 64 GByte RAM Python 3.11.3

Question: Is below described behavior of ingesting for 4 hours and still going normal ? Or is it just caught in an endless loop?

More Details:

Amount of files: 13, each ~700kb. File source: Alphabet Quarterly reports as PDFs, as can be found here: https://abc.xyz/investor/

Total size ~15 MByte.

I have installed according to README,

.env:

PERSIST_DIRECTORY=db
LLAMA_EMBEDDINGS_MODEL=models/ggml-model-q4_0.bin
MODEL_TYPE=GPT4All
MODEL_PATH=/[redacted]/privateGPT/models/ggml-gpt4all-j-v1.3-groovy.bin
MODEL_N_CTX=1000

Started ingestion this morning and it's still going. Fan went spinning, what my MacBook rarely does. Sample Output:

llama_print_timings:        load time =   416.01 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =  6033.13 ms /   144 tokens (   41.90 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =  6063.73 ms

llama_print_timings:        load time =   416.01 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =  6978.30 ms /   164 tokens (   42.55 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =  7013.74 ms

llama_print_timings:        load time =   416.01 ms
llama_print_timings:      sample time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings: prompt eval time =   529.93 ms /    13 tokens (   40.76 ms per token)
llama_print_timings:        eval time =     0.00 ms /     1 runs   (    0.00 ms per run)
llama_print_timings:       total time =   533.35 ms

Is this normal behavior that it takes that long? Or is it just caught in an endless loop?

arviaja avatar May 15 '23 10:05 arviaja

Yes, ingestion can take quite a while. I ingested some SEC filings as well-- ~250k +/- and 2 of them can take around an hour. Oddly enough I tried simple text (Alice In Wonderland and The time Machine) and they took about 20 minutes. Same setup-- Mac M1, 64gb memory.

dennydream avatar May 15 '23 15:05 dennydream

I've been trying on the sample file provided and my numbers are terrible! Running on a Mac M1 Air with just 8GB RAM though.

llama_print_timings: load time = 3844.66 ms llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: prompt eval time = 56510.32 ms / 114 tokens ( 495.70 ms per token) llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: total time = 56687.82 ms

initd1 avatar May 15 '23 15:05 initd1

Yes, ingestion can take quite a while. I ingested some SEC filings as well-- ~250k +/- and 2 of them can take around an hour. Oddly enough I tried simple text (Alice In Wonderland and The time Machine) and they took about 20 minutes. Same setup-- Mac M1, 64gb memory.

@dennydream after ingestion, how long do the queries take? have you noticed any limitations post ingestion?

initd1 avatar May 15 '23 15:05 initd1

Thanks guys. I also have a semi-ok Gaming machine with an 8GB NVIDIA 3070 RTX. Do I just need a GPU optimized model instead of the CPU models for that? Can't find a binary on the GPT4All Repo.

arviaja avatar May 15 '23 16:05 arviaja

I ingested one of the 10K's from your link. It took about 18 minutes. I think these models are CPU by the way. I haven't figured out a way to use GPUs to do it.

dennydream avatar May 15 '23 17:05 dennydream

I ingested one of the 10K's from your link. It took about 18 minutes. I think these models are CPU by the way. I haven't figured out a way to use GPUs to do it.

Thanks for the update, that's what I was thinking. I guess I'll kill the job, still running

arviaja avatar May 15 '23 17:05 arviaja

Ingested 534KByte in 38 minutes (MB Pro M1 Max, 64 GByte RAM)

arviaja avatar May 15 '23 19:05 arviaja

Closing this since works as intended. Just needs some computing capacity...

arviaja avatar May 15 '23 20:05 arviaja

Yes, ingestion can take quite a while. I ingested some SEC filings as well-- ~250k +/- and 2 of them can take around an hour. Oddly enough I tried simple text (Alice In Wonderland and The time Machine) and they took about 20 minutes. Same setup-- Mac M1, 64gb memory.

@dennydream after ingestion, how long do the queries take? have you noticed any limitations post ingestion?

Seems like anywhere from 20-ish seconds to 40+. I just did: "what is the value of Alphabet's gross profits in 2022". Took ~40 seconds and gave me answer related to Gross Revenue at 242 billion. The answers it comes back with are not all that useful even when I explicitly use text from the 10K. So I was wondering if anybody got useful answers back.

dennydream avatar May 16 '23 21:05 dennydream

No good answers from me, either. I don't know enough about if it just needs much more training. Let's not forget that GPT-4 has been trained for years on big machines

arviaja avatar May 16 '23 21:05 arviaja

No good answers from me, either. I don't know enough about if it just needs much more training. Let's not forget that GPT-4 has been trained for years on big machines

I did similar things on ChatGPT and it was hit and miss. I used the HTML version of the SEC filings and it still had some XBRL in it plus probably some unprintable unicode characters. BUT, it did come back with some decent responses and embedding was pretty fast-- so were queries.

dennydream avatar May 16 '23 21:05 dennydream

No good answers from me, either. I don't know enough about if it just needs much more training. Let's not forget that GPT-4 has been trained for years on big machines

I did similar things on ChatGPT and it was hit and miss. I used the HTML version of the SEC filings and it still had some XBRL in it plus probably some unprintable unicode characters. BUT, it did come back with some decent responses and embedding was pretty fast-- so were queries.

I guess you should use the MPT 7B model from gpt4all.io website for better results, the default model is dumb af

ProxyAyush avatar May 30 '23 12:05 ProxyAyush