private-gpt
private-gpt copied to clipboard
Colab ?
Anyone get a ipynb setup to run this? I don't have enough ram on my laptop to run .. Tried installing on colab but no luck .. wouldnt load the model file
Seems to work for me. Before running !python ingest.py
rename example.env
to .env
. Let me know if you encounter issues.
https://colab.research.google.com/drive/1y5bYVCsvNSNjX3LrLi0REFitUqOloSJN?usp=sharing
I'm not sure Colab will handle it well, even after ingesting the data because running PrivateGPT seems more resource intensive than ingestion.
I used my computer because it's about twice as fast as Colab when ingesting data (I have an i7-7700K CPU, and 32GB RAM), and now I ran PrivateGPT, gave it a prompt, and have been waiting for 5 minutes with no answer, my CPU is at 50% and RAM keeps growing (It's at 14GB right now)
So I think using it in Colab might not work out so well. I'm hoping that someone will optimize PrivateGPT in the near future to be less resource intensive.
UPDATE: It answered after I finished writing this. So about 5 minutes to get an answer, and the answer also takes a while to be generated (3+ minutes so far)
re: [voidxd] (thanks Man!) yep just to verify !python ingest.py -- ok (took over 2.5 hours !python privateGPT.py crapped out after prompt -- output -->
llama.cpp: loading model from models/ggml-model-q4_0.bin llama.cpp: can't use mmap because tensors are not aligned; convert to new format to avoid this llama_model_load_internal: format = 'ggml' (old version with low tokenizer quality and no mmap support) llama_model_load_internal: n_vocab = 32000 llama_model_load_internal: n_ctx = 1000 llama_model_load_internal: n_embd = 4096 llama_model_load_internal: n_mult = 256 llama_model_load_internal: n_head = 32 llama_model_load_internal: n_layer = 32 llama_model_load_internal: n_rot = 128 llama_model_load_internal: ftype = 2 (mostly Q4_0) llama_model_load_internal: n_ff = 11008 llama_model_load_internal: n_parts = 1 llama_model_load_internal: model size = 7B llama_model_load_internal: ggml ctx size = 4113748.20 KB llama_model_load_internal: mem required = 5809.33 MB (+ 2052.00 MB per state) ................................................................................................... . llama_init_from_file: kv self size = 1000.00 MB AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | VSX = 0 | Using embedded DuckDB with persistence: data will be stored in: db gptj_model_load: loading model from 'models/ggml-gpt4all-j-v1.3-groovy.bin' - please wait ... gptj_model_load: n_vocab = 50400 gptj_model_load: n_ctx = 2048 gptj_model_load: n_embd = 4096 gptj_model_load: n_head = 16 gptj_model_load: n_layer = 28 gptj_model_load: n_rot = 64 gptj_model_load: f16 = 2 gptj_model_load: ggml ctx size = 4505.45 MB gptj_model_load: memory_size = 896.00 MB, n_mem = 57344 gptj_model_load: ................................... done gptj_model_load: model size = 3609.38 MB / num tensors = 285
llama_print_timings: load time = 9417.93 ms llama_print_timings: sample time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: prompt eval time = 10960.65 ms / 10 tokens ( 1096.06 ms per token) llama_print_timings: eval time = 0.00 ms / 1 runs ( 0.00 ms per run) llama_print_timings: total time = 10970.86 ms gpt_tokenize: unknown token '�' gpt_tokenize: unknown token '�' gpt_tokenize: unknown token '�' ^C --> the 12g of ram colab allows peaking then crashed after 5min.
Glad it worked so you can test it out. Thanks for posting the results. This way we all know the free version of Colab won't work.
Perhaps the paid version works and is a viable option, since I think it has more RAM, and you don't even use up GPU points, since you're using just the CPU & need just the RAM.
Yeah always want to see what you can get away with for free with this bleeding edge stuff.. so we all 'need bigger boats' To legit run on consumer grade hw this looks like.. min i7 - i9 / or whatever amd eq? with min 32gb ram? Anyway gd Amazing this close for prvate llm apps . Was worth taking a look at this project also .. https://colab.research.google.com/drive/16QMQePkONNlDpgiltOi7oRQgmB8dU5fl?usp=sharing
Seems to work for me. Before running
!python ingest.py
renameexample.env
to.env
. Let me know if you encounter issues.https://colab.research.google.com/drive/1y5bYVCsvNSNjX3LrLi0REFitUqOloSJN?usp=sharing
it takes lot of time running the privateGPT.py in colab free version it better to quantize it.