private-gpt Semaphore leak crashes the program

Note: if you'd like to ask a question or open a discussion, head over to the Discussions section and post it there.

Describe the bug and how to reproduce it On asking for a summary of the document the program crashes with the below error

ggml_new_tensor_impl: not enough space in the context's memory pool (needed 8640345568, available 8615054800) Segmentation fault: 11 /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

Expected behavior The program should provide a summary of the document

Environment (please complete the following information):

OS / hardware: mac M1
Python version 3.11.3

May 22 '23 14:05 anuj142003

Same issue. Always failed due to this at second question.

May 22 '23 15:05 kkkkkxiaofei

Same issue here, and I find out that it is related to both chunk_size and chunk_overlap in ingest.py. Basically,

chunk_size: This parameter determines the maximum number of tokens in each chunk. A larger chunk_size will result in fewer, larger chunks, while a smaller chunk_size will result in more, smaller chunks. If you're encountering memory issues, reducing the chunk_size could potentially help. However, making the chunk_size too small could result in the context being lost between chunks. As a starting point, you might try a chunk_size of around 200-300 tokens, but you should adjust this based on your specific needs and the results you observe.

chunk_overlap: This parameter determines the number of tokens that consecutive chunks overlap. A larger chunk_overlap can help to ensure that the context is preserved between chunks, but it also increases the amount of redundancy in your data. If your documents contain a lot of important contexts that span multiple chunks, a larger chunk_overlap might be beneficial. However, if your documents are relatively self-contained and don't require many contexts, a smaller chunk_overlap might be sufficient. Again, you should adjust this based on your specific needs and the results you observe.

However, a smaller chunk size can make the answer much less accurate. The chunk_size parameter determines the maximum number of tokens in each chunk of text. If chunk_size is too small, the context necessary for generating accurate responses might be lost.

May 22 '23 17:05 flyersworder

I have same issue. Noone of those chunk size or overlap changes made difference.

pt_tokenize: unknown token 'n' gpt_tokenize: unknown token 's' gpt_tokenize: unknown token 'w' gpt_tokenize: unknown token 'e' gpt_tokenize: unknown token 'r' gpt_tokenize: unknown token ':' [1] 69430 segmentation fault /usr/local/bin/python3 privateGPT.py /Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/resource_tracker.py:224: UserWarning: resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown warnings.warn('resource_tracker: There appear to be %d '

May 22 '23 18:05 Pawel-Zygler

same issue, m1 macos

May 23 '23 01:05 StableInquest

Checkout this discussion, and I have been able to run most of the models, which were not being run till now. And they are not crashing

https://huggingface.co/TheBloke/MPT-7B-Instruct-GGML/discussions/2

llama-cpp-python==0.1.53 ctransformers==0.2.0

May 23 '23 05:05 gaurav-cointab

private-gpt private-gpt copied to clipboard

Semaphore leak crashes the program

private-gpt
private-gpt copied to clipboard