private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

Any way to change size and amount of embeddings?

Open colrobloxkid69420 opened this issue 2 years ago • 4 comments

Right now each embedding is a maximum of 500 tokens and there are 4 embeds max. What if I want embeds to include less tokens but for the AI to use more of them?

colrobloxkid69420 avatar May 29 '23 19:05 colrobloxkid69420

The values for embedding size and overlap are in both ingest.py and privategpt.py

chunk_size = 500
chunk_overlap = 50

That is in bytes not tokens.

in privategpt.py you could change the 4 in: target_source_chunks = int(os.environ.get('TARGET_SOURCE_CHUNKS',4)) to the value of your choice or define a different value for TARGET_SOURCE_CHUNKS in your .env file.

johnbrisbin avatar May 30 '23 22:05 johnbrisbin

Thanks

colrobloxkid69420 avatar May 31 '23 11:05 colrobloxkid69420

It seems the code has changed a lot. Is there any way now to change the amount of embeddings?

colrobloxkid69420 avatar Jan 09 '24 14:01 colrobloxkid69420

It seems the code has changed a lot. Is there any way now to change the amount of embeddings?

Also interested but this answer ! I need to analyse some logs into a JSON file. The file upload is okay, then I have the "Maximum number of input tokens"

clab60917 avatar Mar 20 '24 13:03 clab60917