private-gpt icon indicating copy to clipboard operation
private-gpt copied to clipboard

Initial token count exceeds token limit in PrivateGPT

Open beaconai opened this issue 1 year ago • 4 comments

Im uploading a CVS file of Company data, i got Initial token count exceeds token limit even i type hi only. But before this i uploaded 4 books of harry potter around about 1600 pages its working good.

beaconai avatar Apr 01 '24 08:04 beaconai

how can i increase the token limit?

beaconai avatar Apr 01 '24 08:04 beaconai

The current chunking mechanism (which splits documents in sentences) is not optimal for CSVs. CSVs contain characters such as , that are token-hungry. Most probably that chunking mechanism is creating long chunks containing lots of CSV values in it, making the retrieved context too long for the context window. I'd advice to switch the current SentenceWindowNodeParser to a simple TokenTextSplitter for example

imartinez avatar Apr 01 '24 16:04 imartinez

The current chunking mechanism (which splits documents in sentences) is not optimal for CSVs. CSVs contain characters such as , that are token-hungry. Most probably that chunking mechanism is creating long chunks containing lots of CSV values in it, making the retrieved context too long for the context window. I'd advice to switch the current SentenceWindowNodeParser to a simple TokenTextSplitter for example

i want a power full CVS platform that read cvs file and reply accordingly, in my company i have operator skill matrix data, my query for example "create team of 10 members that have good grade in sleeve attaching process?" can suggest me a model or platform even if it paid version of AI tool?

beaconai avatar Apr 02 '24 03:04 beaconai