engshell icon indicating copy to clipboard operation
engshell copied to clipboard

Implement function summarize(text)

Open emcf opened this issue 1 year ago • 4 comments

Engshell fails with text data larger than the maximum prompt size. Need a non-GPT based summarize(text) function for processing large text data.

emcf avatar Apr 04 '23 01:04 emcf

Why must it be non GPT based? Wouldn't recursively summarizing chunks of the conversation using the gpt3.5 chat endpoint be sufficient?

AzureDominus avatar Apr 04 '23 03:04 AzureDominus

Why must it be non GPT based? Wouldn't recursively summarizing chunks of the conversation using the gpt3.5 chat endpoint be sufficient?

It's much better to vectorize large texts using the ADA model and then process it from there. Much cheaper and faster.

"This model's maximum context length is 4097 tokens. However, your messages resulted in 685229 tokens. Please reduce the length of the messages."

tondeaf avatar Apr 04 '23 04:04 tondeaf

Getting the embeddings of the large text doesn't help to summarize it as you can't usefully convert the embeddings back to text. What do you mean by "process it from there"?

AzureDominus avatar Apr 04 '23 05:04 AzureDominus

Why must it be non GPT based? Wouldn't recursively summarizing chunks of the conversation using the gpt3.5 chat endpoint be sufficient?

I believe chunking the text and then doing this would work as well, yes.

emcf avatar Apr 04 '23 05:04 emcf