Take advantage of increased LLM context window

Open AJaySi opened this issue 1 year ago • 3 comments

RAG is dying as the context window keeps increasing exponentially. With 1M tokens context windows, who needs RAG. Caching course material and doing vector search seems like the way to go.

We will need to implement concepts from this paper: https://ai.google.dev/gemini-api/docs/long-context

Sep 02 '24 15:09 AJaySi

https://ai.google.dev/gemini-api/docs/long-context

Sep 24 '24 03:09 AJaySi

https://github.com/google-gemini/cookbook/blob/main/examples/Apollo_11.ipynb

Sep 24 '24 07:09 AJaySi

This needs attention for long form content generation.

Mar 13 '25 05:03 AJaySi

We have shifted to flash 2.5 and this feature is inheriently provided and needs no extra instrumentation on our part. I still need to test, but the quality suffers a lot with longer content. Needs more experiementation.

Sep 25 '25 04:09 AJaySi