Support for Gemma Embedding Models, and Vector Database
π Feature Request: Support for Gemma Embedding Model (308M) & Local Vector Database
π Is your feature request related to a problem? Please describe.
Currently, the Google AI Edge Gallery app only supports generative Gemma models (chat, multimodal, prompt lab).
While this is great for on-device creativity and conversation, it lacks embedding support, which prevents building local retrieval-augmented generation (RAG) or semantic search workflows.
Without embeddings + vector database support, developers like me cannot:
- Create offline knowledge bases that are searchable by meaning (not just keywords).
- Build personal/local assistants that can reference custom documents.
- Experiment with hybrid workflows where generative models query retrieved context.
β Describe the solution you'd like
I would love to see the app extended with:
-
Gemma Embedding Model (308M) Integration
- Add the recently launched Gemma 308M embedding model as an option in the app.
- Provide an interface to generate embeddings from text (single or batch).
- Optimized for mobile CPU/GPU, just like current Gemma generative models.
-
Local Vector Database Support
- Lightweight, pluggable integration with one or more local vector DBs:
- FAISS (proven, efficient, local-only).
- Qdrant Lite (mobile-ready, Rust core).
- Or a SQLite extension (pgvector-style) for embedding storage.
- Provide APIs for:
- Inserting embeddings into an index.
- Querying nearest neighbors (similarity search).
- Managing collections locally.
- Lightweight, pluggable integration with one or more local vector DBs:
-
RAG-Style Pipeline Example
- End-to-end demo where:
- User adds custom documents.
- The app generates embeddings + indexes them in the vector DB.
- Generative Gemma model queries the DB for context, then produces grounded responses.
- End-to-end demo where:
π Describe alternatives you've considered
- Running embeddings on external APIs β breaks the local-first promise of the app.
- Using only keyword search on-device β not effective for semantic tasks.
- Manual integration of embeddings + DB outside the app β harder to maintain and not beginner-friendly.
π Additional context
This feature would transform the AI Edge Gallery app from a generative demo into a full-fledged local AI lab:
- Developers could prototype offline assistants, RAG apps, and semantic search engines directly on-device.
- It aligns with Googleβs mission of making AI accessible, private, and edge-ready.
- With Gemma models already optimized for edge, embeddings + vector DB are a natural next step.
References:
Hi @tejasm-189,
Thank you for this detailed and well-structured feature request.
We appreciate you taking the time to outline the proposal for integrating Gemma embedding models with a local vector database.
We have logged this suggestion and passed it along to the team for their review.
Thank you again for your contribution.
check out my app https://github.com/timmyy123/LLM-Hub it supports embedding