🚀 Feature Request: Support for Gemma Embedding Model (308M) & Local Vector Database

📌 Is your feature request related to a problem? Please describe.

Currently, the Google AI Edge Gallery app only supports generative Gemma models (chat, multimodal, prompt lab).
While this is great for on-device creativity and conversation, it lacks embedding support, which prevents building local retrieval-augmented generation (RAG) or semantic search workflows.

Without embeddings + vector database support, developers like me cannot:

Create offline knowledge bases that are searchable by meaning (not just keywords).
Build personal/local assistants that can reference custom documents.
Experiment with hybrid workflows where generative models query retrieved context.

✅ Describe the solution you'd like

I would love to see the app extended with:

Gemma Embedding Model (308M) Integration
- Add the recently launched Gemma 308M embedding model as an option in the app.
- Provide an interface to generate embeddings from text (single or batch).
- Optimized for mobile CPU/GPU, just like current Gemma generative models.
Local Vector Database Support
- Lightweight, pluggable integration with one or more local vector DBs:
  - FAISS (proven, efficient, local-only).
  - Qdrant Lite (mobile-ready, Rust core).
  - Or a SQLite extension (pgvector-style) for embedding storage.
- Provide APIs for:
  - Inserting embeddings into an index.
  - Querying nearest neighbors (similarity search).
  - Managing collections locally.
RAG-Style Pipeline Example
- End-to-end demo where:
  - User adds custom documents.
  - The app generates embeddings + indexes them in the vector DB.
  - Generative Gemma model queries the DB for context, then produces grounded responses.

🔄 Describe alternatives you've considered

Running embeddings on external APIs → breaks the local-first promise of the app.
Using only keyword search on-device → not effective for semantic tasks.
Manual integration of embeddings + DB outside the app → harder to maintain and not beginner-friendly.

📂 Additional context

This feature would transform the AI Edge Gallery app from a generative demo into a full-fledged local AI lab:

Developers could prototype offline assistants, RAG apps, and semantic search engines directly on-device.
It aligns with Google’s mission of making AI accessible, private, and edge-ready.
With Gemma models already optimized for edge, embeddings + vector DB are a natural next step.

References:

Sep 06 '25 05:09 tejasm-189

Hi @tejasm-189,

Thank you for this detailed and well-structured feature request.

We appreciate you taking the time to outline the proposal for integrating Gemma embedding models with a local vector database.

We have logged this suggestion and passed it along to the team for their review.

Thank you again for your contribution.

Sep 10 '25 22:09 dpknag

check out my app https://github.com/timmyy123/LLM-Hub it supports embedding

Sep 14 '25 13:09 timmyy123

Support for Gemma Embedding Models, and Vector Database

🚀 Feature Request: Support for Gemma Embedding Model (308M) & Local Vector Database

📌 Is your feature request related to a problem? Please describe.

✅ Describe the solution you'd like

🔄 Describe alternatives you've considered

📂 Additional context