gallery icon indicating copy to clipboard operation
gallery copied to clipboard

Support for Gemma Embedding Models, and Vector Database

Open tejasm-189 opened this issue 4 months ago β€’ 2 comments

πŸš€ Feature Request: Support for Gemma Embedding Model (308M) & Local Vector Database

πŸ“Œ Is your feature request related to a problem? Please describe.

Currently, the Google AI Edge Gallery app only supports generative Gemma models (chat, multimodal, prompt lab).
While this is great for on-device creativity and conversation, it lacks embedding support, which prevents building local retrieval-augmented generation (RAG) or semantic search workflows.

Without embeddings + vector database support, developers like me cannot:

  • Create offline knowledge bases that are searchable by meaning (not just keywords).
  • Build personal/local assistants that can reference custom documents.
  • Experiment with hybrid workflows where generative models query retrieved context.

βœ… Describe the solution you'd like

I would love to see the app extended with:

  1. Gemma Embedding Model (308M) Integration

    • Add the recently launched Gemma 308M embedding model as an option in the app.
    • Provide an interface to generate embeddings from text (single or batch).
    • Optimized for mobile CPU/GPU, just like current Gemma generative models.
  2. Local Vector Database Support

    • Lightweight, pluggable integration with one or more local vector DBs:
      • FAISS (proven, efficient, local-only).
      • Qdrant Lite (mobile-ready, Rust core).
      • Or a SQLite extension (pgvector-style) for embedding storage.
    • Provide APIs for:
      • Inserting embeddings into an index.
      • Querying nearest neighbors (similarity search).
      • Managing collections locally.
  3. RAG-Style Pipeline Example

    • End-to-end demo where:
      • User adds custom documents.
      • The app generates embeddings + indexes them in the vector DB.
      • Generative Gemma model queries the DB for context, then produces grounded responses.

πŸ”„ Describe alternatives you've considered

  • Running embeddings on external APIs β†’ breaks the local-first promise of the app.
  • Using only keyword search on-device β†’ not effective for semantic tasks.
  • Manual integration of embeddings + DB outside the app β†’ harder to maintain and not beginner-friendly.

πŸ“‚ Additional context

This feature would transform the AI Edge Gallery app from a generative demo into a full-fledged local AI lab:

  • Developers could prototype offline assistants, RAG apps, and semantic search engines directly on-device.
  • It aligns with Google’s mission of making AI accessible, private, and edge-ready.
  • With Gemma models already optimized for edge, embeddings + vector DB are a natural next step.

References:

tejasm-189 avatar Sep 06 '25 05:09 tejasm-189

Hi @tejasm-189,

Thank you for this detailed and well-structured feature request.

We appreciate you taking the time to outline the proposal for integrating Gemma embedding models with a local vector database.

We have logged this suggestion and passed it along to the team for their review.

Thank you again for your contribution.

dpknag avatar Sep 10 '25 22:09 dpknag

check out my app https://github.com/timmyy123/LLM-Hub it supports embedding

timmyy123 avatar Sep 14 '25 13:09 timmyy123