Scalable LLM Architectures with Redis & GCP Vertex AI

☁️ Generative AI with Google Vertex AI comes with a specialized in-console studio experience, a dedicated API for Gemini and easy-to-use Python SDK designed for deploying and managing instances of Google's powerful language models.

⚡ Redis Enterprise offers fast and scalable vector search, with an API for index creation, management, blazing-fast search, and hybrid filtering. When coupled with its versatile data structures - Redis Enterprise shines as the optimal solution for building high-quality Large Language Model (LLM) apps.

This repo serves as a foundational architecture for building LLM applications with Redis and GCP services.

Reference architecture

Primary Data Sources
Data Extraction and Loading
Large Language Models
- text-embedding-gecko@003 for embeddings
- gemini-1.0-pro-001 for LLM generation and chat
High-Performance Data Layer (Redis)
- Semantic caching to improve LLM performance and associated costs
- Vector search for context retrieval from knowledge base

RAG demo

Open the code tutorial using the Colab notebook to get your hands dirty with Redis and Vertex AI on GCP. It's a step-by-step walkthrough of setting up the required data, and generating embeddings, and building RAG from scratch in order to build fast LLM apps; highlighting Redis vector search and semantic caching.

gcp-redis-llm-stack
gcp-redis-llm-stack copied to clipboard

Metadata

Scalable LLM Architectures with Redis & GCP Vertex AI

Reference architecture

RAG demo

Additional resources

← Metadata

Owner

Metadata

gcp-redis-llm-stack gcp-redis-llm-stack copied to clipboard

Metadata

Scalable LLM Architectures with Redis & GCP Vertex AI

Reference architecture

RAG demo

Additional resources

← Metadata

Owner

Metadata

gcp-redis-llm-stack
gcp-redis-llm-stack copied to clipboard