RAG_Hack icon indicating copy to clipboard operation
RAG_Hack copied to clipboard

Project: PhotoRAG - Image search application

Open dubscode opened this issue 1 year ago • 2 comments

Project Name

PhotoRAG

Description

PhotoRAG is a fullstack Next.JS image search application that leverages Azure AI and infrastructure to implement a Retrieval-Augmented Generation (RAG) system for photographs. This project showcases the power of combining vector embeddings, similarity search, and large language models to create an intelligent and efficient image retrieval system.

Data Sources

We seeded the database using a collection of our own photographs. You can view all of the available images in the database at https://rag.photomuse.ai/gallery, and click the Load Gallery button.

Data Ingestion

When we uploaded the images, we utilized the Azure AI Computer Vision API.

computervision/imageanalysis:analyze: Was utilized to generate a caption and tags from the supplied image.

We then provided the tags and computer vision caption to GPT-4o, and prompted it to create a more thorough image description that would improve search result accuracy.

The tags were stored as an array in our Azure Postgresql Server database.

The image description was stored as a string, and we then used

  • computervision/retrieval:vectorizeImage for the image, and
  • computervision/retrieval:vectorizeText for the description, storing both vectors in the same images postgres table.

How It Works

  1. Image Upload: When an image is uploaded, it's stored in Azure Blob Storage.
  2. Image Analysis: The image is analyzed using Azure Computer Vision to generate tags and captions.
  3. Description Generation: GPT-4 is used to generate a detailed description based on the tags and captions.
  4. Vector Embedding: The description is converted into a vector embedding using Azure OpenAI.
  5. Search: When a user performs a search:
    • The query is refined using GPT-4 to extract relevant tags and improve the search terms.
    • The refined query is converted to a vector embedding.
    • A similarity search is performed using cosine similarity between the query embedding and the stored image embeddings.
    • Results are ranked based on similarity and tag matches.

Features

  • Image upload and automatic description generation using Azure Computer Vision and GPT-4
  • Automatic tagging of images
  • Vector embedding of image descriptions for efficient similarity search
  • Natural language querying of the image database
  • Refined search queries using AI
  • Confidence scoring and explanations for search results

Technology Stack

  • Next.js (App Router)
  • TypeScript
  • PostgreSQL with pgvector extension
  • Drizzle ORM
  • Azure OpenAI API (for GPT-4)
  • Azure Computer Vision API (for analysis and embeddings)
  • Azure Blob Storage

Technology & Languages

  • [X] JavaScript
  • [ ] Java
  • [ ] .NET
  • [ ] Python
  • [ ] AI Studio
  • [ ] AI Search
  • [X] PostgreSQL
  • [ ] Cosmos DB
  • [ ] Azure SQL

Project Repository URL

https://github.com/dubscode/photorag

Deployed Endpoint URL

https://rag.photomuse.ai/

Project Video

https://youtu.be/JvHKW363nwo?si=oLjJcrkQbMuQOVXe

Team Members

dubscode

dubscode avatar Sep 15 '24 21:09 dubscode

Hello @dubscode, thank you for participating in RAG Hack!

The team is working hard to distribute badges. Please have each team member fill out this form: aka.ms/raghack/badge-dist

Thank you!

multispark avatar Oct 23 '24 01:10 multispark

Thank you very much!

dubscode avatar Oct 23 '24 02:10 dubscode