ai-devices
ai-devices copied to clipboard
AI Device Template Featuring Whisper, TTS, Groq, Llama3, OpenAI and more
AI Device Template
YouTube Tutorial
This project is an AI-powered voice assistant utilizing various AI models and services to provide intelligent responses to user queries. It supports voice input, transcription, text-to-speech, image processing, and function calling with conditionally rendered UI components. This was inspired by the recent trend of AI Devices such as the Humane AI Pin and the Rabbit R1.
Features
- Voice input and transcription: Using Whisper models from Groq or OpenAI
- Text-to-speech output: Using OpenAI's TTS models
- Image processing: Using OpenAI's GPT-4 Vision or Fal.ai's Llava-Next models
- Function calling and conditionally rendered UI components: Using OpenAI's GPT-3.5-Turbo model
- Customizable UI settings: Includes response times, settings toggle, text-to-speech toggle, internet results toggle, and photo upload toggle
- (Optional) Rate limiting: Using Upstash
- (Optional) Tracing: With Langchain's LangSmith for function execution
Setup
1. Clone the repository
git clone https://github.com/developersdigest/ai-devices.git
2. Install dependencies
npm install
# or
bun install
3. Add API Keys
To use this AI-powered voice assistant, you need to provide the necessary API keys for the selected AI models and services.
Required for core functionality
- Groq API Key For Llama + Whisper
- OpenAI API Key for TTS and Vision + Whisper
- Serper API Key for Internet Results
Optional for advanced configuration
- Langchain Tracing for function execution tracing
- Upstash Redis for IP-based rate limiting
- Spotify for Spotify API interactions
- Fal.AI (Lllava Image Model) Alternative vision model to GPT-4-Vision
Replace 'API_KEY_GOES_HERE' with your actual API keys for each service.
4. Start the development server
npm run dev
# or
bun dev
Access the application at http://localhost:3000 or through the provided URL.
5. Deployment
Configuration
Modify app/config.tsx to adjust settings and configurations for the AI-powered voice assistant. Here’s an overview of the available options:
export const config = {
// Inference settings
inferenceModelProvider: 'groq', // 'groq' or 'openai'
inferenceModel: 'llama3-8b-8192', // Groq: 'llama3-70b-8192' or 'llama3-8b-8192'.. OpenAI: 'gpt-4-turbo etc
// BELOW OPTIONAL are some options for the app to use
// Whisper settings
whisperModelProvider: 'openai', // 'groq' or 'openai'
whisperModel: 'whisper-1', // Groq: 'whisper-large-v3' OpenAI: 'whisper-1'
// TTS settings
ttsModelProvider: 'openai', // only openai supported for now...
ttsModel: 'tts-1', // only openai supported for now...s
ttsvoice: 'alloy', // only openai supported for now... [alloy, echo, fable, onyx, nova, and shimmer]
// OPTIONAL:Vision settings
visionModelProvider: 'openai', // 'openai' or 'fal.ai'
visionModel: 'gpt-4-vision', // OpenAI: 'gpt-4-vision' Fal.ai: 'llava-next'
// Function calling + conditionally rendered UI
functionCallingModelProvider: 'openai', // 'openai' current only
functionCallingModel: 'gpt-3.5-turbo', // OpenAI: 'gpt-3-5-turbo'
// UI settings
enableResponseTimes: false, // Display response times for each message
enableSettingsUIToggle: true, // Display the settings UI toggle
enableTextToSpeechUIToggle: true, // Display the text to speech UI toggle
enableInternetResultsUIToggle: true, // Display the internet results UI toggle
enableUsePhotUIToggle: true, // Display the use photo UI toggle
useAttributionComponent: false, // Use the attribution component to display the attribution of the AI models/services used
// Rate limiting settings
useRateLimiting: false, // Use Upstash rate limiting to limit the number of requests per user
// Tracing with Langchain
useLangSmith: true, // Use LangSmith by Langchain to trace the execution of the functions in the config.tsx set to true to use.
};
Contributing
Contributions are welcome! If you find any issues or have suggestions for improvements, please open an issue or submit a pull request.
I'm the developer behind Developers Digest. If you find my work helpful or enjoy what I do, consider supporting me. Here are a few ways you can do that:
- Patreon: Support me on Patreon at patreon.com/DevelopersDigest
- Buy Me A Coffee: You can buy me a coffee at buymeacoffee.com/developersdigest
- Website: Check out my website at developersdigest.tech
- Github: Follow me on GitHub at github.com/developersdigest
- Twitter: Follow me on Twitter at twitter.com/dev__digest