speakbuild
speakbuild copied to clipboard
Mobile voice assistant that can generate UI on the go
SpeakBuild
An AI-powered voice assistant that generates and modifies React Native components through natural language commands. Built with Expo and OpenRouter API (Claude).
Features
- ๐๏ธ Voice commands to generate UI components
- โก Real-time component generation and preview
- ๐ Component modification through natural language
- ๐ฑ Cross-platform (iOS/Android) support
- ๐พ Persistent component storage
- ๐งช Debug generation interface for testing
Installation
Prerequisites
- Node.js
- Yarn or npm
- Expo CLI
- Bun (for evaluation scripts)
- iOS/Android development environment
Setup
- Clone the repository:
git clone https://github.com/Strawberry-Computer/speakbuild.git
cd speakbuild
- Install dependencies:
yarn install
- Configure environment:
- Set up OpenRouter API key in app settings
- Configure required permissions:
- Microphone access
- Speech recognition
- Internet access
Running the App
This app requires a native build due to dependencies on native modules (speech recognition, etc). It cannot run in Expo Go.
# Build and run on iOS simulator/device
yarn ios
# Build and run on Android emulator/device
yarn android
# Clean build cache if needed
yarn clean
Note: Web platform support is limited due to native module dependencies.
How It Works
1. User Interaction Flow
The system follows this interaction flow:
- Voice/Text Input: User provides input via voice button or keyboard
- Transcription: Audio is converted to text (for voice input)
- Analysis: Input is analyzed to determine intent and widget specification
- Component Generation: A React Native component is generated based on the specification
- Rendering: The component is rendered and displayed to the user
- History Management: The interaction is saved in conversation history
2. Widget Specification System
The system uses a structured URL-based widget specification system:
category/type/style/theme?with_feature=yes¶ms=name:type
Categories:
-
display: Information display (clock, weather, calendar) -
input: User data entry (lists, notes, forms) -
interactive: User actions (timer, player, calculator) -
feedback: System responses (progress, loading, alerts) -
media: Rich content (images, video, audio) -
content: for educational/informational content (articles, explanations, facts)
Features flags (with_*):
-
with_controls: Play/pause/reset controls -
with_dates: Date/time handling -
with_progress: Progress tracking -
with_checkboxes: Checkbox toggles -
with_hourly: Hourly breakdown -
with_daily: Daily breakdown -
with_alarm: Alarm functionality -
with_sections: Content sections
Example:
display/weather/forecast/light?with_daily=yes¶ms=location:caption,unit:caption,date:string,days:integer
2. Parameter Types
The system supports strongly-typed parameters:
Text Types:
-
caption: Short labels (1-3 words) -
title: Headings with context (3-7 words) -
sentence: Single complete thought -
paragraph: Multiple sentences -
story: Long-form content -
url: Web URLs
Number Types:
- Basic:
integer,decimal - Semantic:
size,duration,count,percentage,interval,goal,currency
Arrays:
-
caption[]: Lists of short items -
sentence[]: Lists of tasks/notes -
{text:string,done:boolean}[]: Basic todo items -
{text:string,done:boolean,time:string}[]: Scheduled todo items -
{text:string,selected:boolean,value:string}[]: Selection list items
Technical Architecture
Core Services
assistantService.js:
- Central service that manages the voice assistant state
- Handles audio recording, transcription, and component generation
- Maintains status (IDLE, LISTENING, THINKING, PROCESSING, ERROR)
- Supports different interaction modes (PTT, CALL)
- Emits events for UI updates
audioSession.js:
- Manages WebSocket connections for audio streaming
- Handles microphone access and audio processing
- Provides volume level monitoring
- Supports push-to-talk and call modes
analysis.js:
- Analyzes user requests using Claude
- Determines intent (new/modify)
- Generates widget URLs and parameters
- Maintains request history context
api.js:
- Handles OpenRouter API communication
- Supports both streaming and non-streaming completions
- Includes detailed request/response logging
- Handles SSE for real-time responses
componentGeneration.js:
- Creates React Native components from widget specifications
- Supports streaming generation with progress callbacks
- Handles component validation and error handling
- Provides abort capability for in-progress generations
componentUtils.js (in /src/utils/):
- Provides utilities for creating and rendering components
- Handles component sandboxing and error boundaries
- Manages React and React Native dependencies injection
- Supports dynamic component rendering with props
widgetStorage.js:
- Manages persistent storage of generated components
- Stores components by widget URL
- Maintains version history with timestamps
- Provides retrieval and update capabilities
componentHistoryService.js:
- Manages conversation and component history
- Supports navigation through previous components
- Maintains current component state
- Provides event-based state updates
Platform-Specific Configuration
iOS
- Bundle Identifier:
ai.speakbuild - Required Permissions:
- Microphone Usage
- Speech Recognition
- Background Audio Mode
Android
- Package:
ai.speakbuild - Required Permissions:
- RECORD_AUDIO
- INTERNET
- Build Configuration:
- Kotlin Version: 1.9.24
- Compile SDK: 35
- Target SDK: 34
- Build Tools: 34.0.0
Evaluation System
The app includes two evaluation scripts for testing the AI components:
Analysis Evaluation
Tests the system's ability to understand user requests and convert them to structured widget specifications:
bun scripts/evaluate-analysis.js [model]
# Default: anthropic/claude-3.5-sonnet
Example test case:
{
"request": "What time is it?",
"expected": {
"intent": "new",
"widgetUrl": "display/clock/digital/light?params=format:caption,size:integer",
"params": {
"format": "HH:mm",
"size": 48
}
}
}
Component Generation Evaluation
Tests the system's ability to generate functional React Native components from widget specifications:
bun scripts/evaluate-generation.js [model]
# Default: anthropic/claude-3.5-sonnet
Each evaluation generates a detailed report with:
- Success rate percentage
- Average response time
- Detailed per-test results
- Error analysis
Reports are saved in the evaluations/ directory with filenames:
-
analysis-[date]-[model].md -
generation-[date]-[model].md
Dependencies
Key packages:
-
expo~52.0.36 -
@expo/vector-icons^14.0.4 -
@react-navigation/drawer^7.1.1 -
expo-av~15.0.2 -
expo-build-properties~0.13.2 -
expo-clipboard~7.0.1 -
expo-constants~17.0.0 -
expo-file-system~18.0.7 -
expo-haptics~14.0.1 -
expo-image-picker~16.0.4 -
expo-linking~7.0.0 -
expo-location~18.0.5 -
expo-media-library~17.0.5 -
expo-notifications~0.29.12 -
expo-router~4.0.16 -
expo-sensors~14.0.2 -
expo-sharing~13.0.1 -
expo-splash-screen~0.29.0 -
expo-status-bar~2.0.0 -
expo-system-ui~4.0.0 -
react18.3.1 -
react-dom18.3.1 -
react-native0.76.6 -
react-native-audio-record^0.2.2 -
react-native-gesture-handler~2.20.2 -
react-native-mmkv^3.2.0 -
react-native-permissions^5.2.5 -
react-native-reanimated~3.16.1 -
react-native-safe-area-context4.12.0 -
react-native-screens~4.4.0 -
react-native-svg^15.11.1 -
react-native-web~0.19.6 -
partial-json^0.1.7
For full list of dependencies, see package.json.
Development Notes
- Uses Expo Router for navigation
- Supports TypeScript
- Includes custom Expo plugins for speech recognition
- Configured for both light and dark mode support
- Uses EventEmitter pattern for state management
- Implements custom hooks for component state (useAssistantState)
- Uses MMKV for high-performance storage
- Supports both voice and keyboard input methods
- Implements WebSocket-based audio streaming
License
MIT License
Test API Keys for Builds
Local Development with Test Keys
For local development, you can include test API keys so you don't need to enter them in the app:
- Create a
.envfile with your test API keys:
# Copy the example file
cp .env.example .env
# Edit the .env file with your test keys
# EXPO_PUBLIC_TEST_ULTRAVOX_KEY=your-ultravox-test-key
# EXPO_PUBLIC_TEST_OPENROUTER_KEY=your-openrouter-test-key
- Run the app with the environment variables loaded:
# For iOS
yarn ios
# For Android
yarn android
CI/CD Builds with Test Keys
For automated builds via GitHub Actions, test API keys are injected at build time:
-
Store your API keys as GitHub repository secrets:
-
EXPO_PUBLIC_TEST_ULTRAVOX_KEY -
EXPO_PUBLIC_TEST_OPENROUTER_KEY
-
-
The CI workflow automatically passes these secrets to EAS Build:
# From .github/workflows/release.yml
- name: Build and submit iOS app
run: eas build --platform ios --profile production --non-interactive --auto-submit
env:
EXPO_PUBLIC_TEST_ULTRAVOX_KEY: ${{ secrets.EXPO_PUBLIC_TEST_ULTRAVOX_KEY }}
EXPO_PUBLIC_TEST_OPENROUTER_KEY: ${{ secrets.EXPO_PUBLIC_TEST_OPENROUTER_KEY }}
This allows TestFlight and Play Store testers to use the app without needing to enter API keys.
Contributing
- Fork the repository
- Create your feature branch
- Commit your changes
- Push to the branch
- Create a new Pull Request
Development Workflow
-
Setup Environment:
- Install dependencies:
yarn install - Configure API keys in
.envfile
- Install dependencies:
-
Run Tests:
- Evaluate analysis:
yarn evaluate-analysis - Evaluate generation:
yarn evaluate-generation
- Evaluate analysis:
-
Build and Test:
- Development build:
yarn iosoryarn android - Production build:
yarn build:iosoryarn build:android
- Development build:
-
Submit Changes:
- Ensure all tests pass
- Follow the existing code style
- Include documentation updates