Architecture Overview
The predictive buffering system uses Gemini API to anticipate audio requirements and optimize playback performance before audio chunks arrive from Lyria RealTime. It operates on two levels: predictive analysis and real-time audio processing:Core Components
BufferManager Class
The main BufferManager handles both predictive analysis and audio playback:Initialization
The system initializes with fallback support for basic buffering:Predictive Processing
Pre-Audio Analysis
The system processes interpretations before audio arrives to prepare optimal buffering:Buffer Parameter Optimization
The system calculates optimal buffer sizes based on musical characteristics:Audio Processing Pipeline
Base64 to Float32 Conversion
Lyria audio chunks arrive as base64-encoded 16-bit PCM and must be converted for Web Audio API:Seamless Audio Playback
The system implements seamless audio playback with crossfading:Stereo Channel Processing
Lyria provides interleaved stereo data that must be split for Web Audio API:Crossfading System
Predictive Crossfade Preparation
The system prepares crossfade parameters based on musical characteristics:Real-time Crossfade Implementation
Each audio chunk is played with automatic crossfading for seamless transitions:Layer Addition System
Predictive Layer Preparation
For additive layering (non-crossfade transitions), the system optimizes for longer overlaps:Performance Monitoring
Buffer Status Tracking
The system provides real-time buffer status for monitoring:Rate Limiting
Gemini API calls are rate-limited to prevent quota exhaustion:Integration with Lyria
Audio Chunk Handling
The system processes Lyria audio chunks for immediate playback:Resource Management
Cleanup System
Proper cleanup prevents memory leaks and audio artifacts:Factory Function
The system provides a factory function for easy instantiation:Configuration
Environment Variables
Audio Specifications
The system is configured for Lyria’s audio format:- Sample Rate: 48kHz
- Channels: 2 (stereo)
- Format: 16-bit PCM
- Encoding: Base64 (from Lyria)
- Processing: Float32 (for Web Audio API)