Features
- Sub-second latency voice interactions using WebRTC peer connections
- 6 specialized agent personas: customer service, order tracking, agricultural support, food safety lab, veterinary health, retail assistance
- 30+ callable enterprise tools for order lookup, inventory, dosage calculations, recipes
- Bilingual support with native Arabic and English responses including Saudi dialect
- Live transcription with Whisper and interim/final transcription states
- Tool result visualization alongside agent controls
- Multiple TTS voice options (ballad, ash, sage, nova)
- Real-time token usage monitoring
Challenge
Building a voice AI system capable of handling complex multi-domain customer interactions in real-time for a large agribusiness, with native Arabic support and integration with enterprise systems for order tracking, inventory, and specialized calculations.
Solution
Implemented OpenAI’s Realtime API with WebRTC for sub-second latency. Created 6 specialized agent personas each with domain-specific tool sets. Built a sophisticated tool-calling architecture allowing real-time function execution during conversations while maintaining conversation state.
Highlights
- Sub-second voice latency with WebRTC
- 30+ enterprise tools callable during live conversations
- Native Arabic support with Saudi dialect