Overview
VonageFrameSerializer enables integration with the Vonage Video API Audio Connector WebSocket protocol, allowing Pipecat applications to process real-time audio streams from active Vonage video sessions.
Vonage Serializer API Reference
Pipecat’s API methods for Vonage Audio Connector Streams integration
Example Implementation
End-to-end Pipecat example using Vonage Audio Connector
Vonage Audio Connector Documentation
Official Vonage Video API Audio Connector documentation
Vonage Video API Console
Manage Vonage Video API projects
Installation
TheVonageFrameSerializer does not require any additional dependencies beyond the core Pipecat library:
Prerequisites
Vonage Video API Account Setup
Before using VonageFrameSerializer, you need:- Vonage (TokBox) Account: Sign up at Vonage Video API Console
- Vonage Video API Project: Create a project to obtain Project API Key and Project Secret
- Existing Vonage Video Session: A Vonage session must already exist. Sessions can be created using TokBox Playground or Vonage Video API SDKs
Required Environment Variables
VONAGE_API_KEY: Your Vonage Video API project keyVONAGE_API_SECRET: Your Vonage Video API project secretVONAGE_SESSION_ID: The existing routed session IDWS_URI: Public WebSocket endpoint URI of the server application running Pipecat (e.g. via ngrok)
Required Configuration
- WebSocket Endpoint (/ws): A WebSocket server application (e.g. FastAPI) running Pipecat that accepts raw PCM audio frames.
- Audio Connector /connect Request: Triggers Vonage to open a WebSocket connection to your server and begin streaming audio from the active session.
Key Features
- Bidirectional Audio: Convert between Pipecat and Vonage Audio Connector formats
- Real-Time AI Pipelines: Stream live audio into Pipecat and process it through any real-time pipeline configuration supported by the framework
- Session Control Events: Handle Vonage Audio Connector JSON events
- Linear PCM Audio: Handle raw 16-bit linear PCM audio streams used by the Vonage Video API Audio Connector
Configuration
Configuration parameters for audio settings. See InputParams below.
InputParams
| Parameter | Type | Default | Description |
|---|---|---|---|
vonage_sample_rate | int | 16000 | Sample rate used by Vonage (Hz). Common values: 8000, 16000, 24000. |
sample_rate | int | None | Optional override for pipeline input sample rate. When None, uses the pipeline’s configured rate. |
ignore_rtvi_messages | bool | True | Whether to ignore RTVI protocol messages during serialization. |
Usage
Basic Setup
With Custom Sample Rate
Notes
- Linear PCM audio: Unlike Twilio and Plivo, Vonage uses raw 16-bit linear PCM audio (not mu-law encoded). Audio data is sent as binary WebSocket messages rather than base64-encoded JSON.
- No auto hang-up: The Vonage serializer does not include automatic call termination. Session lifecycle is managed through the Vonage Video API.
- Event handling: The serializer handles Vonage-specific WebSocket events including
websocket:connected,websocket:cleared,websocket:notify, andwebsocket:dtmf. - DTMF support: Touch-tone digit events are converted to
InputDTMFFrameobjects.