Overview
SambaNovaSTTService provides speech-to-text capabilities using SambaNova’s hosted Whisper API with Voice Activity Detection (VAD) for optimized processing. It efficiently processes speech segments to deliver accurate transcription with SambaNova’s high-performance inference platform.
SambaNova STT API Reference
Pipecat’s API methods for SambaNova STT integration
Example Implementation
Complete example with function calling
SambaNova Documentation
Official SambaNova API documentation and features
SambaNova Cloud
Access API keys and Whisper models
Installation
To use SambaNova services, install the required dependency:Prerequisites
SambaNova Account Setup
Before using SambaNova STT services, you need:- SambaNova Account: Sign up at SambaNova Cloud
- API Key: Generate an API key from your account dashboard
- Model Access: Ensure access to Whisper transcription models
Required Environment Variables
SAMBANOVA_API_KEY: Your SambaNova API key for authentication
Configuration
SambaNovaSTTService
Whisper model to use for transcription.
SambaNova API key. Falls back to the
SAMBANOVA_API_KEY environment variable.API base URL.
Language of the audio input.
Optional text to guide the model’s style or continue a previous segment.
Sampling temperature between 0 and 1. Lower values produce more deterministic results.
Usage
Basic Setup
With Custom Configuration
Notes
- Segmented transcription:
SambaNovaSTTServiceextendsSegmentedSTTService(viaBaseWhisperSTTService), processing complete audio segments after VAD detects the user has stopped speaking. - Whisper API compatible: Uses the OpenAI-compatible Whisper API interface hosted on SambaNova’s infrastructure.
- Probability metrics not supported: SambaNova’s Whisper API does not support probability metrics. The
include_prob_metricsparameter has no effect.