Overview
FalSTTService provides speech-to-text capabilities using Fal’s Wizper API with Voice Activity Detection (VAD) to process only speech segments, optimizing API usage and improving response time for efficient transcription.
Fal STT API Reference
Pipecat’s API methods for Fal Wizper integration
Example Implementation
Complete example with VAD integration
Fal Documentation
Official Fal Wizper documentation and features
Fal Platform
Access API keys and Wizper models
Installation
To use Fal services, install the required dependency:Prerequisites
Fal Account Setup
Before using Fal STT services, you need:- Fal Account: Sign up at Fal Platform
- API Key: Generate an API key from your account dashboard
- Model Access: Ensure access to the Wizper transcription model
Required Environment Variables
FAL_KEY: Your Fal API key for authentication
Configuration
FalSTTService
Fal API key. If not provided, uses
FAL_KEY environment variable.Audio sample rate in Hz. When
None, uses the pipeline’s configured sample rate.Configuration parameters for the Wizper API. See InputParams below.
P99 latency from speech end to final transcript in seconds. Override for your deployment.
InputParams
Parameters passed via theparams constructor argument.
| Parameter | Type | Default | Description |
|---|---|---|---|
language | Language | Language.EN | Language of the audio input. |
task | str | "transcribe" | Task to perform: "transcribe" or "translate". |
chunk_level | str | "segment" | Level of chunking for the response. |
version | str | "3" | Version of the Wizper model to use. |
Usage
Basic Setup
With Custom Parameters
Translation Mode
Notes
- Segmented processing:
FalSTTServiceinherits fromSegmentedSTTService, which buffers audio during speech (detected by VAD) and sends complete segments for transcription. This means it does not provide interim results — only final transcriptions after each speech segment. - Translation support: Set
task="translate"to translate audio into English, regardless of the input language. - Wizper versions: The
versionparameter selects the underlying Whisper model version. Version"3"is the default and recommended for best accuracy.