Skip to main content

Overview

AssemblyAISTTService provides real-time speech recognition using AssemblyAI’s WebSocket API with support for interim results, end-of-turn detection, and configurable audio processing parameters for accurate transcription in conversational AI applications.

Installation

To use AssemblyAI services, install the required dependency:
pip install "pipecat-ai[assemblyai]"

Prerequisites

AssemblyAI Account Setup

Before using AssemblyAI STT services, you need:
  1. AssemblyAI Account: Sign up at AssemblyAI Console
  2. API Key: Generate an API key from your dashboard
  3. Model Selection: Choose from available transcription models and features

Required Environment Variables

  • ASSEMBLYAI_API_KEY: Your AssemblyAI API key for authentication

Configuration

AssemblyAISTTService

api_key
str
required
AssemblyAI API key for authentication.
language
Language
default:"Language.EN"
Language code for transcription. AssemblyAI currently supports English.
api_endpoint_base_url
str
default:"wss://streaming.assemblyai.com/v3/ws"
WebSocket endpoint URL. Override for custom or proxied deployments.
connection_params
AssemblyAIConnectionParams
default:"AssemblyAIConnectionParams()"
Connection configuration parameters. See AssemblyAIConnectionParams below.
vad_force_turn_endpoint
bool
default:"True"
Whether to force turn endpoint on VAD stop. When True, disables AssemblyAI’s model-based turn detection and relies on external VAD to trigger turn endpoints. Automatically sets end_of_turn_confidence_threshold=1.0 and max_turn_silence=2000 unless explicitly overridden.
ttfs_p99_latency
float
default:"ASSEMBLYAI_TTFS_P99"
P99 latency from speech end to final transcript in seconds. Override for your deployment.

AssemblyAIConnectionParams

Connection-level parameters passed via the connection_params constructor argument.
ParameterTypeDefaultDescription
sample_rateint16000Audio sample rate in Hz.
encodingLiteral"pcm_s16le"Audio encoding format. Options: "pcm_s16le", "pcm_mulaw".
formatted_finalsboolTrueWhether to enable transcript formatting.
word_finalization_max_wait_timeintNoneMaximum time to wait for word finalization in milliseconds.
end_of_turn_confidence_thresholdfloatNoneConfidence threshold for end-of-turn detection.
min_end_of_turn_silence_when_confidentintNoneMinimum silence duration (ms) when confident about end-of-turn.
max_turn_silenceintNoneMaximum silence duration (ms) before forcing end-of-turn.
keyterms_promptList[str]NoneList of key terms to guide transcription.
speech_modelLiteral"universal-streaming-english"Speech model. Options: "universal-streaming-english", "universal-streaming-multilingual".

Usage

Basic Setup

from pipecat.services.assemblyai import AssemblyAISTTService

stt = AssemblyAISTTService(
    api_key=os.getenv("ASSEMBLYAI_API_KEY"),
)

With Custom Connection Parameters

from pipecat.services.assemblyai import AssemblyAISTTService
from pipecat.services.assemblyai.models import AssemblyAIConnectionParams

stt = AssemblyAISTTService(
    api_key=os.getenv("ASSEMBLYAI_API_KEY"),
    connection_params=AssemblyAIConnectionParams(
        sample_rate=16000,
        formatted_finals=True,
        keyterms_prompt=["Pipecat", "AssemblyAI"],
        speech_model="universal-streaming-multilingual",
    ),
    vad_force_turn_endpoint=True,
)

Notes

  • English only by default: AssemblyAI’s default model supports English. Use speech_model="universal-streaming-multilingual" in connection_params for multilingual support.
  • VAD turn endpoint mode: When vad_force_turn_endpoint=True (the default), AssemblyAI’s model-based turn detection is disabled in favor of external VAD. This sends a ForceEndpoint message when the VAD detects the user has stopped speaking.
  • Formatted finals: When formatted_finals=True, the service waits for formatted transcripts before emitting final TranscriptionFrames. This provides properly formatted text but may introduce a small delay.

Event Handlers

AssemblyAI STT supports the standard service connection events:
EventDescription
on_connectedConnected to AssemblyAI WebSocket
on_disconnectedDisconnected from AssemblyAI WebSocket
@stt.event_handler("on_connected")
async def on_connected(service):
    print("Connected to AssemblyAI")