Skip to main content

Overview

FishAudioTTSService provides real-time text-to-speech synthesis through Fish Audio’s WebSocket-based streaming API. The service offers custom voice models, prosody controls, and multiple audio formats optimized for conversational AI applications with low latency.

Installation

To use Fish Audio services, install the required dependencies:
pip install "pipecat-ai[fish]"

Prerequisites

Fish Audio Account Setup

Before using Fish Audio TTS services, you need:
  1. Fish Audio Account: Sign up at Fish Audio Console
  2. API Key: Generate an API key from your account dashboard
  3. Voice Models: Create or select custom voice models for synthesis

Required Environment Variables

  • FISH_API_KEY: Your Fish Audio API key for authentication

Configuration

FishAudioTTSService

api_key
str
required
Fish Audio API key for authentication.
reference_id
str
default:"None"
Reference ID of the voice model to use for synthesis.
model_id
str
default:"s1"
Fish Audio TTS model to use.
output_format
str
default:"pcm"
Audio output format. Options: "pcm", "opus", "mp3", "wav".
sample_rate
int
default:"None"
Output audio sample rate in Hz. When None, uses the pipeline’s configured sample rate.
params
InputParams
default:"None"
Runtime-configurable voice settings. See InputParams below.

InputParams

Voice and generation settings that can be set at initialization via the params constructor argument.
ParameterTypeDefaultDescription
languageLanguageLanguage.ENLanguage for synthesis.
latencystr"normal"Latency mode: "normal" or "balanced".
normalizeboolTrueWhether to normalize audio output.
prosody_speedfloat1.0Speech speed multiplier (0.5-2.0).
prosody_volumeint0Volume adjustment in dB.

Usage

Basic Setup

from pipecat.services.fish import FishAudioTTSService

tts = FishAudioTTSService(
    api_key=os.getenv("FISH_API_KEY"),
    reference_id="your-voice-reference-id",
)

With Prosody Controls

tts = FishAudioTTSService(
    api_key=os.getenv("FISH_API_KEY"),
    reference_id="your-voice-reference-id",
    model_id="s1",
    params=FishAudioTTSService.InputParams(
        prosody_speed=1.2,
        prosody_volume=3,
        latency="balanced",
    ),
)

Notes

  • reference_id required: You must specify either reference_id (preferred) or the deprecated model parameter. Passing both raises a ValueError.
  • Model switching: Changing the model via set_model() automatically disconnects and reconnects the WebSocket with the new model configuration.

Event Handlers

Fish Audio TTS supports the standard service connection events:
EventDescription
on_connectedConnected to Fish Audio WebSocket
on_disconnectedDisconnected from Fish Audio WebSocket
on_connection_errorWebSocket connection error occurred
@tts.event_handler("on_connected")
async def on_connected(service):
    print("Connected to Fish Audio")