Skip to main content

Overview

OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications with state-of-the-art language models.

Installation

To use OpenAI services, install the required dependencies:
pip install "pipecat-ai[openai]"

Prerequisites

OpenAI Account Setup

Before using OpenAI LLM services, you need:
  1. OpenAI Account: Sign up at OpenAI Platform
  2. API Key: Generate an API key from your account dashboard
  3. Model Selection: Choose from available models (GPT-4.1, GPT-4o, GPT-4o-mini, etc.)
  4. Usage Limits: Set up billing and usage limits as needed

Required Environment Variables

  • OPENAI_API_KEY: Your OpenAI API key for authentication

Configuration

model
str
default:"gpt-4.1"
OpenAI model name to use (e.g., "gpt-4.1", "gpt-4o", "gpt-4o-mini").
api_key
str
default:"None"
OpenAI API key. If None, uses the OPENAI_API_KEY environment variable.
base_url
str
default:"None"
Custom base URL for the OpenAI API. Override for proxied or self-hosted deployments.
organization
str
default:"None"
OpenAI organization ID.
project
str
default:"None"
OpenAI project ID.
default_headers
Mapping[str, str]
default:"None"
Additional HTTP headers to include in every request.
params
InputParams
default:"None"
Runtime-configurable model settings. See InputParams below.
retry_timeout_secs
float
default:"5.0"
Request timeout in seconds. Used when retry_on_timeout is enabled to determine when to retry.
retry_on_timeout
bool
default:"False"
Whether to retry the request once if it times out. The retry attempt has no timeout limit.

InputParams

Model inference settings that can be set at initialization via the params constructor argument, or changed at runtime via UpdateSettingsFrame.
ParameterTypeDefaultDescription
frequency_penaltyfloatNOT_GIVENPenalty for frequent tokens (-2.0 to 2.0). Positive values discourage repetition.
presence_penaltyfloatNOT_GIVENPenalty for new topics (-2.0 to 2.0). Positive values encourage the model to talk about new topics.
seedintNOT_GIVENRandom seed for deterministic outputs.
temperaturefloatNOT_GIVENSampling temperature (0.0 to 2.0). Lower values are more focused, higher values are more creative.
top_kintNoneTop-k sampling parameter. Currently ignored by the OpenAI client library.
top_pfloatNOT_GIVENTop-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output.
max_tokensintNOT_GIVENMaximum tokens in response. Deprecated — use max_completion_tokens instead.
max_completion_tokensintNOT_GIVENMaximum completion tokens to generate.
service_tierstrNOT_GIVENService tier (e.g., "auto", "flex", "priority"). Controls latency and cost tradeoffs.
extradict{}Additional model-specific parameters passed directly to the API.
NOT_GIVEN values are omitted from the API request entirely, letting the OpenAI API use its own defaults. This is different from None, which would be sent explicitly.

Usage

Basic Setup

from pipecat.services.openai import OpenAILLMService

llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    model="gpt-4o",
)

With Custom Parameters

from pipecat.services.openai import OpenAILLMService

llm = OpenAILLMService(
    api_key=os.getenv("OPENAI_API_KEY"),
    model="gpt-4.1",
    params=OpenAILLMService.InputParams(
        temperature=0.7,
        max_completion_tokens=1000,
        frequency_penalty=0.5,
    ),
)

Updating Settings at Runtime

Model settings can be changed mid-conversation using UpdateSettingsFrame:
from pipecat.frames.frames import UpdateSettingsFrame

await task.queue_frame(
    UpdateSettingsFrame(
        settings={
            "llm": {
                "temperature": 0.3,
                "max_completion_tokens": 500,
            }
        }
    )
)

Notes

  • OpenAI-compatible providers: Many third-party LLM providers offer OpenAI-compatible APIs. You can use OpenAILLMService with them by setting base_url to the provider’s endpoint.
  • Retry behavior: When retry_on_timeout=True, the first attempt uses the retry_timeout_secs timeout. If it times out, a second attempt is made with no timeout limit.
  • Function calling: Supports OpenAI’s tool/function calling format. Register function handlers on the pipeline task to handle tool calls automatically.

Event Handlers

OpenAILLMService supports the following event handlers, inherited from LLMService:
EventDescription
on_completion_timeoutCalled when an LLM completion request times out
on_function_calls_startedCalled when function calls are received and execution is about to start
@llm.event_handler("on_completion_timeout")
async def on_completion_timeout(service):
    print("LLM completion timed out")