Overview
OpenAILLMService provides chat completion capabilities using OpenAI’s API, supporting streaming responses, function calling, vision input, and advanced context management for conversational AI applications with state-of-the-art language models.
OpenAI LLM API Reference
Pipecat’s API methods for OpenAI integration
Example Implementation
Function calling example with weather API
OpenAI Documentation
Official OpenAI API documentation
OpenAI Platform
Access models and manage API keys
Installation
To use OpenAI services, install the required dependencies:Prerequisites
OpenAI Account Setup
Before using OpenAI LLM services, you need:- OpenAI Account: Sign up at OpenAI Platform
- API Key: Generate an API key from your account dashboard
- Model Selection: Choose from available models (GPT-4.1, GPT-4o, GPT-4o-mini, etc.)
- Usage Limits: Set up billing and usage limits as needed
Required Environment Variables
OPENAI_API_KEY: Your OpenAI API key for authentication
Configuration
OpenAI model name to use (e.g.,
"gpt-4.1", "gpt-4o", "gpt-4o-mini").OpenAI API key. If
None, uses the OPENAI_API_KEY environment variable.Custom base URL for the OpenAI API. Override for proxied or self-hosted deployments.
OpenAI organization ID.
OpenAI project ID.
Additional HTTP headers to include in every request.
Runtime-configurable model settings. See InputParams below.
Request timeout in seconds. Used when
retry_on_timeout is enabled to determine when to retry.Whether to retry the request once if it times out. The retry attempt has no timeout limit.
InputParams
Model inference settings that can be set at initialization via theparams constructor argument, or changed at runtime via UpdateSettingsFrame.
| Parameter | Type | Default | Description |
|---|---|---|---|
frequency_penalty | float | NOT_GIVEN | Penalty for frequent tokens (-2.0 to 2.0). Positive values discourage repetition. |
presence_penalty | float | NOT_GIVEN | Penalty for new topics (-2.0 to 2.0). Positive values encourage the model to talk about new topics. |
seed | int | NOT_GIVEN | Random seed for deterministic outputs. |
temperature | float | NOT_GIVEN | Sampling temperature (0.0 to 2.0). Lower values are more focused, higher values are more creative. |
top_k | int | None | Top-k sampling parameter. Currently ignored by the OpenAI client library. |
top_p | float | NOT_GIVEN | Top-p (nucleus) sampling (0.0 to 1.0). Controls diversity of output. |
max_tokens | int | NOT_GIVEN | Maximum tokens in response. Deprecated — use max_completion_tokens instead. |
max_completion_tokens | int | NOT_GIVEN | Maximum completion tokens to generate. |
service_tier | str | NOT_GIVEN | Service tier (e.g., "auto", "flex", "priority"). Controls latency and cost tradeoffs. |
extra | dict | {} | Additional model-specific parameters passed directly to the API. |
NOT_GIVEN values are omitted from the API request entirely, letting the OpenAI API use its own defaults. This is different from None, which would be sent explicitly.Usage
Basic Setup
With Custom Parameters
Updating Settings at Runtime
Model settings can be changed mid-conversation usingUpdateSettingsFrame:
Notes
- OpenAI-compatible providers: Many third-party LLM providers offer OpenAI-compatible APIs. You can use
OpenAILLMServicewith them by settingbase_urlto the provider’s endpoint. - Retry behavior: When
retry_on_timeout=True, the first attempt uses theretry_timeout_secstimeout. If it times out, a second attempt is made with no timeout limit. - Function calling: Supports OpenAI’s tool/function calling format. Register function handlers on the pipeline task to handle tool calls automatically.
Event Handlers
OpenAILLMService supports the following event handlers, inherited from LLMService:
| Event | Description |
|---|---|
on_completion_timeout | Called when an LLM completion request times out |
on_function_calls_started | Called when function calls are received and execution is about to start |