> For a complete documentation index, fetch https://docs.voximplant.ai/llms.txt

# Overview

<blockquote>
  For the complete documentation index, see <a href="/llms.txt">llms.txt</a>.
</blockquote>

## Benefits

The native OpenAI module gives VoxEngine direct access to OpenAI's Realtime, Responses, and Chat Completions APIs.
It also works with OpenAI-compatible APIs for text-based pipelines, so you can keep the same VoxEngine integration pattern while swapping the LLM backend.

Capability and feature highlights:

* Bridge PSTN, SIP, WebRTC, or WhatsApp calls into OpenAI with a single VoxEngine scenario.
* Use the API surface that fits your architecture: Realtime, Responses API, or Chat Completions.
* Run direct realtime speech-to-speech or build half-cascade and full-cascade pipelines.
* Use OpenAI-compatible APIs for text-first pipelines through the same OpenAI module surface.
* Add barge‑in, VAD, and turn detection for natural turn-taking in cascade pipelines.

## Demo video

OpenAI Realtime demo (general):

Video link: [OpenAI Realtime API demo](https://www.youtube.com/watch?v=ryJAcb1RAFs)

## Architecture

```mermaid
graph TD
  Caller["PSTN / SIP / WebRTC / WhatsApp"] --> Vox["VoxEngine scenario"]
  Vox --> RT["OpenAI Realtime API"]
  Vox --> RESP["OpenAI Responses API<br />or compatible endpoint"]
  Vox --> CHAT["OpenAI Chat Completions API<br />or compatible endpoint"]
  Vox --> STT["Optional STT"]
  Vox --> TTS["Optional TTS"]
```

## Prerequisites

* OpenAI account with API access at [platform.openai.com](https://platform.openai.com/).
* OpenAI API key from [OpenAI API keys](https://platform.openai.com/api-keys).
* Access to the model family you want to use in these guides (for example `gpt-realtime-1.5`, `gpt-4o-mini`, or another compatible text model).

## Supported API surfaces

The VoxEngine OpenAI module currently supports three API shapes:

* **Realtime API** for direct speech-to-speech sessions with native input audio, output audio, server-side VAD, and low-latency streaming.
* **Responses API client** for text-first and cascade pipelines, including OpenAI-compatible backends exposed through a custom `baseUrl`.
* **Chat Completions API client** for simpler request/response or streaming text workflows when you do not need the newer Responses API surface.

## Pipeline options

### Direct realtime

Use the Realtime API when you want the model to handle speech input and speech output directly.
This is the lowest-friction path for native OpenAI voice sessions.

### Full cascade

Use a full-cascade pipeline when you want to choose separate STT, LLM, and TTS providers.
This is where VAD, turn detection, and helper logic such as `VoxTurnTaking` matter most.

### Half cascade

Use a half-cascade pipeline when OpenAI is still doing the reasoning, but another provider handles speech output.
This is useful when you want a different voice, speech language coverage, or TTS pricing model.

## Development notes

* **Realtime API**: create `OpenAI.RealtimeAPIClient` with `OpenAI.createRealtimeAPIClient({ apiKey, model })` and configure the session with `sessionUpdate()`.
* **Responses API client**: create `OpenAI.ResponsesAPIClient` with `OpenAI.createResponsesAPIClient({ apiKey, baseUrl?, storeContext })` for full-cascade or text-first agent flows.
* **Chat Completions API client**: create `OpenAI.ChatCompletionsClient` with `OpenAI.createChatCompletionsClient({ apiKey, baseUrl?, storeContext })` for simpler text workflows.
* **OpenAI-compatible APIs**: Responses and Chat Completions clients can target compatible endpoints via `baseUrl`, but compatibility is vendor-specific and some providers do not support the full stored-context feature set.
* **Barge‑in and turn control**: Realtime has built-in speech events. Full-cascade flows usually combine STT with [Voice Activity Detection](/capabilities/speech-flow-control/voice-activity-detection), [Turn Detection](/capabilities/speech-flow-control/turn-detection), and the [Turn Taking Helper Library](/capabilities/speech-flow-control/turn-taking-helper-library).

See the OpenAI API references for full details:

* [Realtime API reference](https://platform.openai.com/docs/api-reference/realtime)
* [Responses API reference](https://platform.openai.com/docs/api-reference/responses)
* [Chat Completions API reference](https://platform.openai.com/docs/api-reference/chat)

## Examples

* [Example: Answering an incoming call](inbound)
* [Example: Placing an outbound call](outbound)
* [Example: Function calling](function-calling)
* [Example: Full-cascade incl. Groq](full-cascade-groq)
* [Example: Half-cascade with ElevenLabs](half-cascade-elevenlabs)
* [Example: Half-cascade with Inworld](half-cascade-inworld)
* [Example: Half-cascade with Cartesia](half-cascade-cartesia)

## Links

### Voximplant

* OpenAI Voice AI connector: [https://voximplant.com/docs/voice-ai/openai](https://voximplant.com/docs/voice-ai/openai)
* OpenAI module API reference: [https://voximplant.com/docs/references/voxengine/openai](https://voximplant.com/docs/references/voxengine/openai)
* OpenAI product page: [https://voximplant.com/products/openai-client](https://voximplant.com/products/openai-client)
* Voice AI product overview: [https://voximplant.ai/](https://voximplant.ai/)

### OpenAI

* Realtime API reference: [https://platform.openai.com/docs/api-reference/realtime](https://platform.openai.com/docs/api-reference/realtime)
* Responses API reference: [https://platform.openai.com/docs/api-reference/responses](https://platform.openai.com/docs/api-reference/responses)
* Chat Completions API reference: [https://platform.openai.com/docs/api-reference/chat](https://platform.openai.com/docs/api-reference/chat)
* Realtime events (client/server): [https://platform.openai.com/docs/api-reference/realtime-client-events](https://platform.openai.com/docs/api-reference/realtime-client-events)
* Realtime guide: [https://platform.openai.com/docs/guides/realtime](https://platform.openai.com/docs/guides/realtime)