> For a complete documentation index, fetch https://docs.voximplant.ai/llms.txt

# Grok

Grok provides a VoxEngine client for connecting a call or media unit to the xAI Voice Agent API over WebSocket.

Use `Grok.createVoiceAgentAPIClient(...)` to create a `VoiceAgentAPIClient` for the current scenario. After the client is created, call methods such as `sendMediaTo`, `responseCreate`, and `addEventListener` on that client instance.

## Related guides

Learn how Grok Voice Agent fits into a VoxEngine call flow.

Start from a complete inbound Grok Voice Agent scenario.

See tool calls, context, and event-driven interaction patterns.

Start from a complete outbound Grok Voice Agent scenario.

## Contents

* &#x20;[Usage](#usage): required module import and basic flow.
* &#x20;[Factory functions](#factory-functions): create the Grok client.
* &#x20;[VoiceAgentAPIClientParameters](#factory-functions): API key, tracing, privacy, and WebSocket options.
* &#x20;[VoiceAgentAPIClient](#voiceagentapiclient): runtime client object returned by the factory.
* &#x20;[Methods](#methods): media, response, and connection control methods.
* &#x20;[Events](#events): WebSocket media start and end payloads.
* &#x20;[VoiceAgentAPIEvents](#voiceagentapievents): Grok Voice Agent API event names and payload fields.

## Usage

Add the module before using the namespace:

```js
require(Modules.Grok);
```

Create the client, bridge media, and listen for both WebSocket media events and Voice Agent API events.

## Factory functions

### &#x20;createVoiceAgentAPIClient

Creates a new [Grok.VoiceAgentAPIClient](/api-reference/voxengine/grok#voiceagentapiclient) instance.

```ts
createVoiceAgentAPIClient(parameters: {
  statistics?: boolean;
  trace?: boolean;
  privacy?: boolean;
  onWebSocketClose?: (event: object) => void;
  xAIApiKey: string;
  model?: string;
}): Promise<Grok.VoiceAgentAPIClient>
```

The required `parameters` object is typed as <code>Grok.VoiceAgent<wbr />APIClient<wbr />Parameters</code>.

**Parameters**

| Parameter                              | Type                                                     | Req. | Description                                                                                                                                                                                                                                                                                                                                                                                           |
| -------------------------------------- | -------------------------------------------------------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `parameters`                           | <code>VoiceAgent<wbr />APIClient<wbr />Parameters</code> | ✓    | [Grok.VoiceAgentAPIClient](/api-reference/voxengine/grok#voiceagentapiclient) parameters. Can be passed as arguments to the `Grok.createVoiceAgentAPIClient` method.                                                                                                                                                                                                                                  |
| ↳ `statistics`                         | `boolean`                                                | ✗    | Enables statistics functionality.                                                                                                                                                                                                                                                                                                                                                                     |
| ↳ `trace`                              | `boolean`                                                | ✗    | Whether to enable the tracing functionality. If tracing is enabled, a URL to the trace file appears in the 'websocket.created' message. The file contains all sent and received WebSocket messages in the plain text format. The file is uploaded to the S3 storage. NOTE: enable this only for diagnostic purposes. You can provide the trace file to our support team to help investigating issues. |
| ↳ `privacy`                            | `boolean`                                                | ✗    | Whether to enable the privacy functionality. If privacy is enabled, the logging for the WebSocket connection is disabled. NOTE: the default value is **false**.                                                                                                                                                                                                                                       |
| ↳ <code>onWebSocket<wbr />Close</code> | <code>(event: object) => void</code>                     | ✗    | A callback function that is called when the [WebSocket](/api-reference/voxengine/web-socket) connection is closed.                                                                                                                                                                                                                                                                                    |
| ↳ `xAIApiKey`                          | `string`                                                 | ✓    | The xAI API key for the Grok VoiceAgent API.                                                                                                                                                                                                                                                                                                                                                          |
| ↳ `model`                              | `string`                                                 | ✗    | The model to use for the Grok VoiceAgent API.[https://docs.x.ai/developers/model-capabilities/audio/voice-agent#model-selection](https://docs.x.ai/developers/model-capabilities/audio/voice-agent#model-selection) Note: The default value is **grok-voice-fast-1.0**.                                                                                                                               |

**Returns**

| Type                                            | Description                                                                  |
| ----------------------------------------------- | ---------------------------------------------------------------------------- |
| <code>Promise\<Grok.VoiceAgentAPIClient></code> | Resolves to the [`Grok.VoiceAgentAPIClient`](#voiceagentapiclient) instance. |

## VoiceAgentAPIClient

## Methods

### &#x20;addEventListener

Adds a handler for the specified [Grok.VoiceAgentAPIEvents](/api-reference/voxengine/grok#voiceagentapievents) or [Grok.Events](/api-reference/voxengine/grok#events) event. Use only functions as handlers; anything except a function leads to the error and scenario termination when a handler is called.

```ts
addEventListener(event: Grok.Events | Grok.VoiceAgentAPIEvents | string, callback: (event: object) => any): void
```

**Parameters**

| Parameter  | Type                                                                         | Req. | Description                                   |
| ---------- | ---------------------------------------------------------------------------- | ---- | --------------------------------------------- |
| `event`    | <code>Grok.Events \| Grok.<wbr />VoiceAgent<wbr />APIEvents \| string</code> | ✓    | Event constant or event name to subscribe to. |
| `callback` | <code>(event: object) => any</code>                                          | ✓    | Function called when the event is emitted.    |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;clearMediaBuffer

Clears the Grok WebSocket media buffer.

```ts
clearMediaBuffer(parameters?: ClearMediaBufferParameters): void
```

**Parameters**

| Parameter    | Type                                                  | Req. | Description |
| ------------ | ----------------------------------------------------- | ---- | ----------- |
| `parameters` | <code>ClearMedia<wbr />Buffer<wbr />Parameters</code> | ✗    |             |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;close

Closes the Grok connection (over WebSocket) or connection attempt.

```ts
close(): void
```

**Parameters**

This method does not accept parameters.

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;conversationItemCreate

Create a new user message. [https://docs.x.ai/docs/guides/voice/agent#client](https://docs.x.ai/docs/guides/voice/agent#client)

```ts
conversationItemCreate(parameters: Object): void
```

**Parameters**

| Parameter    | Type     | Req. | Description                                                                                                                                                                                                                                                                                                      |
| ------------ | -------- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `parameters` | `Object` | ✓    | xAI conversation.item.create client message. Common fields include type, event\_id, and item; item can be a user message, assistant message, function call, or function call output.  See the [partner API reference](https://docs.x.ai/developers/rest-api-reference/inference/voice#conversation.item.create). |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

Example `parameters`:

```json
{
  "type": "conversation.item.create",
  "event_id": "event_345",
  "item": {
    "type": "message",
    "role": "user",
    "content": [
      {
        "type": "input_text",
        "text": "Hello"
      }
    ]
  }
}
```

### &#x20;id

Returns the VoiceAgentAPIClient id.

```ts
id(): string
```

**Parameters**

This method does not accept parameters.

**Returns**

| Type     | Description                 |
| -------- | --------------------------- |
| `string` | The requested string value. |

### &#x20;inputAudioBufferClear

Clear input audio buffer. [https://docs.x.ai/docs/guides/voice/agent#client-1](https://docs.x.ai/docs/guides/voice/agent#client-1)

```ts
inputAudioBufferClear(parameters: Object): void
```

**Parameters**

| Parameter    | Type     | Req. | Description                                                                                                                                                                                                                   |
| ------------ | -------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `parameters` | `Object` | ✓    | xAI input\_audio\_buffer.clear client message. Common fields include type and optional event\_id.  See the [partner API reference](https://docs.x.ai/developers/rest-api-reference/inference/voice#input_audio_buffer.clear). |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

Example `parameters`:

```json
{
  "type": "input_audio_buffer.clear"
}
```

### &#x20;removeEventListener

Removes a handler for the specified [Grok.VoiceAgentAPIEvents](/api-reference/voxengine/grok#voiceagentapievents) or [Grok.Events](/api-reference/voxengine/grok#events) event.

```ts
removeEventListener(event: Grok.Events | Grok.VoiceAgentAPIEvents | string, callback?: (event: object) => any): void
```

**Parameters**

| Parameter  | Type                                                                         | Req. | Description                                   |
| ---------- | ---------------------------------------------------------------------------- | ---- | --------------------------------------------- |
| `event`    | <code>Grok.Events \| Grok.<wbr />VoiceAgent<wbr />APIEvents \| string</code> | ✓    | Event constant or event name to subscribe to. |
| `callback` | <code>(event: object) => any</code>                                          | ✗    | Function called when the event is emitted.    |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;responseCreate

Request the server to create a new assistant response when using client side vad. (This is handled automatically when using server side vad.) [https://docs.x.ai/docs/guides/voice/agent#client-2](https://docs.x.ai/docs/guides/voice/agent#client-2)

```ts
responseCreate(parameters: Object): void
```

**Parameters**

| Parameter    | Type     | Req. | Description                                                                                                                                                                                                                                 |
| ------------ | -------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `parameters` | `Object` | ✓    | xAI response.create client message. Common fields include type, optional event\_id, and optional response configuration.  See the [partner API reference](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.create). |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

Example `parameters`:

```json
{
  "type": "response.create"
}
```

### &#x20;sendMediaTo

Starts sending media from the Grok (via WebSocket) to the media unit. Grok works in real time.

```ts
sendMediaTo(mediaUnit: VoxMediaUnit, parameters?: SendMediaParameters): void
```

**Parameters**

| Parameter    | Type                                    | Req. | Description |
| ------------ | --------------------------------------- | ---- | ----------- |
| `mediaUnit`  | `VoxMediaUnit`                          | ✓    |             |
| `parameters` | <code>SendMedia<wbr />Parameters</code> | ✗    |             |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;sessionUpdate

Send this event to update the session’s configuration. [https://docs.x.ai/docs/guides/voice/agent#client-events-1](https://docs.x.ai/docs/guides/voice/agent#client-events-1)

```ts
sessionUpdate(parameters: Object): void
```

**Parameters**

| Parameter    | Type     | Req. | Description                                                                                                                                                                                                                                                                                    |
| ------------ | -------- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `parameters` | `Object` | ✓    | xAI session.update client message. Common fields include type, optional event\_id, and session, which can configure prompt, voice, audio formats, turn detection, and tools.  See the [partner API reference](https://docs.x.ai/developers/rest-api-reference/inference/voice#session.update). |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

Example `parameters`:

```json
{
  "type": "session.update",
  "session": {
    "voice": "aria",
    "instructions": "You are a helpful assistant",
    "turn_detection": {
      "type": "server_vad"
    }
  }
}
```

### &#x20;stopMediaTo

Stops sending media from the Grok (via WebSocket) to the media unit.

```ts
stopMediaTo(mediaUnit: VoxMediaUnit): void
```

**Parameters**

| Parameter   | Type           | Req. | Description |
| ----------- | -------------- | ---- | ----------- |
| `mediaUnit` | `VoxMediaUnit` | ✓    |             |

**Returns**

| Type   | Description              |
| ------ | ------------------------ |
| `void` | Does not return a value. |

### &#x20;webSocketId

Returns the Grok WebSocket id.

```ts
webSocketId(): string
```

**Parameters**

This method does not accept parameters.

**Returns**

| Type     | Description                 |
| -------- | --------------------------- |
| `string` | The requested string value. |

## Events

These events describe audio received through the Grok WebSocket media bridge.

### &#x20;WebSocketMediaStarted

Triggered when the audio stream sent by a third party through an Grok WebSocket is started playing.

Event constant: `Events.WebSocketMediaStarted`

**Payload**

| Field                                | Type                                     | Req. | Description                                                                                                                                       |
| ------------------------------------ | ---------------------------------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| `client`                             | <code>VoiceAgent<wbr />APIClient</code>  | ✓    | The [Grok.VoiceAgentAPIClient](/api-reference/voxengine/grok#voiceagentapiclient) instance.                                                       |
| `tag`                                | `string`                                 | ✗    | Special tag to name audio streams sent over one WebSocket connection. With it, one can send 2 audios to 2 different media units at the same time. |
| `encoding`                           | `string`                                 | ✗    | Audio encoding formats.                                                                                                                           |
| <code>custom<wbr />Parameters</code> | <code>\{ \[key: string]: string }</code> | ✗    | Custom parameters.                                                                                                                                |

### &#x20;WebSocketMediaEnded

Triggers after the end of the audio stream sent by a third party through an Grok WebSocket (**1 second of silence**).

Event constant: `Events.WebSocketMediaEnded`

**Payload**

| Field       | Type                                    | Req. | Description                                                                                                                                       |
| ----------- | --------------------------------------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| `client`    | <code>VoiceAgent<wbr />APIClient</code> | ✓    | The [Grok.VoiceAgentAPIClient](/api-reference/voxengine/grok#voiceagentapiclient) instance.                                                       |
| `tag`       | `string`                                | ✗    | Special tag to name audio streams sent over one WebSocket connection. With it, one can send 2 audios to 2 different media units at the same time. |
| `mediaInfo` | <code>WebSocket<wbr />MediaInfo</code>  | ✗    | Information about the audio stream that can be obtained after the stream stops or pauses (**1 second of silence**).                               |

## VoiceAgentAPIEvents

These events mirror server messages from the Grok Voice Agent API. The `data` field contains the provider event payload.

**All VoiceAgentAPIEvents callbacks receive these common fields:**

| Field    | Type                                                | Description                            |
| -------- | --------------------------------------------------- | -------------------------------------- |
| `client` | <code>Grok.<wbr />VoiceAgent<wbr />APIClient</code> | The Grok.VoiceAgentAPIClient instance. |
| `data`   | `Object`                                            | Pass-through xAI server event payload. |

Per-event payload tables below show only event-specific fields (and any provider payload enrichment for `data`).

<a id="-unknown" />

The unknown event.

Event constant: `VoiceAgentAPIEvents.Unknown`

**Payload**

*No event-specific payload columns are listed here; this callback still receives the common `client` and `data` fields. For `data`, see the partner documentation for the exact JSON shape.*

<a id="-conversationcreated" />

The first message at connection. Notifies the client that a conversation session has been created. [https://docs.x.ai/docs/guides/voice/agent#server-events-2](https://docs.x.ai/docs/guides/voice/agent#server-events-2)

Event constant: `VoiceAgentAPIEvents.ConversationCreated`

**Payload**

| Field               | Type     | Req. | Description                                                                                                                                                                                                                     |
| ------------------- | -------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`              | `Object` | ✗    | The first message on connection. Notifies the client that a conversation session has been created. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#conversation.created). |
| `data.event_id`     | `string` | ✓    | Unique event identifier.                                                                                                                                                                                                        |
| `data.type`         | `string` | ✓    | Always `conversation.created`.                                                                                                                                                                                                  |
| `data.conversation` | `object` | ✓    | The conversation object.                                                                                                                                                                                                        |

Example `data`:

```json
{
  "event_id": "event_9101",
  "type": "conversation.created",
  "conversation": {
    "id": "conv_001",
    "object": "realtime.conversation"
  }
}
```

<a id="-sessionupdated" />

Acknowledge the client's "session.update" message that the session has been updated. [https://docs.x.ai/docs/guides/voice/agent#server-events-1](https://docs.x.ai/docs/guides/voice/agent#server-events-1)

Event constant: `VoiceAgentAPIEvents.SessionUpdated`

**Payload**

| Field           | Type     | Req. | Description                                                                                                                                                                                                    |
| --------------- | -------- | ---- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`          | `Object` | ✗    | Acknowledges the client's session.update message that the session has been configured. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#session.updated). |
| `data.event_id` | `string` | ✓    | Unique event identifier.                                                                                                                                                                                       |
| `data.type`     | `string` | ✓    | Always `session.updated`.                                                                                                                                                                                      |
| `data.session`  | `object` | ✓    | The updated session configuration.                                                                                                                                                                             |

Example `data`:

```json
{
  "event_id": "event_123",
  "type": "session.updated",
  "session": {
    "model": "grok-voice-fast-1.0",
    "instructions": "You are a helpful assistant.",
    "voice": "Eve",
    "turn_detection": {
      "type": "server_vad"
    }
  }
}
```

<a id="-conversationitemadded" />

Responding to the client that a new user message has been added to conversation history, or if an assistance response has been added to conversation history. [https://docs.x.ai/docs/guides/voice/agent#server](https://docs.x.ai/docs/guides/voice/agent#server)

Event constant: `VoiceAgentAPIEvents.ConversationItemAdded`

**Payload**

| Field                   | Type     | Req. | Description                                                                                                                                                                                                 |
| ----------------------- | -------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`                  | `Object` | ✗    | A new user or assistant message has been added to the conversation history. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#conversation.item.added). |
| `data.event_id`         | `string` | ✓    | Unique event identifier.                                                                                                                                                                                    |
| `data.type`             | `string` | ✓    | Always `conversation.item.added`.                                                                                                                                                                           |
| `data.previous_item_id` | `string` | ✓    | ID of the preceding item in conversation history.                                                                                                                                                           |
| `data.item`             | `object` | ✓    | The conversation item that was added.                                                                                                                                                                       |

Example `data`:

```json
{
  "event_id": "event_1920",
  "type": "conversation.item.added",
  "previous_item_id": "msg_002",
  "item": {
    "id": "msg_003",
    "object": "realtime.item",
    "type": "message",
    "status": "completed",
    "role": "user",
    "content": [
      {
        "type": "input_audio",
        "transcript": "hello how are you"
      }
    ]
  }
}
```

<a id="-conversationiteminputaudiotranscriptioncompleted" />

Notify the client the audio transcription for input has been completed. [https://docs.x.ai/docs/guides/voice/agent#server](https://docs.x.ai/docs/guides/voice/agent#server)

Event constant: `VoiceAgentAPIEvents.ConversationItemInputAudioTranscriptionCompleted`

**Payload**

| Field             | Type                                                                                        | Req. | Description                                                                                                                                                                                                                |
| ----------------- | ------------------------------------------------------------------------------------------- | ---- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`            | `Object`                                                                                    | ✗    | Audio transcription for the user's input has been completed. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#conversation.item.input_audio_transcription.completed). |
| `data.event_id`   | `string`                                                                                    | ✓    | Unique event identifier.                                                                                                                                                                                                   |
| `data.type`       | <code>"conversation.<wbr />item.<wbr />input\_audio\_transcription.<wbr />completed"</code> | ✓    | Event type.                                                                                                                                                                                                                |
| `data.item_id`    | `string`                                                                                    | ✓    | ID of the conversation item whose audio was transcribed.                                                                                                                                                                   |
| `data.transcript` | `string`                                                                                    | ✓    | The transcribed text.                                                                                                                                                                                                      |

Example `data`:

```json
{
  "event_id": "event_2122",
  "type": "conversation.item.input_audio_transcription.completed",
  "item_id": "msg_003",
  "transcript": "Hello, how are you?"
}
```

<a id="-inputaudiobuffercommitted" />

Input audio buffer has been committed. [https://docs.x.ai/docs/guides/voice/agent#server-1](https://docs.x.ai/docs/guides/voice/agent#server-1)

Event constant: `VoiceAgentAPIEvents.InputAudioBufferCommitted`

**Payload**

| Field                   | Type     | Req. | Description                                                                                                                                                                                   |
| ----------------------- | -------- | ---- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`                  | `Object` | ✗    | Input audio buffer has been committed as a user message. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#input_audio_buffer.committed). |
| `data.event_id`         | `string` | ✓    | Unique event identifier.                                                                                                                                                                      |
| `data.type`             | `string` | ✓    | Always `input_audio_buffer.committed`.                                                                                                                                                        |
| `data.previous_item_id` | `string` | ✓    | ID of the preceding conversation item.                                                                                                                                                        |
| `data.item_id`          | `string` | ✓    | ID of the newly created user message item.                                                                                                                                                    |

Example `data`:

```json
{
  "event_id": "event_1121",
  "type": "input_audio_buffer.committed",
  "previous_item_id": "msg_001",
  "item_id": "msg_002"
}
```

<a id="-inputaudiobuffercleared" />

Input audio buffer has been cleared. [https://docs.x.ai/docs/guides/voice/agent#server-1](https://docs.x.ai/docs/guides/voice/agent#server-1)

Event constant: `VoiceAgentAPIEvents.InputAudioBufferCleared`

**Payload**

| Field           | Type     | Req. | Description                                                                                                                                                                          |
| --------------- | -------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `data`          | `Object` | ✗    | Confirms the input audio buffer has been cleared. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#input_audio_buffer.cleared). |
| `data.event_id` | `string` | ✓    | Unique event identifier.                                                                                                                                                             |
| `data.type`     | `string` | ✓    | Always `input_audio_buffer.cleared`.                                                                                                                                                 |

Example `data`:

```json
{
  "event_id": "event_1122",
  "type": "input_audio_buffer.cleared"
}
```

<a id="-inputaudiobufferspeechstarted" />

Notify the client the server's VAD has detected the start of a speech. [https://docs.x.ai/docs/guides/voice/agent#server-1](https://docs.x.ai/docs/guides/voice/agent#server-1)

Event constant: `VoiceAgentAPIEvents.InputAudioBufferSpeechStarted`

**Payload**

| Field                 | Type      | Req. | Description                                                                                                                                                                                                                                            |
| --------------------- | --------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| `data`                | `Object`  | ✗    | Notifies that the server's VAD detected the start of speech. Only available with server\_vad turn detection. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#input_audio_buffer.speech_started). |
| `data.event_id`       | `string`  | ✓    | Unique event identifier.                                                                                                                                                                                                                               |
| `data.type`           | `string`  | ✓    | Always `input_audio_buffer.speech_started`.                                                                                                                                                                                                            |
| `data.item_id`        | `string`  | ✓    | ID of the associated message item.                                                                                                                                                                                                                     |
| `data.audio_start_ms` | `integer` | ✓    | Millisecond offset in the audio buffer where speech was detected.                                                                                                                                                                                      |

Example `data`:

```json
{
  "event_id": "event_1516",
  "type": "input_audio_buffer.speech_started",
  "item_id": "msg_003"
}
```

<a id="-inputaudiobufferspeechstopped" />

Notify the client the server's VAD has detected the end of a speech. [https://docs.x.ai/docs/guides/voice/agent#server-1](https://docs.x.ai/docs/guides/voice/agent#server-1)

Event constant: `VoiceAgentAPIEvents.InputAudioBufferSpeechStopped`

**Payload**

| Field               | Type      | Req. | Description                                                                                                                                                                                                                                          |
| ------------------- | --------- | ---- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`              | `Object`  | ✗    | Notifies that the server's VAD detected the end of speech. Only available with server\_vad turn detection. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#input_audio_buffer.speech_stopped). |
| `data.event_id`     | `string`  | ✓    | Unique event identifier.                                                                                                                                                                                                                             |
| `data.type`         | `string`  | ✓    | Always `input_audio_buffer.speech_stopped`.                                                                                                                                                                                                          |
| `data.item_id`      | `string`  | ✓    | ID of the associated message item.                                                                                                                                                                                                                   |
| `data.audio_end_ms` | `integer` | ✓    | Millisecond offset in the audio buffer where speech ended.                                                                                                                                                                                           |

Example `data`:

```json
{
  "event_id": "event_1516",
  "type": "input_audio_buffer.speech_stopped",
  "item_id": "msg_003"
}
```

<a id="-responsecreated" />

A new assistant response turn is in progress. Audio delta created from this assistant turn will have the same response id. [https://docs.x.ai/docs/guides/voice/agent#server-2](https://docs.x.ai/docs/guides/voice/agent#server-2)

Event constant: `VoiceAgentAPIEvents.ResponseCreated`

**Payload**

| Field           | Type     | Req. | Description                                                                                                                                                                                                                     |
| --------------- | -------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`          | `Object` | ✗    | A new assistant response turn is in progress. Audio deltas from this turn share the same response\_id. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.created). |
| `data.event_id` | `string` | ✓    | Unique event identifier.                                                                                                                                                                                                        |
| `data.type`     | `string` | ✓    | Always `response.created`.                                                                                                                                                                                                      |
| `data.response` | `object` | ✓    | The response object.                                                                                                                                                                                                            |

Example `data`:

```json
{
  "event_id": "event_2930",
  "type": "response.created",
  "response": {
    "id": "resp_001",
    "object": "realtime.response",
    "status": "in_progress",
    "output": []
  }
}
```

<a id="-responsedone" />

The assistant's response is completed. [https://docs.x.ai/docs/guides/voice/agent#server-2](https://docs.x.ai/docs/guides/voice/agent#server-2)

Event constant: `VoiceAgentAPIEvents.ResponseDone`

**Payload**

| Field           | Type     | Req. | Description                                                                                                                                                                                                                                                   |
| --------------- | -------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`          | `Object` | ✗    | The assistant's response is completed. Sent after all audio and transcript deltas. Ready for the client to add a new conversation item. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.done). |
| `data.event_id` | `string` | ✓    | Unique event identifier.                                                                                                                                                                                                                                      |
| `data.type`     | `string` | ✓    | Always `response.done`.                                                                                                                                                                                                                                       |
| `data.response` | `object` | ✓    | The completed response object.                                                                                                                                                                                                                                |

Example `data`:

```json
{
  "event_id": "event_3132",
  "type": "response.done",
  "response": {
    "id": "resp_001",
    "object": "realtime.response",
    "status": "completed"
  }
}
```

<a id="-responseoutputitemadded" />

A new assistant response is added to message history. [https://docs.x.ai/docs/guides/voice/agent#server-2](https://docs.x.ai/docs/guides/voice/agent#server-2)

Event constant: `VoiceAgentAPIEvents.ResponseOutputItemAdded`

**Payload**

| Field               | Type      | Req. | Description                                                                                                                                                                                       |
| ------------------- | --------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`              | `Object`  | ✗    | A new assistant response item is added to the message history. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.output_item.added). |
| `data.event_id`     | `string`  | ✓    | Unique event identifier.                                                                                                                                                                          |
| `data.type`         | `string`  | ✓    | Always `response.output_item.added`.                                                                                                                                                              |
| `data.response_id`  | `string`  | ✓    | ID of the response this item belongs to.                                                                                                                                                          |
| `data.output_index` | `integer` | ✓    | Index of the output item in the response.                                                                                                                                                         |
| `data.item`         | `object`  | ✓    | The output item that was added.                                                                                                                                                                   |

Example `data`:

```json
{
  "event_id": "event_3334",
  "type": "response.output_item.added",
  "response_id": "resp_001",
  "output_index": 0,
  "item": {
    "id": "msg_007",
    "object": "realtime.item",
    "type": "message",
    "status": "in_progress",
    "role": "assistant",
    "content": []
  }
}
```

<a id="-responseoutputitemdone" />

A new assistant response is done.

Event constant: `VoiceAgentAPIEvents.ResponseOutputItemDone`

**Payload**

| Field               | Type      | Req. | Description                                                                                                                                                   |
| ------------------- | --------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`              | `Object`  | ✗    | An output item is complete. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.output_item.done). |
| `data.event_id`     | `string`  | ✓    | Unique event identifier.                                                                                                                                      |
| `data.type`         | `string`  | ✓    | Always `response.output_item.done`.                                                                                                                           |
| `data.response_id`  | `string`  | ✓    | ID of the response this item belongs to.                                                                                                                      |
| `data.output_index` | `integer` | ✓    | Index of the output item in the response.                                                                                                                     |
| `data.item`         | `object`  | ✓    | The completed output item.                                                                                                                                    |

Example `data`:

```json
{
  "event_id": "event_3335",
  "type": "response.output_item.done",
  "response_id": "resp_001",
  "output_index": 0,
  "item": {
    "id": "msg_007",
    "object": "realtime.item",
    "type": "message",
    "status": "completed",
    "role": "assistant",
    "content": []
  }
}
```

<a id="-responseoutputaudiotranscriptdelta" />

Audio transcript delta of the assistant response. [https://docs.x.ai/docs/guides/voice/agent#server-3](https://docs.x.ai/docs/guides/voice/agent#server-3)

Event constant: `VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDelta`

**Payload**

| Field                | Type      | Req. | Description                                                                                                                                                                                                       |
| -------------------- | --------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`               | `Object`  | ✗    | Streaming text transcript delta of the assistant's audio response. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.output_audio_transcript.delta). |
| `data.event_id`      | `string`  | ✓    | Unique event identifier.                                                                                                                                                                                          |
| `data.type`          | `string`  | ✓    | Always `response.output_audio_transcript.delta`.                                                                                                                                                                  |
| `data.response_id`   | `string`  | ✓    | ID of the response.                                                                                                                                                                                               |
| `data.item_id`       | `string`  | ✓    | ID of the output item.                                                                                                                                                                                            |
| `data.output_index`  | `integer` | ✓    | Index of the output item in the response.                                                                                                                                                                         |
| `data.content_index` | `integer` | ✓    | Index of the content part within the item.                                                                                                                                                                        |
| `data.delta`         | `string`  | ✓    | Text transcript fragment.                                                                                                                                                                                         |

Example `data`:

```json
{
  "event_id": "event_4950",
  "type": "response.output_audio_transcript.delta",
  "response_id": "resp_001",
  "item_id": "msg_008",
  "delta": "Hello! I'm doing"
}
```

<a id="-responseoutputaudiotranscriptdone" />

The audio transcript delta of the assistant response has finished generating. [https://docs.x.ai/docs/guides/voice/agent#server-3](https://docs.x.ai/docs/guides/voice/agent#server-3)

Event constant: `VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDone`

**Payload**

| Field                | Type      | Req. | Description                                                                                                                                                                                                         |
| -------------------- | --------- | ---- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`               | `Object`  | ✗    | The audio transcript for this assistant turn has finished generating. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.output_audio_transcript.done). |
| `data.event_id`      | `string`  | ✓    | Unique event identifier.                                                                                                                                                                                            |
| `data.type`          | `string`  | ✓    | Always `response.output_audio_transcript.done`.                                                                                                                                                                     |
| `data.response_id`   | `string`  | ✓    | ID of the response.                                                                                                                                                                                                 |
| `data.item_id`       | `string`  | ✓    | ID of the output item.                                                                                                                                                                                              |
| `data.output_index`  | `integer` | ✓    | Index of the output item in the response.                                                                                                                                                                           |
| `data.content_index` | `integer` | ✓    | Index of the content part within the item.                                                                                                                                                                          |
| `data.transcript`    | `string`  | ✓    | The complete transcript text.                                                                                                                                                                                       |

Example `data`:

```json
{
  "event_id": "event_5152",
  "type": "response.output_audio_transcript.done",
  "response_id": "resp_001",
  "item_id": "msg_008"
}
```

<a id="-responseoutputaudiodone" />

Notifies client that the audio for this turn has finished generating. [https://docs.x.ai/docs/guides/voice/agent#server-3](https://docs.x.ai/docs/guides/voice/agent#server-3)

Event constant: `VoiceAgentAPIEvents.ResponseOutputAudioDone`

**Payload**

| Field                | Type      | Req. | Description                                                                                                                                                                               |
| -------------------- | --------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`               | `Object`  | ✗    | Audio generation for this assistant turn has finished. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.output_audio.done). |
| `data.event_id`      | `string`  | ✓    | Unique event identifier.                                                                                                                                                                  |
| `data.type`          | `string`  | ✓    | Always `response.output_audio.done`.                                                                                                                                                      |
| `data.response_id`   | `string`  | ✓    | ID of the response.                                                                                                                                                                       |
| `data.item_id`       | `string`  | ✓    | ID of the output item.                                                                                                                                                                    |
| `data.output_index`  | `integer` | ✓    | Index of the output item in the response.                                                                                                                                                 |
| `data.content_index` | `integer` | ✓    | Index of the content part within the item.                                                                                                                                                |

Example `data`:

```json
{
  "event_id": "event_5152",
  "type": "response.output_audio.done",
  "response_id": "resp_001",
  "item_id": "msg_008"
}
```

<a id="-responsecontentpartadded" />

Notifies client that the content part added.

Event constant: `VoiceAgentAPIEvents.ResponseContentPartAdded`

**Payload**

| Field                | Type      | Req. | Description                                                                                                                                                                      |
| -------------------- | --------- | ---- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`               | `Object`  | ✗    | A content part starts within an output item. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.content_part.added). |
| `data.event_id`      | `string`  | ✓    | Unique event identifier.                                                                                                                                                         |
| `data.type`          | `string`  | ✓    | Always `response.content_part.added`.                                                                                                                                            |
| `data.response_id`   | `string`  | ✓    | ID of the response.                                                                                                                                                              |
| `data.item_id`       | `string`  | ✓    | ID of the output item.                                                                                                                                                           |
| `data.output_index`  | `integer` | ✓    | Index of the output item in the response.                                                                                                                                        |
| `data.content_index` | `integer` | ✓    | Index of the content part within the item.                                                                                                                                       |
| `data.part`          | `object`  | ✓    | The content part.                                                                                                                                                                |

Example `data`:

```json
{
  "event_id": "event_3336",
  "type": "response.content_part.added",
  "response_id": "resp_001",
  "item_id": "msg_007",
  "output_index": 0,
  "content_index": 0,
  "part": {
    "type": "audio"
  }
}
```

<a id="-responsecontentpartdone" />

Notifies client that the content part done.

Event constant: `VoiceAgentAPIEvents.ResponseContentPartDone`

**Payload**

| Field                | Type      | Req. | Description                                                                                                                                                 |
| -------------------- | --------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`               | `Object`  | ✗    | A content part finishes. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.content_part.done). |
| `data.event_id`      | `string`  | ✓    | Unique event identifier.                                                                                                                                    |
| `data.type`          | `string`  | ✓    | Always `response.content_part.done`.                                                                                                                        |
| `data.response_id`   | `string`  | ✓    | ID of the response.                                                                                                                                         |
| `data.item_id`       | `string`  | ✓    | ID of the output item.                                                                                                                                      |
| `data.output_index`  | `integer` | ✓    | Index of the output item in the response.                                                                                                                   |
| `data.content_index` | `integer` | ✓    | Index of the content part within the item.                                                                                                                  |
| `data.part`          | `object`  | ✓    | The completed content part.                                                                                                                                 |

Example `data`:

```json
{
  "event_id": "event_3337",
  "type": "response.content_part.done",
  "response_id": "resp_001",
  "item_id": "msg_007",
  "output_index": 0,
  "content_index": 0,
  "part": {
    "type": "audio"
  }
}
```

<a id="-responsefunctioncallargumentsdone" />

Function call triggered with complete arguments. [https://docs.x.ai/docs/guides/voice/agent#handling-function-call-responses](https://docs.x.ai/docs/guides/voice/agent#handling-function-call-responses)

Event constant: `VoiceAgentAPIEvents.ResponseFunctionCallArgumentsDone`

**Payload**

| Field               | Type      | Req. | Description                                                                                                                                                                                                                                                                                                                         |
| ------------------- | --------- | ---- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `data`              | `Object`  | ✗    | A function call has been triggered with complete arguments. Your code should execute the function and return results via `conversation.item.create` with type `function_call_output`. See the [partner event documentation](https://docs.x.ai/developers/rest-api-reference/inference/voice#response.function_call_arguments.done). |
| `data.event_id`     | `string`  | ✓    | Unique event identifier.                                                                                                                                                                                                                                                                                                            |
| `data.type`         | `string`  | ✓    | Always `response.function_call_arguments.done`.                                                                                                                                                                                                                                                                                     |
| `data.response_id`  | `string`  | ✓    | ID of the response.                                                                                                                                                                                                                                                                                                                 |
| `data.item_id`      | `string`  | ✓    | ID of the function call item.                                                                                                                                                                                                                                                                                                       |
| `data.output_index` | `integer` | ✓    | Index of the output item in the response.                                                                                                                                                                                                                                                                                           |
| `data.call_id`      | `string`  | ✓    | Unique ID for this function call. Pass this as `call_id` in the `conversation.item.create` event with type `function_call_output`.                                                                                                                                                                                                  |
| `data.name`         | `string`  | ✓    | Name of the function to call.                                                                                                                                                                                                                                                                                                       |
| `data.arguments`    | `string`  | ✓    | JSON string of the function arguments.                                                                                                                                                                                                                                                                                              |

Example `data`:

```json
{
  "event_id": "event_fc01",
  "type": "response.function_call_arguments.done",
  "response_id": "resp_001",
  "item_id": "msg_009",
  "output_index": 0,
  "call_id": "call_001",
  "name": "get_weather",
  "arguments": "{\"location\": \"San Francisco\"}"
}
```

<a id="-websocketerror" />

The WebSocket error response event.

Event constant: `VoiceAgentAPIEvents.WebSocketError`

**Payload**

*No event-specific payload columns are listed here; this callback still receives the common `client` and `data` fields. For `data`, see the partner documentation for the exact JSON shape.*

<a id="-connectorinformation" />

Contains information about connector.

Event constant: `VoiceAgentAPIEvents.ConnectorInformation`

**Payload**

*No event-specific payload columns are listed here; this callback still receives the common `client` and `data` fields. For `data`, see the partner documentation for the exact JSON shape.*