Grok
Voice agent API client for xAI Grok real-time voice scenarios.
Grok provides a VoxEngine client for connecting a call or media unit to the xAI Voice Agent API over WebSocket.
Use Grok.createVoiceAgentAPIClient(...) to create a VoiceAgentAPIClient for the current scenario. After the client is created, call methods such as sendMediaTo, responseCreate, and addEventListener on that client instance.
Related guides
Learn how Grok Voice Agent fits into a VoxEngine call flow.
Start from a complete inbound Grok Voice Agent scenario.
See tool calls, context, and event-driven interaction patterns.
Start from a complete outbound Grok Voice Agent scenario.
Contents
- Usage: required module import and basic flow.
- Factory functions: create the Grok client.
- VoiceAgentAPIClientParameters: API key, tracing, privacy, and WebSocket options.
- VoiceAgentAPIClient: runtime client object returned by the factory.
- Methods: media, response, and connection control methods.
- Events: WebSocket media start and end payloads.
- VoiceAgentAPIEvents: Grok Voice Agent API event names and payload fields.
Usage
Add the module before using the namespace:
Create the client, bridge media, and listen for both WebSocket media events and Voice Agent API events.
Factory functions
createVoiceAgentAPIClient
Creates a new Grok.VoiceAgentAPIClient instance.
The required parameters object is typed as Grok.VoiceAgent.
Parameters
Returns
VoiceAgentAPIClient
Methods
addEventListener
Adds a handler for the specified Grok.VoiceAgentAPIEvents or Grok.Events event. Use only functions as handlers; anything except a function leads to the error and scenario termination when a handler is called.
Parameters
Returns
clearMediaBuffer
Clears the Grok WebSocket media buffer.
Parameters
Returns
close
Closes the Grok connection (over WebSocket) or connection attempt.
Parameters
This method does not accept parameters.
Returns
conversationItemCreate
Create a new user message. https://docs.x.ai/docs/guides/voice/agent#client
Parameters
Returns
Example parameters:
id
Returns the VoiceAgentAPIClient id.
Parameters
This method does not accept parameters.
Returns
inputAudioBufferClear
Clear input audio buffer. https://docs.x.ai/docs/guides/voice/agent#client-1
Parameters
Returns
Example parameters:
removeEventListener
Removes a handler for the specified Grok.VoiceAgentAPIEvents or Grok.Events event.
Parameters
Returns
responseCreate
Request the server to create a new assistant response when using client side vad. (This is handled automatically when using server side vad.) https://docs.x.ai/docs/guides/voice/agent#client-2
Parameters
Returns
Example parameters:
sendMediaTo
Starts sending media from the Grok (via WebSocket) to the media unit. Grok works in real time.
Parameters
Returns
sessionUpdate
Send this event to update the session’s configuration. https://docs.x.ai/docs/guides/voice/agent#client-events-1
Parameters
Returns
Example parameters:
stopMediaTo
Stops sending media from the Grok (via WebSocket) to the media unit.
Parameters
Returns
webSocketId
Returns the Grok WebSocket id.
Parameters
This method does not accept parameters.
Returns
Events
These events describe audio received through the Grok WebSocket media bridge.
WebSocketMediaStarted
Triggered when the audio stream sent by a third party through an Grok WebSocket is started playing.
Event constant: Events.WebSocketMediaStarted
Payload
WebSocketMediaEnded
Triggers after the end of the audio stream sent by a third party through an Grok WebSocket (1 second of silence).
Event constant: Events.WebSocketMediaEnded
Payload
VoiceAgentAPIEvents
These events mirror server messages from the Grok Voice Agent API. The data field contains the provider event payload.
All VoiceAgentAPIEvents callbacks receive these common fields:
Per-event payload tables below show only event-specific fields (and any provider payload enrichment for data).
Unknown
The unknown event.
Event constant: VoiceAgentAPIEvents.Unknown
Payload
No event-specific payload columns are listed here; this callback still receives the common client and data fields. For data, see the partner documentation for the exact JSON shape.
ConversationCreated
The first message at connection. Notifies the client that a conversation session has been created. https://docs.x.ai/docs/guides/voice/agent#server-events-2
Event constant: VoiceAgentAPIEvents.ConversationCreated
Payload
Example data:
SessionUpdated
Acknowledge the client’s “session.update” message that the session has been updated. https://docs.x.ai/docs/guides/voice/agent#server-events-1
Event constant: VoiceAgentAPIEvents.SessionUpdated
Payload
Example data:
ConversationItemAdded
Responding to the client that a new user message has been added to conversation history, or if an assistance response has been added to conversation history. https://docs.x.ai/docs/guides/voice/agent#server
Event constant: VoiceAgentAPIEvents.ConversationItemAdded
Payload
Example data:
ConversationItemInputAudioTranscriptionCompleted
Notify the client the audio transcription for input has been completed. https://docs.x.ai/docs/guides/voice/agent#server
Event constant: VoiceAgentAPIEvents.ConversationItemInputAudioTranscriptionCompleted
Payload
Example data:
InputAudioBufferCommitted
Input audio buffer has been committed. https://docs.x.ai/docs/guides/voice/agent#server-1
Event constant: VoiceAgentAPIEvents.InputAudioBufferCommitted
Payload
Example data:
InputAudioBufferCleared
Input audio buffer has been cleared. https://docs.x.ai/docs/guides/voice/agent#server-1
Event constant: VoiceAgentAPIEvents.InputAudioBufferCleared
Payload
Example data:
InputAudioBufferSpeechStarted
Notify the client the server’s VAD has detected the start of a speech. https://docs.x.ai/docs/guides/voice/agent#server-1
Event constant: VoiceAgentAPIEvents.InputAudioBufferSpeechStarted
Payload
Example data:
InputAudioBufferSpeechStopped
Notify the client the server’s VAD has detected the end of a speech. https://docs.x.ai/docs/guides/voice/agent#server-1
Event constant: VoiceAgentAPIEvents.InputAudioBufferSpeechStopped
Payload
Example data:
ResponseCreated
A new assistant response turn is in progress. Audio delta created from this assistant turn will have the same response id. https://docs.x.ai/docs/guides/voice/agent#server-2
Event constant: VoiceAgentAPIEvents.ResponseCreated
Payload
Example data:
ResponseDone
The assistant’s response is completed. https://docs.x.ai/docs/guides/voice/agent#server-2
Event constant: VoiceAgentAPIEvents.ResponseDone
Payload
Example data:
ResponseOutputItemAdded
A new assistant response is added to message history. https://docs.x.ai/docs/guides/voice/agent#server-2
Event constant: VoiceAgentAPIEvents.ResponseOutputItemAdded
Payload
Example data:
ResponseOutputItemDone
A new assistant response is done.
Event constant: VoiceAgentAPIEvents.ResponseOutputItemDone
Payload
Example data:
ResponseOutputAudioTranscriptDelta
Audio transcript delta of the assistant response. https://docs.x.ai/docs/guides/voice/agent#server-3
Event constant: VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDelta
Payload
Example data:
ResponseOutputAudioTranscriptDone
The audio transcript delta of the assistant response has finished generating. https://docs.x.ai/docs/guides/voice/agent#server-3
Event constant: VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDone
Payload
Example data:
ResponseOutputAudioDone
Notifies client that the audio for this turn has finished generating. https://docs.x.ai/docs/guides/voice/agent#server-3
Event constant: VoiceAgentAPIEvents.ResponseOutputAudioDone
Payload
Example data:
ResponseContentPartAdded
Notifies client that the content part added.
Event constant: VoiceAgentAPIEvents.ResponseContentPartAdded
Payload
Example data:
ResponseContentPartDone
Notifies client that the content part done.
Event constant: VoiceAgentAPIEvents.ResponseContentPartDone
Payload
Example data:
ResponseFunctionCallArgumentsDone
Function call triggered with complete arguments. https://docs.x.ai/docs/guides/voice/agent#handling-function-call-responses
Event constant: VoiceAgentAPIEvents.ResponseFunctionCallArgumentsDone
Payload
Example data:
WebSocketError
The WebSocket error response event.
Event constant: VoiceAgentAPIEvents.WebSocketError
Payload
No event-specific payload columns are listed here; this callback still receives the common client and data fields. For data, see the partner documentation for the exact JSON shape.
ConnectorInformation
Contains information about connector.
Event constant: VoiceAgentAPIEvents.ConnectorInformation
Payload
No event-specific payload columns are listed here; this callback still receives the common client and data fields. For data, see the partner documentation for the exact JSON shape.