Overview

ElevenLabs Agents in VoxEngine
View as MarkdownOpen in Claude

Benefits

The native ElevenLabs module connects Voximplant calls to ElevenLabs Agents for real-time, speech-to-speech conversations. The connector streams audio between Voximplant and ElevenLabs while keeping call control in VoxEngine.

Capability and feature highlights:

  • Connect inbound and outbound calls to a single ElevenLabs agent ID.
  • Bi-directional audio streaming with low latency and built-in media conversion.
  • Barge-in support using interruption events and media buffer control.
  • Real-time events for transcripts, responses, tool calls, and diagnostics.
  • Client-side tool execution with ClientToolCall and clientToolResult.

Architecture

ElevenLabs Agents is a stateful WebSocket service. VoxEngine opens a session and streams audio while receiving audio, transcripts, and events on the same connection.

Prerequisites

  • ElevenLabs API key
  • ElevenLabs Agent ID configured in the ElevenLabs console

Development notes

  • Native VoxEngine module: load with require(Modules.ElevenLabs) and create an ElevenLabs.AgentsClient via ElevenLabs.createAgentsClient({ xiApiKey, agentId }).
  • Agent configuration: prompts, voices, and tools are configured in the ElevenLabs Agent console. Use agentId to select the agent to run.
  • Barge-in: listen for ElevenLabs.AgentsEvents.Interruption and call agentsClient.clearMediaBuffer() to stop current TTS audio.
  • Context and user text: optionally call conversationInitiationClientData, contextualUpdate, or userMessage to inject metadata or text.
  • Function calling: handle ElevenLabs.AgentsEvents.ClientToolCall and respond with clientToolResult.

See the ElevenLabs module API reference for full details on methods, events, and types.

Examples

Voximplant

ElevenLabs