*** ## title: 'Example: Function calling' This example answers an inbound call, connects it to Cartesia Line Agents, and uses an HTTP callback for tool actions (`end_call`, `call_transfer`). **Jump to the [Full VoxEngine scenario](#full-voxengine-scenario).** ## Prerequisites * Deploy a Cartesia Line Agent script that sends HTTP callbacks to VoxEngine for tool actions (see the [Cartesia agent example](#cartesia-agent-example-python)). * Store your [Cartesia API key](https://play.cartesia.ai/keys) in Voximplant `ApplicationStorage` under `CARTESIA_API_KEY` * Store your [Cartesia Agent ID](https://play.cartesia.ai/agents) in Voximplant `ApplicationStorage` under `CARTESIA_AGENT_ID` * Ensure your scenario can place outbound PSTN calls with a verified caller ID (used by transfer) - see [How can I enable Caller ID for outbound calls](https://voximplant.com/help/faq/how-can-i-enable-caller-id-for-outbound-calls) * Store a valid outbound caller ID in Voximplant `ApplicationStorage` under `PSTN_CALLER_ID` for transfer PSTN dialing ## Session setup This setup is similar to the inbound example, with one extra control-plane step for tools: 1. `AppEvents.Started`: capture per-call `accessSecureURL` as `sessionControlUrl`. 2. `AppEvents.CallAlerting`: answer the inbound call and (optionally) start recording. 3. Create `Cartesia.AgentsClient` from `ApplicationStorage`. 4. Bridge media between caller leg and Cartesia. 5. Start Cartesia session with any metadata to pass to the agent, including `vox_session_control_url`. ## Functional Call Execution ### Architecture Cartesia Line Agent currently provides limited controls over its WebSocket interface. Instead, we will use VoxEngine's `AppEvents.HttpRequest` as a callback endpoint for the Cartesia Agent to request telephony actions (call transfer, hang up) in VoxEngine. This allows the Line Agent to orchestrate complex call control flows while VoxEngine retains ownership of telephony actions and routing decisions. ![Cartesia Line Agents architecture](https://files.buildwithfern.com/voximplant.docs.buildwithfern.com/6e3e6086d7343023896cf1283a3372db19e71922b33485632dccdcf240f035e3/docs/assets/diagrams/cartesia-line-agent-architecture.jpg) ### HTTP control model This scenario uses the following control-plane pattern: 1. VoxEngine gets a unique `accessSecureURL` in `AppEvents.Started`. 2. VoxEngine passes that URL to Cartesia in `metadata.vox_session_control_url`. 3. Cartesia Python tool handlers `POST` actions such as `{"action":"end_call"}` or `{"action":"call_transfer"}`. 4. VoxEngine receives those requests in `AppEvents.HttpRequest` and executes telephony actions. ## How `AppEvents.HttpRequest` is used `AppEvents.HttpRequest` is the handoff point where Cartesia tool decisions become telephony actions: 1. Cartesia sends an HTTPS `POST` to the per-call `vox_session_control_url`. 2. VoxEngine receives request content in `appEvent.content`. 3. Scenario parses JSON and switches on `cmd.action`. 4. Scenario executes telephony action (`hangup`, `callTransfer`, bridge). 5. Scenario returns a JSON string response (`{"ok": true}`) to acknowledge receipt. Expected request payloads: * `{"action":"end_call"}` * `{"action":"call_transfer","summary":"optional short summary"}` Minimal response payload: * `{"ok": true}` ### Passing the control URL and metadata to Cartesia Here we pass information such as the control URL to Cartesia when we start the session. ```js title="Session control URL + start metadata" VoxEngine.addEventListener(AppEvents.Started, (appEvent) => { sessionControlUrl = appEvent.accessSecureURL; }); voiceAIClient.start({ metadata: { from: call.callerid(), to: call.number(), vox_session_control_url: sessionControlUrl } }); ``` ### Handling control actions in `AppEvents.HttpRequest` ```js title="HTTP control handler" VoxEngine.addEventListener(AppEvents.HttpRequest, (appEvent) => { const cmd = JSON.parse(appEvent.content); if (cmd.action === "end_call") { callerLeg.hangup(); voiceAIClient?.close(); VoxEngine.terminate(); } else if (cmd.action === "call_transfer") { callTransfer(cmd); } return JSON.stringify({ ok: true }); }); ``` ## High-level flows ### Hang-up flow (`end_call`) 1. Caller asks to end the call. 2. Cartesia Python agent runs `end_call()` and posts `{"action":"end_call"}`. 3. VoxEngine `AppEvents.HttpRequest` receives the action. 4. Scenario hangs up caller leg, closes Cartesia websocket, and terminates the session. ### Transfer flow (`call_transfer`) 1. Caller asks to transfer to a human. 2. Cartesia Python agent sends transfer confirmation, then posts `{"action":"call_transfer","summary":"..."}`. 3. VoxEngine `AppEvents.HttpRequest` calls `callTransfer(...)`. 4. Scenario detaches caller from Cartesia (`stopMediaBetween` + websocket close), which makes this a blind transfer handoff pattern. 5. Scenario dials consult PSTN call with `VoxEngine.callPSTN(...)` using `PSTN_CALLER_ID`. 6. On consult answer, scenario bridges caller to consult leg with `VoxEngine.sendMediaBetween(callerLeg, consultLeg)`. 7. If consult call fails, scenario logs failure and hangs up caller leg. ## Cartesia agent example (Python) The full Cartesia Line Python agent example used with this callback pattern: ```python title={"cartesia-line-tools-agent.py"} maxLines={0} import os import asyncio from typing import Annotated, Optional import httpx from line.llm_agent import LlmAgent from line.llm_agent.config import LlmConfig from line.events import AgentEndCall, AgentSendText from line.llm_agent.tools.decorators import passthrough_tool from line.llm_agent.tools.utils import ToolEnv from line.voice_agent_app import AgentEnv, CallRequest, VoiceAgentApp DEFAULT_SYSTEM_PROMPT = """\ You are a helpful voice agent running on Cartesia Line, connected to a phone call via Voximplant. This demo has a very simple call transfer: 1) If the caller asks for Voxy or a human, call call_transfer(summary=...). 2) The call_transfer tool will speak the transfer confirmation to the caller. 3) Do not continue the conversation after calling call_transfer. Ending the call: - If the caller asks to hang up or says goodbye, say a short goodbye and then call end_call(). Be concise, polite, and ask one question at a time. """ DEFAULT_INTRODUCTION = "Hi, this is Voximplant Voice AI powered by Cartesia Line. How can I help?" async def _post_vox_control( url: str, payload: dict, *, timeout_s: float = 3.0, ) -> dict: """Send a POST request to the given URL with the provided payload.""" try: async with httpx.AsyncClient(timeout=timeout_s) as client: response = await client.post(url, json=payload) try: return response.json() except Exception: return {"status_code": response.status_code, "text": response.text} except Exception: return {"error": "request_failed"} async def get_agent(env: AgentEnv, request: CallRequest): # Extract the Voximplant control URL from the request metadata, if present and valid. vox_control_url: Optional[str] = None if request.metadata and isinstance(request.metadata, dict): raw = request.metadata.get("vox_session_control_url") if isinstance(raw, str) and raw.startswith("https://"): vox_control_url = raw if not vox_control_url: return { "error": "missing_vox_control_url", "message": "The call request is missing a valid 'vox_session_control_url' in its metadata.", } @passthrough_tool async def call_transfer( ctx: ToolEnv, summary: Annotated[ str, "Optional short transfer summary (for logs / analytics). Voximplant will receive it.", ] = "", ): """Request Voximplant to transfer the call to a human (Voximplant performs the telephony actions). Speak first so the caller reliably hears the transfer confirmation before Voximplant detaches the agent audio and bridges to PSTN. """ yield AgentSendText(text="Sure. One moment, I'm transferring you to a human now.") async def _do_transfer(): # Give TTS time to play before Voximplant starts tearing down the agent bridge. # use line.events.AgentTurnEnded to be more precise await asyncio.sleep(5.0) await _post_vox_control( vox_control_url, {"action": "call_transfer", "summary": summary}, timeout_s=10.0, ) asyncio.create_task(_do_transfer()) @passthrough_tool async def end_call(ctx: ToolEnv): """End the call.""" yield AgentEndCall() # Give TTS time to play before Voximplant starts tearing down the agent bridge. # use line.events.AgentTurnEnded to be more precise await asyncio.sleep(3.0) await _post_vox_control(vox_control_url, {"action": "end_call"}) agent = LlmAgent( model="gpt-5-nano", api_key=os.getenv("OPENAI_API_KEY", ""), tools=[end_call, call_transfer], config=LlmConfig( system_prompt=DEFAULT_SYSTEM_PROMPT, # use request.agent.system_prompt for the GUI version introduction=DEFAULT_INTRODUCTION, # use request.agent.introduction for the GUI version ), ) return agent voice_agent_app = VoiceAgentApp(get_agent=get_agent) app = voice_agent_app.fastapi_app if __name__ == "__main__": # Local dev only. Cartesia runs this as a web service in the cloud. voice_agent_app.run() ``` This Python sample expects `metadata.vox_session_control_url` and uses it to call back into VoxEngine. ## Notes **Voximplant** * Voximplant Cartesia module API reference: [https://voximplant.com/docs/references/voxengine/cartesia](https://voximplant.com/docs/references/voxengine/cartesia) * Working with API requests guide: [https://voximplant.com/docs/guides/voxengine/api](https://voximplant.com/docs/guides/voxengine/api) * HTTP Callback guide: [https://voximplant.com/docs/guides/management-api/callbacks](https://voximplant.com/docs/guides/management-api/callbacks) **Cartesia** * Cartesia Line Agents documentation: [https://docs.cartesia.ai/line-agents/overview](https://docs.cartesia.ai/line-agents/overview) * Cartesia Line web calls event model: [https://docs.cartesia.ai/line/integrations/web-calls](https://docs.cartesia.ai/line/integrations/web-calls) ## Full VoxEngine scenario ```javascript title={"voxeengine-cartesia-tools.js"} maxLines={0} // Voximplant VoxEngine scenario: // - Streams caller audio <-> Cartesia Line agent (Agents connector) // - Supports: // - end_call: hang up the caller leg // - call_transfer: place an outbound PSTN consult call and then bridge caller -> consult leg // // Configure these keys in Voximplant Application Storage: // - CARTESIA_API_KEY // - CARTESIA_AGENT_ID // - PSTN_CALLER_ID (required for callPSTN; must be a real E.164 number in your Voximplant account) require(Modules.Cartesia); require(Modules.ApplicationStorage); require(Modules.ASR); const CALL_TRANSFER_NUMBER = "+18339906144"; // Per-session control URL. Any HTTPS request to this URL triggers AppEvents.HttpRequest in this session. // We'll pass it into Cartesia call metadata so the agent runtime can request telephony actions via HTTP. let sessionControlUrl = null; // Current call session state (single-call demo scenario). let callerLeg = null; let voiceAIClient = null; let consultLeg = null; let transferInProgress = false; let transferred = false; VoxEngine.addEventListener(AppEvents.Started, (appEvent) => { sessionControlUrl = appEvent.accessSecureURL; Logger.write(`===SESSION_CONTROL_URL_READY===>${JSON.stringify({accessSecureURL: sessionControlUrl}) || ""}`); }); VoxEngine.addEventListener(AppEvents.HttpRequest, async (appEvent) => { Logger.write(`===HTTP_CONTROL_REQUEST===>${JSON.stringify({method: appEvent.method, path: appEvent.path}) || ""}`); // Check for and handle control commands const cmd = JSON.parse(appEvent?.content); if (cmd.action === "end_call") { Logger.write(`===CONTROL_END_CALL===>${JSON.stringify(cmd) || ""}`); callerLeg.hangup(); voiceAIClient?.close(); VoxEngine.terminate(); } else if (cmd.action === "call_transfer") { Logger.write(`===CONTROL_CALL_TRANSFER===>${JSON.stringify(cmd) || ""}`); await callTransfer(cmd); } else { Logger.write(`===CONTROL_COMMAND_UNKNOWN===>${JSON.stringify(cmd) || ""}`); } return JSON.stringify({ok: true}); }); async function callTransfer(cmd) { if (transferInProgress || transferred) return; if (!callerLeg) return; transferInProgress = true; Logger.write(`===CALL_TRANSFER_REQUESTED===>${JSON.stringify(cmd || {}) || ""}`); // Detach the agent now for a blind transfer. Delay or conference for a warm transfer if (voiceAIClient) { VoxEngine.stopMediaBetween(callerLeg, voiceAIClient); voiceAIClient.close(); voiceAIClient = null; } // Transfer the call to a new number const currentPstnCallerId = (await ApplicationStorage.get("PSTN_CALLER_ID")).value; consultLeg = VoxEngine.callPSTN(CALL_TRANSFER_NUMBER, currentPstnCallerId, {followDiversion: true}); consultLeg.addEventListener(CallEvents.Failed, () => { transferInProgress = false; Logger.write(`===CONSULT_CALL_FAILED===>${JSON.stringify({}) || ""}`); callerLeg.hangup(); }); consultLeg.addEventListener(CallEvents.Disconnected, (event) => { Logger.write(`===CONSULT_CALL_DISCONNECTED===>${JSON.stringify(event) || ""}`); }); consultLeg.addEventListener(CallEvents.Connected, () => { Logger.write(`===CONSULT_CALL_CONNECTED===>${JSON.stringify({}) || ""}`); transferInProgress = false; transferred = true; VoxEngine.sendMediaBetween(callerLeg, consultLeg); }); } function onWebSocketClose(event) { Logger.write(`===ON_WEB_SOCKET_CLOSE===>${JSON.stringify(event) || ""}`); // Ignore expected close during transfer if (transferInProgress || transferred || event.code === 1000) return; // otherwise end the call callerLeg.hangup(); VoxEngine.terminate(); } VoxEngine.addEventListener(AppEvents.CallAlerting, async ({call}) => { callerLeg = call; // Termination functions - add cleanup and logging as needed call.addEventListener(CallEvents.Disconnected, () => VoxEngine.terminate()); call.addEventListener(CallEvents.Failed, () => VoxEngine.terminate()); try { call.answer(); call.record({hd_audio: true, stereo: true}); // Optional: record the call voiceAIClient = await Cartesia.createAgentsClient({ apiKey: (await ApplicationStorage.get("CARTESIA_API_KEY")).value, agentId: (await ApplicationStorage.get("CARTESIA_AGENT_ID")).value, cartesiaVersion: "2025-04-16", onWebSocketClose, }); VoxEngine.sendMediaBetween(call, voiceAIClient); voiceAIClient.start({ // Optional metadata passed into the Cartesia agent metadata: { mode: "tools", from: call.callerid(), to: call.number(), vox_session_control_url: sessionControlUrl, // Control plane from Cartesia via HTTPS callback }, }); // "log only" handlers for debugging. [ Cartesia.AgentsEvents.ACK, Cartesia.AgentsEvents.Clear, Cartesia.AgentsEvents.ConnectorInformation, Cartesia.AgentsEvents.DTMF, Cartesia.AgentsEvents.Unknown, Cartesia.AgentsEvents.WebSocketError, Cartesia.Events.WebSocketMediaStarted, Cartesia.Events.WebSocketMediaEnded, ].forEach((eventName) => { voiceAIClient.addEventListener(eventName, (event) => { Logger.write(`===${event.name}===>${JSON.stringify(event.data) || ""}`); }); }); } catch (error) { Logger.write(`===SOMETHING_WENT_WRONG===>${JSON.stringify(error) || String(error)}`); VoxEngine.terminate(); } }); ```