*** title: 'Example: Answering an incoming call' subtitle: Connect an inbound call to a Grok voice agent ------------------------------------------------------- ## Overview This minimal example answers an inbound call and connects it to the Grok Voice Agent API with only core settings and barge-in support—no extra tools or function calling. **⬇️ Jump to the [Full VoxEngine scenario](#full-voxengine-scenario).** ## Prerequisites * Set up an inbound entrypoint for the caller: * Phone number: [https://voximplant.com/docs/getting-started/basic-concepts/phone-numbers](https://voximplant.com/docs/getting-started/basic-concepts/phone-numbers) * WhatsApp: [https://voximplant.com/docs/guides/integrations/whatsapp](https://voximplant.com/docs/guides/integrations/whatsapp) * SIP user / SIP registration: [https://voximplant.com/docs/guides/calls/sip](https://voximplant.com/docs/guides/calls/sip) * Voximplant user: [https://voximplant.com/docs/getting-started/basic-concepts/users](https://voximplant.com/docs/getting-started/basic-concepts/users) (see also [https://voximplant.com/docs/guides/calls/scenarios#how-to-call-a-voximplant-user](https://voximplant.com/docs/guides/calls/scenarios#how-to-call-a-voximplant-user)) * Create a routing rule that points the destination to this scenario: [https://voximplant.com/docs/getting-started/basic-concepts/routing-rules](https://voximplant.com/docs/getting-started/basic-concepts/routing-rules) * Store your Grok API key in `ApplicationStorage` under `XAI_API_KEY`. ## Architecture ```mermaid graph LR Caller[PSTN/SIP/WhatsApp/WebRTC] --> VoxEngine[VoxEngine Scenario] VoxEngine -->|WebSocket Media| Grok[Grok Voice Agent API] Grok --> VoxEngine VoxEngine --> Caller ``` ## Usage highlights * Create a `VoiceAgentAPIClient` with `Grok.createVoiceAgentAPIClient(...)`. * Configure the session with `voice`, `turn_detection`, and a short `instructions` prompt. * Bridge audio with `VoxEngine.sendMediaBetween(call, client)`. * Enable barge-in by clearing the media buffer when the caller starts speaking. ### Turn detection & barge-in When `InputAudioBufferSpeechStarted` fires, clear the media buffer so the caller can interrupt the agent: ```js voiceAgentAPIClient.addEventListener( Grok.VoiceAgentAPIEvents.InputAudioBufferSpeechStarted, () => voiceAgentAPIClient.clearMediaBuffer() ); ``` ## Configure before you run * Set `XAI_API_KEY` in `ApplicationStorage`. * Adjust the `SYSTEM_PROMPT` in the example to match your brand voice and guardrails. ## Try it Suggested test prompts: * "Hello" * "What can you help me with?" * "Goodbye." ## Notes [See the VoxEngine API Reference for more details](https://voximplant.com/docs/references/voxengine/grok). ## Full VoxEngine scenario ```javascript title={"voxeengine-grok-answer-incoming-call.js"} maxLines={0} require(Modules.Grok); require(Modules.ApplicationStorage); const SYSTEM_PROMPT = ` You are Voxi, a concise phone agent for Voximplant callers. Keep answers brief and helpful. If you do not know, say so and offer to connect them to a human. `; VoxEngine.addEventListener(AppEvents.CallAlerting, async ({call}) => { let voiceAIClient; // Termination functions - add cleanup and logging as needed call.addEventListener(CallEvents.Disconnected, ()=>VoxEngine.terminate()); call.addEventListener(CallEvents.Failed, ()=>VoxEngine.terminate()); call.answer(); // call.record({ hd_audio: true, stereo: true }); // Optional: record the call try { voiceAIClient = await Grok.createVoiceAgentAPIClient({ xAIApiKey: (await ApplicationStorage.get("XAI_API_KEY")).value, onWebSocketClose: (event) => { Logger.write("===Gemini.WebSocket.Close==="); if (event) Logger.write(JSON.stringify(event)); VoxEngine.terminate(); }, }); // Set up the session once created voiceAIClient.addEventListener( Grok.VoiceAgentAPIEvents.ConversationCreated, () => { voiceAIClient.sessionUpdate({ session: { voice: "Ara", turn_detection: {type: "server_vad"}, instructions: SYSTEM_PROMPT, }, }); }, ); // Wait for Grok setup, then bridge audio and trigger the greeting voiceAIClient.addEventListener( Grok.VoiceAgentAPIEvents.SessionUpdated, () => { VoxEngine.sendMediaBetween(call, voiceAIClient); voiceAIClient.responseCreate({instructions: "Hello! How can I help today?"}); }, ); // Simple barge-in: clear buffered audio when caller starts speaking voiceAIClient.addEventListener( Grok.VoiceAgentAPIEvents.InputAudioBufferSpeechStarted, () => voiceAIClient.clearMediaBuffer(), ); // -------------------- Log Other Events -------------------- [ CallEvents.FirstAudioPacketReceived, Grok.Events.WebSocketMediaStarted, Grok.Events.WebSocketMediaEnded, Grok.VoiceAgentAPIEvents.ConnectorInformation, Grok.VoiceAgentAPIEvents.ResponseCreated, Grok.VoiceAgentAPIEvents.ResponseOutputItemAdded, Grok.VoiceAgentAPIEvents.ResponseOutputItemDone, Grok.VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDelta, Grok.VoiceAgentAPIEvents.ResponseOutputAudioTranscriptDone, Grok.VoiceAgentAPIEvents.ResponseOutputAudioDone, Grok.VoiceAgentAPIEvents.ResponseDone, Grok.VoiceAgentAPIEvents.InputAudioBufferSpeechStopped, Grok.VoiceAgentAPIEvents.InputAudioBufferCommitted, Grok.VoiceAgentAPIEvents.ConversationItemAdded, Grok.VoiceAgentAPIEvents.ConversationItemInputAudioTranscriptionCompleted, Grok.VoiceAgentAPIEvents.WebSocketError, Grok.VoiceAgentAPIEvents.Unknown, ].forEach((evtName) => { voiceAIClient.addEventListener(evtName, (e) => { Logger.write(`===${e.name}===>${JSON.stringify(e)}`); }); }); } catch (error) { Logger.write("===SOMETHING_WENT_WRONG==="); Logger.write(error); VoxEngine.terminate(); } }); ```