*** ## title: 'Example: Answering an incoming call' This example answers an inbound Voximplant call and bridges audio to Gemini Live API for real-time speech-to-speech conversations. **⬇️ Jump to the [Full VoxEngine scenario](#full-voxengine-scenario).** ## Prerequisites * Set up an inbound entrypoint for the caller: * Phone number: [https://voximplant.com/docs/getting-started/basic-concepts/phone-numbers](https://voximplant.com/docs/getting-started/basic-concepts/phone-numbers) * WhatsApp: [https://voximplant.com/docs/guides/integrations/whatsapp](https://voximplant.com/docs/guides/integrations/whatsapp) * SIP user / SIP registration: [https://voximplant.com/docs/guides/calls/sip](https://voximplant.com/docs/guides/calls/sip) * Voximplant user: [https://voximplant.com/docs/getting-started/basic-concepts/users](https://voximplant.com/docs/getting-started/basic-concepts/users) (see also [https://voximplant.com/docs/guides/calls/scenarios#how-to-call-a-voximplant-user](https://voximplant.com/docs/guides/calls/scenarios#how-to-call-a-voximplant-user)) * Create a routing rule that points the destination (number / WhatsApp / SIP username) to this scenario: [https://voximplant.com/docs/getting-started/basic-concepts/routing-rules](https://voximplant.com/docs/getting-started/basic-concepts/routing-rules) * Store your Gemini API key in Voximplant `ApplicationStorage` under `GEMINI_API_KEY`. ## Session setup The Gemini Live API session is configured via `connectConfig`, passed into `Gemini.createLiveAPIClient(...)`. In the full scenario, see `GEMINI_CONNECT_CONFIG`: * `systemInstruction` maps directly to `SYSTEM_PROMPT`, defining the agent’s behavior. * `responseModalities: ["AUDIO"]` asks Gemini to speak back over the call. * `inputAudioTranscription` and `outputAudioTranscription` are enabled so `ServerContent` includes user + agent text. If you don’t need transcript logs, you can remove `inputAudioTranscription` and `outputAudioTranscription`. ## Connect call audio Once the Gemini Live API session is ready, bridge audio between the call and Gemini: ```js title="Connect call audio" VoxEngine.sendMediaBetween(call, geminiLiveAPIClient); ``` In the example, this happens in the `Gemini.LiveAPIEvents.SetupComplete` handler, **after** the Gemini session is ready. The same handler also sends a starter message to trigger the greeting: ```js title="Trigger the greeting" geminiLiveAPIClient.sendClientContent({ turns: [{ role: "user", parts: [{ text: GREETING_TRIGGER }] }], turnComplete: true, }); ``` ## Barge-in Gemini includes an `interrupted` flag in `ServerContent` when the caller starts speaking during TTS. The example clears the media buffer so the agent stops speaking immediately: ```js title="Barge-in handling" if (payload.interrupted) { geminiLiveAPIClient.clearMediaBuffer(); } ``` ## Events The scenario listens for `Gemini.LiveAPIEvents.ServerContent` to capture transcript text: ```js title="Transcripts" geminiLiveAPIClient.addEventListener(Gemini.LiveAPIEvents.ServerContent, (event) => { const payload = event?.data?.payload || {}; if (payload.inputTranscription?.text) Logger.write(payload.inputTranscription.text); if (payload.outputTranscription?.text) Logger.write(payload.outputTranscription.text); }); ``` For illustration, the example also logs **all Gemini events**: * `Gemini.LiveAPIEvents`: `SetupComplete`, `ServerContent`, `ToolCall`, `ToolCallCancellation`, `ConnectorInformation`, `Unknown` * `Gemini.Events`: `WebSocketMediaStarted`, `WebSocketMediaEnded` ## Notes * The example uses the Gemini Developer API (`Gemini.Backend.GEMINI_API`), not Vertex AI. * `inputAudioTranscription` and `outputAudioTranscription` are enabled so you can log user and agent text in `ServerContent` events. [See the VoxEngine API Reference for more details](https://voximplant.com/docs/references/voxengine/gemini). ## Full VoxEngine scenario ```javascript title={"voxeengine-gemini-answer-incoming-call.js"} maxLines={0} /** * Voximplant + Gemini Live API connector demo * Scenario: answer an incoming call and bridge it to Gemini Live API. */ require(Modules.Gemini); require(Modules.ApplicationStorage); const SYSTEM_PROMPT = `You are Voxi, a helpful voice assistant for phone callers. Keep responses short and telephony-friendly (usually 1-2 sentences).`; // -------------------- Gemini Live API settings -------------------- const CONNECT_CONFIG = { responseModalities: ["AUDIO"], speechConfig: { voiceConfig: { prebuiltVoiceConfig: {voiceName: "Aoede"}, }, }, systemInstruction: { parts: [{text: SYSTEM_PROMPT}], }, inputAudioTranscription: {}, outputAudioTranscription: {}, }; VoxEngine.addEventListener(AppEvents.CallAlerting, async ({call}) => { let voiceAIClient; // Termination functions - add cleanup and logging as needed call.addEventListener(CallEvents.Disconnected, ()=>VoxEngine.terminate()); call.addEventListener(CallEvents.Failed, ()=>VoxEngine.terminate()); try { call.answer(); // call.record({ hd_audio: true, stereo: true }); // Optional: record the call // Create client and connect to Gemini Live API voiceAIClient = await Gemini.createLiveAPIClient({ apiKey: (await ApplicationStorage.get("GEMINI_API_KEY")).value, model: "gemini-2.5-flash-native-audio-preview-12-2025", backend: Gemini.Backend.GEMINI_API, connectConfig: CONNECT_CONFIG, onWebSocketClose: (event) => { Logger.write("===Gemini.WebSocket.Close==="); if (event) Logger.write(JSON.stringify(event)); VoxEngine.terminate(); }, }); // ---------------------- Event handlers ----------------------- // Wait for Gemini setup, then bridge audio and trigger the greeting voiceAIClient.addEventListener(Gemini.LiveAPIEvents.SetupComplete, () => { VoxEngine.sendMediaBetween(call, voiceAIClient); voiceAIClient.sendClientContent({ turns: [{role: "user", parts: [{text: "Say hello and ask how you can help."}]}], turnComplete: true, }); }); // Capture transcripts + handle barge-in voiceAIClient.addEventListener(Gemini.LiveAPIEvents.ServerContent, (event) => { const payload = event?.data?.payload || {}; if (payload.inputTranscription?.text) { Logger.write(`===USER=== ${payload.inputTranscription.text}`); } if (payload.outputTranscription?.text) { Logger.write(`===AGENT=== ${payload.outputTranscription.text}`); } if (payload.interrupted) { Logger.write("===BARGE-IN=== Gemini.LiveAPIEvents.ServerContent"); voiceAIClient.clearMediaBuffer(); } }); // Log all Gemini events for illustration/debugging [ Gemini.LiveAPIEvents.SetupComplete, Gemini.LiveAPIEvents.ServerContent, Gemini.LiveAPIEvents.ToolCall, Gemini.LiveAPIEvents.ToolCallCancellation, Gemini.LiveAPIEvents.ConnectorInformation, Gemini.LiveAPIEvents.Unknown, Gemini.Events.WebSocketMediaStarted, Gemini.Events.WebSocketMediaEnded, ].forEach((eventName) => { voiceAIClient.addEventListener(eventName, (event) => { Logger.write(`===${event.name}===`); if (event?.data) Logger.write(JSON.stringify(event.data)); }); }); } catch (error) { Logger.write("===SOMETHING_WENT_WRONG==="); Logger.write(error); voiceAIClient?.close(); VoxEngine.terminate(); } }); ```