Example: Half-cascade with Inworld
Overview
This half-cascade example uses OpenAI Realtime for speechโtoโtext and reasoning, then sends OpenAI text responses to Inworld Realtime TTS.
โฌ๏ธ Jump to the Full VoxEngine scenario.
Prerequisites
- Store your OpenAI API key in Voximplant
ApplicationStorageunderOPENAI_API_KEY. - Set a
voiceIdin the Inworld request (createContextParameters.create.voiceId) to choose the TTS voice used in this scenario. - (Optional) Store your Inworld API key in
ApplicationStorageasINWORLD_API_KEYif you want to use your own Inworld account.
How it works
- OpenAI runs in text mode (
output_modalities: ["text"]). - Caller audio is sent to OpenAI:
call.sendMediaTo(voiceAIClient). - Inworld generates speech from OpenAI text and streams it to the call.
Notes
- The example sets
voiceId: "Ashley"andmodelId: "inworld-tts-1.5-mini"increateContextParameters.create. Change these to any supported Inworld voice/model. - Do not set audio format parameters in half-cascade connector requests. VoxEngineโs WebSocket gateway handles media format negotiation automatically.
- If no Inworld API key is provided, Voximplantโs default account and billing are used.
- Custom / cloned voices are only available when using your own API key.
- Generate speech using
send({ send_text: { text } }) - Flush the context after every turn with
send({ flush_context: {} }) - Clear buffered speech in barge-in handler with
clearBuffer()so interruptions stay natural.
More info
- OpenAI module API: https://voximplant.com/docs/references/voxengine/openai
- OpenAI Realtime guide: https://voximplant.com/docs/guides/ai/openai-realtime
- Inworld module API: https://voximplant.com/docs/references/voxengine/inworld
- Realtime TTS guide: https://voximplant.com/docs/guides/speech/realtime-tts
Full VoxEngine scenario
voxeengine-openai-half-cascade-inworld.js