WhatsApp

Connect WhatsApp Business calling to Voximplant scenarios
View as Markdown

For the complete documentation index, see llms.txt.

Overview

WhatsApp integration lets you process WhatsApp calls/messages with VoxEngine logic and route them to your Voice AI scenarios. Use the integration flow to connect Meta credentials, complete verification, and attach numbers.

WhatsApp integration overview

Prerequisites

  • Meta developer account with a WhatsApp Business app in developers.facebook.com.
  • WhatsApp Business phone number added in WhatsApp Manager and ready for verification.
  • Meta Cloud API credentials: Temporary Access Token and Phone Number ID.

Inbound

Full inbound setup walkthrough video

Use this step-by-step video to see the full setup in Meta and Voximplant.

Video link: WhatsApp Business Calling setup overview

Setup flow (Inbound)

1

Prepare WhatsApp Cloud API in Meta

In developers.facebook.com/apps, create a Business app, add WhatsApp, and open WhatsApp > API Setup. Keep your Temporary Access Token and Phone Number ID available. Meta API Setup

2

Verify and register the WhatsApp number on the Meta side

In WhatsApp Manager, complete verification and registration on the WhatsApp Cloud API side:

  1. In the Profile tab, click Send verification code. Send verification code

  2. Choose how to receive the code (SMS or phone call). Choose verification method

  3. Enter the received code and keep it for the registration step. Enter verification code

  4. Open the Certificate tab and wait until Display Name status becomes Approved. Display Name status

  5. After approval, register the phone number with the Meta Graph API request shown in the guide/modal. Phone registration step

  6. Refresh the phone numbers page and confirm the number status is Connected. Connected status

3

Create Voximplant application

In manage.voximplant.com, create or open an application for this WhatsApp inbound flow. Create Voximplant application

4

Create inbound scenario

Create your inbound scenario (for example whatsapp-inbound) and add the call handling logic. WhatsApp inbound scenario code

Minimal inbound WhatsApp scenario
1VoxEngine.addEventListener(AppEvents.CallAlerting, (event) => {
2 const call = event.call;
3 call.answer();
4 call.say("Hello, this WhatsApp call is connected to Voximplant.", {voice: VoiceList.Amazon.en_US_Joanna});
5 call.addEventListener(CallEvents.Disconnected, () => VoxEngine.terminate());
6});
5

Connect the WhatsApp number in Voximplant

Open WhatsApp numbers in your application and click Add WhatsApp number. WhatsApp numbers in manage.voximplant.com

Follow the modal instructions: WhatsApp configuration modal

You will need to execute the Meta Graph API requests via cURL.

Then copy the returned password into SIP password, set the WhatsApp number, and click Save. Save SIP password

6

Create routing rule

Create a routing rule and attach the inbound scenario. Open your application and go to Routing. Create a routing rule

Click Create / New rule. The default mask .* is fine to process all inbound calls. Attach scenarios to a routing rule

7

Attach the WhatsApp number to your Application

Select the available WhatsApp number and attach it to the current application.

8

Test inbound calling

Now you can place a call to your WhatsApp Business number and see it hit your VoxEngine scenario!

Outbound specifics

Outbound requires the same setup as inbound. Then you can use VoxEngine.callWhatsappUser() to initiate outbound calling to WhatsApp users from one of your WhatsApp business phone numbers from a scenario.

Outbound WhatsApp
1const call = VoxEngine.callWhatsappUser({
2 number: "+15551234567",
3 callerid: "15557654321"
4});

Outbound video walkthrough:

Video link: Outbound WhatsApp calling walkthrough

More information:

Multi-modal simultaneous voice & messaging support

The WhatsApp integration supports both calls and messages, so you can create multi-modal scenarios that handle voice and text in the same flow. For example, you can answer the call, send a welcome message, then continue with voice prompts and responses.

Architecture

This follows a similar setup procedure as above, but requires an additional server to proxy messages.

WhatsApp multi-modal architecture

Walkthrough and demo

See here for a quick walkthrough and demo of this capability: Video link: Inbound WhatsApp calling demo

Example code

VoxEngine code sample for the WhatsApp multi-modal voice + text flow:

voxengine_openai_ga.js
1require(Modules.ApplicationStorage)
2require(Modules.OpenAI);
3
4let sessionUrl = null, connected = false, cid = null, realtimeAPIClient = undefined;
5
6const OPENAI_API_KEY = ApplicationStorage.get("OPENAI_API_KEY");
7const MODEL = "gpt-realtime-1.5";
8const WA_PROXY_URL = "https://waproxy.ngrok.app/webhook";
9
10const onWebSocketClose = (event) => {
11 Logger.write('===ON_WEB_SOCKET_CLOSE==');
12 Logger.write(JSON.stringify(event));
13 VoxEngine.terminate();
14};
15
16VoxEngine.addEventListener(AppEvents.Started, (appEvent) => {
17 sessionUrl = appEvent.accessSecureURL;
18});
19
20VoxEngine.addEventListener(AppEvents.HttpRequest, (appEvent) => {
21 Logger.write("Inbound Http request");
22 try {
23 let data = JSON.parse(appEvent.content);
24 if (data.text?.body != undefined) {
25
26 const item = {
27 "item": {
28 "type": "message",
29 "role": "user",
30 "content": [
31 {
32 "type": "input_text",
33 "text": data.text.body
34 }
35 ]
36 }
37 }
38
39 realtimeAPIClient.conversationItemCreate(item);
40 realtimeAPIClient.addEventListener(OpenAI.RealtimeAPIEvents.ConversationItemAdded, (rtEvent) => {
41 const response = {};
42 realtimeAPIClient.responseCreate(response);
43 realtimeAPIClient.removeEventListener(OpenAI.RealtimeAPIEvents.ConversationItemAdded);
44 });
45
46
47 }
48
49 } catch (err) {
50 Logger.write(JSON.stringify(err));
51 }
52 return "OK";
53});
54
55VoxEngine.addEventListener(AppEvents.CallAlerting, async ({ callerid, call }) => {
56 cid = callerid;
57 const realtimeAPIClientParameters = {
58 model: MODEL,
59 apiKey: OPENAI_API_KEY,
60 type: OpenAI.RealtimeAPIClientType.REALTIME,
61 onWebSocketClose
62 };
63
64 call.answer();
65 try {
66 realtimeAPIClient = await OpenAI.createRealtimeAPIClient(realtimeAPIClientParameters);
67 const session_update = {
68 "session": {
69 "type": "realtime",
70 "instructions": `Your name is Voxy, you're a friendly and fun guy. You speak English only. You have to collect person's name, company he/she works at and his/her email. Call the 'createProfile' function whenever you learn all information including name, company and email address. You MUST NEVER mention the tools/functions to the user. You speak English ONLY, don't switch to any other language. Always continue the conversation after the user answers.`,
71 "audio": {
72 "input": {
73 "transcription": {
74 "model": "gpt-4o-transcribe",
75 "language": "en"
76 }
77 },
78 "output": {
79 "voice": "cedar"
80 }
81 },
82 "tools": [
83 {
84 "type": "function",
85 "name": "createProfile",
86 "description": "Save contact information of a user for the purpose of creating/updating profile information.",
87 "parameters": {
88 "type": "object",
89 "properties": {
90 "name": {
91 "type": "string",
92 "description": "The user's name",
93 },
94 "emailAddress": {
95 "type": "string",
96 "description": "The user's work/business email address.",
97 },
98 "organization": {
99 "type": "string",
100 "description": "The name of the company/organization where the user works."
101 }
102 },
103 "required": ["name", "organization", "emailAddress"]
104 }
105 },
106 ],
107 "tool_choice": "auto"
108 }
109 };
110 realtimeAPIClient.sessionUpdate(session_update);
111 VoxEngine.sendMediaBetween(call, realtimeAPIClient);
112 connected = true;
113 const response = {};
114 realtimeAPIClient.responseCreate(response);
115
116 // Interruptions support: clear the media buffer in case of OpenAI's VAD detected speech input
117 realtimeAPIClient.addEventListener(OpenAI.RealtimeAPIEvents.InputAudioBufferSpeechStarted, (event) => {
118 if (realtimeAPIClient) realtimeAPIClient.clearMediaBuffer();
119 });
120
121 realtimeAPIClient.addEventListener(OpenAI.RealtimeAPIEvents.ResponseDone, async (event) => {
122 // Logger.write("RESPONSE DONE");
123 // Logger.write(JSON.stringify(event));
124 // Check the function name and act accordingly
125 if (event.data.payload?.response?.output[0].type == "function_call" && event.data.payload?.response?.output[0].name == "createProfile") {
126 try {
127 let args = JSON.parse(event.data.payload.response.output[0].arguments);
128 if (args.name == "" || args.emailAddress == "" || args.organization == "") return;
129 Logger.write("Profile created, sending info to WhatsApp");
130 const obj = {
131 entry: [
132 {
133 changes: [
134 {
135 value: {
136 messages: [
137 {
138 from: cid,
139 type: "voiceai",
140 text: {
141 body: "Name: " + args.name + ", Email: " + args.emailAddress + ", Company: " + args.organization
142 }
143 }
144 ]
145 },
146 field: "messages"
147 }
148 ]
149 }
150 ]
151 }
152 Logger.write(JSON.stringify(obj));
153 await Net.httpRequestAsync(WA_PROXY_URL, {
154 method: "POST",
155 postData: JSON.stringify(obj),
156 enableSystemLog: true,
157 headers: [
158 "Content-Type: application/json"
159 ]
160 })
161 const response = {};
162 realtimeAPIClient.responseCreate(response);
163 } catch (err) {
164 Logger.write(err);
165 }
166 // https://waproxy.ngrok.app/webhook
167 }
168 });
169
170 } catch (error) {
171 Logger.write('===SOMETHING_WENT_WRONG===');
172 Logger.write(error);
173 VoxEngine.terminate();
174 }
175
176 call.record({ hd_audio: true, stereo: true });
177 try {
178 ApplicationStorage.put("WAB_" + callerid, sessionUrl, 60 * 90); // assuming that the call session wouldn't last longer than 1.5 hours
179 } catch (e) {
180 Logger.write("ApplicationStorage error: " + JSON.stringify(e));
181 }
182
183 call.addEventListener(CallEvents.Disconnected, (callEvent) => {
184 if (realtimeAPIClient) realtimeAPIClient.close();
185 connected = false;
186 try {
187 ApplicationStorage.remove("WAB_" + callerid);
188 } catch (e) {
189 Logger.write("ApplicationStorage error: " + JSON.stringify(e));
190 }
191 VoxEngine.terminate();
192 })
193
194
195})
196
197VoxEngine.addEventListener(AppEvents.Terminating, (appEvent) => {
198 if (connected) {
199 try {
200 ApplicationStorage.remove("WAB_" + cid);
201 } catch (e) {
202 Logger.write("ApplicationStorage error: " + JSON.stringify(e));
203 }
204 }
205});

Node.js proxy server code: