For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Platform docsVideosCommunitySign up
CapabilitiesGetting startedVoice AI OrchestrationVoxEngine PlatformAPI ReferenceFAQ
CapabilitiesGetting startedVoice AI OrchestrationVoxEngine PlatformAPI ReferenceFAQ
      • Overview
        • ASR
        • ASRDictionary
        • ASREvents
        • ASRLanguage
        • ASRModel
        • ASRProfile
        • Voice
        • TranscriptionProvider
        • TTSEffectsProfile
        • DialogflowLanguage
        • DialogflowModel
        • DialogflowModelVariant
        • DialogflowSsmlVoiceGender
  • Management API
    • Reference
    • Authorization
    • Errors
  • Web SDK
    • Overview
  • Android SDK
    • Overview
  • Android SDK v3
    • Overview
  • iOS SDK
    • Overview
  • React Native SDK
    • Overview
  • Flutter SDK
    • Overview
LogoLogo
Platform docsVideosCommunitySign up
On this page
  • Values
VoxEngineSpeech, ASR, and TTS

DialogflowModel

Add the following line to your scenario code to use the enum: require(Modules.AI);

||View as Markdown|
Was this page helpful?
Edit this page
Previous

DialogflowLanguage

Next

DialogflowModelVariant

Built with

Values

VIDEO
'video'

Use this model for transcribing audio in video clips or ones that includes multiple speakers. For best results, provide audio recorded at 16,000Hz or greater sampling rate. NOTE: this is a premium model that costs more than the standard rate.

PHONE_CALL
'phone_call'

Use this model for transcribing audio from a phone call. Typically, phone audio is recorded at 8,000Hz sampling rate. NOTE: the enhanced phone model is a premium model that costs more than the standard rate.

COMMAND_AND_SEARCH
'command_and_search'

Use this model for transcribing shorter audio clips. Some examples include voice commands or voice search.

DEFAULT
'default'

Use this model if your audio does not fit one of the previously described models. For example, you can use this for long-form audio recordings that feature a single speaker only. Ideally, the audio is high-fidelity, recorded at 16,000Hz or greater sampling rate.