PATCH
/
update-agent
/
{agent_id}

Authorizations

Authorization
string
headerrequired

Authentication header containing API key (find it in dashboard). The format is "Bearer YOUR_API_KEY"

Path Parameters

agent_id
string
required

Unique id of the agent to be updated.

Body

application/json
response_engine
object

The response engine to use for the agent.

agent_name
string | null

The name of the agent. Only used for your own reference.

voice_id
string

Unique voice id used for the agent. Find list of available voices and their preview in Dashboard.

voice_model
enum<string> | null

Optionally set the voice model used for the selected voice. Currently only elevenlab voices have voice model selections. Set to null to remove voice model selection, and default ones will apply. Supported voice models are:

  • eleven_turbo_v2: Fast english only model, supports pronunciation tags.

  • eleven_turbo_v2_5: Multilingual model with lowest latency.

  • eleven_multilingual_v2: Multilingual model with rich emotion and nice accent.

Available options:
eleven_turbo_v2,
eleven_turbo_v2_5,
eleven_multilingual_v2
fallback_voice_ids
string[] | null

When TTS provider for the selected voice is experiencing outages, we would use fallback voices listed here for the agent. Voice id and the fallback voice ids must be from different TTS providers. The system would go through the list in order, if the first one in the list is also having outage, it would use the next one. Set to null to remove voice fallback for the agent.

voice_temperature
number

Controls how stable the voice is. Value ranging from [0,2]. Lower value means more stable, and higher value means more variant speech generation. Currently this setting only applies to 11labs voices. If unset, default value 1 will apply.

voice_speed
number

Controls speed of voice. Value ranging from [0.5,2]. Lower value means slower speech, while higher value means faster speech rate. If unset, default value 1 will apply.

volume
number

If set, will control the volume of the agent. Value ranging from [0,2]. Lower value means quieter agent speech, while higher value means louder agent speech. If unset, default value 1 will apply.

responsiveness
number

Controls how responsive is the agent. Value ranging from [0,1]. Lower value means less responsive agent (wait more, respond slower), while higher value means faster exchanges (respond when it can). If unset, default value 1 will apply.

interruption_sensitivity
number

Controls how sensitive the agent is to user interruptions. Value ranging from [0,1]. Lower value means it will take longer / more words for user to interrupt agent, while higher value means it's easier for user to interrupt agent. If unset, default value 1 will apply. When this is set to 0, agent would never be interrupted.

enable_backchannel
boolean

Controls whether the agent would backchannel (agent interjects the speaker with phrases like "yeah", "uh-huh" to signify interest and engagement). Backchannel when enabled tends to show up more in longer user utterances. If not set, agent will not backchannel.

backchannel_frequency
number

Only applicable when enable_backchannel is true. Controls how often the agent would backchannel when a backchannel is possible. Value ranging from [0,1]. Lower value means less frequent backchannel, while higher value means more frequent backchannel. If unset, default value 0.8 will apply.

backchannel_words
string[] | null

Only applicable when enable_backchannel is true. A list of words that the agent would use as backchannel. If not set, default backchannel words will apply. Check out backchannel default words for more details. Note that certain voices do not work too well with certain words, so it's recommended to expeirment before adding any words.

reminder_trigger_ms
number

If set (in milliseconds), will trigger a reminder to the agent to speak if the user has been silent for the specified duration after some agent speech. Must be a positive number. If unset, default value of 10000 ms (10 s) will apply.

reminder_max_count
integer

If set, controls how many times agent would remind user when user is unresponsive. Must be a non negative integer. If unset, default value of 1 will apply (remind once). Set to 0 to disable agent from reminding.

ambient_sound
enum<string> | null

If set, will add ambient environment sound to the call to make experience more realistic. Currently supports the following options:

Set to null to remove ambient sound from this agent.

Available options:
coffee-shop,
convention-hall,
summer-outdoor,
mountain-outdoor,
static-noise,
call-center
ambient_sound_volume
number

If set, will control the volume of the ambient sound. Value ranging from [0,2]. Lower value means quieter ambient sound, while higher value means louder ambient sound. If unset, default value 1 will apply.

language
enum<string>

Specifies what language (and dialect) the speech recognition will operate in. For instance, selecting en-GB optimizes speech recognition for British English. If unset, will use default value en-US. Select multi for multilingual support, currently this supports Spanish and English.

Available options:
en-US,
en-IN,
en-GB,
de-DE,
es-ES,
es-419,
hi-IN,
ja-JP,
pt-PT,
pt-BR,
fr-FR,
zh-CN,
ru-RU,
it-IT,
ko-KR,
nl-NL,
pl-PL,
tr-TR,
vi-VN,
multi
webhook_url
string | null

The webhook for agent to listen to call events. See what events it would get at webhook doc. If set, will binds webhook events for this agent to the specified url, and will ignore the account level webhook for this agent. Set to null to remove webhook url from this agent.

boosted_keywords
string[] | null

Provide a customized list of keywords to bias the transcriber model, so that these words are more likely to get transcribed. Commonly used for names, brands, street, etc.

enable_transcription_formatting
boolean

If set to true, will format transcription to number, date, email, etc. If set to false, will return transcripts in raw words. If not set, default value of true will apply.

opt_out_sensitive_data_storage
boolean

Whether this agent opts out of sensitive data storage like transcript, recording, logging. These data can still be accessed securely via webhooks. If not set, default value of false will apply.

pronunciation_dictionary
object[] | null

A list of words / phrases and their pronunciation to be used to guide the audio synthesize for consistent pronunciation. Currently only supported for English & 11labs voices. Set to null to remove pronunciation dictionary from this agent.

normalize_for_speech
boolean

If set to true, will normalize the some part of text (number, currency, date, etc) to spoken to its spoken form for more consistent speech synthesis (sometimes the voice synthesize system itself might read these wrong with the raw text). For example, it will convert "Call my number 2137112342 on Jul 5th, 2024 for the $24.12 payment" to "Call my number two one three seven one one two three four two on july fifth, twenty twenty four for the twenty four dollars twelve cents payment" before starting audio generation.

end_call_after_silence_ms
integer

If users stay silent for a period after agent speech, end the call. The minimum value allowed is 10,000 ms (10 s). By default, this is set to 600000 (10 min).

max_call_duration_ms
integer

Maximum allowed length for the call, will force end the call if reached. The minimum value allowed is 60,000 ms (1 min), and maximum value allowed is 7,200,000 (2 hours). By default, this is set to 3,600,000 (1 hour).

enable_voicemail_detection
boolean

If set to true, will detect whether the call enters a voicemail. Note that this feature is only available for phone calls.

voicemail_message
string

The message to be played when the call enters a voicemail. Note that this feature is only available for phone calls. If you want to hangup after hitting voicemail, set this to empty string.

voicemail_detection_timeout_ms
integer

Configures when to stop running voicemail detection, as it becomes unlikely to hit voicemail after a couple minutes, and keep running it will only have negative impact. The minimum value allowed is 5,000 ms (5 s), and maximum value allowed is 180,000 (3 minutes). By default, this is set to 30,000 (30 s).

post_call_analysis_data
object[] | null

Post call analysis data to extract from the call. This data will augment the pre-defined variables extracted in the call analysis. This will be available after the call ends.

Response

200 - application/json
agent_id
string
required

Unique id of agent.

last_modification_timestamp
integer
required

Last modification timestamp (milliseconds since epoch). Either the time of last update or creation if no updates available.

response_engine
object

The response engine to use for the agent.

agent_name
string | null

The name of the agent. Only used for your own reference.

voice_id
string

Unique voice id used for the agent. Find list of available voices and their preview in Dashboard.

voice_model
enum<string> | null

Optionally set the voice model used for the selected voice. Currently only elevenlab voices have voice model selections. Set to null to remove voice model selection, and default ones will apply. Supported voice models are:

  • eleven_turbo_v2: Fast english only model, supports pronunciation tags.

  • eleven_turbo_v2_5: Multilingual model with lowest latency.

  • eleven_multilingual_v2: Multilingual model with rich emotion and nice accent.

Available options:
eleven_turbo_v2,
eleven_turbo_v2_5,
eleven_multilingual_v2
fallback_voice_ids
string[] | null

When TTS provider for the selected voice is experiencing outages, we would use fallback voices listed here for the agent. Voice id and the fallback voice ids must be from different TTS providers. The system would go through the list in order, if the first one in the list is also having outage, it would use the next one. Set to null to remove voice fallback for the agent.

voice_temperature
number

Controls how stable the voice is. Value ranging from [0,2]. Lower value means more stable, and higher value means more variant speech generation. Currently this setting only applies to 11labs voices. If unset, default value 1 will apply.

voice_speed
number

Controls speed of voice. Value ranging from [0.5,2]. Lower value means slower speech, while higher value means faster speech rate. If unset, default value 1 will apply.

volume
number

If set, will control the volume of the agent. Value ranging from [0,2]. Lower value means quieter agent speech, while higher value means louder agent speech. If unset, default value 1 will apply.

responsiveness
number

Controls how responsive is the agent. Value ranging from [0,1]. Lower value means less responsive agent (wait more, respond slower), while higher value means faster exchanges (respond when it can). If unset, default value 1 will apply.

interruption_sensitivity
number

Controls how sensitive the agent is to user interruptions. Value ranging from [0,1]. Lower value means it will take longer / more words for user to interrupt agent, while higher value means it's easier for user to interrupt agent. If unset, default value 1 will apply. When this is set to 0, agent would never be interrupted.

enable_backchannel
boolean

Controls whether the agent would backchannel (agent interjects the speaker with phrases like "yeah", "uh-huh" to signify interest and engagement). Backchannel when enabled tends to show up more in longer user utterances. If not set, agent will not backchannel.

backchannel_frequency
number

Only applicable when enable_backchannel is true. Controls how often the agent would backchannel when a backchannel is possible. Value ranging from [0,1]. Lower value means less frequent backchannel, while higher value means more frequent backchannel. If unset, default value 0.8 will apply.

backchannel_words
string[] | null

Only applicable when enable_backchannel is true. A list of words that the agent would use as backchannel. If not set, default backchannel words will apply. Check out backchannel default words for more details. Note that certain voices do not work too well with certain words, so it's recommended to expeirment before adding any words.

reminder_trigger_ms
number

If set (in milliseconds), will trigger a reminder to the agent to speak if the user has been silent for the specified duration after some agent speech. Must be a positive number. If unset, default value of 10000 ms (10 s) will apply.

reminder_max_count
integer

If set, controls how many times agent would remind user when user is unresponsive. Must be a non negative integer. If unset, default value of 1 will apply (remind once). Set to 0 to disable agent from reminding.

ambient_sound
enum<string> | null

If set, will add ambient environment sound to the call to make experience more realistic. Currently supports the following options:

Set to null to remove ambient sound from this agent.

Available options:
coffee-shop,
convention-hall,
summer-outdoor,
mountain-outdoor,
static-noise,
call-center
ambient_sound_volume
number

If set, will control the volume of the ambient sound. Value ranging from [0,2]. Lower value means quieter ambient sound, while higher value means louder ambient sound. If unset, default value 1 will apply.

language
enum<string>

Specifies what language (and dialect) the speech recognition will operate in. For instance, selecting en-GB optimizes speech recognition for British English. If unset, will use default value en-US. Select multi for multilingual support, currently this supports Spanish and English.

Available options:
en-US,
en-IN,
en-GB,
de-DE,
es-ES,
es-419,
hi-IN,
ja-JP,
pt-PT,
pt-BR,
fr-FR,
zh-CN,
ru-RU,
it-IT,
ko-KR,
nl-NL,
pl-PL,
tr-TR,
vi-VN,
multi
webhook_url
string | null

The webhook for agent to listen to call events. See what events it would get at webhook doc. If set, will binds webhook events for this agent to the specified url, and will ignore the account level webhook for this agent. Set to null to remove webhook url from this agent.

boosted_keywords
string[] | null

Provide a customized list of keywords to bias the transcriber model, so that these words are more likely to get transcribed. Commonly used for names, brands, street, etc.

enable_transcription_formatting
boolean

If set to true, will format transcription to number, date, email, etc. If set to false, will return transcripts in raw words. If not set, default value of true will apply.

opt_out_sensitive_data_storage
boolean

Whether this agent opts out of sensitive data storage like transcript, recording, logging. These data can still be accessed securely via webhooks. If not set, default value of false will apply.

pronunciation_dictionary
object[] | null

A list of words / phrases and their pronunciation to be used to guide the audio synthesize for consistent pronunciation. Currently only supported for English & 11labs voices. Set to null to remove pronunciation dictionary from this agent.

normalize_for_speech
boolean

If set to true, will normalize the some part of text (number, currency, date, etc) to spoken to its spoken form for more consistent speech synthesis (sometimes the voice synthesize system itself might read these wrong with the raw text). For example, it will convert "Call my number 2137112342 on Jul 5th, 2024 for the $24.12 payment" to "Call my number two one three seven one one two three four two on july fifth, twenty twenty four for the twenty four dollars twelve cents payment" before starting audio generation.

end_call_after_silence_ms
integer

If users stay silent for a period after agent speech, end the call. The minimum value allowed is 10,000 ms (10 s). By default, this is set to 600000 (10 min).

max_call_duration_ms
integer

Maximum allowed length for the call, will force end the call if reached. The minimum value allowed is 60,000 ms (1 min), and maximum value allowed is 7,200,000 (2 hours). By default, this is set to 3,600,000 (1 hour).

enable_voicemail_detection
boolean

If set to true, will detect whether the call enters a voicemail. Note that this feature is only available for phone calls.

voicemail_message
string

The message to be played when the call enters a voicemail. Note that this feature is only available for phone calls. If you want to hangup after hitting voicemail, set this to empty string.

voicemail_detection_timeout_ms
integer

Configures when to stop running voicemail detection, as it becomes unlikely to hit voicemail after a couple minutes, and keep running it will only have negative impact. The minimum value allowed is 5,000 ms (5 s), and maximum value allowed is 180,000 (3 minutes). By default, this is set to 30,000 (30 s).

post_call_analysis_data
object[] | null

Post call analysis data to extract from the call. This data will augment the pre-defined variables extracted in the call analysis. This will be available after the call ends.