Get Voice Agent

Authorizations

Authorization

string

header

required

Authentication header containing API key (find it in dashboard). The format is "Bearer YOUR_API_KEY"

Path Parameters

agent_id

string

required

Unique id of the agent to be retrieved.

Minimum string length: 1

Example:

"16b980523634a6dc504898cda492e939"

Query Parameters

version

integer

Optional version of the API to use for this request. If not provided, will default to latest version.

Example:

1

Response

Successfully retrieved an agent.

agent_id

string

required

Unique id of agent.

Example:

"oBeDLoLOeuAbiuaMFXRtDOLriTJ5tSxD"

version

integer

required

Version of the agent.

Example:

0

response_engine

object

required

The Response Engine to attach to the agent. It is used to generate responses for the agent. You need to create a Response Engine first before attaching it to an agent.

Option 1
Option 2
Option 3

Show child attributes

Example:

{
  "type": "retell-llm",
  "llm_id": "llm_234sdertfsdsfsdf",
  "version": 0
}

voice_id

string

required

Unique voice id used for the agent. Find list of available voices and their preview in Dashboard.

Example:

"retell-Cimo"

last_modification_timestamp

integer

required

Last modification timestamp (milliseconds since epoch). Either the time of last update or creation if no updates available.

Example:

1703413636133

is_published

boolean

Whether the agent is published.

Example:

false

agent_name

string | null

The name of the agent. Only used for your own reference.

Example:

"Jarvis"

version_description

string | null

Optional description of the agent version. Used for your own reference and documentation.

Example:

"Customer support agent for handling product inquiries"

voice_model

enum<string> | null

Select the voice model used for the selected voice. Each provider has a set of available voice models. Set to null to remove voice model selection, and default ones will apply. Check out dashboard for more details of each voice model.

Available options:

eleven_turbo_v2,

eleven_flash_v2,

eleven_turbo_v2_5,

eleven_flash_v2_5,

eleven_multilingual_v2,

eleven_v3,

sonic-2,

sonic-3,

sonic-3-latest,

sonic-turbo,

tts-1,

gpt-4o-mini-tts,

speech-02-turbo,

speech-2.8-turbo,

s1,

null

fallback_voice_ids

string[] | null

When TTS provider for the selected voice is experiencing outages, we would use fallback voices listed here for the agent. Voice id and the fallback voice ids must be from different TTS providers. The system would go through the list in order, if the first one in the list is also having outage, it would use the next one. Set to null to remove voice fallback for the agent.

Example:

["cartesia-Cimo", "minimax-Cimo"]

voice_temperature

number

Controls how stable the voice is. Value ranging from [0,2]. Lower value means more stable, and higher value means more variant speech generation. Check the dashboard to see what provider supports this feature. If unset, default value 1 will apply.

Example:

1

voice_speed

number

Controls speed of voice. Value ranging from [0.5,2]. Lower value means slower speech, while higher value means faster speech rate. If unset, default value 1 will apply.

Required range: 0.5 <= x <= 2

Example:

1

enable_dynamic_voice_speed

boolean

If set to true, will enable dynamic voice speed adjustment based on the user's speech rate and conversation context. If unset, default value false will apply.

Example:

true

enable_dynamic_responsiveness

boolean

If set to true, the agent will dynamically adjust how quickly it responds based on the user's speech rate and past turn-taking behavior in the call. If unset, default value false will apply.

Example:

true

volume

number

If set, will control the volume of the agent. Value ranging from [0,2]. Lower value means quieter agent speech, while higher value means louder agent speech. If unset, default value 1 will apply.

Example:

1

voice_emotion

enum<string> | null

Controls the emotional tone of the agent's voice. Currently supported for Cartesia and Minimax TTS providers. If unset, no emotion will be used.

Available options:

calm,

sympathetic,

happy,

sad,

angry,

fearful,

surprised,

null

Example:

"calm"

responsiveness

number

Controls how responsive is the agent. Value ranging from [0,1]. Lower value means less responsive agent (wait more, respond slower), while higher value means faster exchanges (respond when it can). If unset, default value 1 will apply.

Required range: 0 <= x <= 1

Example:

1

interruption_sensitivity

number

Controls how sensitive the agent is to user interruptions. Value ranging from [0,1]. Lower value means it will take longer / more words for user to interrupt agent, while higher value means it's easier for user to interrupt agent. If unset, default value 1 will apply. When this is set to 0, agent would never be interrupted.

Required range: 0 <= x <= 1

Example:

1

enable_backchannel

boolean

Controls whether the agent would backchannel (agent interjects the speaker with phrases like "yeah", "uh-huh" to signify interest and engagement). Backchannel when enabled tends to show up more in longer user utterances. If not set, agent will not backchannel.

Example:

true

backchannel_frequency

number

Only applicable when enable_backchannel is true. Controls how often the agent would backchannel when a backchannel is possible. Value ranging from [0,1]. Lower value means less frequent backchannel, while higher value means more frequent backchannel. If unset, default value 0.8 will apply.

Example:

0.9

backchannel_words

string[] | null

Only applicable when enable_backchannel is true. A list of words that the agent would use as backchannel. If not set, default backchannel words will apply. Check out backchannel default words for more details. Note that certain voices do not work too well with certain words, so it's recommended to experiment before adding any words.

Example:

["yeah", "uh-huh"]

reminder_trigger_ms

number

If set (in milliseconds), will trigger a reminder to the agent to speak if the user has been silent for the specified duration after some agent speech. Must be a positive number. If unset, default value of 10000 ms (10 s) will apply.

Example:

10000

reminder_max_count

integer

If set, controls how many times agent would remind user when user is unresponsive. Must be a non negative integer. If unset, default value of 1 will apply (remind once). Set to 0 to disable agent from reminding.

Example:

2

ambient_sound

enum<string> | null

If set, will add ambient environment sound to the call to make experience more realistic. Currently supports the following options:

coffee-shop: Coffee shop ambience with people chatting in background. Listen to Ambience
convention-hall: Convention hall ambience, with some echo and people chatting in background. Listen to Ambience
summer-outdoor: Summer outdoor ambience with cicada chirping. Listen to Ambience
mountain-outdoor: Mountain outdoor ambience with birds singing. Listen to Ambience
static-noise: Constant static noise. Listen to Ambience
call-center: Call center work noise. Listen to Ambience Set to null to remove ambient sound from this agent.

Available options:

coffee-shop,

convention-hall,

summer-outdoor,

mountain-outdoor,

static-noise,

call-center,

null

ambient_sound_volume

number

If set, will control the volume of the ambient sound. Value ranging from [0,2]. Lower value means quieter ambient sound, while higher value means louder ambient sound. If unset, default value 1 will apply.

Example:

1

language

enum<string>

Specifies what language (and dialect) the speech recognition will operate in. For instance, selecting en-GB optimizes speech recognition for British English. If unset, will use default value en-US. Select multi for multilingual support.

Available options:

en-US,

en-IN,

en-GB,

en-AU,

en-NZ,

de-DE,

es-ES,

es-419,

hi-IN,

fr-FR,

fr-CA,

ja-JP,

pt-PT,

pt-BR,

zh-CN,

ru-RU,

it-IT,

ko-KR,

nl-NL,

nl-BE,

pl-PL,

tr-TR,

vi-VN,

ro-RO,

bg-BG,

ca-ES,

th-TH,

da-DK,

fi-FI,

el-GR,

hu-HU,

id-ID,

no-NO,

sk-SK,

sv-SE,

lt-LT,

lv-LV,

cs-CZ,

ms-MY,

af-ZA,

ar-SA,

az-AZ,

bs-BA,

cy-GB,

fa-IR,

fil-PH,

gl-ES,

he-IL,

hr-HR,

hy-AM,

is-IS,

kk-KZ,

kn-IN,

mk-MK,

mr-IN,

ne-NP,

sl-SI,

sr-RS,

sw-KE,

ta-IN,

ur-IN,

yue-CN,

uk-UA,

multi

Example:

"en-US"

webhook_url

string | null

The webhook for agent to listen to call events. See what events it would get at webhook doc. If set, will binds webhook events for this agent to the specified url, and will ignore the account level webhook for this agent. Set to null to remove webhook url from this agent.

Example:

"https://webhook-url-here"

webhook_events

enum<string>[] | null

Which webhook events this agent should receive. If not set, defaults to call_started, call_ended, call_analyzed.

Available options:

call_started,

call_ended,

call_analyzed,

transcript_updated,

transfer_started,

transfer_bridged,

transfer_cancelled,

transfer_ended

webhook_timeout_ms

integer

The timeout for the webhook in milliseconds. If not set, default value of 10000 will apply.

Example:

10000

boosted_keywords

string[] | null

Provide a customized list of keywords to bias the transcriber model, so that these words are more likely to get transcribed. Commonly used for names, brands, street, etc.

Example:

["retell", "kroger"]

data_storage_setting

enum<string>

Granular setting to manage how Retell stores sensitive data (transcripts, recordings, logs, etc.). This replaces the deprecated opt_out_sensitive_data_storage field.

everything: Store all data including transcripts, recordings, and logs.
everything_except_pii: Store data without PII when PII is detected.
basic_attributes_only: Store only basic attributes; no transcripts/recordings/logs. If not set, default value of "everything" will apply.

Available options:

everything,

everything_except_pii,

basic_attributes_only

Example:

"everything"

data_storage_retention_days

integer | null

Number of days to retain call/chat data before automatic deletion. Must be between 1 and 730 days. If not set, data is retained forever (no automatic deletion).

Required range: 1 <= x <= 730

Example:

30

opt_in_signed_url

boolean

Whether this agent opts in for signed URLs for public logs and recordings. When enabled, the generated URLs will include security signatures that restrict access and automatically expire after 24 hours.

Example:

true

signed_url_expiration_ms

integer | null

The expiration time for the signed url in milliseconds. Only applicable when opt_in_signed_url is true. If not set, default value of 86400000 (24 hours) will apply.

Example:

86400000

pronunciation_dictionary

object[] | null

A list of words / phrases and their pronunciation to be used to guide the audio synthesize for consistent pronunciation. Check the dashboard to see what provider supports this feature. Set to null to remove pronunciation dictionary from this agent.

Show child attributes

normalize_for_speech

boolean

If set to true, will normalize the some part of text (number, currency, date, etc) to spoken to its spoken form for more consistent speech synthesis (sometimes the voice synthesize system itself might read these wrong with the raw text). For example, it will convert "Call my number 2137112342 on Jul 5th, 2024 for the $24.12 payment" to "Call my number two one three seven one one two three four two on july fifth, twenty twenty four for the twenty four dollars twelve cents payment" before starting audio generation.

Example:

true

end_call_after_silence_ms

integer

If users stay silent for a period after agent speech, end the call. The minimum value allowed is 10,000 ms (10 s). By default, this is set to 600000 (10 min).

Example:

600000

max_call_duration_ms

integer

Maximum allowed length for the call, will force end the call if reached. The minimum value allowed is 60,000 ms (1 min), and maximum value allowed is 7,200,000 (2 hours). By default, this is set to 3,600,000 (1 hour).

Example:

3600000

enable_voicemail_detection

boolean

If set to true, will detect whether the call enters a voicemail. Note that this feature is only available for phone calls.

Example:

true

voicemail_message

string

The message to be played when the call enters a voicemail. Note that this feature is only available for phone calls. If you want to hangup after hitting voicemail, set this to empty string.

Example:

"Hi, please give us a callback."

voicemail_detection_timeout_ms

integer

Configures when to stop running voicemail detection, as it becomes unlikely to hit voicemail after a couple minutes, and keep running it will only have negative impact. The minimum value allowed is 5,000 ms (5 s), and maximum value allowed is 180,000 (3 minutes). By default, this is set to 30,000 (30 s).

Example:

30000

voicemail_option

object

If this option is set, the call will try to detect voicemail in the first 3 minutes of the call. Actions defined (hangup, or leave a message) will be applied when the voicemail is detected. Set this to null to disable voicemail detection.

Show child attributes

Example:

{
  "action": {
    "type": "static_text",
    "text": "Please give us a callback tomorrow at 10am."
  }
}

ivr_option

object

If this option is set, the call will try to detect IVR in the first 3 minutes of the call. Actions defined will be applied when the IVR is detected. Set this to null to disable IVR detection.

Show child attributes

Example:

{ "action": { "type": "hangup" } }

post_call_analysis_data

object[] | null

Post call analysis data to extract from the call. This data will augment the pre-defined variables extracted in the call analysis. This will be available after the call ends.

Option 1
Option 2
Option 3
Option 4

Show child attributes

post_call_analysis_model

enum<string> | null

The model to use for post call analysis. Default to gpt-4.1-mini.

Available options:

gpt-4.1,

gpt-4.1-mini,

gpt-4.1-nano,

gpt-5,

gpt-5.1,

gpt-5.2,

gpt-5-mini,

gpt-5-nano,

claude-4.5-sonnet,

claude-4.6-sonnet,

claude-4.5-haiku,

gemini-2.5-flash,

gemini-2.5-flash-lite,

gemini-3.0-flash,

null

Example:

"gpt-4.1-mini"

analysis_successful_prompt

string | null

Prompt to determine whether the post call or chat analysis should mark the interaction as successful. Set to null to use the default prompt.

Maximum string length: 2000

Example:

"The agent finished the task and the call was complete without being cutoff."

analysis_summary_prompt

string | null

Prompt to guide how the post call or chat analysis summary should be generated. When unset, the default system prompt is used. Set to null to use the default prompt.

Maximum string length: 2000

Example:

"Summarize the outcome of the conversation in two sentences."

analysis_user_sentiment_prompt

string | null

Prompt to guide how the post call or chat analysis should evaluate user sentiment. When unset, the default system prompt is used. Set to null to use the default prompt.

Maximum string length: 2000

Example:

"Evaluate the user's sentiment based on their tone and satisfaction level."

begin_message_delay_ms

integer

If set, will delay the first message by the specified amount of milliseconds, so that it gives user more time to prepare to take the call. Valid range is [0, 5000]. If not set or set to 0, agent will speak immediately. Only applicable when agent speaks first.

Example:

1000

ring_duration_ms

integer

If set, the phone ringing will last for the specified amount of milliseconds. This applies for both outbound call ringtime, and call transfer ringtime. Default to 30000 (30 s). Valid range is [5000, 300000].

Required range: 5000 <= x <= 300000

Example:

30000

stt_mode

enum<string>

If set, determines whether speech to text should focus on latency or accuracy. Default to fast mode. When set to custom, custom_stt_config must be provided.

Available options:

fast,

accurate,

custom

Example:

"fast"

custom_stt_config

object

Custom STT configuration. Only used when stt_mode is set to custom.

Show child attributes

vocab_specialization

enum<string>

If set, determines the vocabulary set to use for transcription. This setting only applies for English agents, for non English agent, this setting is a no-op. Default to general.

Available options:

general,

medical

Example:

"general"

allow_user_dtmf

boolean

If set to true, DTMF input will be accepted and processed. If false, any DTMF input will be ignored. Default to true.

Example:

true

user_dtmf_options

object

Show child attributes

denoising_mode

enum<string>

If set, determines what denoising mode to use. Use "no-denoise" to bypass all audio denoising. Default to noise-cancellation.

Available options:

no-denoise,

noise-cancellation,

noise-and-background-speech-cancellation

Example:

"noise-cancellation"

pii_config

object

Configuration for PII scrubbing from transcripts and recordings.

Show child attributes

guardrail_config

object

Configuration for guardrail checks to detect and prevent prohibited topics in agent output and user input.

Show child attributes

is_public

boolean | null

Whether the agent is public. When set to true, the agent is available for public agent preview link.

Example:

false

Call (V2)

Chat

Phone Number

Voice Agent

Chat Agent

Retell LLM Response Engine (for single / multi prompt agent)

Conversation Flow Response Engine (for conversation flow agent)

Conversation Flow Components

MCP Tool

Knowledge Base

Voice

Batch call

Test Case Definitions

Batch Tests

Test Runs

Account

Custom Telephony

Custom LLM

Authorizations

Path Parameters

Query Parameters

Response