Follow these steps to configure the fundamental settings for your agent, optimizing it for your specific business requirements.

1

Select a Language Model

We recommend starting with GPT-4o, which offers an optimal balance of:

  • Response quality
  • Latency
  • Cost-effectiveness
2

Configure Voice Settings

  1. Open the voice selection dropdown menu:
  1. Listen to the available voice samples and note the voice ID of your preferred option:

Custom Voices: You can also add voices from the ElevenLabs community by clicking “Add custom voice”. Learn more in our voice configuration guide.

3

Configure Conversation Initiation

Define how your agent starts conversations:

  • User-First: Agent waits for user input
  • Agent-First: Agent initiates the conversation
    • Set a fixed welcome message
    • Use prompts to guide the agent’s opening message

More Settings

You can further customize your agent by setting the following settings:

1

Write Global Prompt

Here’s where you specify the agent’s persona, identity, guardrails, etc. This set of text will be available in every node, and will influence all response generation.

2

Configure Knowledge Base

Here’s where you can supply contexts to agent via documents, urls, texts. Read more at Knowledge Base Guide.

3

Configure Speech Settings

Here’s a lot of options that allow you to finetune how your agent interacts with user.

  • Background sound: select a background sound that plays throughout the whole call to mimic an environment like call center, making the conversation more humanlike and engaging.
  • Responsiveness: how responsive the agent is. Set it lower if you want agent to respond slower, which can be useful when talking to folks like elderlys.
  • Interruption Sensitivity: how fast the agent gets interrupted by user interruptions. Set it lower if you want agent to be more resilient to background speech.
  • Backchanneling: Set up how often and what words the agent uses to acknowledge users.
  • Boosted Keywords: Provides some biases towards certain words, making it easier to get recognized. Common ones are brand names, people’s names, etc.
  • Speech Normalization: convert entities like date, currency, numbers into plain words, which can help prevent issues where audio generated was not pronuncing those right.
  • Disable Transcript Formatting: return transcript with entities in plain words, not formatted to timestamps, numbers, etc. Can prevent issues that are caused by incorrect transcript formatting.
  • Reminder frequency: how often the agent will remind the user when user is inactive.
  • Pronunciation: set up pronunciation guide for specific words.
4

Configure Call Settings

Here’s a couple of settings that’s more call operation related.

  • Voicemail related settings: set up voicemail detection and what to do when voicemail is detected. See more at Handle Voicemail.
  • End call on silence: set up if user is active for a certain amount of time, the call will be ended.
  • Call duration: set up maximum duration of the call.
  • Pause before speaking: For the beginning of the call, if agent speaks first, it will wait for the configured duration before speaking, useful to handle scenarios when user is still picking up the phone.
5

Configure Post Call Analysis

Probably set up later, read more at Post Call Analysis Guide.

6

Configure Privacy & Webhook

Here’s where you can set up whether to opt out sensitive data storage, and configure webhook settings for receiving call related events.

Video Tutorial

See community templates in docs

Next Steps

Once you’ve configured these basic settings, your agent is ready for basic interactions. To enhance its capabilities, proceed to adding capabilities by using function calling.

Was this page helpful?