Skip to main content
We have enforced some constraints and limitations to ensure the smooth operation of your agents, and prevent any misuse of the service. Note that these constraints can be adjusted based on your operational needs, on a case-by-case basis.

Concurrency

Concurrency refers to the number of simultaneous active voice calls that can be handled by your system at any given moment. For example, if 15 users are engaged in voice calls with your agents at the same time, that counts as 15 concurrent calls. As part of our service, Pay-As-You-Go users are allocated a quota of 20 concurrent calls. Should your operational needs require additional concurrency, you can go to “Billing” page to upgrade your plan.
Billing Page
You can check your current number of concurrent calls in the dashboard.
  • Handling Multiple Calls per Agent: You don’t need to create multiple agents to manage multiple calls concurrently. Each agent within your plan is capable of handling an unlimited number of calls, provided that the total concurrency remains within your designated quota. This means you can efficiently manage your workload without unnecessary agent duplication.

Concurrency Burst

Concurrency Burst allows you to temporarily exceed your standard concurrency limit during peak demand periods. When enabled, calls that would normally be rejected due to hitting your concurrency limit will instead be allowed to proceed with an additional surcharge.

How It Works

When concurrency burst is enabled:
  1. Normal calls: Calls within your standard concurrency limit proceed as usual with no additional charges
  2. Burst calls: Calls that exceed your normal limit (but stay within the burst limit) will proceed with an additional $0.10/min surcharge applied to the entire call duration

Burst Limit Calculation

Your burst limit is calculated as the lower of:
  • 3× your concurrency limit, OR
  • Your concurrency limit + 300
For example:
  • If your limit is 50, burst allows up to 150 concurrent calls (3 × 50 = 150)
  • If your limit is 200, burst allows up to 500 concurrent calls (200 + 300 = 500, which is less than 3 × 200 = 600)

Enabling Concurrency Burst

You can enable or disable concurrency burst from the Settings > Limits page in your dashboard.
Concurrency Burst Settings

Pricing

Call TypeAdditional Cost
Normal (within standard limit)No additional charge
Burst (above standard limit)$0.10/min for the entire call duration
The burst surcharge applies to the entire duration of any call that started while in burst mode, not just the portion of time spent above the normal limit.

Use Cases

Concurrency burst is ideal for:
  • Unpredictable traffic spikes: Handle sudden increases in call volume without rejected calls
  • Campaign launches: Support higher-than-normal call volumes during marketing campaigns
  • Seasonal peaks: Manage increased demand during busy periods without permanently upgrading your concurrency limit
While concurrency burst provides flexibility, consistent high usage above your normal limit may indicate a need to increase your base concurrency allocation for cost efficiency.

Max Call Duration

The maximum duration of a call is 1 hours by default, and the call will end automatically after 1 hour. Should your operational needs require longer calls, please reach out to our team at [email protected] to discuss options.

Max Prompt Token Length

The maximum length of prompt when using Retell LLM framework is 32768 by default, and longer prompt will be rejected when creating or updating the LLM. Note that prompts over 3500 tokens will be charged extra, read more at Billing Exceptions. Should your operational needs require longer context, please reach out to our team at [email protected] to discuss options.