Interaction Overview Diagram
The upper part of the diagram is interaction between your backend response generating server and Retell server.
-
A phone or web call is made with the AI agent. Our server established the
audio WebSocket
. -
Our server will connect with
llm_websocket_url
you provided in the agent. - You LLM server need to send the message upon the WebSocket connection is ready. If you want the agent to speak first, set the content; otherwise, set content to empty string.
- Users says, “My name is Mike”.
- Our model detected a high chance of turntaking, or user pauses, We request a response from your LLM.
-
Your server check for
interaction_type
in our json. If it isresponse_required
, you need to send the response. After receiving your response, we have our model to check if AI should speak - Users continued and spoke ” My name is Mike Trump”
- Same as step 3
- Our server receives the response from your LLM and decided to speak
-
We send the AI voice in the
audio websocket
. Meanwhile, We will send you json withinteraction_type
asupdate_only
. You don’t need to update but you can get the transcript from the json body.
Example Custom LLM Demo Repositories
Fork the complete code used in the following guides to follow along to integrate your custom LLM solutions. These demo repos show how to built a LLM solution withopenai
/ azure openai
, how to start a LLM websocket
server,
and how to use Twilio to make phone calls with Retell agents programmatically.
- Backend Server:
Youtube Guide
This video might be outdated already.