Integrate Function Calling
Let your voice agent take actions.
What is Function Calling
A lot of time, you would want your voice agent to take actions (for example: call an API to book appointment, or to transfer / end the call, get external knowledge) besides talking. Right now you can achieve this with certain LLMs by describe functions and have the model intelligently choose to output a JSON object containing arguments to call one or many functions. This is also known as tool use, and these two terms are interchangeable.
We recommend reading this OpenAI documentation to understand what function calling is. This guide we will use OpenAI function calling as an example, but feel free to take the idea and use with other models like Claude.
We will first dive into an easy use case of function calling (end the call), and then cover a more advanced appointment booking use case.
Case Study: End the Call Intelligently
Youtube tutorial to follow along:
The following steps take codes from Node.js Demo Repo / Python Demo Repo, it’s modified based on the LLM client class you created in the last guide.
Step 1
Define the Function
Note that for OpenAI, it would either give a tool call, or it would give a text response, but not both.
Here we defined a message
parameter in the tool call, so that when LLM decides to call this function,
we can also have something to say to the user.
Step 2
Add your function calling
into chat request and call it
Step 3
Extract the function calling arguments (if any) from the streaming response.
Step 4
End the call when LLM suggested this function call.
Case Study: Make an Appointment
End call is a simple case to use function calling. In most case, you would like to your agent to say something while calling the function and say something after calling the function.
Check out Node Demo Code and below Youtube toturial for a simplified example of booking appointments.
Please note that this is not production ready yet, as in production, you need to make sure you don’t make duplicate function calls, you can still interat and handle interruptions of users, etc. For a more practical setting, a good practice is to have some internal states to decide and track what function to run, and influence how LLM responds.
We are working on adding a more practical example of function calling to our open source demo repos.
Was this page helpful?