If you have not worked with audio bytes before, we stronly suggest you to check out audio basics, which can help with choosing the best configuration here.

Set up the SDK

Step 1: Install the Client JS SDK

npm install retell-client-js-sdk

Step 2: Set up the SDK class

import { RetellWebClient } from "retell-client-js-sdk";

const sdk = new RetellWebClient();

Call register-call to get call id

Your client code should call your server endpoint which calls register call to get the call id. The endpoint requires using the API Key, which is the reason why you need to call the endpoint from the server instead of client to protect the key from exposing.

Start the conversation

sdk.startConversation({
  callId: registerCallResponse.call_id,
  sampleRate: registerCallResponse.sample_rate
  enableUpdate: true, // (Optional) You want to receive the update event such as transcript
  customStream: yourStream, // (Optional) You can use your own MediaStream which might use a different mic
});

Stop the conversation

You can close a web call with the agent by using

sdk.stopConversation()

Listen to events

// Setup event listeners
// When the whole agent and user conversation starts
sdk.on("conversationStarted", () => {
    console.log("Conversation started");
});

// When the whole agent and user conversation ends
sdk.on("conversationEnded", () => {
    console.log("Conversation ended");
});

sdk.on("error", (error) => {
    console.error("An error occurred:", error);
});

// Update message such as transcript, turntaking information
sdk.on("update", (update) => {
    // Print live transcript as needed
    console.log("update", update);
});

// Metadata passed from custom LLM server
sdk.on("metadata", (metadata) => {
    console.log("metadata", metadata);
});

// Agent audio in real time, can be used for animation
sdk.on("audio", (audio: Uint8Array) => {
    console.log("There is audio");
});

// Signals agent audio starts playback, does not work when ambient sound is used
// Useful for animation
sdk.on("agentStartTalking", () => {
    console.log("agentStartTalking");
});

// Signals all agent audio in buffer has been played back, does not work when ambient sound is used
// Useful for animation
sdk.on("agentStopTalking", () => {
    console.log("agentStopTalking");
});

Transcript Update

If you would like show animation according to user speech or agent speech, you can utilize update event. When you call startConversation, set enableUpdate to true.

In update, we will provide the update such as transcript. It will be the transcript for both user and agent in an incremental way. For example, during the conversation it will print:

In the transcript, you will get word level timestamp for the user speech. You can use this to tell if the user is speaking fast or slowly.

{
  "transcript": [
    {
      "role": "user",
      "content": "Hey",
      "words": [
        {
          "word": "Hey.",
          "start": 4.375,
          "end": 4.615
        }
      ]
    }
  ]
}
{
  "transcript": [
    {
      "role": "user",
      "content": "Hey there",
      "words": [
        {
          "word": "Hey.",
          "start": 4.375,
          "end": 4.615
        },
        {
          "word": "there,",
          "start": 4.615,
          "end": 4.855
        }
      ]
    }
  ]
}
{
  "transcript": [
    {
      "role": "user",
      "content": "Hey there",
       "words": [
        {
          "word": "Hey.",
          "start": 4.375,
          "end": 4.615
        },
        {
          "word": "there,",
          "start": 4.615,
          "end": 4.855
        }
      ]
    }
    {
      "role": "agent",
      "content": "Hey, I\'m"
    }
  ]
}