Appearance
Voice Agents
Voice agents enable real-time, bidirectional voice conversations powered by Amazon Nova Sonic. Users speak naturally and the agent responds with lifelike speech — no text typing needed.
What Are Voice Agents?
Voice agents (also called "bidi agents") are a special type of agent on Universal API that use WebSocket connections for real-time audio streaming. Unlike text agents that use HTTP request/response, voice agents maintain a persistent connection for continuous, natural conversation.
Key capabilities:
- 🎙️ Real-time speech-to-speech conversations
- 🔧 Tool use during voice conversations (check availability, book appointments, etc.)
- 🌐 Embeddable on any website with one script tag
- 📞 Phone integration via Twilio (inbound and outbound calls)
- 🗣️ Multiple voice personalities (tiffany, matthew, amy)
- ⚡ Barge-in support (users can interrupt the agent mid-sentence)
Quick Start
1. Write the Agent Source Code
Voice agents define a create_bidi_agent() function (instead of create_agent() for text agents):
python
from strands.experimental.bidi.agent import BidiAgent
from strands.experimental.bidi.models.nova_sonic import BidiNovaSonicModel
def create_bidi_agent():
model = BidiNovaSonicModel(
region="us-east-1",
model_id="amazon.nova-sonic-v1:0",
provider_config={
"audio": {
"input_sample_rate": 16000,
"output_sample_rate": 24000,
"voice": "tiffany"
}
}
)
system_prompt = """You are a friendly voice assistant.
Keep responses concise (1-3 sentences) for natural conversation.
Confirm important details by repeating them back."""
return BidiAgent(model=model, system_prompt=system_prompt)2. Deploy via API
Create the agent with agentType: "bidi":
bash
curl -X POST https://api.universalapi.co/agent/create \
-H "Authorization: Bearer uapi_ut_your_token" \
-H "Content-Type: application/json" \
-d '{
"agentName": "my-voice-assistant",
"description": "A friendly voice assistant",
"agentType": "bidi",
"sourceCode": "...",
"visibility": "public"
}'Or use the create_agent MCP tool with agentType="bidi".
3. Connect Users
Choose one or more delivery methods:
- Embed Widget — One script tag on any website
- WebSocket — Custom browser integration
- Twilio Phone — Inbound/outbound phone calls
Available Voices
| Voice | Style | Best For |
|---|---|---|
tiffany | Warm, professional female | Customer service, receptionists |
matthew | Clear, friendly male | General assistants, support |
amy | British English female | International audiences |
Set the voice in the provider_config:
python
provider_config={
"audio": {
"input_sample_rate": 16000,
"output_sample_rate": 24000,
"voice": "tiffany" # or "matthew" or "amy"
}
}Adding Tools
Voice agents can use tools just like text agents. Define tools with the @tool decorator:
python
from strands.experimental.bidi.agent import BidiAgent
from strands.experimental.bidi.models.nova_sonic import BidiNovaSonicModel
from strands import tool
@tool
def check_availability(date: str, time_preference: str) -> dict:
"""Check appointment availability for a given date and time preference."""
# Your implementation — call your booking API, database, etc.
return {"available": True, "slots": ["9:00 AM", "2:00 PM", "4:30 PM"]}
@tool
def book_appointment(patient_name: str, date: str, time: str, service: str) -> dict:
"""Book an appointment for a patient."""
return {"confirmed": True, "confirmation_number": "APT-12345"}
@tool
def get_office_info(question: str) -> dict:
"""Answer questions about the office (hours, location, services)."""
return {"answer": "We're open Monday-Friday, 8 AM to 6 PM."}
def create_bidi_agent():
model = BidiNovaSonicModel(
region="us-east-1",
model_id="amazon.nova-sonic-v1:0",
provider_config={
"audio": {
"input_sample_rate": 16000,
"output_sample_rate": 24000,
"voice": "tiffany"
}
}
)
system_prompt = """You are a friendly AI receptionist for Bright Smile Dental.
You can check appointment availability, book appointments, and answer office questions.
Keep responses concise and confirm details by repeating them back."""
return BidiAgent(
model=model,
system_prompt=system_prompt,
tools=[check_availability, book_appointment, get_office_info]
)Embed Widget
The easiest way to add a voice agent to your website. One script tag creates a floating voice chat button:
html
<script src="https://cdn.universalapi.co/voice/v1.js"
data-agent="{agentId}"
data-token="{emb_pk_live_xxx}"
data-color="#6366f1"
data-greeting="Hi! How can I help you today?"
data-position="bottom-right">
</script>Widget Parameters
| Attribute | Required | Default | Description |
|---|---|---|---|
data-agent | ✅ | — | Agent UUID |
data-token | ✅ | — | Embed public key (emb_pk_live_xxx) |
data-position | ❌ | bottom-right | bottom-right or bottom-left |
data-color | ❌ | #6366f1 | Brand color (hex) |
data-greeting | ❌ | — | Custom greeting text shown before conversation starts |
data-trigger | ❌ | — | CSS selector for a custom trigger element (hides the default floating button) |
How It Works
- User clicks the floating button (or your custom trigger)
- Browser requests microphone permission
- WebSocket connection opens to
voice.api.universalapi.co - Audio streams bidirectionally — user speaks, agent responds in real-time
- Widget uses Shadow DOM for complete style isolation from your site
WebSocket Integration
For custom browser integrations with full control over the UI:
Endpoint:
wss://voice.api.universalapi.co/ws/{agentId}?token=uapi_ut_xxxAudio Format:
- Send: PCM 16kHz mono, 16-bit signed little-endian
- Receive: PCM 24kHz mono, 16-bit signed little-endian
JavaScript Example:
javascript
// Request microphone access
const stream = await navigator.mediaDevices.getUserMedia({
audio: {
sampleRate: 16000,
channelCount: 1,
echoCancellation: true,
noiseSuppression: true
}
});
// Connect to voice agent
const ws = new WebSocket(
`wss://voice.api.universalapi.co/ws/${agentId}?token=${token}`
);
ws.binaryType = "arraybuffer";
// Send audio frames from AudioWorklet or MediaRecorder
// Receive audio frames and play via AudioContext
ws.onmessage = (event) => {
if (event.data instanceof ArrayBuffer) {
// Play received audio through speakers
playAudioBuffer(event.data);
}
};Twilio Phone Integration
Connect your voice agent to a phone number for real phone calls.
Inbound Calls (Customers Call You)
- Create a Twilio voice channel:
bash
curl -X POST https://api.universalapi.co/channels \
-H "Authorization: Bearer uapi_ut_your_token" \
-H "Content-Type: application/json" \
-d '{
"name": "office-phone",
"platform": "twilio-voice",
"agentId": "{your-voice-agent-id}",
"platformConfig": {
"accountSid": "ACxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx",
"authToken": "your_twilio_auth_token",
"phoneNumber": "+12125551234"
}
}'Configure Twilio: In the Twilio Console, set your phone number's Voice webhook URL to the channel's
webhookUrlreturned in the response.Test it: Call your Twilio phone number — the voice agent answers!
Outbound Calls (Agent Calls Customers)
bash
curl -X POST https://api.universalapi.co/channels/{channelId}/call \
-H "Authorization: Bearer uapi_ut_your_token" \
-H "Content-Type: application/json" \
-d '{"to": "+12125559876"}'The agent initiates the call and begins the voice conversation when the recipient answers.
Billing
- Voice agents cost approximately 50 credits/minute while connected
- Billing is per-second (no rounding up to full minutes)
- 2 credit minimum per session
- Check your balance:
GET /user/credits
Best Practices
System Prompt Tips for Voice
- Keep it concise — Voice users expect quick responses. Aim for 1-3 sentences per turn.
- Use confirmation patterns — Repeat back important details (dates, names, phone numbers).
- Handle interruptions gracefully — Nova Sonic supports barge-in. Design prompts that work when cut short.
- Avoid long lists — Limit to 3-4 items and offer to continue.
- Use natural filler phrases — "Let me check on that" or "One moment" while processing tool calls.
- Be conversational — Avoid robotic or overly formal language.
Example System Prompt
You are a friendly AI receptionist for Bright Smile Dental in Denver.
Your capabilities:
- Check appointment availability
- Book appointments
- Answer questions about services, hours, and location
Guidelines:
- Keep responses to 1-3 sentences
- Always confirm appointment details before booking
- If unsure about something, offer to transfer to a human
- Be warm and professionalEmbed Token Management
The embed widget requires an embed token (emb_pk_live_xxx) to authenticate. Embed tokens are domain-restricted publishable keys — safe to include in client-side HTML.
Create an Embed Token
bash
curl -X POST https://api.universalapi.co/voice/embed/create \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"agentId": "YOUR_AGENT_ID",
"allowedDomains": ["yourdomain.com", "localhost"],
"rateLimitPerDay": 1000,
"rateLimitPerIp": 10,
"greeting": "Hi! Click to start a voice conversation.",
"color": "#6366f1",
"position": "bottom-right"
}'Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
agentId | string | ✅ | Voice agent UUID |
allowedDomains | string[] | No | Domains where widget can load (default: ["localhost"]). Use ["*"] for any domain. |
rateLimitPerDay | number | No | Max calls/day across all users (default: 1000) |
rateLimitPerIp | number | No | Max calls per IP per day (default: 10) |
greeting | string | No | Initial greeting shown in widget |
color | string | No | Hex color for widget button (default: #6366f1) |
position | string | No | bottom-right or bottom-left |
minutesLimit | number | No | Max minutes per call |
creditLimit | number | No | Max credits this token can consume |
List Embed Tokens
bash
curl https://api.universalapi.co/voice/embed/list \
-H "Authorization: Bearer YOUR_TOKEN"Update an Embed Token
bash
curl -X PUT https://api.universalapi.co/voice/embed/EMBED_TOKEN_ID \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{"allowedDomains": ["newdomain.com"], "rateLimitPerIp": 20}'Revoke an Embed Token
bash
curl -X DELETE https://api.universalapi.co/voice/embed/EMBED_TOKEN_ID \
-H "Authorization: Bearer YOUR_TOKEN"Get Widget Config (Public)
This endpoint is called by the widget script itself — no user auth needed, just the embed token:
bash
curl "https://api.universalapi.co/voice/embed/config?token=emb_pk_live_xxx"Returns the widget configuration (agentId, greeting, color, position) if the Origin header matches an allowed domain.
API Reference
| Endpoint | Method | Description |
|---|---|---|
/agent/create | POST | Create voice agent (set agentType: "bidi") |
wss://voice.api.universalapi.co/ws/{agentId} | WebSocket | Browser voice chat |
wss://voice.api.universalapi.co/ws/twilio/{agentId} | WebSocket | Twilio phone integration |
/channels | POST | Create Twilio voice channel |
/channels/{channelId}/call | POST | Make outbound phone call |
/voice/embed/create | POST | Create embed token |
/voice/embed/list | GET | List your embed tokens |
/voice/embed/{id} | PUT | Update embed token |
/voice/embed/{id} | DELETE | Revoke embed token |
/voice/embed/config | GET | Get widget config (public) |
https://voice.api.universalapi.co/health | GET | Voice runtime health check |
Related Resources
- Creating Agents — General agent creation guide
- Streaming — Text agent streaming responses
- Blog: AI Agent Outbound Phone Calls with Twilio — Step-by-step Twilio setup tutorial