Streaming Responses

Agent responses are streamed in real-time via a dedicated streaming API.

Streaming Endpoint

POST https://stream.api.universalapi.co/agent/{agentId}/chat

INFO

Note the different domain: stream.api.universalapi.co (not api.universalapi.co). This is a dedicated API Gateway optimized for streaming.

Request Format

bash

curl -s -X POST https://stream.api.universalapi.co/agent/AGENT_ID/chat \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain quantum computing in simple terms",
    "conversationId": "conv-xxx"
  }'

Field	Type	Required	Description
`prompt`	string	Yes	The user's message
`conversationId`	string	No	Include to continue an existing conversation

Response Format

The response streams as plain text, followed by two metadata lines:

[Agent's streamed text response here...]

__META__{"conversationId":"conv-abc123","agentId":"agent-xxx","bedrockProvider":"platform"}
__METRICS__{"totalCycles":2,"totalTokens":1500,"toolsUsed":["google_search"]}

Metadata Fields (`META`)

Field	Description
`conversationId`	Use this to continue the conversation
`agentId`	The agent that responded
`bedrockProvider`	`"platform"` (UAPI pays) or `"user"` (user's AWS keys)

Metrics Fields (`METRICS`)

Field	Description
`totalCycles`	Number of reasoning cycles (tool calls + responses)
`totalTokens`	Total input + output tokens used
`toolsUsed`	Array of tool names the agent called

Parsing Streamed Responses

Python

python

import requests

response = requests.post(
    "https://stream.api.universalapi.co/agent/AGENT_ID/chat",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={"prompt": "Hello!"},
    stream=True
)

text = ""
meta = None
metrics = None

for line in response.iter_lines(decode_unicode=True):
    if line.startswith("__META__"):
        import json
        meta = json.loads(line[8:])
    elif line.startswith("__METRICS__"):
        import json
        metrics = json.loads(line[11:])
    else:
        text += line + "\n"
        print(line)  # Print as it streams

print(f"\nConversation ID: {meta['conversationId']}")
print(f"Tokens used: {metrics['totalTokens']}")

JavaScript

javascript

const response = await fetch(
  "https://stream.api.universalapi.co/agent/AGENT_ID/chat",
  {
    method: "POST",
    headers: {
      "Authorization": "Bearer YOUR_TOKEN",
      "Content-Type": "application/json"
    },
    body: JSON.stringify({ prompt: "Hello!" })
  }
);

const text = await response.text();
const lines = text.split("\n");

let agentText = "";
let meta = null;
let metrics = null;

for (const line of lines) {
  if (line.startsWith("__META__")) {
    meta = JSON.parse(line.slice(8));
  } else if (line.startsWith("__METRICS__")) {
    metrics = JSON.parse(line.slice(11));
  } else {
    agentText += line + "\n";
  }
}

Tool Call Display

When an agent uses tools, the response includes tool call information inline:

I'll search for that information...

[Calling tool: google_search with {"query": "latest AI news"}]

Based on my search results, here are the latest developments in AI...

__META__{"conversationId":"conv-xxx",...}
__METRICS__{"totalCycles":2,"totalTokens":1500,"toolsUsed":["google_search"]}

Error Handling

If an error occurs during streaming, the response will contain an error message:

__ERROR__{"error":"Insufficient credits","code":"INSUFFICIENT_CREDITS"}

Common errors:

INSUFFICIENT_CREDITS — User doesn't have enough credits
AGENT_NOT_FOUND — Invalid agent ID
AGENT_EXECUTION_ERROR — Error in agent source code
TIMEOUT — Agent exceeded the execution time limit

Streaming Responses ​

Streaming Endpoint ​

Request Format ​

Response Format ​

Metadata Fields (__META__) ​

Metrics Fields (__METRICS__) ​

Parsing Streamed Responses ​

Python ​

JavaScript ​

Tool Call Display ​

Error Handling ​