Skip to content

Streaming Responses

Agent responses are streamed in real-time via a dedicated streaming API.

Streaming Endpoint

POST https://stream.api.universalapi.co/agent/{agentId}/chat

INFO

Note the different domain: stream.api.universalapi.co (not api.universalapi.co). This is a dedicated API Gateway optimized for streaming.

Request Format

bash
curl -s -X POST https://stream.api.universalapi.co/agent/AGENT_ID/chat \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Explain quantum computing in simple terms",
    "conversationId": "conv-xxx"
  }'
FieldTypeRequiredDescription
promptstringYesThe user's message
conversationIdstringNoInclude to continue an existing conversation

Response Format

The response streams as plain text, followed by two metadata lines:

[Agent's streamed text response here...]

__META__{"conversationId":"conv-abc123","agentId":"agent-xxx","bedrockProvider":"platform"}
__METRICS__{"totalCycles":2,"totalTokens":1500,"toolsUsed":["google_search"]}

Metadata Fields (__META__)

FieldDescription
conversationIdUse this to continue the conversation
agentIdThe agent that responded
bedrockProvider"platform" (UAPI pays) or "user" (user's AWS keys)

Metrics Fields (__METRICS__)

FieldDescription
totalCyclesNumber of reasoning cycles (tool calls + responses)
totalTokensTotal input + output tokens used
toolsUsedArray of tool names the agent called

Parsing Streamed Responses

Python

python
import requests

response = requests.post(
    "https://stream.api.universalapi.co/agent/AGENT_ID/chat",
    headers={"Authorization": "Bearer YOUR_TOKEN"},
    json={"prompt": "Hello!"},
    stream=True
)

text = ""
meta = None
metrics = None

for line in response.iter_lines(decode_unicode=True):
    if line.startswith("__META__"):
        import json
        meta = json.loads(line[8:])
    elif line.startswith("__METRICS__"):
        import json
        metrics = json.loads(line[11:])
    else:
        text += line + "\n"
        print(line)  # Print as it streams

print(f"\nConversation ID: {meta['conversationId']}")
print(f"Tokens used: {metrics['totalTokens']}")

JavaScript

javascript
const response = await fetch(
  "https://stream.api.universalapi.co/agent/AGENT_ID/chat",
  {
    method: "POST",
    headers: {
      "Authorization": "Bearer YOUR_TOKEN",
      "Content-Type": "application/json"
    },
    body: JSON.stringify({ prompt: "Hello!" })
  }
);

const text = await response.text();
const lines = text.split("\n");

let agentText = "";
let meta = null;
let metrics = null;

for (const line of lines) {
  if (line.startsWith("__META__")) {
    meta = JSON.parse(line.slice(8));
  } else if (line.startsWith("__METRICS__")) {
    metrics = JSON.parse(line.slice(11));
  } else {
    agentText += line + "\n";
  }
}

Tool Call Display

When an agent uses tools, the response includes tool call information inline:

I'll search for that information...

[Calling tool: google_search with {"query": "latest AI news"}]

Based on my search results, here are the latest developments in AI...

__META__{"conversationId":"conv-xxx",...}
__METRICS__{"totalCycles":2,"totalTokens":1500,"toolsUsed":["google_search"]}

Error Handling

If an error occurs during streaming, the response will contain an error message:

__ERROR__{"error":"Insufficient credits","code":"INSUFFICIENT_CREDITS"}

Common errors:

  • INSUFFICIENT_CREDITS — User doesn't have enough credits
  • AGENT_NOT_FOUND — Invalid agent ID
  • AGENT_EXECUTION_ERROR — Error in agent source code
  • TIMEOUT — Agent exceeded the execution time limit

Universal API - The agentic entry point to the universe of APIs