Architecture overview

This page is the mental model the rest of the docs assume. Read it once and the API reference makes sense.

The boxes

┌──────────────────────────────────────────────────────────────────┐
│  CLIENT                                                          │
│    Dashboard (web) · Telegram bot · Slack · API consumer         │
│    Auth: session cookie (web) or rc_… token (API)                │
└──────────┬──────────────────────────────────────────────────────┘
           │ HTTPS  /  WSS
           ▼
┌──────────────────────────────────────────────────────────────────┐
│  VONZIO PLATFORM                                                 │
│    REST endpoints (/v1/*)                                        │
│    WebSocket (/v1/stream)                                        │
│    MCP endpoints (/mcp/memory · /mcp/notify · /mcp/gmail ·       │
│                   /mcp/teller · /mcp/platform)                   │
│    Orchestrator: schedules tasks, owns sessions, mints MCP tokens│
└──────────┬─────────────────────────────────────┬────────────────┘
           │ container lifecycle                 │ outbound HTTPS
           ▼                                     ▼
┌────────────────────────────────────────┐   ┌──────────────────┐
│  AGENT CONTAINER (per workspace)       │   │ Third parties     │
│    Claude Agent SDK                    │   │  Anthropic API    │
│    Tools: Bash, Read, Write, custom    │   │  Telegram API     │
│    MCP clients → platform's MCP        │   │  Google OAuth     │
│    Secrets: injected as env vars       │   │  Teller (mTLS)    │
└────────────────────────────────────────┘   └──────────────────┘

The agent never holds your API keys. It calls the platform’s MCP servers over HTTP using a one-shot bearer token the orchestrator mints per task. The platform translates those MCP calls into outbound third-party requests on your behalf.

A task, end to end

Submit. Either POST /v1/tasks (REST), a task.submit WebSocket message, an inbound Telegram message, or a scheduled playbook firing internally.
Validate. Caller key authenticated, profile ownership confirmed, scope checked for any granted integrations.
Resolve profile. Profile config (model, tools, MCP servers, skills, subagents, claude_md) loaded from Postgres.
Mint MCP tokens. The orchestrator generates a short-lived bearer token per MCP server (memory, notify, gmail, teller, platform) bound to the task’s session and user. These tokens never leave the platform — they’re injected into the container as part of the MCP server config.
Spawn container. Docker creates a fresh agent image with the profile’s environment: Claude API key, secrets vault contents (filtered by scope), tool files, skill files, MCP server URLs + tokens, claude_md system prompt.
Stream. The agent runs. Token-by-token output is relayed back to the WebSocket session and persisted to the event log. Tool calls fire over Docker’s internal network; MCP calls hit the platform’s /mcp/* endpoints, which verify the bearer token against the in-memory map.
Finalize. turn.done event fires. If the session is “ephemeral” (no claimed thread, no dashboard handle) it gets reaped; if it’s “persistent” (workspace flag or thread-claimed) the container is paused for resume.
Deliver. Final text is broadcast over WebSocket (dashboard) AND/OR pushed to whichever channel the task was bound to (Telegram session, Slack thread, etc).

Things that aren’t obvious

Sessions ≠ workspaces. Same session_id. “Session” is the conversation; “workspace” is the persistent record + container handle. We use them interchangeably in code; the docs use “workspace” for the dashboard surface and “session” for the runtime/API surface.
Profiles are templates, workspaces are instances. One profile, many workspaces.
MCP tokens are per-task. They die when the task finishes. An agent can’t reuse them or leak them across tasks.
Integrations are scoped to the user, then optionally to specific profiles. See Integrations & scope.
Playbooks have their own workspace ids prefixed pb-. Hidden from the chat list unless thread-claimed (see Telegram integration).
Memory is namespaced per (user, profile). Two profiles owned by the same user have separate memory.