Architecture overview
This page is the mental model the rest of the docs assume. Read it once and the API reference makes sense.
The boxes
Section titled “The boxes”┌──────────────────────────────────────────────────────────────────┐│ CLIENT ││ Dashboard (web) · Telegram bot · Slack · API consumer ││ Auth: session cookie (web) or rc_… token (API) │└──────────┬──────────────────────────────────────────────────────┘ │ HTTPS / WSS ▼┌──────────────────────────────────────────────────────────────────┐│ VONZIO PLATFORM ││ REST endpoints (/v1/*) ││ WebSocket (/v1/stream) ││ MCP endpoints (/mcp/memory · /mcp/notify · /mcp/gmail · ││ /mcp/teller · /mcp/platform) ││ Orchestrator: schedules tasks, owns sessions, mints MCP tokens│└──────────┬─────────────────────────────────────┬────────────────┘ │ container lifecycle │ outbound HTTPS ▼ ▼┌────────────────────────────────────────┐ ┌──────────────────┐│ AGENT CONTAINER (per workspace) │ │ Third parties ││ Claude Agent SDK │ │ Anthropic API ││ Tools: Bash, Read, Write, custom │ │ Telegram API ││ MCP clients → platform's MCP │ │ Google OAuth ││ Secrets: injected as env vars │ │ Teller (mTLS) │└────────────────────────────────────────┘ └──────────────────┘The agent never holds your API keys. It calls the platform’s MCP servers over HTTP using a one-shot bearer token the orchestrator mints per task. The platform translates those MCP calls into outbound third-party requests on your behalf.
A task, end to end
Section titled “A task, end to end”- Submit. Either
POST /v1/tasks(REST), atask.submitWebSocket message, an inbound Telegram message, or a scheduled playbook firing internally. - Validate. Caller key authenticated, profile ownership confirmed, scope checked for any granted integrations.
- Resolve profile. Profile config (model, tools, MCP servers, skills, subagents, claude_md) loaded from Postgres.
- Mint MCP tokens. The orchestrator generates a short-lived bearer token per MCP server (
memory,notify,gmail,teller,platform) bound to the task’s session and user. These tokens never leave the platform — they’re injected into the container as part of the MCP server config. - Spawn container. Docker creates a fresh agent image with the profile’s environment: Claude API key, secrets vault contents (filtered by scope), tool files, skill files, MCP server URLs + tokens, claude_md system prompt.
- Stream. The agent runs. Token-by-token output is relayed back to the WebSocket session and persisted to the event log. Tool calls fire over Docker’s internal network; MCP calls hit the platform’s
/mcp/*endpoints, which verify the bearer token against the in-memory map. - Finalize.
turn.doneevent fires. If the session is “ephemeral” (no claimed thread, no dashboard handle) it gets reaped; if it’s “persistent” (workspace flag or thread-claimed) the container is paused for resume. - Deliver. Final text is broadcast over WebSocket (dashboard) AND/OR pushed to whichever channel the task was bound to (Telegram session, Slack thread, etc).
Things that aren’t obvious
Section titled “Things that aren’t obvious”- Sessions ≠ workspaces. Same
session_id. “Session” is the conversation; “workspace” is the persistent record + container handle. We use them interchangeably in code; the docs use “workspace” for the dashboard surface and “session” for the runtime/API surface. - Profiles are templates, workspaces are instances. One profile, many workspaces.
- MCP tokens are per-task. They die when the task finishes. An agent can’t reuse them or leak them across tasks.
- Integrations are scoped to the user, then optionally to specific profiles. See Integrations & scope.
- Playbooks have their own workspace ids prefixed
pb-. Hidden from the chat list unless thread-claimed (see Telegram integration). - Memory is namespaced per (user, profile). Two profiles owned by the same user have separate memory.