Skip to content

Architecture overview

This page is the mental model the rest of the docs assume. Read it once and the API reference makes sense.

┌──────────────────────────────────────────────────────────────────┐
│ CLIENT │
│ Dashboard (web) · Telegram bot · Slack · API consumer │
│ Auth: session cookie (web) or rc_… token (API) │
└──────────┬──────────────────────────────────────────────────────┘
│ HTTPS / WSS
┌──────────────────────────────────────────────────────────────────┐
│ VONZIO PLATFORM │
│ REST endpoints (/v1/*) │
│ WebSocket (/v1/stream) │
│ MCP endpoints (/mcp/memory · /mcp/notify · /mcp/gmail · │
│ /mcp/teller · /mcp/platform) │
│ Orchestrator: schedules tasks, owns sessions, mints MCP tokens│
└──────────┬─────────────────────────────────────┬────────────────┘
│ container lifecycle │ outbound HTTPS
▼ ▼
┌────────────────────────────────────────┐ ┌──────────────────┐
│ AGENT CONTAINER (per workspace) │ │ Third parties │
│ Claude Agent SDK │ │ Anthropic API │
│ Tools: Bash, Read, Write, custom │ │ Telegram API │
│ MCP clients → platform's MCP │ │ Google OAuth │
│ Secrets: injected as env vars │ │ Teller (mTLS) │
└────────────────────────────────────────┘ └──────────────────┘

The agent never holds your API keys. It calls the platform’s MCP servers over HTTP using a one-shot bearer token the orchestrator mints per task. The platform translates those MCP calls into outbound third-party requests on your behalf.

  1. Submit. Either POST /v1/tasks (REST), a task.submit WebSocket message, an inbound Telegram message, or a scheduled playbook firing internally.
  2. Validate. Caller key authenticated, profile ownership confirmed, scope checked for any granted integrations.
  3. Resolve profile. Profile config (model, tools, MCP servers, skills, subagents, claude_md) loaded from Postgres.
  4. Mint MCP tokens. The orchestrator generates a short-lived bearer token per MCP server (memory, notify, gmail, teller, platform) bound to the task’s session and user. These tokens never leave the platform — they’re injected into the container as part of the MCP server config.
  5. Spawn container. Docker creates a fresh agent image with the profile’s environment: Claude API key, secrets vault contents (filtered by scope), tool files, skill files, MCP server URLs + tokens, claude_md system prompt.
  6. Stream. The agent runs. Token-by-token output is relayed back to the WebSocket session and persisted to the event log. Tool calls fire over Docker’s internal network; MCP calls hit the platform’s /mcp/* endpoints, which verify the bearer token against the in-memory map.
  7. Finalize. turn.done event fires. If the session is “ephemeral” (no claimed thread, no dashboard handle) it gets reaped; if it’s “persistent” (workspace flag or thread-claimed) the container is paused for resume.
  8. Deliver. Final text is broadcast over WebSocket (dashboard) AND/OR pushed to whichever channel the task was bound to (Telegram session, Slack thread, etc).
  • Sessions ≠ workspaces. Same session_id. “Session” is the conversation; “workspace” is the persistent record + container handle. We use them interchangeably in code; the docs use “workspace” for the dashboard surface and “session” for the runtime/API surface.
  • Profiles are templates, workspaces are instances. One profile, many workspaces.
  • MCP tokens are per-task. They die when the task finishes. An agent can’t reuse them or leak them across tasks.
  • Integrations are scoped to the user, then optionally to specific profiles. See Integrations & scope.
  • Playbooks have their own workspace ids prefixed pb-. Hidden from the chat list unless thread-claimed (see Telegram integration).
  • Memory is namespaced per (user, profile). Two profiles owned by the same user have separate memory.