Skip to main content

Architecture

If you're building on top of Osaurus — writing plugins, scripts, or integrations — this page is the orientation. It maps the user-facing surfaces (chat overlay, management window, HTTP API) to the components underneath, and points to the deeper pages for each layer.

The harness

Osaurus presents three entry points:

  • The chat overlay (⌘;) — the daily driver
  • The Management window (⌘ ⇧ M) — settings, agents, models, plugins, tools, memory, themes, automation
  • The HTTP API (on :1337) — OpenAI / Anthropic / Open Responses / Ollama / MCP

All three funnel into the same agent loop, which talks to your memory, skills/methods, and the automation surface (schedules, watchers). Inference goes out to local MLX, Apple Foundation, or any cloud provider you've connected. Tools span native plugins (v1/v2 ABI), remote MCP servers, and the Linux sandbox. Underneath everything: identity (signed requests, access keys), encrypted storage (SQLCipher), and relay (public tunnels).

How the pieces fit together

flowchart TB
User[You]
Chat[Chat Overlay - ⌘;]
Mgmt[Management Window - ⌘ ⇧ M]
HTTP[HTTP API on :1337]

User --> Chat
User --> Mgmt
User --> HTTP

subgraph harness [The Harness]
Loop[Agent Loop]
Mem[Memory]
Skills[Skills and Methods]
Auto[Schedules and Watchers]
end

Chat --> Loop
Mgmt --> Loop
HTTP --> Loop

Loop --> Mem
Loop --> Skills
Auto --> Loop

subgraph providers [Inference]
MLX[MLX Local]
Foundation[Apple Foundation]
Cloud[Cloud Providers]
end

Loop --> MLX
Loop --> Foundation
Loop --> Cloud

subgraph plugins [Tools]
Native[Native Plugins]
MCP[Remote MCP]
Sandbox[Linux Sandbox]
end

Loop --> Native
Loop --> MCP
Loop --> Sandbox

subgraph foundationLayer [Foundations]
Identity[Identity and Access]
Storage[Encrypted Storage]
Relay[Relay Tunnels]
end

harness --> foundationLayer
plugins --> foundationLayer

Layers

LayerWhat it doesReference
Entry pointsChat overlay (⌘;), Management window (⌘ ⇧ M), HTTP API on :1337Chat, HTTP API, CLI
HarnessTasks, Memory, Skills/Methods, Schedules/Watchers — the continuity layerTasks, Memory, Skills, Methods, Schedules, Watchers
InferenceMLX local models, Apple Foundation Models, cloud providers — all behind the same pickerModels, Apple Intelligence, Inference Runtime
Tools20+ native plugins (Mail, Calendar, Browser, Git, …), remote MCP aggregation, the Linux SandboxTools & Plugins, Plugin Authoring, Sandbox Internals, Remote MCP Providers
FoundationsIdentity (signed requests, osk-v1 keys), encrypted storage (SQLCipher), Public Links (public tunnels)Identity Cryptography, Storage & Encryption, Public Links

Entry points

Chat overlay

A glass-style overlay summoned with ⌘; from anywhere on macOS. Holds zero, one, or many chat windows. Each window has its own active agent, working folder / Sandbox state, model selection, and conversation history. Multi-window mode lets you run several agents side by side.

The overlay is also where voice input lives: the microphone in the input bar, plus VAD wake-word activation and global Transcription Mode.

Management window

⌘ ⇧ M. Tabs for everything that isn't a single chat: Models, Providers, Agents, Plugins, Sandbox, Tools, Skills, Commands, Memory, Schedules, Watchers, Voice, Themes, Insights, Server, Permissions, Identity, Storage, Settings.

HTTP API

A local server on port 1337 (configurable). Speaks OpenAI Chat Completions, Anthropic Messages, Open Responses, and Ollama Chat APIs side by side, plus MCP server endpoints (/mcp/health, /mcp/tools, /mcp/call) and Osaurus-specific routes (/agents/{id}/run, /memory/ingest, /agents, /pair).

Harness

The harness is what makes Osaurus more than a thin SDK shim:

  • Agent Loop — every chat is an agent loop. The model writes a markdown todo list, calls tools, iterates, and ends with a verified summary or pauses to ask one critical question.
  • Memory — persistent on-device memory with three layers (identity, pinned facts, episodes) plus a transcript fallback. Distillation runs once per session, gated on a configured Core Model.
  • Skills & Methods — reusable capabilities. Skills are markdown packages of expertise; Methods are scored YAML workflows the agent saved from past runs. Both are auto-selected via RAG preflight.
  • Schedules & Watchers — automation. Schedules run on a clock; watchers react to file system changes via FSEvents.

Plugins, schedules, watchers, and the HTTP API all dispatch through the same agent loop — same engine, same loop tools, same intercepts. Sessions are tagged with their source (chat / plugin / http / schedule / watcher) so you can audit what spawned each conversation in the chat sidebar.

Inference

Three local options and a cloud surface, all behind the same model picker:

  • MLX — local transformer / SSM models, optimized for Apple Silicon via vmlx-swift-lm's BatchEngine (continuous batching, content-addressed prefix caching). Inference Runtime →
  • Apple Foundation Models — Apple's on-device system model (model: "foundation") on macOS 26+. Zero downloads, zero config.
  • Liquid Foundation Models — non-transformer architecture optimized for edge.
  • Cloud providers — OpenAI, Anthropic, xAI, OpenRouter, Venice, Ollama, LM Studio. API keys in macOS Keychain.

Memory and agent context persist across all of them — switching from local Gemma to Claude 4 doesn't lose what your agent has learned about you.

Tools

Two ABIs for native plugins:

  • v1 — tools only
  • v2 — full host API: HTTP routes, SQLite-backed config, web app serving, agent dispatch, inference, events

Plus remote MCP providers to aggregate tools from external MCP servers, and the Linux Sandbox (macOS 26+) for safe code execution. The sandbox itself accepts JSON-recipe plugins so users can extend an agent's capabilities without compiling anything.

Every tool — built-in, folder, sandbox, plugin, MCP-aggregated — returns the same canonical Tool Contract envelope.

Foundations

The trust layer underneath everything:

  • Identity — secp256k1 master key in iCloud Keychain (biometric-gated), deterministic per-agent child keys, Apple App Attest device assertion, osk-v1 access keys for external callers (scoped, expirable, revocable).
  • Encrypted Storage — SQLCipher across chat history, memory, methods, tool index, and plugin databases. Large attachments spilled to AES-GCM .osec blobs. Key in macOS Keychain, device-bound.
  • Public Links — secure WebSocket tunnels through agent.osaurus.ai per agent. The agent's cryptographic address is the routing key. No port forwarding.

These are the boundaries. See Security & Privacy for the user-facing summary, Identity Cryptography and Storage & Encryption for the specs.

Where to go next

Build a thing:

Understand a piece: