Architecture
If you're building on top of Osaurus — writing plugins, scripts, or integrations — this page is the orientation. It maps the user-facing surfaces (chat overlay, management window, HTTP API) to the components underneath, and points to the deeper pages for each layer.
The harness
Osaurus presents three entry points:
- The chat overlay (
⌘;) — the daily driver - The Management window (
⌘ ⇧ M) — settings, agents, models, plugins, tools, memory, themes, automation - The HTTP API (on
:1337) — OpenAI / Anthropic / Open Responses / Ollama / MCP
All three funnel into the same agent loop, which talks to your memory, skills/methods, and the automation surface (schedules, watchers). Inference goes out to local MLX, Apple Foundation, or any cloud provider you've connected. Tools span native plugins (v1/v2 ABI), remote MCP servers, and the Linux sandbox. Underneath everything: identity (signed requests, access keys), encrypted storage (SQLCipher), and relay (public tunnels).
How the pieces fit together
flowchart TB
User[You]
Chat[Chat Overlay - ⌘;]
Mgmt[Management Window - ⌘ ⇧ M]
HTTP[HTTP API on :1337]
User --> Chat
User --> Mgmt
User --> HTTP
subgraph harness [The Harness]
Loop[Agent Loop]
Mem[Memory]
Skills[Skills and Methods]
Auto[Schedules and Watchers]
end
Chat --> Loop
Mgmt --> Loop
HTTP --> Loop
Loop --> Mem
Loop --> Skills
Auto --> Loop
subgraph providers [Inference]
MLX[MLX Local]
Foundation[Apple Foundation]
Cloud[Cloud Providers]
end
Loop --> MLX
Loop --> Foundation
Loop --> Cloud
subgraph plugins [Tools]
Native[Native Plugins]
MCP[Remote MCP]
Sandbox[Linux Sandbox]
end
Loop --> Native
Loop --> MCP
Loop --> Sandbox
subgraph foundationLayer [Foundations]
Identity[Identity and Access]
Storage[Encrypted Storage]
Relay[Relay Tunnels]
end
harness --> foundationLayer
plugins --> foundationLayer
Layers
| Layer | What it does | Reference |
|---|---|---|
| Entry points | Chat overlay (⌘;), Management window (⌘ ⇧ M), HTTP API on :1337 | Chat, HTTP API, CLI |
| Harness | Tasks, Memory, Skills/Methods, Schedules/Watchers — the continuity layer | Tasks, Memory, Skills, Methods, Schedules, Watchers |
| Inference | MLX local models, Apple Foundation Models, cloud providers — all behind the same picker | Models, Apple Intelligence, Inference Runtime |
| Tools | 20+ native plugins (Mail, Calendar, Browser, Git, …), remote MCP aggregation, the Linux Sandbox | Tools & Plugins, Plugin Authoring, Sandbox Internals, Remote MCP Providers |
| Foundations | Identity (signed requests, osk-v1 keys), encrypted storage (SQLCipher), Public Links (public tunnels) | Identity Cryptography, Storage & Encryption, Public Links |
Entry points
Chat overlay
A glass-style overlay summoned with ⌘; from anywhere on macOS. Holds zero, one, or many chat windows. Each window has its own active agent, working folder / Sandbox state, model selection, and conversation history. Multi-window mode lets you run several agents side by side.
The overlay is also where voice input lives: the microphone in the input bar, plus VAD wake-word activation and global Transcription Mode.
Management window
⌘ ⇧ M. Tabs for everything that isn't a single chat: Models, Providers, Agents, Plugins, Sandbox, Tools, Skills, Commands, Memory, Schedules, Watchers, Voice, Themes, Insights, Server, Permissions, Identity, Storage, Settings.
HTTP API
A local server on port 1337 (configurable). Speaks OpenAI Chat Completions, Anthropic Messages, Open Responses, and Ollama Chat APIs side by side, plus MCP server endpoints (/mcp/health, /mcp/tools, /mcp/call) and Osaurus-specific routes (/agents/{id}/run, /memory/ingest, /agents, /pair).
Harness
The harness is what makes Osaurus more than a thin SDK shim:
- Agent Loop — every chat is an agent loop. The model writes a markdown todo list, calls tools, iterates, and ends with a verified summary or pauses to ask one critical question.
- Memory — persistent on-device memory with three layers (identity, pinned facts, episodes) plus a transcript fallback. Distillation runs once per session, gated on a configured Core Model.
- Skills & Methods — reusable capabilities. Skills are markdown packages of expertise; Methods are scored YAML workflows the agent saved from past runs. Both are auto-selected via RAG preflight.
- Schedules & Watchers — automation. Schedules run on a clock; watchers react to file system changes via FSEvents.
Plugins, schedules, watchers, and the HTTP API all dispatch through the same agent loop — same engine, same loop tools, same intercepts. Sessions are tagged with their source (chat / plugin / http / schedule / watcher) so you can audit what spawned each conversation in the chat sidebar.
Inference
Three local options and a cloud surface, all behind the same model picker:
- MLX — local transformer / SSM models, optimized for Apple Silicon via vmlx-swift-lm's
BatchEngine(continuous batching, content-addressed prefix caching). Inference Runtime → - Apple Foundation Models — Apple's on-device system model (
model: "foundation") on macOS 26+. Zero downloads, zero config. - Liquid Foundation Models — non-transformer architecture optimized for edge.
- Cloud providers — OpenAI, Anthropic, xAI, OpenRouter, Venice, Ollama, LM Studio. API keys in macOS Keychain.
Memory and agent context persist across all of them — switching from local Gemma to Claude 4 doesn't lose what your agent has learned about you.
Tools
Two ABIs for native plugins:
- v1 — tools only
- v2 — full host API: HTTP routes, SQLite-backed config, web app serving, agent dispatch, inference, events
Plus remote MCP providers to aggregate tools from external MCP servers, and the Linux Sandbox (macOS 26+) for safe code execution. The sandbox itself accepts JSON-recipe plugins so users can extend an agent's capabilities without compiling anything.
Every tool — built-in, folder, sandbox, plugin, MCP-aggregated — returns the same canonical Tool Contract envelope.
Foundations
The trust layer underneath everything:
- Identity — secp256k1 master key in iCloud Keychain (biometric-gated), deterministic per-agent child keys, Apple App Attest device assertion,
osk-v1access keys for external callers (scoped, expirable, revocable). - Encrypted Storage — SQLCipher across chat history, memory, methods, tool index, and plugin databases. Large attachments spilled to AES-GCM
.osecblobs. Key in macOS Keychain, device-bound. - Public Links — secure WebSocket tunnels through
agent.osaurus.aiper agent. The agent's cryptographic address is the routing key. No port forwarding.
These are the boundaries. See Security & Privacy for the user-facing summary, Identity Cryptography and Storage & Encryption for the specs.
Where to go next
Build a thing:
- HTTP API — endpoint reference, streaming, function calling
- SDK Examples — Python, JavaScript, Anthropic SDK, Open Responses
- CLI —
osaurus serve / mcp / tools / run - Tools & Plugins → Plugin Authoring → Tool Contract
- Sandbox Internals — VM, vsock bridge, plugin recipes
Understand a piece:
- Inference Runtime —
BatchEngine, KV cache, model leases - Identity Cryptography — full crypto spec
- Storage & Encryption — SQLCipher migration, key rotation, recovery
- Developer Tools — Insights and Server Explorer in the Management window
- Building from Source — clone, build, test, contribute