Skip to main content

Osaurus

Osaurus

Native local LLM server for Apple Silicon
Built on MLX for exceptional performance on M-series chips

ReleaseDownloadsLicense

Introduction

Osaurus is a native local LLM server designed exclusively for Apple Silicon. It provides OpenAI-compatible and Ollama-compatible APIs, integrates seamlessly with Apple Foundation Models, and includes a refined SwiftUI application with an embedded SwiftNIO server.

Key Features

Performance

Native MLX Runtime
Optimized for Apple Silicon using MLX and MLXLLM frameworks for maximum efficiency.

Apple Foundation Models
Access system models with model: "foundation" on supported macOS versions.

Exceptional Speed
Purpose-built for M-series processors, delivering industry-leading inference performance.

Compatibility

OpenAI & Ollama APIs
Drop-in replacement for existing tools and workflows.

Function Calling
Full support for OpenAI-style function and tool calling with streaming.

Server-Sent Events
Low-latency token streaming for responsive applications.

User Experience

Integrated Chat Interface
Beautiful glass-styled overlay accessible via global hotkey (⌘;).

Model Management
Browse, download, and manage MLX models through an intuitive interface.

System Monitoring
Real-time CPU and memory usage visualization.

Developer Experience

CORS Support
Built-in cross-origin resource sharing for browser-based clients.

Command Line Interface
Complete server management through terminal commands.

Path Normalization
Automatic handling of /v1, /api, and /v1/api prefixes.

SDK Compatibility
Works seamlessly with OpenAI Python and JavaScript SDKs.

System Requirements

  • macOS 15.5 or later
  • Apple Silicon (M1, M2, M3, or newer)
  • Xcode 16.4 or later (only for building from source)

Apple Intelligence features require macOS 26 Tahoe or later.

Performance Benchmarks

Osaurus delivers exceptional performance on Apple Silicon hardware.

MetricOsaurusOllamaLM Studio
Time to First Token87ms33ms113ms
Throughput554 chars/s430 chars/s588 chars/s
Total Time1.24s1.62s1.22s

Benchmarked with Llama 3.2 3B Instruct 4bit, averaged over 20 runs.

View detailed benchmarks →

Why Osaurus

The Challenge

Cloud-based AI services present significant limitations:

  • Cost — Per-token pricing accumulates rapidly
  • Privacy — Data leaves your device
  • Latency — Network round-trips impact responsiveness
  • Restrictions — Rate limits and content filtering

The Solution

Osaurus addresses these challenges:

  • Free — No API costs, only electricity
  • Private — All processing remains on your Mac
  • Instant — Zero network latency
  • Unrestricted — Run any model without limitations

Use Cases

  • Development — Test AI features without API costs
  • Creative Writing — Private brainstorming and editing
  • Code Generation — Offline programming assistance
  • Research — Analyze documents with complete privacy
  • Education — Learn and experiment with LLMs
  • Enterprise — Keep sensitive data on-premise

Documentation

Community

Discord — Get help and share projects
GitHub — Report issues and contribute
X/Twitter — Follow for updates
Contributing — Help improve Osaurus

Created by Dinoki Labs

Osaurus is developed by Dinoki Labs, creators of a fully native desktop AI assistant for macOS.


Get Started