Skip to main content

Integration Guide

Osaurus provides multiple integration paths: Remote Providers for cloud AI, MCP Server for AI agents, OpenAI-compatible APIs for existing tools, and Ollama-compatible APIs for additional flexibility.

MCP Server

Osaurus is a full Model Context Protocol (MCP) server. Connect any MCP client to give AI agents access to your installed tools.

Cursor

Add to your Cursor MCP settings (Settings → Features → MCP Servers):

{
"mcpServers": {
"osaurus": {
"command": "osaurus",
"args": ["mcp"]
}
}
}

Or edit ~/.cursor/mcp.json directly.

Claude Desktop

Add to your Claude Desktop configuration at ~/Library/Application Support/Claude/claude_desktop_config.json:

{
"mcpServers": {
"osaurus": {
"command": "osaurus",
"args": ["mcp"]
}
}
}

Other MCP Clients

Any MCP client that supports stdio transport can connect:

{
"mcpServers": {
"osaurus": {
"command": "osaurus",
"args": ["mcp"]
}
}
}

The osaurus mcp command:

  • Proxies MCP protocol over stdio to the running server
  • Auto-launches Osaurus if it isn't running
  • Exposes all installed tools to the connected client

HTTP MCP Endpoints

MCP is also available over HTTP:

EndpointMethodDescription
/mcp/healthGETCheck MCP availability
/mcp/toolsGETList active tools
/mcp/callPOSTExecute a tool

Example: List tools

curl http://127.0.0.1:1337/mcp/tools | jq

Example: Call a tool

curl -X POST http://127.0.0.1:1337/mcp/call \
-H "Content-Type: application/json" \
-d '{"name": "current_time", "arguments": {}}'

Remote Providers

Osaurus can connect to cloud AI providers, giving you access to remote models alongside your local ones through a unified API.

Adding a Provider

  1. Open the Management window (⌘⇧M)
  2. Navigate to Providers
  3. Click Add Provider
  4. Choose a preset or configure a custom endpoint

Supported Presets

ProviderDescription
OpenAIGPT-4o, o1, and other OpenAI models
OpenRouterAccess multiple providers (Anthropic, Google, etc.) through one API
OllamaConnect to a local or remote Ollama instance
LM StudioUse LM Studio as a backend
CustomAny OpenAI-compatible endpoint

Features

  • Secure API Key Storage — Keys are stored in macOS Keychain, never in plain text
  • Custom Headers — Add authentication headers for custom endpoints
  • Auto-Connect on Launch — Automatically reconnect to providers when Osaurus starts
  • Connection Health Monitoring — Real-time status indicators show provider availability

Configuration Options

When adding a custom provider:

FieldDescription
NameDisplay name for the provider
Base URLAPI endpoint (e.g., https://api.openai.com/v1)
API KeyAuthentication key (stored securely in Keychain)
Custom HeadersAdditional headers for authentication
Auto-ConnectConnect automatically on launch

Using Remote Models

Once connected, remote models appear alongside local models in the Model Manager and Chat UI. Use them via the API:

# Use a remote OpenAI model
curl http://127.0.0.1:1337/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"messages": [{"role": "user", "content": "Hello!"}]
}'

Remote models are available to all connected MCP clients and integrate seamlessly with tool calling.

OpenAI SDK Integration

Python

from openai import OpenAI

client = OpenAI(
base_url="http://127.0.0.1:1337/v1",
api_key="osaurus" # Any value works
)

response = client.chat.completions.create(
model="llama-3.2-3b-instruct-4bit",
messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
model="llama-3.2-3b-instruct-4bit",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True
)

for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")

JavaScript/TypeScript

import OpenAI from "openai";

const client = new OpenAI({
baseURL: "http://127.0.0.1:1337/v1",
apiKey: "osaurus",
});

const response = await client.chat.completions.create({
model: "llama-3.2-3b-instruct-4bit",
messages: [{ role: "user", content: "Hello!" }],
});

console.log(response.choices[0].message.content);

LangChain Integration

from langchain_openai import ChatOpenAI
from langchain_core.messages import HumanMessage

llm = ChatOpenAI(
base_url="http://127.0.0.1:1337/v1",
api_key="osaurus",
model="llama-3.2-3b-instruct-4bit"
)

response = llm.invoke([HumanMessage(content="Hello!")])
print(response.content)

With Chains

from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser

prompt = ChatPromptTemplate.from_messages([
("system", "You are a helpful translator."),
("human", "Translate to French: {text}")
])

chain = prompt | llm | StrOutputParser()
result = chain.invoke({"text": "Hello, world!"})

LlamaIndex Integration

from llama_index.llms.openai import OpenAI
from llama_index.core import Settings

Settings.llm = OpenAI(
api_base="http://127.0.0.1:1337/v1",
api_key="osaurus",
model="llama-3.2-3b-instruct-4bit"
)

IDE Integrations

Continue.dev (VS Code / JetBrains)

Add to ~/.continue/config.json:

{
"models": [
{
"title": "Osaurus Local",
"provider": "openai",
"model": "llama-3.2-3b-instruct-4bit",
"apiBase": "http://127.0.0.1:1337/v1",
"apiKey": "osaurus"
}
]
}

Cursor (as OpenAI Provider)

In Cursor settings, add a custom OpenAI-compatible model:

  1. Open Settings → Models
  2. Add OpenAI-compatible endpoint
  3. Base URL: http://127.0.0.1:1337/v1
  4. Model: llama-3.2-3b-instruct-4bit
  5. API Key: osaurus

Native App Integration

Swift/macOS

Use the shared configuration to discover running Osaurus instances:

import Foundation

// Discover Osaurus
let instance = try OsaurusDiscoveryService.discoverLatestRunningInstance()

// Make API request
let url = instance.url.appendingPathComponent("v1/chat/completions")
var request = URLRequest(url: url)
request.httpMethod = "POST"
request.setValue("application/json", forHTTPHeaderField: "Content-Type")

let body: [String: Any] = [
"model": "llama-3.2-3b-instruct-4bit",
"messages": [["role": "user", "content": "Hello!"]]
]
request.httpBody = try JSONSerialization.data(withJSONObject: body)

let (data, _) = try await URLSession.shared.data(for: request)

See the Shared Configuration guide for the complete discovery implementation.

Electron

// main/index.js
const { discoverLatestRunningInstance } = require("./osaurus-discovery");

ipcMain.handle("osaurus:chat", async (event, message) => {
const instance = await discoverLatestRunningInstance();

const response = await fetch(`${instance.url}/v1/chat/completions`, {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama-3.2-3b-instruct-4bit",
messages: [{ role: "user", content: message }],
}),
});

return response.json();
});

Ollama-Compatible Clients

OllamaKit (Swift)

import OllamaKit

let client = OllamaKit(baseURL: URL(string: "http://127.0.0.1:1337")!)

let response = try await client.chat(
model: "llama-3.2-3b-instruct-4bit",
messages: [.user("Hello!")]
)

Python (Ollama Format)

import requests

response = requests.post(
"http://127.0.0.1:1337/api/chat",
json={
"model": "llama-3.2-3b-instruct-4bit",
"messages": [{"role": "user", "content": "Hello!"}],
"stream": False
}
)

print(response.json()["message"]["content"])

Web Applications

React

async function chat(message) {
const response = await fetch("http://127.0.0.1:1337/v1/chat/completions", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama-3.2-3b-instruct-4bit",
messages: [{ role: "user", content: message }],
}),
});

const data = await response.json();
return data.choices[0].message.content;
}

Next.js API Route

// app/api/chat/route.js
export async function POST(request) {
const { message } = await request.json();

const response = await fetch("http://127.0.0.1:1337/v1/chat/completions", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
model: "llama-3.2-3b-instruct-4bit",
messages: [{ role: "user", content: message }],
}),
});

return Response.json(await response.json());
}

Integration Checklist

  • Choose integration method — MCP, OpenAI SDK, or direct API
  • Configure base URLhttp://127.0.0.1:1337 (or custom port)
  • Set model name — Use lowercase with hyphens
  • Handle streaming — Enable for better UX
  • Install tools — If using MCP, install needed plugins
  • Test error handling — Handle connection and model errors

Troubleshooting

Connection Refused

  • Verify Osaurus is running: osaurus status
  • Check port matches configuration
  • Ensure firewall allows localhost connections

MCP Tools Not Appearing

  • Verify tools are installed: osaurus tools list
  • Restart your MCP client after installing tools
  • Check osaurus mcp runs without errors

Model Not Found

  • List available models: curl http://127.0.0.1:1337/v1/models
  • Use exact model name from list
  • Download model if needed via Model Manager

CORS Errors (Browser)

  • CORS is enabled by default for all origins
  • Verify request includes correct headers
  • Check browser console for specific error

For integration help, check our SDK Examples or join Discord.