How Ostler is built.

For developers and the technically curious. Everything runs on a Mac Mini. Nothing leaves your network.

Three-store architecture

Ostler uses three specialised databases, each optimised for a different type of query:

StoreTechnologyPurpose
Vector store Qdrant Semantic search. “Find people similar to this description.” Stores your embeddings (nomic-embed-text), scaling to hundreds of thousands of vectors.
Knowledge graph Oxigraph Structured relationships. SPARQL queries over your knowledge-graph triples. “Who knows whom? What happened when?”
Cache + message bus Valkey Fast lookups, real-time message routing between services, session state. (Linux Foundation fork of Redis 7.2.)

All three run as launchd services. On a Mac Mini M4, the databases use under 2GB RAM, leaving the rest for Ollama and the AI models (which need 6–12GB depending on model size).

Local LLM inference

All AI inference runs locally via Ollama. No cloud API calls. No usage billing. No data exfiltration.

ModelUsePerformance
Qwen 3.5 9B AI assistant, conversation processing, fact extraction ~30 tok/s on M4
nomic-embed-text Vector embeddings for semantic search ~200 embeddings/s

The system is hardware-adaptive. Settings profiles configure model selection and batch sizes based on available hardware. A Mac Mini M1 runs smaller models; a Mac Studio M2 Ultra runs larger ones.

Instant onboarding (macOS data)

The moment you install, Ostler reads data directly from your Mac’s built-in apps. No exports needed. No waiting.

SourceWhat we readPermission
SafariBrowsing history, bookmarks, reading listFull Disk Access
iMessageConversations, participants, timestampsFull Disk Access
Apple NotesNote titles, text content, foldersFull Disk Access
CalendarEvents, attendees, locationsFull Disk Access
PhotosFace labels, GPS locations, dates (not image content)Full Disk Access
RemindersTasks, due dates, listsFull Disk Access
Apple MailSubjects, senders, dates (not email body)Full Disk Access

All databases are opened read-only to prevent corruption. Each extractor handles schema differences across macOS versions (Ventura, Sonoma, Sequoia). Full Disk Access is optional – you can skip it and still use GDPR imports.

GDPR import pipeline

For deeper historical data, 20 parsers read from GDPR data exports:

PlatformData importedFormat
LinkedInConnections, career, endorsements, messages (metadata)CSV
FacebookFriends, events, timelineJSON
InstagramFollowers, following, close friendsJSON
WhatsAppPhone cross-referencesJSON
Twitter / XSynced contacts (phone cross-ref)JS
Google CalendarEvents, attendees, locationsICS
iCloudContacts (via CardDAV)vCard
EmailSignature mining, header analysisMBOX
BrowserHistory URLs, page titlesSafari / Chrome

Identity resolution

The same person appears differently across platforms. “John Smith” on LinkedIn, “johnnyboy” on Instagram, “+44 7XXX XXXXXX” on WhatsApp. The identity resolver matches these using:

  • Exact matching: LinkedIn URL, email address, phone number (last 8 digits).
  • Fuzzy matching: Jaro-Winkler string distance on names, corroborated by shared organisation, email domain, or platform overlap.
  • Manual review queue: Uncertain matches go to a review queue. The user approves or rejects. No automatic merges without confidence.

The resolver has 38 automated tests covering exact, fuzzy, phone, email, and name-subset matching strategies.

Conversation processing

When a conversation is recorded (via the Ostler RemoteCapture app on your Mac or manual import), it passes through a multi-step pipeline:

  • Classification – setting (work/social/family), shape (meeting/1:1/group), stakes (high/medium/low).
  • Fact extraction – 12.6 facts per conversation on average, with quality gates.
  • Relationship signals – warmth, reciprocity, energy, power dynamics.
  • Coaching observations – longitudinal patterns in how the user communicates.
  • Cross-conversation linking – semantic similarity between conversation summaries.

Each step is idempotent (re-runnable without duplicates), has exponential backoff on failure, and records the prompt version that generated it.

Stack summary

Capture Layer
  macOS databases (instant)  ·  GDPR imports  ·  macOS app  ·  iOS app  ·  Browser extension

Processing Layer
  Conversation pipeline  ·  Identity resolver  ·  Fact extraction  ·  Relationship signals

Intelligence Layer
  Ollama (Qwen 3.5 9B)  ·  nomic-embed-text  ·  SPARQL queries  ·  Vector search

Storage Layer
  Qdrant (vectors)  ·  Oxigraph (RDF graph)  ·  Valkey (cache + bus)  ·  SQLite (coaching)

Interface Layer
  Assistant (iMessage · Email)  ·  Personal Wiki  ·  iOS app

Total dependencies: Python 3.11+, Ollama. No cloud accounts required. No API keys. No subscriptions.

Built to run at home.

Local  ·  Verifiable  ·  Yours