Architecture – Ostler

Three-store architecture

Ostler uses three specialised databases, each optimised for a different type of query:

Store	Technology	Purpose
Vector store	Qdrant	Semantic search. “Find people similar to this description.” Stores your embeddings (nomic-embed-text), scaling to hundreds of thousands of vectors.
Knowledge graph	Oxigraph	Structured relationships. SPARQL queries over your knowledge-graph triples. “Who knows whom? What happened when?”
Cache + message bus	Valkey	Fast lookups, real-time message routing between services, session state. (Linux Foundation fork of Redis 7.2.)

All three run as launchd services. On a Mac Mini M4, the databases use under 2GB RAM, leaving the rest for Ollama and the AI models (which need 6–12GB depending on model size).

Local LLM inference

All AI inference runs locally via Ollama. No cloud API calls. No usage billing. No data exfiltration.

Model	Use	Performance
Qwen3 9B	AI assistant, conversation processing, fact extraction	~30 tok/s on M4
nomic-embed-text	Vector embeddings for semantic search	~200 embeddings/s

The system is hardware-adaptive. Settings profiles configure model selection and batch sizes based on available hardware. A Mac Mini M1 runs smaller models; a Mac Studio M2 Ultra runs larger ones.

Instant onboarding (macOS data)

The moment you install, Ostler reads data directly from your Mac’s built-in apps. No exports needed. No waiting.

Source	What we read	Permission
Safari	Browsing history, bookmarks, reading list	Full Disk Access
iMessage	Conversations, participants, timestamps	Full Disk Access
Apple Notes	Note titles, text content, folders	Full Disk Access
Calendar	Events, attendees, locations	Full Disk Access
Photos	Face labels, GPS locations, dates (not image content)	Full Disk Access
Reminders	Tasks, due dates, lists	Full Disk Access
Apple Mail	Subjects, senders, dates (not email body)	Full Disk Access

All databases are opened read-only to prevent corruption. Each extractor handles schema differences across macOS versions (Ventura, Sonoma, Sequoia). Full Disk Access is optional – you can skip it and still use GDPR imports.

GDPR import pipeline

For deeper historical data, 20 parsers read from GDPR data exports:

Platform	Data imported	Format
LinkedIn	Connections, career, endorsements, messages (metadata)	CSV
Facebook	Friends, events, timeline	JSON
Instagram	Followers, following, close friends	JSON
WhatsApp	Phone cross-references	JSON
Twitter / X	Synced contacts (phone cross-ref)	JS
Google Calendar	Events, attendees, locations	ICS
iCloud	Contacts (via CardDAV)	vCard
Email	Signature mining, header analysis	MBOX
Browser	History URLs, page titles	Safari / Chrome

Identity resolution

The same person appears differently across platforms. “John Smith” on LinkedIn, “johnnyboy” on Instagram, “+44 7XXX XXXXXX” on WhatsApp. The identity resolver matches these using:

Exact matching: LinkedIn URL, email address, phone number (last 8 digits).
Fuzzy matching: Jaro-Winkler string distance on names, corroborated by shared organisation, email domain, or platform overlap.
Manual review queue: Uncertain matches go to a review queue. The user approves or rejects. No automatic merges without confidence.

The resolver has 38 automated tests covering exact, fuzzy, phone, email, and name-subset matching strategies.

Conversation processing

When a conversation is recorded (via the Ostler RemoteCapture app on your Mac or manual import), it passes through a multi-step pipeline:

Classification – setting (work/social/family), shape (meeting/1:1/group), stakes (high/medium/low).
Fact extraction – 12.6 facts per conversation on average, with quality gates.
Relationship signals – warmth, reciprocity, energy, power dynamics.
Coaching observations – longitudinal patterns in how the user communicates.
Cross-conversation linking – semantic similarity between conversation summaries.

Each step is idempotent (re-runnable without duplicates), has exponential backoff on failure, and records the prompt version that generated it.

Stack summary

Capture Layer
  macOS databases (instant)  ·  GDPR imports  ·  macOS app  ·  iOS app  ·  Browser extension

Processing Layer
  Conversation pipeline  ·  Identity resolver  ·  Fact extraction  ·  Relationship signals

Intelligence Layer
  Ollama (Qwen3 9B)  ·  nomic-embed-text  ·  SPARQL queries  ·  Vector search

Storage Layer
  Qdrant (vectors)  ·  Oxigraph (RDF graph)  ·  Valkey (cache + bus)  ·  SQLite (coaching)

Interface Layer
  Assistant (iMessage · Email)  ·  Personal Wiki  ·  iOS app

Total dependencies: Python 3.11+, Ollama. No cloud accounts required. No API keys. No subscriptions.

How Ostler is built.

Three-store architecture

Local LLM inference

Instant onboarding (macOS data)

GDPR import pipeline

Identity resolution

Conversation processing

Stack summary

Built to run at home.