a16z’s next thesis is ‘observe to act’. We built it local-first.

Kenan Saleh, Investing Partner at a16z Speedrun, wrapped up the SR007 cohort this week with a short video laying out what excites him about the next wave of AI products. The wrap-up post is here; the bit that mattered, transcribed:

“Today’s AI products are reactive. You give the model a prompt, it responds with an answer. These are useful, but I’m excited about products that take this further and shift the paradigm from ask-to-answer to observe-to-act. These agents will continuously monitor context in the background across all of your connected tools and data, predict what matters, and take action before being asked to do so, much like a human does. So instead of you prompting the model, the model will start prompting you. Examples here could include agents that remind you about tasks you forgot to complete, resolve customer issues before support tickets are filed, or debug and ship code fixes automatically. This represents a new paradigm where AI products behave more like humans and less like tools. We’re already starting to see this dynamic with products like OpenClaw, Poke, and more, and we’ve only scratched the surface of capabilities here.”

That paragraph is the cleanest articulation I’ve seen of why Ostler exists. Every clause maps to something in the product I shipped this week. I’m going to walk through them line by line, because once you map them against a real product the differences between approaches start to matter a lot.

“Continuously monitor context in the background across all of your connected tools and data”

Ostler ingests email, iMessage, WhatsApp, calendar, meeting transcripts, and browsing history into one personal knowledge graph. The graph runs on the customer’s own Mac. It refreshes continuously. By the time you have used Ostler for a few weeks it has indexed, in my case, four-and-a-half thousand people, a hundred and forty-eight thousand preferences, and one-point-eight-seven million knowledge-graph facts. That is the “monitor context in the background” sentence as a concrete deliverable.

The “across all of your connected tools and data” clause is the one most products fall short on. A lot of agents observe their own narrow surface and call that “context”. The reason Ostler observes email and WhatsApp and calendar in one place is because the actual content of your life is spread across all of them, and a model that sees only one of them is missing most of the picture. That breadth is also why we run locally; piping your iMessage history to a cloud vendor for the sake of a side-feature is not a trade most customers will make once they think about it.

“Predict what matters”

The Ostler Hub generates a morning brief and an evening wrap every day. Both are produced by the local assistant scanning everything that has come in since the previous brief and choosing what is worth surfacing. Morning brief is calibrated for action; evening wrap is calibrated for reflection. There’s a suggestion endpoint that produces “you might care about this person right now” pings based on what the graph has recently learnt, weighted by your historical attention to that person.

The mechanism for prediction is interesting. We’re not running giant transformer reasoning over the whole graph for every query; we’re using small specialist local models for things they are good at (the OpenAI Privacy Filter for PII, a small embedding model for retrieval, a 9B Qwen model for synthesis) and stitching them with a deterministic pipeline. The result is fast, cheap, and explainable. The locality is not a nice-to-have. The graph contains your therapist’s name, your salary, things your partner has texted you. The prediction must happen inside the boundary that already holds those facts.

“Take action before being asked to do so”

This is the part that’s landing into v1.0 right now. Ostler writes proactively to Apple Reminders on your iPhone for commitments it has detected you made to other people (“I’ll send you that contract on Friday”). It surfaces birthdays it has learnt about. It can summarise the meeting you just had into a transcript filed under the right project. The iOS app on your phone shows the Live Activity for whatever the Hub is doing right now – recording a call, generating a brief, or going offline.

The last one matters. Most cloud assistants are designed for the moment when the user is paying attention. Ostler is mostly running when you are not. The “the model will start prompting you” line from Kenan’s clip is the right framing; the assistant is supposed to know about your life in enough detail to be the one who initiates the conversation. That requires sustained, granular, multi-source observation that the cloud architecture is poorly placed to deliver because customers will not give cloud vendors what they will give a piece of software on their own machine.

“Already starting to see this dynamic with products like OpenClaw, Poke, and more”

OpenClaw and Poke are the two products Kenan names. Both are cloud. Both are good. Both are honest about the paradigm. And both are looking at the same set of customer-shaped problems Ostler is. We have vs-openclaw and vs-poke comparison pages because the question “which observe-to-act product do I use” is the actual question a thoughtful prospective customer asks.

The answer comes down to one architectural choice. Observe-to-act requires the system to see a lot. The amount it has to see grows with how good you want it to be. Either the data stays on your machine, or it doesn’t. There isn’t a third option that’s as good as either of the first two. Ostler picks the first answer. OpenClaw and Poke pick the second. Both can be right answers for different customers; we’ve seen real value flow from both directions. But the customer who is uncomfortable handing their entire digital life to a cloud vendor has, until now, mostly had nothing to choose. We’re what they get to choose.

What this is, and what it isn’t

I’m not writing this to claim Kenan was talking about Ostler. He wasn’t. The video was a wrap-up of a cohort that doesn’t include us; I’m a SR007 applicant whose application is currently under review. The thesis paragraph is interesting precisely because it lands on the product spec independently. When a Tier-1 venture firm starts articulating a thesis, that’s a signal that the customer demand for a particular shape of product is becoming legible to capital, not just to the engineers building it. Convergent independent description is usually the moment a category starts crystallising.

Apple is doing the same thing. The WWDC26 invite three weeks from now hints at a Siri overhaul wrapped in a glowing ring; the symbology, the “coming bright up” tagline, and the Swift-the-bird-and-language pun together suggest a coherent push around App Intents, FoundationModels, and Live Activities as a lifestyle-assistant pillar. The convergence between a16z and Apple at this moment, on this shape of product, isn’t a coincidence. It’s the same demand reaching the same surface from two different directions.

What we’re going to do about it

Ship. Two days to v1.0 launch. The Hub installer is signed, notarised, stapled, and live. The iOS app enters TestFlight this week. The website is live. The pricing is $99 one-time for the Hub plus an Ostler Pro subscription at $9.99/month (first 30 days free) for ongoing memory capture and the new Ostler features we ship.

If you’re reading this and the “observe-to-act” framing matches what you want from a digital assistant, the question for you is no longer whether the category exists. It does. The question is which side of the cloud-or-local choice you want for the most observed device of your life. We made our choice. It’s here.

Thoughts, pushback, or applications to work with us – [email protected].

Source: Kenan Saleh, Investing Partner at a16z Speedrun, in the SR007 wrap-up video published at speedrun.substack.com/p/sr007-apps-wrapped. Transcript above is Andy’s manual transcription from the video; any errors are mine.