Mastermind · May 24, 2026
Buckets for Rainwater

Yesterday I wrote about why data is the moat. Tom Bloomfield wants mics in every room. The point was simple: the conversation is the data, and the data is the company.
The next question is who is actually doing this well, and what shape it takes.
The pattern is the same one the hyperscalers figured out a decade ago. Google did not win search by being smarter than Yahoo. Google won by building a bigger bucket. The crawler runs forever, the index gets fatter, and every query trains the next ranking pass. Amazon did not win e-commerce by selling more books. Amazon built S3, then EC2, then the whole catalogue of buckets that everyone else's product now sits inside. The product is the bucket. Not the spout.
So the way I think about the next layer of software is the same. Forget features. Ask what bucket you are building, and what rainwater it catches.
The rainwater right now is human intelligence. Spoken thought, written thought, half-formed ideas, decisions made in meetings, voice memos, drafts, the WhatsApp note you sent yourself at 2am. There is more of it being generated per day than at any point in history, and almost none of it lands anywhere permanent. It evaporates. It scrolls off. It gets transcribed once and then trapped inside a Slack channel nobody reads.
But here is the part most people miss. Not all rainwater is equal. Where you place the bucket decides what kind of water you catch.
Put a bucket under a storm drain and you collect the runoff. Trash, oil, the same water everyone else's drain is already catching. You have data, technically. You have nothing nobody else has. Train a model on it and your model is whatever the rest of the web's model already is.
Put a bucket on the mountaintop, under the melting snowpack, and you catch something else entirely. Cold, clear, asymmetric. That is the water the city pays for. The bucket has not changed. The placement has.
This is the real game. Asymmetric data is data that, by virtue of where you sit in the flow, only you can collect. Public web text is the storm drain. Everyone is downstream of it. The transcript of a private meeting you facilitated, the agent traces from a hackathon you ran, the every-prompt history of a workflow your product is the only one inside of — that is snowmelt. There is one bucket in position. Yours.
Once you have that flow, the mining is where the compounding starts. You take the raw stream, you build a model on top of it, you generate insights nobody else can generate because nobody else has the underlying material. That model gets used. The usage produces more data. The data sharpens the model. The model attracts more usage. The bucket fills faster every week because it is the only one in that exact spot under that exact glacier. The moat is not the model. The moat is that you are upstream.
The companies that are quietly becoming unicorns right now understood this early.
Notion was the first wave. The bucket was the doc, the page, the database — a place to put the kind of half-structured thought work that used to live across twenty different SaaS tools. Granola is the same playbook for meetings. You take a call, Granola is silently catching every word, and the next morning your transcripts plus your notes are sitting in one place. Linear is doing it for product work. Cursor is doing it for code conversations. Each of them looks like a tool. Each of them is a bucket placed somewhere very specific in the flow of work, where the water that runs through is fresh and theirs alone.
The most interesting wave right now is voice to text. Superwhisper, Wispr Flow, the rest of the dictation pack — they all look like a productivity feature on the surface. Press a key, talk, get text. That framing misses what is actually happening. They are racing to be the default catchment for spoken thought. WeChat just shipped its own version. Apple is rumoured to be folding it into the OS. The reason every well-capitalised player is moving on this at the same time is that whoever owns spoken-thought capture owns the upstream of an enormous amount of soon-to-be-valuable training data — not the commodity web text everyone already has, but the new stream, fresh at the source.
Talk to most of these companies about their roadmap and you will hear about new languages or faster latency. That is the feature view. The bucket view is different. Their roadmap is really about whose mountainside the bucket ends up on. Once the pipe is theirs, the surface area for product is enormous. Coaching. Summaries. Memory. Calendar inference. Whatever model you run later, you can run on a corpus no one else has.
This is also why I do not lose sleep over wrappers being commoditised. The model layer commoditises. The bucket does not, as long as the bucket is in a place nobody else can stand. Anthropic and OpenAI compete on the same underlying physics. Notion and Granola and Wispr Flow do not, because each one is collecting a stream of data nobody else has access to, and the model is just the consumer of that stream. The wrapper is fine. The wrapper sitting under the same storm drain as everyone else is the one in trouble.
The frame I keep coming back to as a builder is brutally simple. Pick the rainwater you want to catch. Pick the spot where it is freshest. Build the bucket so well that the rainwater wants to go there. Then mine it forever.
For us, the rainwater is the hackathon build process. The prompts, the false starts, the agent traces, the conversations between teammates, the judges' questions at the demo. That stream is upstream of almost every interesting question about how builders actually build with AI in 2026, and we are one of very few teams with the bucket in position. We have been treating it like exhaust instead of like product. That stops this quarter. Every event we run now has to answer one question: did the bucket get deeper and the water stay clean, or did the rainwater just run off the roof again.
The hyperscalers showed everyone the move. Build the bucket in the right place. Let the rain do the rest.