From Hoarding to Harnessing: Why Data Moats Are Dead (and Orchestration Moats Matter)
For years, VCs, consultants, and healthtech founders have treated “we have the data” as if it were a magical incantation to keep competitors at bay. And to be fair, I was guilty of buying into this myself.
The cracks started to show after reading my buddy’s Ernests Štāls’ essay on “Building Moats In The Age Of AI” earlier this year. That was the first nudge. But over the past few months, I’ve been testing my own assumptions, replaying deal conversations in healthtech, and wondering why something felt… off. The pitches kept saying “we own the data,” but in practice, owning data wasn’t translating into defensibility or adoption.
Then came the tipping point. Less than two weeks ago, Emad Mostaque dropped his The Last Economy, and it smacked me right between the eyes. His framing of “cognitive abundance” — intelligence as the first post-scarcity resource — hit the nerve I’d been circling around for months. If intelligence itself is abundant and effectively free, then what on earth is the point of hoarding static data?
That was the lightbulb moment: it’s not about the moat of having (data), it’s about the moat of doing (orchestrating). The real defensibility in the age of AI isn’t who owns the biggest dusty warehouse of medical records, but who can orchestrate intelligence — connecting models, hypotheses, and workflows into the messy, regulated, deeply human real world.
Now, I should probably insert a disclaimer here before the healthtech crowd starts sharpening pitchforks: I can already hear the murmurs of “but data is everything in healthcare” echoing across conference panels. Yes, it was. But not anymore. And sure, I’ll get stick for saying this — but here’s the uncomfortable truth: in the age of cognitive abundance, data moats are illusions. Orchestration moats are where the real edge lies.
The Illusion of Data Moats
For the better part of the last decade, “data moats” were the gold standard for defensibility. Investors drooled over startups that had exclusive access to proprietary data — EHRs, lab reports, behavioural signals, financial logs, you name it. The logic was simple: whoever owns the most data trains the best models, and whoever trains the best models wins.
Except… that’s no longer true.
Models Have Outpaced the Moats
We now live in a world where foundation models exist. Massive, general-purpose models trained on internet-scale data — text, images, code, biology, you name it — are the new infrastructure layer.
They’re not perfect. But they’re absurdly competent out of the box. And they’ve changed the calculus.
The truth is: in many domains, including healthcare, the marginal value of your proprietary dataset is collapsing. You can fine-tune a base model with a few hundred well-curated examples and get within striking distance of a solution that would’ve taken years and millions to build from scratch.
So that exclusive partnership for 40,000 unlabeled, partially duplicated radiology scans from 2018? Yeah, it’s not the moat you think it is.
Synthetic Data Broke the Scarcity Game
Once dismissed as glorified guesswork, synthetic data is now a thing.
Generating high-quality, domain-specific data:
Diffusion models simulating rare pathologies,
RL agents playing out endless patient journeys,
LLMs constructing plausible edge-case notes for underrepresented demographics.
In short: you’re not constrained by what’s been collected anymore. You’re constrained by your imagination and your orchestration layer.
And before anyone yells “but synthetic data isn’t real,” I agree — it’s not real.
It’s better.
It’s controlled, diverse, clean, and instantly scalable. Try getting that from your local hospital’s IT department.
Owning Data ≠ Using Data
Just because you own data doesn’t mean you’re doing anything useful with it.
Most real-world data is a hot mess:
Locked behind regulations,
Full of noise, gaps, and outdated taxonomies,
Biased as hell,
And rarely structured for actual downstream use.
Startups brag about their dataset size like it’s a sign of success. But most of the time, it’s just a liability waiting to happen — ethically, financially, or both.
Hoarding petabytes of messy data without a feedback loop is like owning a library where no one reads. In my view it’s not a moat. It’s a mausoleum.
The New Edge: Better Hypotheses, Not Bigger Warehouses
We’ve shifted from a world of data scarcity to a world of cognitive orchestration.
The winners aren’t those who have the most data. They’re the ones who know what questions to ask, where to embed the answers, and how to learn faster than anyone else.
You don’t need 10 million data points. You need 10 good ones, embedded into the right moment of care, triggering the right next step, and closing the loop.
This is the Orchestration Moat.
It’s not built on data. It’s built on:
Judgment – Knowing what actually matters.
Context – Embedding insight where decisions happen.
Reflexivity – Learning and evolving in real-world time.
That’s how startups compound insight. Not by hoarding, but by harmonizing.
Orchestration Moats in the Wild
In healthtech, I’ve seen this firsthand.
Some founders pitch access to millions of records. Yet when you ask how this actually improves care, reduces costs, or makes clinicians’ lives easier… the answer is usually hand-wavy at best.
Then there are the quiet operators.
The ones who embed AI into a clinician’s workflow so well that the doctor barely notices it’s there — yet starts making faster, better, more confident decisions. The ones who close the loop between patient action and clinical insight. The ones who get better not because they have more data — but because they’ve built systems that learn from reality.
These are the orchestration moat builders. And they’re going to be very hard to copy.
TL;DR — Your Dataset Is Not a Moat Anymore
I’m not saying data doesn’t matter. It absolutely does. But it’s no longer the final boss. It’s just the starting material.
The real edge now comes from how well you orchestrate that data:
Into workflows.
Into insights.
Into real-world outcomes.
Those who master cognitive compounding — not just in AI, but in product, design, feedback, and delivery — are the ones who build defensibility that actually lasts.
And yes, I know… this is going to ruffle some feathers. Somewhere out there, a healthtech founder is already angrily highlighting this essay to send to their advisor, titled “THIS IS EXACTLY WHY WE NEED OUR OWN DATA.”
But for those who’ve felt the real bottleneck, who’ve seen the endless decks about exclusive datasets go nowhere fast, who’ve tried — and failed — to integrate AI into actual care delivery? You know this is true. The moat isn’t the vault.
It’s the velocity of meaningful insight.
It’s the feedback loop with the real world.
It’s the cognitive choreography across systems.
Or, in simpler terms:
It’s not about who hoards the most data.
It’s about who makes it sing.
S.