AGI Won’t Fix Your Dirty Data. Neither Did AI, GenAI, or Agentic AI. Here’s Why.

Posted on December 3, 2025

0


Stephany Lapierre recently posted that AGI may be 2-5 years away — and that enterprises aren’t ready because their data foundations are a mess.

She’s right about the mess. But she’s solving the wrong problem.

Her argument: AGI will need clean, structured, continuously updated supplier data. Enterprises don’t have it. Therefore, enterprises need platforms that provide persistent legal entity resolution, standardized supplier schemas, verified external data sources, and hierarchical relationship mapping.

That’s TealBook’s business model. And it’s not wrong — it’s just incomplete.

Here’s the question nobody’s asking:

Why is data still dirty in 2025?

I’ve been writing about data quality since 1998. The Procurement Insights archives go back to 2007, and I was raising this issue from the start:

  • In 2008, I wrote about solving the parent-child SKU nightmare for Canada’s Department of National Defence — not with better technology, but by asking what time of day do your orders come in?
  • In 2009, I distinguished between “spend analysis” and “spend intelligence” — arguing that the problem wasn’t the tools, but the understanding underneath them.

Twenty-seven years later, we’re still having the same conversation. Only now it’s dressed up in AGI language.

The Technology Progression:

  • AI (rule-based, expert systems)
  • Generative AI (content creation, pattern recognition)
  • Agentic AI (autonomous task execution)
  • AGI (human-level general reasoning)

Each wave arrives with the same implicit promise: “This one will finally fix your data.”

Each wave fails for the same reason: dirty data is not a technology problem.

The DND Proof Point

When I solved the parent-child SKU compression problem for the Department of National Defence, I didn’t need advanced technology. I needed to ask:

  • What time of day do orders come in?
  • Who is placing them?
  • What behavioral patterns create duplicates?
  • How do field technicians actually work?

That’s agent-based thinking. That’s strand commonality. That’s understanding the Metaprise — the ecosystem of people, processes, and behaviors that generate data in the first place.

The technology came after I understood the human and process physics that created the mess.

Mopping the Floor While the Faucet Runs

Every vendor — including TealBook — sells a platform to clean data without addressing why it gets dirty in the first place.

Stephany’s post says AGI will need:

  • Persistent legal entity resolution
  • Standardized supplier schemas
  • Verified external data sources
  • Hierarchical relationships and lineage
  • Provenance tracking

All true. All necessary. All insufficient.

Because none of those address:

  • Why do enterprises have multiple conflicting vendor masters?
  • What process created the inconsistency?
  • Which agents (people) are entering data, when, and under what pressures?
  • What behavioral incentives reward speed over accuracy?

You can standardize schemas all day. If the same fragmented processes and misaligned behaviors persist, the data will be dirty again tomorrow.

The Hansen Method Answer

Dirty data is a symptom — of process fragmentation, agent misalignment, and governance absence.

Phase Zero asks the questions that no platform can answer:

  • Who touches this data, and why?
  • What pressures drive the behaviors that create inconsistency?
  • Where do processes fragment across departments, systems, and geographies?
  • What governance structures exist — or don’t — to enforce quality?

Until you answer those questions, every technology layer is a temporary patch on a permanent problem.

The 27-Year Pattern

The fact that we’re still having this conversation in 2025 — now with AGI as the promised savior — proves the industry still hasn’t learned.

AI didn’t fix dirty data. Generative AI didn’t fix dirty data. Agentic AI won’t fix dirty data. AGI won’t fix dirty data.

Because dirty data was never a technology problem.

It’s a people problem. A process problem. A readiness problem.

And until the industry stops skipping Phase Zero to chase the next technology wave, the mess will remain — no matter how intelligent the machines become.

-30-

BONUS SECTION

I’ve been exploring this question with leading academics and practitioners for years. The answer has always been the same — technology amplifies whatever foundation exists. It doesn’t create one.

Dr. Rob Handfield, Bank of America University Distinguished Professor of Supply Chain Management at NC State, has been tracking this problem through annual data governance surveys with IBM. His findings are consistent — and challenging for the technology-first thesis:

On what creates clean data: Handfield has consistently argued that clean data is the result of having a proper governance model in place — the discipline of information input at the point of sourcing, SKU creation, and supplier database management.

On the reality: He’s noted that clean data remains a mirage for most organizations because, even with reasonably good governance built around a stable taxonomy, the capture, interpretation, and entry of data at the point of origin varies.

On progress (or lack thereof): In a 2022 podcast, Handfield observed that data quality in most organizations remains poor — managers spend hours each day just finding the data they need, then cleaning it in Excel before presenting it. His 2020 predictions noted that poor data quality remains one of the biggest limits to digital transformation, and that the organization and categorization of data does not appear to be advancing.

A notable silence:

Handfield has explicitly called for AI-based approaches to help clean and standardize organizational data sets. TealBook positions itself as exactly this kind of solution. Stephany Lapierre has referenced Handfield’s research, they’ve served on the same advisory board (IADQGA), and she has engaged him directly on LinkedIn.

Yet I’ve not been able to find any public endorsement from Handfield of TealBook — or of a technology-first approach to solving the dirty data problem — to date. Given the alignment between what he called for and what TealBook claims to offer, the absence of endorsement is, at minimum, a data point worth noticing.

The foundation was never technology. It was always governance, process, and people.

Posted in: Commentary