The Difference Between Longitudinal Depth and Dataset Size in Predicting Successful Initiative Outcomes

Posted on March 5, 2026

0


Procurement Insights — Jon Hansen | March 2026


A conversation unfolded on LinkedIn this week that gets to the heart of a question the procurement technology industry has avoided for decades: Does having more data actually help you predict whether an implementation will succeed?

It started when Canda Rozier, a Senior Advisory Board member and procurement transformation leader, responded to a post by Iain Campbell McKenna — a procurement podcaster and SRM enthusiast — with a simple but important observation:

“Readiness — organizational, governance, adoption capacity, process — has to be a prerequisite to transformation.”

That comment opened a thread that became one of the most substantive public exchanges I’ve participated in on LinkedIn. What followed wasn’t argument for its own sake. It was a genuine, respectful debate about what kind of evidence actually predicts implementation outcomes — and who should be providing it.


The Challenge

Iain raised a thoughtful point. He acknowledged the value of the Hansen Fit Score™ and organizational readiness, but pushed back on what he characterized as the framework’s reliance on “niche case studies” like the 1998 Department of National Defence engagement and the Procurement Insights archive. His argument: traditional frameworks from Gartner and McKinsey rely on massive global datasets that validate scalable, tech-centric outcomes — and that scale of validation matters when making the case across a global enterprise.

It’s a fair question. And it deserves a direct answer.


Projected Outcomes vs. Actual Outcomes

The distinction I drew in the thread is this: those globally validated benchmarks measure projected outcomes from technology capability. The 70–80% failure rate that has persisted across seven technology eras measures actual outcomes from implementation.

The breadth of a dataset isn’t the issue. It’s what the dataset measures.

Gartner’s ITScore and McKinsey’s analytics are excellent at answering the question: “What can this technology do?” They don’t answer: “Can this organization absorb it?”

That’s the gap the Hansen Fit Score™ was built to close — not to replace those frameworks, but to complete the equation they leave open.


Martec’s Law Enters the Conversation

Marc Smith, a Managing Consultant at Tata Consultancy Services, connected the thread to a broader structural principle. He asked whether this comes back to Martec’s Law — the observation that technology evolves exponentially while organizations change logarithmically — and whether that gap is what drives the transformation failures we observe.

The answer is yes. Martec’s Law describes the gap. The Hansen Fit Score™ measures it. That’s the missing step: most organizations acknowledge the gap exists but have no instrument to quantify it before the implementation begins. The 70–80% failure rate is what happens when that gap is felt but not measured.


The “Niche” Question

Iain pressed the point further, respectfully reiterating that the Hansen model “still relies heavily on niche case studies” and will need the kind of massive, broad data validation that Gartner or McKinsey provides to make the case universally across global enterprises.

This is where the conversation reached its most important distinction — and where the word “niche” needed to be challenged directly.

The DND and Virginia eVA examples are the most visible proof points. But the Hansen Fit Score™ isn’t derived from a handful of case studies.

It draws on a living, continuously updated archive of over 3,300 published documents spanning nearly two decades — tracking vendor behavior, implementation outcomes, and organizational readiness patterns across every major ProcureTech platform in real time. That archive isn’t a periodic snapshot. It evolves as the market does. And it’s stress-tested through a multimodel AI validation framework — RAM 2025™ — that challenges every score against that living evidence base.

The case studies anchor the methodology. The archive is the dataset. And it never stops recording.

For context: a single white paper from the 2008 archive — the SAP Procurement for Public Sector study — contains over 20 cross-referenced case studies spanning both public and private sector implementations, including Hershey, FoxMeyer Drug, Hewlett-Packard, Cadbury, Whirlpool, Dow Chemical, Boeing, Dell, Waste Management, King County Washington, City of Houston, Arapahoe County, Seattle Public Schools, Erie County, Virginia eVA, Canada Post, and the London Borough of Waltham Forest. That paper is one document out of more than 3,300 in the living archive.

That is not niche. That is the deepest continuous record of implementation outcomes in the procurement technology space.


The Real Distinction: Longitudinal Depth vs. Dataset Size

This conversation clarified something that the industry conflates routinely: dataset size and predictive depth are not the same thing.

A large dataset that measures technology capability across 10,000 organizations tells you what the technology can do. It does not tell you whether any specific organization is ready to absorb it. The projected ROI figures — the 13x returns, the 20% profit margins — are derived from capability measurement. They describe what should happen under ideal conditions.

The 70–80% failure rate describes what actually happens under real conditions.

Longitudinal depth — tracking the same vendors, the same implementation patterns, the same organizational behaviors across multiple technology eras over nearly two decades — tells you something a large cross-sectional dataset cannot: why the failure rate persists despite continuous improvements in technology capability.

The answer is not technology. It has never been technology. The answer is organizational readiness — governance, data maturity, behavioral alignment, decision rights, change absorption capacity — measured before the implementation begins.

Gartner’s datasets are broad. The Procurement Insights archive is deep. Broad tells you what exists. Deep tells you what works — and what doesn’t, and why.

Ideally, the two should complement each other rather than compete.


The Challenger Becomes the Validator

Iain’s final reply in the thread said it better than I could:

“The DND and Virginia eVA case studies are so visible that it’s easy for academic reviewers to write them off as edge cases. But when you consider that the score is being actively stress-tested against a 3,300-document living archive validated across 12 AI models (RAM 2025), that’s not a niche methodology anymore — that’s serious empirical weight.”

He continued:

“You need establishment benchmarks like Gartner to confirm that the technology itself is sound and secure; that’s your foundation. But you also need practitioner-focused models like the Hansen Fit Score to answer the harder question: are the people and processes actually ready for it? Measuring that gap is one thing. But when you combine both lenses and use targeted, human-in-the-loop pilots to close it, that’s when companies finally start breaking through the 70–80% failure rate.”

That’s the answer. Not from me — from the person who challenged the framework in the first place.


What This Means for Practitioners

If you are evaluating a ProcureTech platform, an ERP reimplementation, or an AI deployment, two questions matter:

The first is the question the traditional analyst model answers well: What can this technology do? Gartner, McKinsey, Forrester, and IDC provide genuine value here. Their datasets are broad, their methodology is established, and their technology capability assessments are credible.

The second is the question those models leave open: Can your organization absorb what it’s about to buy — and how do you know?

That question requires a different kind of evidence. Not a larger dataset measuring the same dimension. A different dataset measuring the dimension that determines the outcome.

The Hansen Fit Score™ was built to answer the second question. Phase 0™ is the engagement that closes the gap before the contract is signed. And the living archive — 3,300+ documents, nearly two decades, continuously updated — is the evidence base that makes both the score and the diagnostic possible.

The talent was never the problem. The technology was never the problem. The missing measurement was the problem.

Now it exists.


With thanks to Iain Campbell McKenna for the thoughtful challenge that made this conversation possible — and for the intellectual honesty to update his assessment when the evidence warranted it. That’s exactly how rigorous discourse should work. Thanks also to Marc Smith at TCS for the Martec’s Law connection, Canda Rozier for anchoring the conversation in readiness, and Grant Oliff for the parallel discussion on where judgment lives in the system.

The Hansen Fit Score™ framework, Phase 0™ readiness diagnostic, and RAM 2025™ validation methodology are available through Procurement Insights. No vendor sponsorship. No referral fees. No paid placements.

New to our approach? Start here:

-30-

Posted in: Commentary