ProcureTech Providers and Analysts Are Recommending the Wrong AI Model. Here Is What Procurement Practitioners Are Not Being Told.

Posted on March 27, 2026

0


Technology is built to be right. RAM was built to get it right.

That single distinction explains why the ProcureTech implementation failure rate has not moved across seven consecutive technology eras — and why it will not move until the industry stops confusing confident outputs with correct ones.

Every major ProcureTech provider is now embedding AI into their platform. Every major analyst firm is evaluating those platforms and producing assessments of their AI capability. And almost without exception, what they are selling — and what they are assessing — operates under the same single-answer, closed-frame optimization logic found in Mixture of Experts and similar architectures. A system designed to produce the most confident, coherent answer from within its own data environment — precisely where the problem begins.


What a closed-frame AI model actually does

A Mixture of Experts is a single-model architecture in which different “expert” sub-networks activate for different inputs via a routing mechanism. The experts are trained together, optimized toward the same objective, and their outputs are combined into one unified response. Agentic layers built on top simply orchestrate tasks — they do not introduce genuine independence.

There is no disagreement between experts. There is no mechanism that challenges the system’s own epistemological foundations. There is no adversarial layer that asks whether the confident answer the platform is producing is the right answer for this specific organization operating under these specific conditions.

For a practitioner, that means one thing: the system is architected to optimize its own answers, not to question them.


How this architecture powers the platforms procurement practitioners use every day

ProcureTech leaders including SAP Ariba (with its generative AI-powered Category Management and Joule co-pilot), Coupa (positioning itself as an agentic procurement orchestration platform), GEP, JAGGAER, and others are embedding AI that relies on modern foundation models and agentic systems — many of which leverage closed-frame optimization architectures internally for efficient scaling. These systems route tasks through specialized sub-networks to deliver fast, confident outputs: risk scores, contract clauses, forecasts, and workflow decisions.

The architecture is commercially optimal — it demos well, scales efficiently, and aligns with vendor incentives to show capability. What it does not do is challenge its own frame or diagnose whether the receiving organization has the governance architecture, pre-authorized decision authority, and process integrity to convert those signals into sustained outcomes under compounding stress.

This surfaces in four scenarios practitioners encounter every day.

Supplier risk assessment. The platform scores a supplier as moderate risk based on financials, news sentiment, and geographic exposure. The practitioner acts on that score. What the platform cannot tell them is whether their own governance architecture — approval thresholds, alternative supplier qualification, emergency sourcing authority — is capable of acting on that signal inside the 30-day window that separates a decision from a consequence. The signal is accurate. The organization cannot convert it into action. The platform has no mechanism to surface that gap.

Contract negotiation support. The platform recommends benchmark pricing and clause language. The practitioner secures a strong contract on paper. What the platform cannot tell them is whether the vendor’s historical behavioral arc — across independent, longitudinal, unconflicted evidence — supports actual performance under real-world stress. The platform is optimized toward a confident recommendation. It is not designed to tell the practitioner that the vendor’s track record does not support the performance the contract assumes.

Demand forecasting and inventory decisions. The platform produces a high-confidence forecast. The practitioner commits inventory accordingly. What the platform cannot tell them is that the forecast was built on a closed frame that assumed stable conditions — exactly the assumption that collapsed with the 2026 Hormuz disruption, and will collapse again with the next corridor closure, regulatory shock, or supplier insolvency.

Technology selection and implementation. The platform — or the analyst report evaluating it — produces a capability assessment. The practitioner builds the business case. What neither tells them is whether the organization has the process integrity and decision authority to sustain the outcome the technology promises. The implementation proceeds. The failure arrives. The practitioner is left holding a well-documented capability assessment that said nothing about the organizational readiness gap that determined the outcome from day one.

In each case the practitioner did not make a bad decision. They made a well-supported decision using the best available tool. The tool was optimized to produce confident answers from within its own frame. The frame did not include the question that determined success or failure. It is not that the answers are wrong. It is that the system is not designed to tell them what it does not know — and in procurement, what the platform does not know is almost always the thing that determines the outcome.

If you are currently inside one of these four scenarios, there is a 30-day window where what happens next determines the outcome. Most organizations do not know where they sit until it has already been decided for them.


What the analyst community adds — and what it misses

When analyst firms evaluate AI-enabled ProcureTech platforms, they assess capability inside a framework funded and shaped by the vendors they review. The result is a reinforcing loop: the platform produces confident answers; the analyst produces a confident assessment of those answers; the practitioner receives no independent signal about where platform capability ends and organizational readiness begins.

The analyst layer does not introduce independence. It reinforces the same closed-frame optimization logic from a different vantage point. Their reports measure capability. They do not measure whether the receiving organization has the governance integrity to absorb and act on what the platform recommends. That is the fiduciary gap no vendor-funded assessment will ever close — because closing it sometimes requires telling the client to stop.

The Procurement Insights archive has documented this pattern across seven consecutive technology eras. The consistent finding is not that the technology failed. It is that governance architecture, process integrity, and organizational readiness were never independently diagnosed before deployment. The 60-85% implementation failure rate did not change across those seven eras. The technology changed. The failure rate did not.


The technology providers are not going to change. That is precisely why Phase 0™ is essential.

Closed-frame optimization is commercially optimal for technology providers. It produces confident outputs, it demos well, it closes sales. The incentive to introduce adversarial independence — the mechanism that sometimes produces a Stop signal — does not exist within a commercial model that depends on deployment.

Phase 0™ is the only diagnostic with the independence to tell a client to stop. No vendor-funded platform can afford that signal. No analyst firm funded by vendors can produce it. Phase 0™ does — because it was designed to find the boundary between what the technology claims and what the organization can actually execute, before the commitment is made. That diagnostic applies equally whether an organization is evaluating new procurement technology or stress-testing the stack it already has in place.


What this has looked like in practice — the Genomadix proof case

We recently ran a live demonstration of exactly this difference. A RAM 2025™ multi-model assessment of an Ottawa-based Pharmacogenomics diagnostics company already under tariff-driven stress when the 2026 Hormuz disruption hit — deployed six genuinely independent models against the same brief in isolation. The models surfaced five interconnected exposure vectors, produced materially divergent accuracy assessments ranging from 6/10 to 9.5/10, and generated a two-version segmentation strategy — an Operator version calibrated for conviction language and an Executive version calibrated for defensible conditional framing — that no single closed-frame platform would have produced, because no single closed-frame platform is designed to disagree with itself.

The divergence was not a system error. It was the most valuable output the assessment produced. It revealed exactly where confident claims ended and inference began — and gave the organization a calibrated, three-tier framework for what the archive fully endorses, what it strongly supports, and what requires a diagnostic to confirm.


What this connects back to — and why 1998 matters

The principles behind RAM 2025™ are not new. They are grounded in a system that was built, tested, and validated in a live operational environment in 1998.

SR&ED-funded government research produced an agent-based, self-learning procurement system for Canada’s Department of National Defence that improved delivery performance from 51% to 97.3% in three months and sustained that performance for seven years — without adding headcount. Administrative overhead dropped from 23 to 3 FTEs. The system was later sold for $12 million.

What made it work was not the algorithm. It was the alignment between the algorithm, the organization, and the decision authority to act. Three architectural decisions that most current AI platforms still treat as optional:

Multi-factor decisioning — supplier performance weighted dynamically across delivery, quality, price, and location simultaneously, not compressed into a single confidence score.

Feedback-driven recalibration — the system compared its outputs against real-world outcomes and adjusted accordingly. It did not produce confident answers and move on. It measured whether its confident answers were correct.

Human judgment retained at the point of action — decision logic was embedded inside actual operating conditions, with human override available at the point where consequences occurred.

That 1998 system was not RAM 2025™. The important distinction is this: the 1998 system was self-learning operational intelligence, solving the question “how do we improve live decisions under changing conditions?” RAM 2025™ is multimodel epistemic validation, solving a different question: “how do we verify whether a conclusion should be trusted before an organization acts on it?”

1998 answered “how do we decide better in real time?” RAM 2025™ answers “which conclusions can we trust before we act?”

The continuity is real. The layers are different.

And the Hansen Fit Score™ sits at the intersection of both. It is not a theoretical model. It does not predict outcomes. It identifies the conditions under which outcomes have consistently succeeded or failed — grounded in principles first implemented in a live operational environment in 1998, and refined across 18 years of longitudinal observation spanning multiple technology eras, 3,300+ published documents, and zero vendor sponsorships.

The 1998 work proved the principle. The archive proved the recurrence. RAM 2025™ is the formalized diagnostic that applies both.


The one-line truth

Technology is built to be right. RAM was built to get it right.

The models changed. The failure pattern did not — because the diagnostic layer never arrived.

That principle was embedded in the 1998 system. It is embedded in RAM 2025™. And the archive has 28 years of evidence showing what happens when it is absent.


Technology is built to be right. RAM was built to get it right. The Procurement Insights archive contains 3,300+ published documents spanning 18 years of independently produced, timestamped procurement and supply chain research — zero vendor sponsorships, zero paid analyst relationships. RAM 2025™ is the multimodel validation framework that cross-validates all major Hansen Models™ assessments before publication. Phase 0™ is the organizational readiness diagnostic that precedes all technology and supply chain commitments. The Hansen Fit Score™ identifies the conditions under which procurement technology outcomes have consistently succeeded or failed across 18 years and seven technology eras.


Ready to close your Authority Gap?

Whether you are evaluating a new platform or trying to get more from the one you already have, the same question determines the outcome: can your organization execute what the system is producing?

Book a 30-Minute Readiness Conversation with Jon Hansen — a preliminary diagnostic discussion to identify whether a Phase 0™ assessment is the right next step for your initiative. No sales pitch. Just an honest conversation about where your organization sits on the readiness spectrum.

Book your 30-Minute Readiness Conversation

hansenprocurement.com

-30-

Posted in: Commentary