All articles

Essay

When "we test everything" became marketing, the test stopped meaning anything

Goodhart's Law says a metric stops being a good metric the moment it becomes a target. The certificate of analysis crossed that threshold somewhere in 2024. The fix is not a better certificate.

Charles Goodhart formulated the law that bears his name in 1975, in a paper about monetary policy. The statement is short. "Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." A working translation: the moment a metric becomes a target, the people being measured reorganize their behavior around the metric, and the metric stops tracking the underlying quality it was originally chosen to track.

The certificate of analysis is now a worked example of Goodhart's Law in a small consumer-products vertical. The COA was a credible signal in 2018. By 2020 it was useful with some skepticism. By 2024 it had crossed the threshold. Sometime in the second half of that year, "lab-tested" became a standard phrase in promotional copy, the rate of new COA-style documents on supplier sites grew faster than the rate at which any plausible set of independent labs could be producing them, and the buyer's marginal information from the presence of a COA approached zero.

The hard observation: once a signal crosses the Goodhart threshold, more of it does not help. A more elaborate certificate, a longer assay list, a more reputable laboratory name on the header, all of these get copied. The infrastructure for copying them is cheap, and the marketing value of doing so is high. The arms race favors the forger.

This essay argues for a different response. The fix is not a more sophisticated certificate. It is a different category of artifact, produced by a different category of actor, under different incentives. The certificate is doing the work the certificate can do. The buyer-side signal needs a witness that the seller cannot pay.

What is Goodhart's Law actually doing in this market?

Two stages.

Stage one is the pre-target equilibrium. A few suppliers volunteer to publish lab tests because they can use the test as a differentiator against suppliers who do not. The signal carries information at the margin: a buyer who reads it can update their belief about supplier quality, because the publication itself is costly enough to filter out operators who are not investing in real testing. The fraction of suppliers publishing COAs is small. The fraction of those COAs that correspond to a real test is high. Buyer-side information improves.

Stage two is the post-target equilibrium. The differentiation works. The COA-publishing suppliers grow faster than the non-publishing ones. Non-publishing suppliers either start publishing or exit. Publishing becomes standard. At this point, two things happen simultaneously. First, the act of publishing no longer differentiates, because everyone does it. Second, the act of forging a COA becomes worthwhile, because the population of buyers reading them has grown to the point where a fake COA produces enough revenue to cover the labor of producing it.

The two effects compound. Buyers learn that the presence of a COA no longer differentiates, so they look more closely. The closer look produces a small number of fakes detected. The detection capacity is finite, and the rate of new fakes is faster than the rate of detection. The buyer's optimal strategy, against a population in which forgery is cheap and detection is slow, is to discount the signal entirely.

That is the second equilibrium. It is where the peptide COA sits as of late 2025.

Why doesn't a better certificate fix the problem?

Goodhart's Law is a statement about incentives, not about technology. A more elaborate certificate is a technical change. The incentive structure that produces the gaming is unchanged.

A cryptographically signed COA, for instance, raises the cost of forgery from "lifting an image" to "compromising a signing key." The marginal cost of fakery goes up. But the marginal revenue of fakery also goes up, because a cryptographically signed COA is a more powerful signal than an image-format one. The arms race continues at a higher resolution, and the equilibrium share of fakes in the population is set by the ratio of fakery revenue to fakery cost, which is bounded but nonzero.

Worse, every additional layer of technical defense becomes itself a new target for gaming. A registry of authorized laboratories becomes a target for laboratory-name forgery. A QR-code linking back to a lab portal becomes a target for portal-spoofing. The structural problem is that the artifact lives inside the seller's marketing surface and produces seller revenue. As long as both of those facts hold, the artifact will get gamed at whatever the equilibrium rate is for the prevailing arms-race intensity.

The interesting question is not how to make the COA harder to fake. It is how to design a signal that does not produce seller revenue when forged.

What does a Goodhart-resistant signal actually look like?

Three properties matter. A signal that has all three resists Goodhart-style decay over a useful timescale.

The signal must be produced by a party with no commercial relationship to the seller. The party can be a regulator, a research institution, a notarial registry, or a cross-vendor witness funded by some structurally independent source. What it cannot be is a service that the seller pays for and that benefits from the seller's revenue. A paid-for "verified" badge has the same incentive problem the original COA had, one level up.

The signal must live in a venue the seller does not control. A claim published on the seller's own site is rhetorical. The same claim published into a registry the seller cannot edit is evidentiary. The shift from rhetorical to evidentiary is the technical change that closes the Goodhart loop, because the seller can no longer adjust the signal to match the marketing.

The signal must be free at the point of use, and append-only over time. Free means a buyer can read it without an account or a paywall, so the signal reaches the buyer's decision moment with no friction. Append-only means a seller cannot pay to delete an unflattering record, which is what closes the loop the other direction.

A signal that has all three properties cannot be Goodharted in the same way the COA was, because none of the three avenues of gaming are available. The seller cannot pay the verifier, cannot edit the venue, and cannot suppress the lookup. They can still attempt to fake the underlying credential, but the fake will not appear inside the third-party record without the third party agreeing to enter it, and the third party has no incentive to do so.

What follows operationally?

For the buyer: the next-generation signal is not on the supplier's product page. It is a separate lookup against an independent registry. The presence or absence of a COA on the supplier site, in late 2025, is no longer a usable signal on its own.

For the seller: the asset that carries information is no longer the certificate the seller publishes. It is the third-party record the seller has earned. The cost of earning that record is real and unavoidable, but the asset persists across price cycles and survives the next Goodhart wave, because it is not gameable by the seller in the same way.

For the third-party witness: the operational obligation is clear. Take no seller money for placement. Publish lookups without an account. Refuse to delete records. Time-bound the attestations. The work is not glamorous and the revenue model has to come from somewhere other than the parties being attested. That is the entire architectural commitment, and it is the only one that resists the law.

If a reader takes one thing from the Goodhart paper, it is that good signals decay when they are placed under commercial load. The fix is not a better signal under the same load. It is a different load.

Frequently asked questions

What is Goodhart's Law, in one sentence?

Once a measure becomes a target for policy or marketing, it stops being a useful measure, because the parties subject to it will reorganize their behavior around the measure rather than around the underlying quality the measure was supposed to track. The original formulation was about monetary policy; the principle generalizes.

How did the COA cross the Goodhart threshold for peptides?

It happened when 'lab-tested' moved from a niche credential to a standard line in promotional copy across the category. The moment the phrase started carrying revenue, the population of certificates expanded faster than the population of legitimate tests. Fake COAs and recycled COAs appeared within months. The fraction of certificates that were doing their original work fell quickly enough that buyers could no longer use the surface presence of a COA as a usable signal.

Why can't the answer be a better, harder-to-fake certificate?

Because the failure mode is structural, not technical. Any artifact that lives inside the seller's marketing surface and benefits the seller's revenue will, eventually, be gamed. A more cryptographically rigorous certificate raises the cost of fakery but does not change the incentive structure that produces it. The fix has to move the artifact out of the seller's surface entirely.

What does a Goodhart-resistant signal look like in this domain?

A record of a check performed by a party with no commercial relationship to the seller, published in a venue the seller does not control, written into a registry the seller cannot edit. Free to look up. Append-only. Time-bound. The verifying party is paid by some funding source structurally disconnected from any individual seller's sales volume.