Prediction Markets · Reading the Price Part 4 · Newcomers

A price is a probability

If a market says 73%, what does that even mean — and how would you ever know if it's right? Here's how the pros grade a forecast: calibration, and the Brier score.


By now the core idea of this series is familiar: a market price is a probability. A contract trading at 73¢ is the crowd saying 73%.1 But a probability is a slippery kind of claim. If I tell you there's a 73% chance of rain and the sky stays dry, was I wrong? Not really — unlikely things are supposed to happen 27% of the time. You cannot grade a single probabilistic call on a single outcome.

So how do you tell a real forecaster from a lucky one — or a calibrated expert from a confident fraud? You stop staring at any one call and look at all of them. Two tools do the work: calibration and the Brier score. They're how you hold a number accountable.

A single prediction can't be graded. A thousand of them can.

The only way to grade a probability

A 73% forecast is not a promise that the thing will happen. It's a claim about frequency: of all the times I say 73%, the event should happen about 73 times in 100. That reframing is the whole trick. One call tells you nothing; a track record tells you everything. So gather every market that closed near 73¢ and count how many resolved YES. Then do it for every price level. What you get is a calibration curve.

The calibration curve

Plot what the market said on one axis and what actually happened on the other. A perfect forecaster lands on the 45° line: when it says 30%, the event happens 30% of the time; when it says 90%, 90%. Real markets hug that line remarkably well — far better than pundits, who are reliably overconfident and pay nothing for it.2

= PERFECT THE MARKET 0% 100% 0% 100% WHAT THE MARKET SAID HOW OFTEN IT HAPPENED
A reliability diagram — illustrative. Closer to the diagonal means better calibrated.

Miscalibration is just distance from the line. A forecaster whose "90%" calls only happen 70% of the time is overconfident — their curve sags below the diagonal at the high end. The picture tells you not only that someone is off, but exactly how.

One number — the Brier score

A curve is a diagnosis; sometimes you want a single grade. The Brier score is the standard one. Write each outcome as \(o_t = 1\) if the event happened and \(o_t = 0\) if it didn't, and let \(p_t\) be the price as a probability. The score is the average squared miss:3

$$ \text{BS} \;=\; \frac{1}{N}\sum_{t=1}^{N}\left(p_t - o_t\right)^2 $$

It runs from 0 (you said 100% and were right, every time) to 1 (you said 100% and were wrong, every time). The number that matters is 0.25 — what you score by saying "50%" to everything, the forecast of someone who knows nothing. Beat 0.25 and you're adding information; liquid markets beat it comfortably.

The squaring is the soul of it. Say 80% on something that happens and you eat \((0.8-1)^2 = 0.04\). Say 80% on something that doesn't, and you eat \((0.8-0)^2 = 0.64\) — sixteen times the penalty. Confidence is cheap only when you're right; the Brier score makes you pay for bluster.

0.00 PERFECT ~0.10 A SHARP MARKET 0.25 ALWAYS SAY 50% ← LOWER IS BETTER NO SKILL →
The Brier score, illustrated — markets land well left of the no-skill line.

Why the market stays honest

Why are markets so well-calibrated? The same reason everything in this series comes back to: money. If a market reliably said 70% for things that happen half the time, that gap is free money — you sell at 70¢, collect more often than you pay out, and your selling drags the price back toward the truth. Miscalibration is a profit opportunity, and a liquid market competes it away. A pundit can be overconfident for a whole career; a trader is overconfident only until they're broke.

What calibration won't tell you

Calibration is necessary, not sufficient. A forecaster who just parrots the base rate — "8% of startups become unicorns, so 8% for every startup" — is perfectly calibrated and perfectly useless. Good forecasting needs calibration and sharpness: confident, differentiated calls that still land on the line. The Brier score actually splits into exactly those two pieces.4

Markets have one well-known wrinkle, too: the favorite–longshot bias. Longshots tend to be priced a touch too high and heavy favorites a touch too low — buy a basket of 5¢ lottery-ticket contracts and you'll slightly underperform their price.5 It's small in deep markets, but it's real — a reminder that "the price is a probability" is an excellent approximation, not a law of physics. And it holds for the sports contracts that make up most of the volume just as much as for elections: a calibrated number is a forecast even when people are trading it for the thrill.

So "a price is a probability" was never a slogan. It's a claim you can test — and the market passes it better than almost any expert, pundit, or poll, because it's the only forecaster that goes broke when it's wrong. Next time a number flashes on a screen — 73% — you know what it means, and you know exactly how you'd check it. That is what separates a market from an opinion: the number is accountable.

Notes
  1. The price-equals-probability identity is developed in Parts 1–3 of this series — What is a prediction market?, Prediction markets, explained, and Why prediction markets matter.
  2. On the gap between expert confidence and actual accuracy, and the relative calibration of aggregated/market forecasts: Philip Tetlock, Expert Political Judgment (2005); and reliability studies of the Iowa Electronic Markets.
  3. The Brier score: Glenn W. Brier, "Verification of Forecasts Expressed in Terms of Probability," Monthly Weather Review (1950). The 0.25 figure is the score of a constant 50% forecast on binary events.
  4. The Brier score decomposes into reliability (calibration), resolution (sharpness), and uncertainty: Allan H. Murphy, "A New Vector Partition of the Probability Score" (1973).
  5. Favorite–longshot bias — a long-documented regularity in betting and prediction markets; see Snowberg & Wolfers and the broader market-microstructure literature.
SL
Seeker Labs
The research desk at Seeker — theses, trends, and where we see the next bets across markets, AI, and the technologies in between. By Viet Ho (Managing Partner) & John Nguyen (Research Partner).
hi@vietho.me · @congviet