muchomota
app/ wiki /01-foundations/03-why-bayesian-beats-point-estimates

Why Bayesian Beats Point Estimates

Bob had started collapsing posteriors to their means in his head to make decisions faster. Tau caught him doing it and took a pen off a neighbouring table.


Back of a napkin, third iteration. The luchador on the velvet painting looked unimpressed. The combo waiter had given up for the evening.

Bob: Look, I get priors, I get shrinkage, I get the width. But in practice I'm a grown man and I can't run ten-thousand-sample posterior integrals in my head before I register for a tournament. So I look at the mean. It's the centre of the curve. It's the best guess. What's the problem?

Uncle Tau: The problem is that your bankroll doesn't follow your best guess. Your bankroll follows the whole distribution. Acting on the mean is fine if the thing you're doing is linear. It is catastrophic if the thing you're doing is multiplicative — and everything in this industry is multiplicative.

Bob: I feel like we're about to get to Bernoulli.

Uncle Tau: We're already there. Hand me the napkin.


The coin that kills you

Uncle Tau: Here's a coin. I flip it, heads I give you +80% of whatever you wager, tails you lose 50%. Fifty-fifty coin. Do you take it?

Bob: I check the EV. 0.5 times plus-80 plus 0.5 times minus-50 equals plus-15. Positive expectation. Yes I take it.

Uncle Tau: Now I say: good. Put your entire bankroll on it. And I'll offer you the same coin tomorrow, and the day after, and every day for a year.

Bob: …wait.

Uncle Tau: Do the multiplication, not the addition. Start with €10,000. Heads takes you to 18,000. Tails takes you to 9,000. That's one heads and one tails — the most likely sequence if you get your fifty-fifty. You ran exactly to expectation. You are now at 9,000. You have lost ten percent on a strategy with fifteen percent positive expectation.

Bob: That can't be right.

Uncle Tau: Run it ten flips. Five and five. 10,000 times 1.8 to the fifth times 0.5 to the fifth. You work it out. I'll wait.

Bob: [does the math on the napkin] …5,905.

Uncle Tau: Run it a hundred flips. Fifty and fifty. Same expectation. 1.8 to the fiftieth times 0.5 to the fiftieth.

Bob: That's basically zero.

Uncle Tau: A fraction of a cent. The arithmetic mean of your returns is +15%. The geometric mean — the thing that actually multiplies your bankroll — is the square root of 1.8 times 0.5, which is the square root of 0.9, which is 0.949. Less than one. Your bankroll compounds at −5.1% per flip even though the "expected return" is +15%.

Bob: So the EV is lying.

Uncle Tau: The EV is the ensemble average. It's what happens if ten thousand copies of you each take this bet once. On average, across all of them, wealth goes up 15%. But you are not ten thousand copies of yourself. You are one guy with one bankroll taking the same bet over and over. That's the time average, and the time average is the geometric mean, and the geometric mean is less than one.

Bob: This has a name.

Uncle Tau: Non-ergodicity. Ole Peters, Nature Physics, 2019. The observation is Daniel Bernoulli's from 1738, so don't overpraise Peters — he gave the modern treatment, but the Swiss guy had it before America existed. The ensemble average and the time average are the same number only when the process is ergodic. Multiplicative wealth processes with an absorbing state at zero are not ergodic. Your bankroll is one.


Where this kicks in for you

Bob: Fine, I won't bet the whole bankroll on one coin. I'm not an idiot.

Uncle Tau: You're not, but the error survives at smaller sizes. It just gets quieter. And it doesn't survive because of coin-flip weirdness — it survives because every compounding decision in your professional life is structurally the same. You enter a tournament: you multiply your bankroll by some random number. You sell a piece of action: you change the shape of that random number. You pick stakes for next weekend: you set the scale of the multiplier. You markup a package: you're pricing a random variable someone else is going to multiply by. Stakes, markups, session plans, staking deals, bond sizes — all of it compounds. None of it is addition.

Bob: And the mean is still lying at tournament sizes.

Uncle Tau: The mean is lying at every size that isn't infinitesimal. Kelly sized fractions exist specifically because full exposure is never the right answer. Your optimal fraction is on the order of $\mu/\sigma^2$, not $\mu$. The sigma squared in the denominator is what the mean cannot see on its own.


What the posterior gives you that the mean doesn't

Bob: So the posterior is not just the mean plus some error bars for decoration.

Uncle Tau: The posterior is a menu of stories, each weighted by how much the evidence likes it. The mean is one story — the story everyone agrees is the least surprising on average. But when you do something multiplicative, the stories don't average. They compound, and they compound asymmetrically. High-variance worlds punish you more than low-variance worlds reward you, because that's what $\log$ does to a number below one versus a number above one.

Bob: Jensen?

Uncle Tau: Jensen's inequality, 1906, and this is the villain of the whole lesson. For a concave function like $\log$:

$$\mathbb{E}[\log X] \leq \log \mathbb{E}[X]$$

The expected log of a random variable is less than or equal to the log of the expected value. Strict inequality whenever the variable has any variance. Which means: the growth rate of your bankroll under the posterior — which is $\mathbb{E}_\pi[\log(1 + fX)]$, integrating over the menu — is strictly less than the growth rate you get from plugging in the mean and pretending the variance isn't there.

Bob: So the mean overstates.

Uncle Tau: The mean systematically overstates any concave function of a random variable. Including the growth rate of your bankroll. Including the value of your stake. Including the fair markup on a package. Any decision that routes through a log or a square root or anything concave, the mean lies upward, and the lie gets louder as the posterior gets wider.

Bob: And a wider posterior means a bigger lie.

Uncle Tau: The gap between the truth and the mean is proportional to the variance of the posterior. Narrow posterior, small gap, the mean is a reasonable shortcut. Wide posterior — your first forty tournaments, a new format, a player you haven't staked before — the gap is enormous. That's when the point estimate is doing its worst lying, and that's exactly when your instinct is to reach for it because the posterior looks intimidating.


Defensive sizing is just this, in action

Uncle Tau: Here's where it becomes concrete for a staking operation. Take the Bayesian Kelly fraction. For known parameters it's $f^* = \mu / \sigma^2$. Under a posterior:

$$f^*{\text{Bayes}} = \frac{\bar{\mu}}{\overline{\sigma^2} + \bar{\mu}^2 + \text{Var}\pi(\mu)}$$

Bob: The extra term in the denominator.

Uncle Tau: The variance of your posterior on $\mu$. That term is zero only when you have infinite data. Any time you have less than infinity, the defensive sizing is strictly smaller than the plug-in sizing. The guy who acts on the posterior size smaller than the guy who acts on the mean — not because he's timid, but because he's correct. Jensen is doing it to him whether he likes it or not. The only question is whether he gets the benefit or eats the cost.

Bob: And the one who acts on the mean thinks he's being rational.

Uncle Tau: He's running the ensemble average on a single-player process. He's a hypocrite who calls himself sharp. Which is fine — everybody's a hypocrite — but he's going to compound at a lower rate than the guy next to him who respects the width. Over a career the gap is enormous.


Kelly and Breiman

Bob: So this isn't just "play safe, bro." There's a proof.

Uncle Tau: Kelly in 1956, at Bell Labs. He wrote a paper ostensibly about information theory and buried the result in it. Maximize the expected log of wealth, and your long-run growth rate dominates every other strategy. Breiman in 1961 made it stronger — not just dominates on average, dominates almost surely. With probability one, in the long run, any non-log-maximizing strategy falls behind. That's not an opinion. It's a theorem.

And here's the load-bearing part: the log-maximizer is the posterior-integrating agent. He doesn't plug in means. He takes the whole menu of stories, weights each one by the posterior, and picks the action that maximizes expected log across the menu. The plug-in agent — the guy acting on the mean — is a strict subset of this. His strategy is a special case that happens to coincide with the posterior-optimal only when the posterior is a point mass. Which it never is.

Bob: So the mean agent is the posterior agent in the degenerate limit.

Uncle Tau: And only in the degenerate limit. Everywhere else, he's leaving growth rate on the table. Breiman will find him.


The menu interpretation

Bob: Give me the intuition one more time.

Uncle Tau: Your posterior is a menu. Each entry on the menu is a candidate reality — "maybe you're a 10% ROI player, maybe 15, maybe 8, maybe 22" — with a weight attached. The right action averages the action's consequence over the menu, not the menu over a single action.

If you act on the mean, you pick the action that's best for the median-weighted story, and then you pretend every other story doesn't exist. The stories don't care that you ignored them. They still happen with their posterior probability, and when they happen, they multiply your bankroll by whatever their multiplier is. The bad ones hurt more than the good ones help, because $\log$ is concave, because Jensen.

If you act on the posterior, you pick the action whose worst-case weighted consequences across the menu are least bad. You're not being pessimistic. You're being arithmetic. The math punishes the mean-agent for ignoring half his own beliefs.

Bob: And every tab in this app is doing the posterior version by default.

Uncle Tau: Every SALSA run samples from your posterior and simulates each sample. Every Package Builder output is a markup band, not a point. Every Scout readout is a confidence range, not a number. The app refuses to collapse the menu on your behalf. If you want to collapse it in your head before making a decision, you're welcome to, but understand that you are re-introducing the error the entire product was designed to remove.


Bob: Alright. So the one-sentence lesson is: never act on the centre of a curve you could have acted on the whole curve of.

Uncle Tau: That's the sentence. Add to it: the wider the curve, the worse the error, and the first thousand tournaments in any format your curve is wide. Act accordingly.

Bob: Thanks, Uncle Tau.

Uncle Tau: Go estimate your shapes, kid. Next time we'll unpack where the posterior shows up on the Strategy tab — SALSA, samples, the growth-rate distribution — so you can see Jensen's tax being paid out in range plots you'll actually trust.


What's next

  • Reading the Strategy tab — where posteriors become SALSA samples, and how the output range is literally Jensen's inequality being plotted.
  • Kelly sizing under uncertainty — the defensive premium that falls out of this lesson, and why a wider posterior maps directly to a smaller optimal stake.
  • Markup and posterior width — how a wide posterior forces a wider markup band in the Package Builder, and why that's a feature.

Further reading

  • Why Tournament Poker Bankroll Management Is a Solved Problem (And Why You're Probably Thinking About It Wrong) on the muchomota Substack — the coin-flip example and the full non-ergodicity walkthrough.
  • Bob and Uncle Tau: How a Bumhunter Who Read Too Much Built MUCHO MOTA on the muchomota Substack — the origin conversation this voice is calibrated to.
  • Nested Bellman–Bayes Optimization for Tournament Poker Bankroll Management, Theorem 1 — the defensive sizing inequality formalized.
  • Bernoulli, 1738, Specimen theoriae novae de mensura sortis, St. Petersburg Academy.
  • Kelly, 1956, A new interpretation of information rate, Bell System Technical Journal.
  • Breiman, 1961, Optimal gambling systems for favorable games, Fourth Berkeley Symposium.
  • Jensen, 1906, Sur les fonctions convexes et les inégalités entre les valeurs moyennes, Acta Mathematica.
  • Peters, 2019, The ergodicity problem in economics, Nature Physics.