Sample Size — How Many Tournaments Is "Enough"?

Bob wanted to stake his friend Danny. Danny had 180 tournaments on his Sharkscope, an ROI number that looked pretty, and a confident opinion about his own heads-up game. Bob asked Tau whether 180 was enough to know anything. Tau laughed.

The taquería again. The waiter had given up on the combo pitch by now and was just dropping two horchatas on the table without asking. Tau had a napkin covered in scrawl next to his phone.

Bob: Danny wants me to buy 25% of him for the Sunday grind. 180 tournaments, 22% ROI on Sharkscope. Is that enough to know what I'm buying?

Uncle Tau: Depends what you think you're buying. "Enough" isn't a single number. It depends which parameter.

Bob: What do you mean which parameter. ROI is ROI.

Uncle Tau: ROI is a downstream quantity. It falls out of two things — cash frequency and heads-up winrate. Those are the two knobs. The whole SALSA model runs on them. And they converge at wildly different speeds.

Bob: How different.

Uncle Tau: Cash frequency settles in about five hundred tournaments. HU winrate takes six to ten thousand. Order of magnitude gap.

Bob: That can't be right. It's the same player, same sample.

Uncle Tau: Same sample, totally different information content per tournament. Every tournament tells you whether he cashed. That's a binary observation, happens every time. Cash rate accumulates fast. But almost no tournament tells you anything about his heads-up game. Most nights he busts in the middle, or on the bubble, or at the final table in 6th. He gets one HU situation per tournament at most — and usually zero. The information about HU winrate accumulates like a trickle.

Bob: So when Danny says his HU game is great —

Uncle Tau: — he's talking about a parameter that even he doesn't have enough data on himself to know. Not yet. At 180 tournaments he's probably had maybe a dozen HU situations. Maybe fewer. That's not a sample, that's a rumour.

Why binary is so much cheaper than rank

Bob: Walk me through why the rates are that different.

Uncle Tau: Every parameter you want to estimate has an information floor — the Cramér-Rao bound. It says no estimator, parametric or not, neural network or not, can converge faster than a rate set by the Fisher information of the parameter. The Fisher information is basically "how sharply does the likelihood change when the parameter changes." High Fisher information means each observation tells you a lot. Low Fisher information means each observation tells you almost nothing.

Bob: And cash frequency has a lot of Fisher information per tournament.

Uncle Tau: Every tournament is a direct observation on it. A binary Bernoulli variable — you cashed or you didn't. Fisher information on a Bernoulli parameter near $p \approx 0.15$ is roughly $1/[p(1-p)] \approx 7.8$ per observation. Standard deviation of the estimator shrinks like $1/\sqrt{n}$. Do the arithmetic for $\pm 3\%$ at 95% confidence and you land around $n = 500$.

Bob: And HU winrate.

Uncle Tau: Here's the asymmetry. Most of what HU winrate controls — the tilt of the payout distribution toward higher finishes — gets observed only in the positions that actually happen often enough to measure. But most positions are bust positions, most tournaments end with you not in the money. Your payout observations are almost all zeros. A zero tells you something, but it tells you very little about the top-end structure. The Fisher information on $\lambda_2$ — the payout-tilt parameter — scales like $\text{Var}(w_k)$ under your finishing distribution. And $\text{Var}(w_k)$ is dominated by the rare high-payout finishes that almost never happen.

Bob: Translate.

Uncle Tau: Per tournament, cash frequency gets you maybe an order of magnitude more bits about itself than HU winrate gets you about itself. So your required $n$ for similar precision is an order of magnitude larger. That's where the 500 versus 6,000-10,000 numbers come from.

Bob: So if I want $\pm 3\%$ precision at 95% confidence on —

Uncle Tau: — cash frequency: about 500 tournaments. HU winrate: somewhere between 6,000 and 10,000 depending on the format. Tight structures where first place is a big fraction of the pool converge a bit faster. WSOP-style ladders where first is 1000x the buy-in and the rest is smeared across hundreds of positions converge slower.

What this means for staking

Bob: So Danny's 180 tournaments.

Uncle Tau: Tells you roughly nothing about either parameter yet, but the two "roughly nothings" are different sizes. At 180 you can make a directional guess about cash frequency — he's maybe in the 90%-110% band of random, which brackets "slightly losing" to "decent grinder." His 22% ROI at that cash rate would require an HU winrate near the ceiling, which is unlikely. So the most plausible story is he ran hot in his cash conversions. Not a lie — just variance.

Bob: What do I actually see on the app when I look him up?

Uncle Tau: A posterior on ROI. Wide as a barn. The app pulls his estimate toward the prior hard because 180 tournaments can't outvote the population of regs in his format. His posterior on cash frequency is starting to narrow a little. His posterior on HU winrate is essentially just the prior — the app isn't pretending to know anything about his HU game yet.

Bob: And the app is honest about that.

Uncle Tau: That's what the posterior width is for. When his HU posterior is as wide as the prior, the app is telling you "I have no signal on this parameter." When his CF posterior is tighter than his HU posterior — which it always is at small $n$ — the app is telling you "I can say something about how often he cashes, but don't ask me about final-table conversion yet."

Bob: So in practical terms, what can I bet on?

Uncle Tau: You can stake him on cash-frequency-denominated exposure — buy-ins, mid-volume formats, stuff where the edge comes from getting into the money more often than random. You cannot yet stake him on "he's a final-table monster." That claim is not supported by his sample. Not even close. If he gets to 3,000 tournaments and still has 22% ROI, that starts to be real evidence. At 180 it isn't.

Bob: And his own confidence in his HU game is —

Uncle Tau: — noise dressed up as conviction. His sample isn't large enough for him to know either. Neither of you can see the parameter. The difference is you admitted it.

The staking asymmetry

Bob: You're saying there's a long window where I can price his cash rate fairly but I genuinely can't price his deep-run edge.

Uncle Tau: Right. And that window is most of a career. Five hundred tournaments is maybe six months of serious volume for an online grinder. Six to ten thousand is three to five years. For almost every player you will ever consider staking, you know their cash rate long before you know their HU winrate. That asymmetry is a feature of the product, not a bug — most markup negotiations happen before the HU parameter is estimable, and the product handles that by leaning on the prior for the parameter nobody has enough data on yet.

Bob: So Scout and Package Builder are honest about this too.

Uncle Tau: Every tool in the app that gives you a number. Scout shows you a wider band on a short-sample player. Package Builder widens its markup band when the posterior is uncertain. Strategy samples from a wide posterior and gives you a wider output range. None of these are tuning choices. They're all downstream of one thing — how much information the player's sample actually contains about the parameter you're asking about.

Bob: And the parameters converge at different rates.

Uncle Tau: Always. Forever. Cash frequency first, HU winrate much later. You'll know someone's cash rate in six months and their HU rate in four years.

The one-line math

Bob: Give me the formula.

Uncle Tau:

$$\text{Var}(\hat{\theta}) \geq \frac{1}{n \cdot \mathscr{I}(\theta)}$$

The variance of any unbiased estimator of a parameter $\theta$ is at least $1/(n \cdot \mathscr{I})$ where $\mathscr{I}$ is the Fisher information per observation. Cramér, 1946. Rao, 1945. $\mathscr{I}$ is high for cash frequency, low for HU winrate, and the ratio between them is why you need an order of magnitude more tournaments for the second one.

Bob: That's it?

Uncle Tau: That's it. Every required-sample-size number in the product is a downstream consequence of that one inequality.

What to do with this

Bob: Three things. Go.

Uncle Tau: One: stop conflating "I have 200 tournaments of data" with "I know my ROI." You have 200 tournaments of noisy information about two parameters that converge at different rates. Translate the sample into the parameter you care about before you quote anything.

Two: for staking decisions, price the parameter you can see. Cash frequency converges first. If a player's edge theory is "I cash a lot," you can validate that at 500 tournaments. If their edge theory is "I'm a deep-run assassin," you need thousands. Adjust the markup you accept for the HU claim until the sample earns it.

Three: respect the asymmetry in other directions too. A player who's been grinding for six months and claims a 20% ROI is making a mathematically defensible claim about cash frequency and a mathematically indefensible claim about HU winrate. Don't conflate the two.

Bob: So when do I get to buy Danny at a real markup?

Uncle Tau: When his posterior on the parameter you're pricing is tight enough that the pull from the prior is minor. For cash frequency, a year or so of full-time volume. For HU winrate, come back in three years.

Bob: Or we could just stake him at the prior and treat the ride as a learning experience.

Uncle Tau: Welcome to staking. That's exactly what you're doing when you bet at small $n$, whether you admit it or not. Pretending otherwise is just priced-in self-deception.

Bob: Got it. Thanks, Uncle Tau.

Uncle Tau: Go estimate your shapes, kid.

What's next

Reading the confidence band — the posterior width on every screen in this app is directly downstream of the Fisher information in your sample. Here's how to triage decisions by band width.
Kelly under uncertainty — why the Bayesian Kelly fraction is strictly smaller than the plug-in version, and why the math rewards honesty about what you don't know.
The prior as a default — what the app is quoting when you haven't earned the right to override it yet.