Appendix B: Statistical Tables
This appendix provides the statistical tables most frequently referenced in prediction market analysis. While software makes table lookups largely unnecessary for computation, having these values at hand builds intuition about the magnitudes involved and serves as a quick sanity check on computed results.
B.1 Standard Normal Distribution Table (Z-Table)
The table gives $\Phi(z) = P(Z \leq z)$ for the standard normal distribution $Z \sim \mathcal{N}(0, 1)$. To find $P(Z > z)$, compute $1 - \Phi(z)$.
| $z$ | 0.00 | 0.01 | 0.02 | 0.03 | 0.04 | 0.05 | 0.06 | 0.07 | 0.08 | 0.09 |
|---|---|---|---|---|---|---|---|---|---|---|
| 0.0 | 0.5000 | 0.5040 | 0.5080 | 0.5120 | 0.5160 | 0.5199 | 0.5239 | 0.5279 | 0.5319 | 0.5359 |
| 0.1 | 0.5398 | 0.5438 | 0.5478 | 0.5517 | 0.5557 | 0.5596 | 0.5636 | 0.5675 | 0.5714 | 0.5753 |
| 0.2 | 0.5793 | 0.5832 | 0.5871 | 0.5910 | 0.5948 | 0.5987 | 0.6026 | 0.6064 | 0.6103 | 0.6141 |
| 0.3 | 0.6179 | 0.6217 | 0.6255 | 0.6293 | 0.6331 | 0.6368 | 0.6406 | 0.6443 | 0.6480 | 0.6517 |
| 0.4 | 0.6554 | 0.6591 | 0.6628 | 0.6664 | 0.6700 | 0.6736 | 0.6772 | 0.6808 | 0.6844 | 0.6879 |
| 0.5 | 0.6915 | 0.6950 | 0.6985 | 0.7019 | 0.7054 | 0.7088 | 0.7123 | 0.7157 | 0.7190 | 0.7224 |
| 0.6 | 0.7257 | 0.7291 | 0.7324 | 0.7357 | 0.7389 | 0.7422 | 0.7454 | 0.7486 | 0.7517 | 0.7549 |
| 0.7 | 0.7580 | 0.7611 | 0.7642 | 0.7673 | 0.7704 | 0.7734 | 0.7764 | 0.7794 | 0.7823 | 0.7852 |
| 0.8 | 0.7881 | 0.7910 | 0.7939 | 0.7967 | 0.7995 | 0.8023 | 0.8051 | 0.8078 | 0.8106 | 0.8133 |
| 0.9 | 0.8159 | 0.8186 | 0.8212 | 0.8238 | 0.8264 | 0.8289 | 0.8315 | 0.8340 | 0.8365 | 0.8389 |
| 1.0 | 0.8413 | 0.8438 | 0.8461 | 0.8485 | 0.8508 | 0.8531 | 0.8554 | 0.8577 | 0.8599 | 0.8621 |
| 1.1 | 0.8643 | 0.8665 | 0.8686 | 0.8708 | 0.8729 | 0.8749 | 0.8770 | 0.8790 | 0.8810 | 0.8830 |
| 1.2 | 0.8849 | 0.8869 | 0.8888 | 0.8907 | 0.8925 | 0.8944 | 0.8962 | 0.8980 | 0.8997 | 0.9015 |
| 1.3 | 0.9032 | 0.9049 | 0.9066 | 0.9082 | 0.9099 | 0.9115 | 0.9131 | 0.9147 | 0.9162 | 0.9177 |
| 1.4 | 0.9192 | 0.9207 | 0.9222 | 0.9236 | 0.9251 | 0.9265 | 0.9279 | 0.9292 | 0.9306 | 0.9319 |
| 1.5 | 0.9332 | 0.9345 | 0.9357 | 0.9370 | 0.9382 | 0.9394 | 0.9406 | 0.9418 | 0.9429 | 0.9441 |
| 1.6 | 0.9452 | 0.9463 | 0.9474 | 0.9484 | 0.9495 | 0.9505 | 0.9515 | 0.9525 | 0.9535 | 0.9545 |
| 1.7 | 0.9554 | 0.9564 | 0.9573 | 0.9582 | 0.9591 | 0.9599 | 0.9608 | 0.9616 | 0.9625 | 0.9633 |
| 1.8 | 0.9641 | 0.9649 | 0.9656 | 0.9664 | 0.9671 | 0.9678 | 0.9686 | 0.9693 | 0.9699 | 0.9706 |
| 1.9 | 0.9713 | 0.9719 | 0.9726 | 0.9732 | 0.9738 | 0.9744 | 0.9750 | 0.9756 | 0.9761 | 0.9767 |
| 2.0 | 0.9772 | 0.9778 | 0.9783 | 0.9788 | 0.9793 | 0.9798 | 0.9803 | 0.9808 | 0.9812 | 0.9817 |
| 2.1 | 0.9821 | 0.9826 | 0.9830 | 0.9834 | 0.9838 | 0.9842 | 0.9846 | 0.9850 | 0.9854 | 0.9857 |
| 2.2 | 0.9861 | 0.9864 | 0.9868 | 0.9871 | 0.9875 | 0.9878 | 0.9881 | 0.9884 | 0.9887 | 0.9890 |
| 2.3 | 0.9893 | 0.9896 | 0.9898 | 0.9901 | 0.9904 | 0.9906 | 0.9909 | 0.9911 | 0.9913 | 0.9916 |
| 2.4 | 0.9918 | 0.9920 | 0.9922 | 0.9925 | 0.9927 | 0.9929 | 0.9931 | 0.9932 | 0.9934 | 0.9936 |
| 2.5 | 0.9938 | 0.9940 | 0.9941 | 0.9943 | 0.9945 | 0.9946 | 0.9948 | 0.9949 | 0.9951 | 0.9952 |
| 2.6 | 0.9953 | 0.9955 | 0.9956 | 0.9957 | 0.9959 | 0.9960 | 0.9961 | 0.9962 | 0.9963 | 0.9964 |
| 2.7 | 0.9965 | 0.9966 | 0.9967 | 0.9968 | 0.9969 | 0.9970 | 0.9971 | 0.9972 | 0.9973 | 0.9974 |
| 2.8 | 0.9974 | 0.9975 | 0.9976 | 0.9977 | 0.9977 | 0.9978 | 0.9979 | 0.9979 | 0.9980 | 0.9981 |
| 2.9 | 0.9981 | 0.9982 | 0.9982 | 0.9983 | 0.9984 | 0.9984 | 0.9985 | 0.9985 | 0.9986 | 0.9986 |
| 3.0 | 0.9987 | 0.9987 | 0.9987 | 0.9988 | 0.9988 | 0.9989 | 0.9989 | 0.9989 | 0.9990 | 0.9990 |
Commonly used critical values:
| Confidence Level | $\alpha$ (two-tailed) | $z_{\alpha/2}$ |
|---|---|---|
| 90% | 0.10 | 1.645 |
| 95% | 0.05 | 1.960 |
| 99% | 0.01 | 2.576 |
| 99.9% | 0.001 | 3.291 |
B.2 t-Distribution Critical Values
The table gives $t_{\alpha, \nu}$ such that $P(T > t_{\alpha, \nu}) = \alpha$ for a t-distribution with $\nu$ degrees of freedom. For two-tailed tests at significance $\alpha$, use the column $\alpha/2$.
| df ($\nu$) | $\alpha = 0.10$ | $\alpha = 0.05$ | $\alpha = 0.025$ | $\alpha = 0.01$ | $\alpha = 0.005$ |
|---|---|---|---|---|---|
| 1 | 3.078 | 6.314 | 12.706 | 31.821 | 63.657 |
| 2 | 1.886 | 2.920 | 4.303 | 6.965 | 9.925 |
| 3 | 1.638 | 2.353 | 3.182 | 4.541 | 5.841 |
| 4 | 1.533 | 2.132 | 2.776 | 3.747 | 4.604 |
| 5 | 1.476 | 2.015 | 2.571 | 3.365 | 4.032 |
| 6 | 1.440 | 1.943 | 2.447 | 3.143 | 3.707 |
| 7 | 1.415 | 1.895 | 2.365 | 2.998 | 3.499 |
| 8 | 1.397 | 1.860 | 2.306 | 2.896 | 3.355 |
| 9 | 1.383 | 1.833 | 2.262 | 2.821 | 3.250 |
| 10 | 1.372 | 1.812 | 2.228 | 2.764 | 3.169 |
| 12 | 1.356 | 1.782 | 2.179 | 2.681 | 3.055 |
| 15 | 1.341 | 1.753 | 2.131 | 2.602 | 2.947 |
| 20 | 1.325 | 1.725 | 2.086 | 2.528 | 2.845 |
| 25 | 1.316 | 1.708 | 2.060 | 2.485 | 2.787 |
| 30 | 1.310 | 1.697 | 2.042 | 2.457 | 2.750 |
| 40 | 1.303 | 1.684 | 2.021 | 2.423 | 2.704 |
| 60 | 1.296 | 1.671 | 2.000 | 2.390 | 2.660 |
| 120 | 1.289 | 1.658 | 1.980 | 2.358 | 2.617 |
| $\infty$ | 1.282 | 1.645 | 1.960 | 2.326 | 2.576 |
Usage note: When evaluating whether a prediction market strategy produces statistically significant returns, use the t-distribution with $n - 1$ degrees of freedom when the sample size $n$ is small (fewer than approximately 30 trades). For larger samples, the t-distribution closely approximates the normal.
B.3 Chi-Square Critical Values
The table gives $\chi^2_{\alpha, \nu}$ such that $P(\chi^2 > \chi^2_{\alpha, \nu}) = \alpha$ for a chi-square distribution with $\nu$ degrees of freedom.
| df ($\nu$) | $\alpha = 0.10$ | $\alpha = 0.05$ | $\alpha = 0.025$ | $\alpha = 0.01$ | $\alpha = 0.005$ |
|---|---|---|---|---|---|
| 1 | 2.706 | 3.841 | 5.024 | 6.635 | 7.879 |
| 2 | 4.605 | 5.991 | 7.378 | 9.210 | 10.597 |
| 3 | 6.251 | 7.815 | 9.348 | 11.345 | 12.838 |
| 4 | 7.779 | 9.488 | 11.143 | 13.277 | 14.860 |
| 5 | 9.236 | 11.070 | 12.833 | 15.086 | 16.750 |
| 6 | 10.645 | 12.592 | 14.449 | 16.812 | 18.548 |
| 7 | 12.017 | 14.067 | 16.013 | 18.475 | 20.278 |
| 8 | 13.362 | 15.507 | 17.535 | 20.090 | 21.955 |
| 9 | 14.684 | 16.919 | 19.023 | 21.666 | 23.589 |
| 10 | 15.987 | 18.307 | 20.483 | 23.209 | 25.188 |
| 15 | 22.307 | 24.996 | 27.488 | 30.578 | 32.801 |
| 20 | 28.412 | 31.410 | 34.170 | 37.566 | 39.997 |
| 25 | 34.382 | 37.652 | 40.646 | 44.314 | 46.928 |
| 30 | 40.256 | 43.773 | 46.979 | 50.892 | 53.672 |
Usage note: Chi-square tests are commonly used in prediction markets for calibration testing. Given $k$ probability bins, the calibration chi-square statistic compares the observed frequency of outcomes in each bin to the expected frequency. The test uses $k - 1$ degrees of freedom.
B.4 Common Probability Values
This quick-reference table converts between the various probability representations commonly encountered in prediction markets and sports betting.
| Probability | Decimal Odds | Fractional Odds | American Odds | Implied Prob (with vig) |
|---|---|---|---|---|
| 0.01 | 100.00 | 99/1 | +9900 | ~0.010 |
| 0.05 | 20.00 | 19/1 | +1900 | ~0.052 |
| 0.10 | 10.00 | 9/1 | +900 | ~0.104 |
| 0.15 | 6.67 | 17/3 | +567 | ~0.156 |
| 0.20 | 5.00 | 4/1 | +400 | ~0.208 |
| 0.25 | 4.00 | 3/1 | +300 | ~0.260 |
| 0.30 | 3.33 | 7/3 | +233 | ~0.313 |
| 0.33 | 3.00 | 2/1 | +200 | ~0.345 |
| 0.40 | 2.50 | 3/2 | +150 | ~0.417 |
| 0.50 | 2.00 | 1/1 | +100 / -100 | ~0.524 |
| 0.60 | 1.67 | 2/3 | -150 | ~0.625 |
| 0.67 | 1.50 | 1/2 | -200 | ~0.694 |
| 0.70 | 1.43 | 3/7 | -233 | ~0.727 |
| 0.75 | 1.33 | 1/3 | -300 | ~0.781 |
| 0.80 | 1.25 | 1/4 | -400 | ~0.833 |
| 0.85 | 1.18 | 3/17 | -567 | ~0.880 |
| 0.90 | 1.11 | 1/9 | -900 | ~0.930 |
| 0.95 | 1.05 | 1/19 | -1900 | ~0.968 |
| 0.99 | 1.01 | 1/99 | -9900 | ~0.995 |
Conversion formulas:
- Probability to decimal odds: $d = 1 / p$
- Decimal odds to probability: $p = 1 / d$
- Probability to American odds: If $p \geq 0.5$: $A = -100p / (1-p)$; if $p < 0.5$: $A = +100(1-p) / p$
- The "implied prob with vig" column assumes a typical 4.5% overround (vigorish), illustrating how bookmakers shade probabilities to ensure a profit margin.
B.5 Kelly Criterion Quick Reference
Optimal Kelly fraction $f^* = (bp - q) / b$ where $p$ is the true win probability, $q = 1 - p$, and $b$ is the net odds (payout per unit wagered). The table shows $f^*$ as a percentage of bankroll.
Even Money Bets ($b = 1$)
| True Prob ($p$) | Edge ($p - 0.5$) | Kelly $f^*$ | Half Kelly | Quarter Kelly |
|---|---|---|---|---|
| 0.51 | 0.01 | 2.0% | 1.0% | 0.5% |
| 0.52 | 0.02 | 4.0% | 2.0% | 1.0% |
| 0.53 | 0.03 | 6.0% | 3.0% | 1.5% |
| 0.55 | 0.05 | 10.0% | 5.0% | 2.5% |
| 0.57 | 0.07 | 14.0% | 7.0% | 3.5% |
| 0.60 | 0.10 | 20.0% | 10.0% | 5.0% |
| 0.65 | 0.15 | 30.0% | 15.0% | 7.5% |
| 0.70 | 0.20 | 40.0% | 20.0% | 10.0% |
| 0.75 | 0.25 | 50.0% | 25.0% | 12.5% |
| 0.80 | 0.30 | 60.0% | 30.0% | 15.0% |
Various Odds and Probabilities
| Net Odds ($b$) | True Prob ($p$) | Break-even Prob | Edge | Kelly $f^*$ |
|---|---|---|---|---|
| 0.5 | 0.70 | 0.667 | 0.033 | 6.7% |
| 0.5 | 0.75 | 0.667 | 0.083 | 16.7% |
| 1.0 | 0.55 | 0.500 | 0.050 | 10.0% |
| 1.0 | 0.60 | 0.500 | 0.100 | 20.0% |
| 2.0 | 0.40 | 0.333 | 0.067 | 10.0% |
| 2.0 | 0.45 | 0.333 | 0.117 | 17.5% |
| 3.0 | 0.30 | 0.250 | 0.050 | 6.7% |
| 3.0 | 0.35 | 0.250 | 0.100 | 13.3% |
| 5.0 | 0.22 | 0.167 | 0.053 | 5.3% |
| 5.0 | 0.25 | 0.167 | 0.083 | 8.3% |
| 9.0 | 0.12 | 0.100 | 0.020 | 2.2% |
| 9.0 | 0.15 | 0.100 | 0.050 | 5.6% |
Practical guidelines:
- If $f^* \leq 0$, do not bet (no edge).
- Half Kelly ($f^*/2$) achieves approximately 75% of the growth rate with substantially lower variance and drawdown.
- Quarter Kelly ($f^*/4$) is recommended when probability estimates are uncertain.
- Never exceed full Kelly; "over-betting" reduces long-run growth rate and can be catastrophic.
B.6 Brier Score Reference
The Brier score is $BS = (p - o)^2$ where $p$ is the predicted probability and $o \in \{0, 1\}$ is the outcome. Lower is better. The table shows Brier scores for various prediction-outcome combinations.
When Event Occurs ($o = 1$)
| Predicted $p$ | Brier Score | Interpretation |
|---|---|---|
| 0.00 | 1.000 | Maximally wrong: confident it would not happen |
| 0.05 | 0.903 | Nearly certain it would not happen |
| 0.10 | 0.810 | Very poor prediction |
| 0.20 | 0.640 | Poor prediction |
| 0.30 | 0.490 | Below average |
| 0.40 | 0.360 | Slightly below average |
| 0.50 | 0.250 | Coin-flip prediction (baseline for binary events) |
| 0.60 | 0.160 | Slightly above average |
| 0.70 | 0.090 | Good prediction |
| 0.80 | 0.040 | Very good prediction |
| 0.90 | 0.010 | Excellent prediction |
| 0.95 | 0.003 | Near-perfect prediction |
| 1.00 | 0.000 | Perfect prediction |
When Event Does Not Occur ($o = 0$)
| Predicted $p$ | Brier Score | Interpretation |
|---|---|---|
| 0.00 | 0.000 | Perfect prediction |
| 0.05 | 0.003 | Near-perfect prediction |
| 0.10 | 0.010 | Excellent prediction |
| 0.20 | 0.040 | Very good prediction |
| 0.30 | 0.090 | Good prediction |
| 0.40 | 0.160 | Slightly above average |
| 0.50 | 0.250 | Coin-flip prediction (baseline) |
| 0.60 | 0.360 | Slightly below average |
| 0.70 | 0.490 | Below average |
| 0.80 | 0.640 | Poor prediction |
| 0.90 | 0.810 | Very poor prediction |
| 1.00 | 1.000 | Maximally wrong: confident it would happen |
Benchmark Brier Scores
| Forecaster Type | Typical Mean Brier Score | Notes |
|---|---|---|
| Always predict 0.50 | 0.250 | Uninformed baseline for binary events |
| Climatological base rate | 0.200 - 0.240 | Always predict the historical frequency |
| Typical poll-based model | 0.150 - 0.200 | Simple aggregation of public data |
| Good prediction market | 0.100 - 0.170 | Efficient aggregation of diverse information |
| Expert superforecaster | 0.080 - 0.150 | Trained, calibrated individual forecasters |
| Perfect foresight | 0.000 | Theoretical lower bound |
B.7 Sample Size Requirements
The following tables give the minimum number of observations $n$ needed to detect an effect of a given size at standard significance levels. These are essential for determining how many trades or predictions are needed before strategy performance can be evaluated with statistical confidence.
Comparing a Proportion to a Known Value (One-Sample Z-Test)
Minimum $n$ to detect a difference $|p - p_0|$ from a null proportion $p_0 = 0.50$ at power $= 0.80$.
| Detectable Difference | $\alpha = 0.10$ | $\alpha = 0.05$ | $\alpha = 0.01$ |
|---|---|---|---|
| 0.01 | 6,766 | 9,604 | 16,587 |
| 0.02 | 1,692 | 2,401 | 4,147 |
| 0.03 | 752 | 1,068 | 1,843 |
| 0.05 | 271 | 385 | 663 |
| 0.07 | 138 | 196 | 339 |
| 0.10 | 68 | 97 | 166 |
| 0.15 | 30 | 43 | 74 |
| 0.20 | 17 | 25 | 42 |
Interpretation: If your trading strategy has a true win rate of 55% (a 5-percentage-point edge over 50%), you need approximately 385 trades to confirm this edge is statistically significant at the 5% level with 80% power.
Comparing Two Proportions (Two-Sample Z-Test)
Minimum $n$ per group to detect a difference $|p_1 - p_2|$ at power $= 0.80$, assuming $p_1 \approx p_2 \approx 0.50$.
| Detectable Difference | $\alpha = 0.10$ | $\alpha = 0.05$ | $\alpha = 0.01$ |
|---|---|---|---|
| 0.02 | 3,382 | 4,802 | 8,294 |
| 0.05 | 542 | 769 | 1,327 |
| 0.10 | 136 | 193 | 332 |
| 0.15 | 60 | 86 | 148 |
| 0.20 | 34 | 49 | 84 |
Detecting a Non-Zero Mean Return (One-Sample t-Test)
Minimum $n$ to detect a mean return $\mu$ when returns have standard deviation $\sigma$ (expressed as effect size $d = \mu / \sigma$) at power $= 0.80$.
| Effect Size ($d$) | Description | $\alpha = 0.10$ | $\alpha = 0.05$ | $\alpha = 0.01$ |
|---|---|---|---|---|
| 0.05 | Very small edge | 2,714 | 3,848 | 6,632 |
| 0.10 | Small edge | 679 | 963 | 1,659 |
| 0.20 | Small-medium edge | 170 | 241 | 415 |
| 0.30 | Medium edge | 76 | 108 | 185 |
| 0.50 | Large edge | 28 | 39 | 67 |
| 0.80 | Very large edge | 11 | 15 | 27 |
Practical implications for prediction market trading:
- Edges in prediction markets are typically small ($d$ between 0.05 and 0.20). This means hundreds or thousands of trades may be needed to confirm that a strategy genuinely produces positive returns.
- Strategies that produce large per-trade returns ($d > 0.50$) are rare and usually involve illiquid or niche markets.
- When backtesting over limited historical data, be cautious about claiming statistical significance. These sample size requirements explain why many apparently profitable backtests fail to replicate out of sample.
- For calibration testing, a minimum of approximately 100 forecasts per probability bin is recommended for reliable assessment.
Summary. The tables in this appendix support the quantitative analyses presented throughout the book. The z-table and t-table underpin hypothesis testing for strategy evaluation. The probability conversion table is essential for moving between prediction market prices and traditional odds formats. The Kelly table provides quick sizing decisions. The Brier score reference supports forecast evaluation, and the sample size tables provide realistic expectations about how much data is required to draw confident conclusions about trading performance.