Quiz: Nonparametric Methods

Q: Under the null hypothesis of the sign test, the number of positive differences follows: (a) A normal distribution (b) A t-distribution (c) A binomial distribution with (d) A chi-square distribution

(c) A binomial distribution with . If the treatment has no effect, each difference is equally likely to be positive or negative — like a fair coin flip. The count of positive differences among non-zero differences follows .

Q: A useful check when computing rank sums for two groups is that should equal: (a) (b) (c) where (d)

(c) where . The sum of all ranks from 1 to is always . Since every observation gets exactly one rank, and the ranks for Group 1 and Group 2 partition the complete set of ranks, their sums must add to this total. This is a valuable error-checking tool.

Q: In the Mann-Whitney U test, the U statistic for Group 1 is computed as: (a) (b) (c) (d)

(b) . The U statistic adjusts the rank sum by subtracting the minimum possible rank sum for that group. If Group 1 had all the smallest values, its rank sum would be , giving . The U statistic counts how many times a Group 1 observation exceeds a Group 2 observation.

Q: In the Wilcoxon signed-rank test, if a paired difference equals zero, you should: (a) Assign it rank 1 (b) Count it as a positive difference (c) Drop it from the analysis (d) Assign it the average rank of all zero differences

(c) Drop it from the analysis. Zero differences provide no information about which direction is more common. They are excluded, and the sample size for the test is reduced accordingly. The non-zero differences are then ranked by their absolute values.

Q: Under the null hypothesis, the Kruskal-Wallis H statistic approximately follows: (a) A normal distribution (b) A t-distribution (c) An F-distribution (d) A chi-square distribution with degrees of freedom

(d) A chi-square distribution with degrees of freedom. This parallels the F-distribution in ANOVA, but the chi-square distribution arises because the H statistic is constructed from squared deviations of group mean ranks from the overall mean rank. With groups, the test has degrees of freedom.

Contributors

Quiz: Nonparametric Methods

Test your understanding of nonparametric tests, ranking procedures, the sign test, Wilcoxon tests, the Kruskal-Wallis test, and the decision framework for choosing between parametric and nonparametric methods. Try to answer each question before revealing the answer.

1. The main advantage of nonparametric tests over parametric tests is:

(a) They are more powerful in all situations (b) They have no assumptions whatsoever (c) They do not require the normality assumption (d) They always produce smaller p-values

Answer

**(c) They do not require the normality assumption.** Nonparametric tests do have assumptions (independence, similar distribution shapes, rankable data), so (b) is incorrect. They are not always more powerful — in fact, they are slightly less powerful when normality holds — so (a) and (d) are incorrect. The key advantage is that they work without assuming a specific distribution shape for the population.

2. When data values are converted to ranks, the following property is gained:

(a) The data becomes normally distributed (b) Extreme outliers have the same influence as moderate values (c) The sample size increases (d) The degrees of freedom increase

Answer

**(b) Extreme outliers have the same influence as moderate values.** Whether an extreme value is 100 or 1,000,000, it receives the same rank (the highest one). This makes rank-based tests robust to outliers. Ranking does not make the data normal (a), does not change the sample size (c), or affect degrees of freedom (d).

3. How are tied values handled in rank-based nonparametric tests?

(a) They are dropped from the analysis (b) Each tied value receives the rank of the first occurrence (c) Each tied value receives the average of the ranks they would have occupied (d) Ties make nonparametric tests invalid

Answer

**(c) Each tied value receives the average of the ranks they would have occupied.** This is called midrank assignment. For example, if two values are tied at positions 3 and 4, each receives rank 3.5. Ties are not dropped (a), are not assigned first-occurrence ranks (b), and do not invalidate the tests (d), though many tied values can reduce the test's power.

4. Which of the following data types is the strongest reason to use a nonparametric test?

(a) Continuous measurements from a normal population (b) Ordinal ratings on a 1-5 scale (c) Large sample sizes ($n > 100$) (d) Data with equal variances across groups

Answer

**(b) Ordinal ratings on a 1-5 scale.** Ordinal data has a meaningful order but unequal intervals between levels. Computing a mean of ordinal data is technically questionable, making parametric tests (which compare means) inappropriate. This is the clearest case for nonparametric methods. Options (a), (c), and (d) all favor parametric tests.

5. The sign test for paired data works by:

(a) Computing the mean of the differences and testing whether it equals zero (b) Ranking the absolute differences and analyzing the signed ranks (c) Counting the number of positive and negative differences and comparing to a binomial distribution (d) Computing the U statistic from the combined ranks

Answer

**(c) Counting the number of positive and negative differences and comparing to a binomial distribution.** Under $H_0$, each non-zero difference is equally likely to be positive or negative, so the count of positive differences follows $\text{Binomial}(n, 0.5)$. Option (a) describes the paired t-test, (b) describes the Wilcoxon signed-rank test, and (d) describes the Mann-Whitney U test.

6. Under the null hypothesis of the sign test, the number of positive differences follows:

(a) A normal distribution (b) A t-distribution (c) A binomial distribution with $p = 0.5$ (d) A chi-square distribution

Answer

**(c) A binomial distribution with $p = 0.5$.** If the treatment has no effect, each difference is equally likely to be positive or negative — like a fair coin flip. The count of positive differences among $n$ non-zero differences follows $\text{Binomial}(n, 0.5)$.

7. The Wilcoxon rank-sum test is the nonparametric alternative to:

(a) The one-sample t-test (b) The paired t-test (c) The two-sample t-test for independent groups (d) One-way ANOVA

Answer

**(c) The two-sample t-test for independent groups.** The Wilcoxon rank-sum (Mann-Whitney U) test compares two independent groups, just like the two-sample t-test from Chapter 16. The nonparametric alternative to the paired t-test is the Wilcoxon signed-rank test (b). The nonparametric alternative to ANOVA is the Kruskal-Wallis test (d).

8. A useful check when computing rank sums for two groups is that $W_1 + W_2$ should equal:

(a) $n_1 + n_2$ (b) $n_1 \times n_2$ (c) $N(N+1)/2$ where $N = n_1 + n_2$ (d) $(n_1 + n_2)^2$

Answer

**(c) $N(N+1)/2$ where $N = n_1 + n_2$.** The sum of all ranks from 1 to $N$ is always $N(N+1)/2$. Since every observation gets exactly one rank, and the ranks for Group 1 and Group 2 partition the complete set of ranks, their sums must add to this total. This is a valuable error-checking tool.

9. In the Mann-Whitney U test, the U statistic for Group 1 is computed as:

(a) $U_1 = W_1 + n_1(n_1+1)/2$ (b) $U_1 = W_1 - n_1(n_1+1)/2$ (c) $U_1 = W_1 / n_1$ (d) $U_1 = W_1 \times W_2$

Answer

**(b) $U_1 = W_1 - n_1(n_1+1)/2$.** The U statistic adjusts the rank sum by subtracting the minimum possible rank sum for that group. If Group 1 had all the smallest values, its rank sum would be $1 + 2 + \cdots + n_1 = n_1(n_1+1)/2$, giving $U_1 = 0$. The U statistic counts how many times a Group 1 observation exceeds a Group 2 observation.

10. The Wilcoxon signed-rank test differs from the sign test in that it:

(a) Can be used for independent samples (b) Uses the magnitude of differences (via ranks), not just their signs (c) Requires normally distributed data (d) Can compare more than two groups

Answer

**(b) Uses the magnitude of differences (via ranks), not just their signs.** The sign test only uses the direction (positive or negative) of each difference. The Wilcoxon signed-rank test also uses the relative magnitude by ranking the absolute differences. A large improvement gets a higher rank than a small improvement, giving the test more information and therefore more power. Both tests are for paired data (not independent), don't require normality, and only compare two conditions.

11. In the Wilcoxon signed-rank test, if a paired difference equals zero, you should:

(a) Assign it rank 1 (b) Count it as a positive difference (c) Drop it from the analysis (d) Assign it the average rank of all zero differences

Answer

**(c) Drop it from the analysis.** Zero differences provide no information about which direction is more common. They are excluded, and the sample size for the test is reduced accordingly. The non-zero differences are then ranked by their absolute values.

12. The Kruskal-Wallis test is the nonparametric alternative to:

(a) The two-sample t-test (b) The paired t-test (c) One-way ANOVA (d) The chi-square test of independence

Answer

**(c) One-way ANOVA.** Just as the Wilcoxon rank-sum test extends to two independent groups what the t-test does, the Kruskal-Wallis test extends rank-based comparison to three or more independent groups — the same job as one-way ANOVA. Under $H_0$, the Kruskal-Wallis H statistic follows an approximate chi-square distribution with $df = k - 1$.

13. Under the null hypothesis, the Kruskal-Wallis H statistic approximately follows:

(a) A normal distribution (b) A t-distribution (c) An F-distribution (d) A chi-square distribution with $k - 1$ degrees of freedom

Answer

**(d) A chi-square distribution with $k - 1$ degrees of freedom.** This parallels the F-distribution in ANOVA, but the chi-square distribution arises because the H statistic is constructed from squared deviations of group mean ranks from the overall mean rank. With $k$ groups, the test has $k - 1$ degrees of freedom.

14. After a significant Kruskal-Wallis test, the appropriate post-hoc procedure is:

(a) Tukey's HSD (b) Pairwise Mann-Whitney U tests with Bonferroni correction (c) Pairwise sign tests (d) No post-hoc tests are available for nonparametric methods

Answer

**(b) Pairwise Mann-Whitney U tests with Bonferroni correction.** Tukey's HSD (a) is a parametric post-hoc test designed for ANOVA. The appropriate nonparametric follow-up is to run Mann-Whitney U tests for each pair of groups and adjust for multiple comparisons using Bonferroni correction (divide $\alpha$ by the number of comparisons). Dunn's test is another common option. Post-hoc tests are certainly available for nonparametric methods (d is false).

15. The asymptotic relative efficiency (ARE) of the Mann-Whitney U test compared to the two-sample t-test for normal data is approximately:

(a) 0.50 (50% as efficient) (b) 0.75 (75% as efficient) (c) 0.955 (95.5% as efficient) (d) 1.50 (50% more efficient)

Answer

**(c) 0.955 (95.5% as efficient).** This means that when data are truly normally distributed, the Mann-Whitney U test needs about $1/0.955 \approx 1.047$ times as many observations to achieve the same power as the t-test — roughly 5% more data. This is a remarkably small penalty, and for non-normal data, the ARE can exceed 1.0, meaning the nonparametric test is actually *more* efficient.

16. In which scenario would a nonparametric test likely be MORE powerful than its parametric counterpart?

(a) Large samples from a normal population (b) Small samples from a heavily right-skewed population with outliers (c) Moderate samples from a symmetric, bell-shaped population (d) Very large samples from any population

Answer

**(b) Small samples from a heavily right-skewed population with outliers.** Outliers inflate the standard deviation in parametric tests, increasing the standard error and making it harder to detect real differences. Nonparametric tests, working with ranks, are unaffected by the outliers' exact values. In scenario (b), the rank-based test preserves the genuine signal while the t-test is distracted by inflated variance. For normal or symmetric data (a, c, d), the parametric test retains its slight power advantage.

17. A researcher has ordinal data (satisfaction ratings, 1-5) and runs a t-test, finding $p = 0.03$. A colleague points out the data is ordinal and suggests a Mann-Whitney U test, which gives $p = 0.04$. Both are significant at $\alpha = 0.05$. Which result should be reported?

(a) The t-test result, because it has a smaller p-value (b) The Mann-Whitney U result, because it's appropriate for ordinal data (c) Both results, letting the reader decide (d) It doesn't matter since both are significant

Answer

**(b) The Mann-Whitney U result, because it's appropriate for ordinal data.** The choice of statistical test should be based on the data type and assumptions, not on which p-value is more favorable. Ordinal data doesn't have equal intervals between levels, so computing a mean (which the t-test compares) is technically inappropriate. The Mann-Whitney U test, which works with ranks, is the correct choice for ordinal data regardless of what the t-test happens to show.

18. Excel's Data Analysis ToolPak:

(a) Includes all major nonparametric tests (b) Includes Mann-Whitney but not Kruskal-Wallis (c) Does not include any nonparametric tests (d) Includes Kruskal-Wallis but not Mann-Whitney

Answer

**(c) Does not include any nonparametric tests.** This is a well-known limitation of Excel's built-in Data Analysis ToolPak. For nonparametric tests in Excel, you need to either calculate manually using RANK.AVG() and formulas, install add-in packages (like the Real Statistics Resource Pack), or use Python/R. This is one area where the investment in learning Python particularly pays off.

19. A researcher runs a Kruskal-Wallis test with $k = 4$ groups and gets $H = 2.89$, $p = 0.41$. She then runs pairwise Mann-Whitney U tests and finds one pair with $p = 0.03$. What is the correct interpretation?

(a) One pair truly differs; the Kruskal-Wallis test missed it (b) The pairwise result is likely a false positive; do not conduct post-hoc tests after a non-significant omnibus test (c) The Kruskal-Wallis test must be wrong (d) The pairwise result should be reported instead of the Kruskal-Wallis result

Answer

**(b) The pairwise result is likely a false positive; do not conduct post-hoc tests after a non-significant omnibus test.** This is the same principle as with ANOVA ([Chapter 20](../chapter-20-anova/index.md)): a non-significant omnibus test means no post-hoc pairwise comparisons should be conducted. With $\binom{4}{2} = 6$ pairwise comparisons, the probability of at least one false positive at $\alpha = 0.05$ is $1 - 0.95^6 = 0.265$ — a 26.5% chance. The $p = 0.03$ pairwise result is likely due to this inflated false positive rate.

20. The best general strategy for choosing between parametric and nonparametric tests is:

(a) Always use nonparametric tests to be safe (b) Always use parametric tests because they're more powerful (c) Decide based on data type, sample size, and assumption checks — and consider running both as a robustness check (d) Run both and always report whichever gives the smaller p-value

Answer

**(c) Decide based on data type, sample size, and assumption checks — and consider running both as a robustness check.** This reflects the decision framework from Section 21.9. For ordinal data, use nonparametric. For continuous, approximately normal data with large samples, use parametric. For continuous data with small samples where normality is questionable, nonparametric is safer. Running both tests is a legitimate robustness check — if they agree, you have more confidence in the conclusion. If they disagree, the discrepancy itself is informative. Never choose a test based solely on which p-value is smaller (d).