Chapter 26 Exercises: A/B Testing Content and Offer Strategy
Exercise 26.1 — Hypothesis Formation Workshop
Objective: Practice the discipline of forming clear, testable hypotheses before running any test.
Instructions: A valid test hypothesis has three components: 1. Variable: What exactly is being changed? 2. Expected outcome: What do you predict will happen, and why? 3. Metric: How will you measure success?
Write a formal hypothesis for each of the following test scenarios:
Scenario A: You currently use neutral, informational thumbnail designs. You want to test whether thumbnails featuring your face with a strong emotion perform better.
Scenario B: Your email open rate averages 18%. You want to test whether subject lines that include a specific actionable number perform better than your current approach.
Scenario C: Your course landing page currently has the CTA button ("Enroll Now") at the bottom of the page. You want to test whether placing a second CTA button above the fold increases conversion.
Template for each hypothesis: "If I [change variable], then [metric] will [increase/decrease/change] because [reasoning based on audience understanding]. I will measure [specific metric] over [time period] with a minimum of [n] observations per variant."
Reflection: What makes a hypothesis weak or untestable? What distinguishes "my audience might like this" from a real hypothesis?
Exercise 26.2 — Sample Size Calculation Before Testing
Objective: Learn to calculate required sample sizes before committing to a test, preventing premature or underpowered conclusions.
Instructions:
Use the ab_test_analysis.py sample size calculator (function calculate_required_sample_size) to answer the following questions:
-
Your email newsletter has a 22% open rate. You want to detect an improvement of at least 4 percentage points (to 26%). With 80% statistical power and 5% significance level, how many subscribers per variant do you need? How many total emails does this require?
-
Your YouTube thumbnails have a 5.5% average CTR. You want to detect an improvement of at least 1 percentage point (to 6.5%). What sample size per variant is needed? At 400 views per day, how many days does this test require?
-
Your landing page converts at 2.8%. You want to detect a 20% relative improvement (to 3.36%). How many visitors per variant do you need? If you get 80 visitors per day, is this test feasible within 60 days?
Deliverable: A table showing required sample sizes, estimated test duration, and a feasibility assessment (feasible / borderline / not feasible) for each scenario.
Reflection: What does sample size calculation tell you about the practical limits of testing at small audience sizes? What alternatives exist when tests are not feasible?
Exercise 26.3 — Run a Real Email Subject Line Test
Objective: Execute your first real A/B test and analyze the results.
Instructions:
1. Identify your next planned email newsletter send.
2. Write two subject line variants that test a single variable (question vs. statement, with vs. without a number, with vs. without emoji).
3. Set up the test in your email platform (most platforms have A/B testing built in).
4. Send to at least 200 subscribers per variant (or your full list if smaller, split 50/50).
5. Wait at least 48 hours before checking results.
6. Export the results: number sent, number opened, open rate per variant.
7. Enter the results into ab_test_analysis.py using the run_proportion_z_test function.
8. Interpret the output: Is the result statistically significant? What is the p-value? What action will you take?
Deliverable: A test report including your hypothesis, the two subject lines, raw results data, statistical analysis output, and your conclusion.
Note: If you do not yet have an email list, complete this exercise using the sample data in the script and write hypothetical subject lines based on your content niche.
Exercise 26.4 — Analyze the Meridian Collective's Bundle Test
Objective: Practice interpreting A/B test results and drawing actionable conclusions.
Background data: The Meridian Collective ran a sequential bundle test on their Destiny 2 Starter Pack. Here are the results:
- Period A (30 days, October): "Community Discord + Beginner Raid Guide"
- Visitors: 2,340 | Conversions: 82 | Conversion rate: 3.51%
- Price: $15
-
Revenue: $1,230
-
Period B (30 days, November): "Private Coaching Discord + Beginner Raid Guide"
- Visitors: 2,180 | Conversions: 89 | Conversion rate: 4.08%
- Price: $15
- Revenue: $1,335
Instructions:
1. Run the proportion z-test using ab_test_analysis.py with these numbers.
2. Is the difference statistically significant? What is the p-value?
3. Calculate the relative improvement in conversion rate from A to B.
4. Note that traffic differs between periods (2,340 vs. 2,180). Does this affect your interpretation? How would you account for seasonal effects (October vs. November)?
5. The revenue difference is $105. If this test were annualized, what is the cumulative revenue impact of implementing Version B permanently?
Deliverable: A written test analysis with statistical output, interpretation, and business recommendation.
Reflection: The test approaches but does not definitively reach statistical significance at p < 0.05. Should the Meridian Collective implement Version B anyway? What additional information would help you make that call?
Exercise 26.5 — Design a Landing Page Test Protocol
Objective: Design a complete, methodologically sound test protocol for a landing page change.
Instructions: Choose a landing page you have (or one you plan to create). Design a test protocol for the single most impactful change you could make. Your protocol must include:
-
Current state description: What does the page currently look like? What is the baseline conversion rate (or your best estimate)?
-
Variable being tested: Exactly what is changing between Version A and Version B? (One thing only.)
-
Hypothesis: Full hypothesis statement (see Exercise 26.1 template).
-
Success metric: Primary metric (conversion rate? Revenue per visitor?) and secondary metrics to watch.
-
Sample size requirement: Calculate using
ab_test_analysis.py. -
Test duration estimate: Given your current traffic, how many days until you hit the required sample size?
-
Stopping rules: Minimum run time AND minimum sample size must both be met before declaring a winner.
-
Potential confounds: What external factors could interfere with this test (seasonal events, planned promotions, traffic source changes)?
-
Implementation plan: If Version B wins, how will you implement it? If Version B loses, what is your next test hypothesis?
Deliverable: A one-page test protocol document you could hand to a collaborator or revisit in 60 days.
Exercise 26.6 — Build Your Iteration Log
Objective: Create a structured system for tracking testing knowledge over time.
Instructions: 1. Set up an iteration log spreadsheet (Google Sheets) or Airtable base with the following fields: - Test ID (sequential number) - Test Name (brief description) - Variable Tested - Date Started - Date Ended - Variant A Description - Variant B Description - Primary Metric - Variant A Result (rate + count) - Variant B Result (rate + count) - Sample Size per Variant - P-Value - Statistically Significant? (Yes / No / Borderline) - Winner - Relative Improvement % - Action Taken - Notes/Context
-
Populate at least 3 rows with tests you have run (or, if you have not run tests yet, with informal comparisons you have noticed — videos that performed better or worse than expected, emails that had unusually high or low open rates, etc.).
-
Add a "Lessons" tab where you write summary principles that emerge from multiple tests (e.g., "Subject lines with numbers outperform vague subjects for this audience — tested 3x").
Deliverable: A populated iteration log with at least 3 entries and at least 1 summary principle.
Reflection: How would having 50 entries in this log, built over 18 months, change how you approach content and offer decisions? What is the compounding value of systematic testing knowledge?