Chapter 26 Exercises: A/B Testing Content and Offer Strategy

DataField.Dev

Chapter 26 Exercises: A/B Testing Content and Offer Strategy

Exercise 26.1 — Hypothesis Formation Workshop

Objective: Practice the discipline of forming clear, testable hypotheses before running any test.

Instructions: A valid test hypothesis has three components: 1. Variable: What exactly is being changed? 2. Expected outcome: What do you predict will happen, and why? 3. Metric: How will you measure success?

Write a formal hypothesis for each of the following test scenarios:

Scenario A: You currently use neutral, informational thumbnail designs. You want to test whether thumbnails featuring your face with a strong emotion perform better.

Scenario B: Your email open rate averages 18%. You want to test whether subject lines that include a specific actionable number perform better than your current approach.

Scenario C: Your course landing page currently has the CTA button ("Enroll Now") at the bottom of the page. You want to test whether placing a second CTA button above the fold increases conversion.

Template for each hypothesis: "If I [change variable], then [metric] will [increase/decrease/change] because [reasoning based on audience understanding]. I will measure [specific metric] over [time period] with a minimum of [n] observations per variant."

Reflection: What makes a hypothesis weak or untestable? What distinguishes "my audience might like this" from a real hypothesis?

Exercise 26.2 — Sample Size Calculation Before Testing

Objective: Learn to calculate required sample sizes before committing to a test, preventing premature or underpowered conclusions.

Instructions: Use the ab_test_analysis.py sample size calculator (function calculate_required_sample_size) to answer the following questions:

Your email newsletter has a 22% open rate. You want to detect an improvement of at least 4 percentage points (to 26%). With 80% statistical power and 5% significance level, how many subscribers per variant do you need? How many total emails does this require?
Your YouTube thumbnails have a 5.5% average CTR. You want to detect an improvement of at least 1 percentage point (to 6.5%). What sample size per variant is needed? At 400 views per day, how many days does this test require?
Your landing page converts at 2.8%. You want to detect a 20% relative improvement (to 3.36%). How many visitors per variant do you need? If you get 80 visitors per day, is this test feasible within 60 days?

Deliverable: A table showing required sample sizes, estimated test duration, and a feasibility assessment (feasible / borderline / not feasible) for each scenario.

Reflection: What does sample size calculation tell you about the practical limits of testing at small audience sizes? What alternatives exist when tests are not feasible?

Exercise 26.3 — Run a Real Email Subject Line Test

Objective: Execute your first real A/B test and analyze the results.

Instructions: 1. Identify your next planned email newsletter send. 2. Write two subject line variants that test a single variable (question vs. statement, with vs. without a number, with vs. without emoji). 3. Set up the test in your email platform (most platforms have A/B testing built in). 4. Send to at least 200 subscribers per variant (or your full list if smaller, split 50/50). 5. Wait at least 48 hours before checking results. 6. Export the results: number sent, number opened, open rate per variant. 7. Enter the results into ab_test_analysis.py using the run_proportion_z_test function. 8. Interpret the output: Is the result statistically significant? What is the p-value? What action will you take?

Deliverable: A test report including your hypothesis, the two subject lines, raw results data, statistical analysis output, and your conclusion.

Note: If you do not yet have an email list, complete this exercise using the sample data in the script and write hypothetical subject lines based on your content niche.

Exercise 26.4 — Analyze the Meridian Collective's Bundle Test

Objective: Practice interpreting A/B test results and drawing actionable conclusions.

Background data: The Meridian Collective ran a sequential bundle test on their Destiny 2 Starter Pack. Here are the results:

Period A (30 days, October): "Community Discord + Beginner Raid Guide"
Visitors: 2,340 | Conversions: 82 | Conversion rate: 3.51%
Price: $15
Revenue: $1,230
Period B (30 days, November): "Private Coaching Discord + Beginner Raid Guide"
Visitors: 2,180 | Conversions: 89 | Conversion rate: 4.08%
Price: $15
Revenue: $1,335

Instructions: 1. Run the proportion z-test using ab_test_analysis.py with these numbers. 2. Is the difference statistically significant? What is the p-value? 3. Calculate the relative improvement in conversion rate from A to B. 4. Note that traffic differs between periods (2,340 vs. 2,180). Does this affect your interpretation? How would you account for seasonal effects (October vs. November)? 5. The revenue difference is $105. If this test were annualized, what is the cumulative revenue impact of implementing Version B permanently?

Deliverable: A written test analysis with statistical output, interpretation, and business recommendation.

Reflection: The test approaches but does not definitively reach statistical significance at p < 0.05. Should the Meridian Collective implement Version B anyway? What additional information would help you make that call?

Exercise 26.5 — Design a Landing Page Test Protocol

Objective: Design a complete, methodologically sound test protocol for a landing page change.

Instructions: Choose a landing page you have (or one you plan to create). Design a test protocol for the single most impactful change you could make. Your protocol must include:

Current state description: What does the page currently look like? What is the baseline conversion rate (or your best estimate)?
Variable being tested: Exactly what is changing between Version A and Version B? (One thing only.)
Hypothesis: Full hypothesis statement (see Exercise 26.1 template).
Success metric: Primary metric (conversion rate? Revenue per visitor?) and secondary metrics to watch.
Sample size requirement: Calculate using ab_test_analysis.py.
Test duration estimate: Given your current traffic, how many days until you hit the required sample size?
Stopping rules: Minimum run time AND minimum sample size must both be met before declaring a winner.
Potential confounds: What external factors could interfere with this test (seasonal events, planned promotions, traffic source changes)?
Implementation plan: If Version B wins, how will you implement it? If Version B loses, what is your next test hypothesis?

Deliverable: A one-page test protocol document you could hand to a collaborator or revisit in 60 days.

Exercise 26.6 — Build Your Iteration Log

Objective: Create a structured system for tracking testing knowledge over time.

Instructions: 1. Set up an iteration log spreadsheet (Google Sheets) or Airtable base with the following fields: - Test ID (sequential number) - Test Name (brief description) - Variable Tested - Date Started - Date Ended - Variant A Description - Variant B Description - Primary Metric - Variant A Result (rate + count) - Variant B Result (rate + count) - Sample Size per Variant - P-Value - Statistically Significant? (Yes / No / Borderline) - Winner - Relative Improvement % - Action Taken - Notes/Context

Populate at least 3 rows with tests you have run (or, if you have not run tests yet, with informal comparisons you have noticed — videos that performed better or worse than expected, emails that had unusually high or low open rates, etc.).
Add a "Lessons" tab where you write summary principles that emerge from multiple tests (e.g., "Subject lines with numbers outperform vague subjects for this audience — tested 3x").

Deliverable: A populated iteration log with at least 3 entries and at least 1 summary principle.

Reflection: How would having 50 entries in this log, built over 18 months, change how you approach content and offer decisions? What is the compounding value of systematic testing knowledge?