Case Study 2: ShopSmart Conversion Funnel Analysis — SQL for A/B Test Feature Extraction

DataField.Dev

Case Study 2: ShopSmart Conversion Funnel Analysis — SQL for A/B Test Feature Extraction

The Situation

ShopSmart is a mid-size e-commerce marketplace with 14 million monthly active users. The product team has run an A/B test on a redesigned checkout flow: the treatment group sees a streamlined single-page checkout; the control group sees the existing multi-step checkout. The test ran for 21 days with a 50/50 split across 2.8 million users.

The product team declared the test a success based on overall conversion rate. But the data science team suspects the results are more nuanced — the new checkout may be helping some user segments while hurting others. To investigate, they need to extract detailed funnel metrics from the event database and slice them by user behavior features.

The data lives in two tables:

-- User behavior events
CREATE TABLE page_events (
    event_id       BIGINT PRIMARY KEY,
    user_id        BIGINT NOT NULL,
    event_type     VARCHAR(30),     -- 'page_view', 'search', 'product_view',
                                    -- 'add_to_cart', 'checkout_start', 'purchase'
    event_time     TIMESTAMP NOT NULL,
    page_url       VARCHAR(255),
    session_id     VARCHAR(50) NOT NULL,
    device_type    VARCHAR(20),     -- 'desktop', 'mobile', 'tablet'
    referral_source VARCHAR(30)     -- 'organic', 'paid_search', 'social', 'email', 'direct'
);

-- A/B test assignment
CREATE TABLE ab_assignments (
    user_id        BIGINT PRIMARY KEY,
    experiment_id  VARCHAR(50),     -- 'checkout_redesign_2025q1'
    variant        VARCHAR(20),     -- 'control', 'treatment'
    assigned_date  DATE NOT NULL
);

The page_events table has 340 million rows for the 21-day test period. The ab_assignments table has 2.8 million rows.

Phase 1: The Funnel Query

The first task is to compute the conversion funnel for each variant. The funnel has five stages: page view, product view, add to cart, checkout start, purchase. Each user either makes it to a stage or does not.

-- Conversion funnel by A/B variant
WITH user_funnel AS (
    SELECT
        pe.user_id,
        ab.variant,
        -- Did this user reach each funnel stage? (at least once during the test)
        BOOL_OR(pe.event_type = 'page_view') AS reached_page_view,
        BOOL_OR(pe.event_type = 'product_view') AS reached_product_view,
        BOOL_OR(pe.event_type = 'add_to_cart') AS reached_add_to_cart,
        BOOL_OR(pe.event_type = 'checkout_start') AS reached_checkout_start,
        BOOL_OR(pe.event_type = 'purchase') AS reached_purchase,
        -- Count of events at each stage
        COUNT(*) FILTER (WHERE pe.event_type = 'page_view') AS page_views,
        COUNT(*) FILTER (WHERE pe.event_type = 'product_view') AS product_views,
        COUNT(*) FILTER (WHERE pe.event_type = 'add_to_cart') AS cart_adds,
        COUNT(*) FILTER (WHERE pe.event_type = 'checkout_start') AS checkout_starts,
        COUNT(*) FILTER (WHERE pe.event_type = 'purchase') AS purchases
    FROM page_events pe
    INNER JOIN ab_assignments ab
        ON pe.user_id = ab.user_id
    WHERE ab.experiment_id = 'checkout_redesign_2025q1'
      AND pe.event_time >= ab.assigned_date::TIMESTAMP
      AND pe.event_time < (ab.assigned_date + INTERVAL '21 days')::TIMESTAMP
    GROUP BY pe.user_id, ab.variant
)
SELECT
    variant,
    COUNT(*) AS users,
    -- Stage reach rates
    ROUND(COUNT(*) FILTER (WHERE reached_product_view)::NUMERIC / COUNT(*), 4)
        AS product_view_rate,
    ROUND(COUNT(*) FILTER (WHERE reached_add_to_cart)::NUMERIC / COUNT(*), 4)
        AS add_to_cart_rate,
    ROUND(COUNT(*) FILTER (WHERE reached_checkout_start)::NUMERIC / COUNT(*), 4)
        AS checkout_start_rate,
    ROUND(COUNT(*) FILTER (WHERE reached_purchase)::NUMERIC / COUNT(*), 4)
        AS purchase_rate,
    -- Stage-to-stage conversion rates
    ROUND(COUNT(*) FILTER (WHERE reached_add_to_cart)::NUMERIC /
          NULLIF(COUNT(*) FILTER (WHERE reached_product_view), 0), 4)
        AS product_to_cart_rate,
    ROUND(COUNT(*) FILTER (WHERE reached_checkout_start)::NUMERIC /
          NULLIF(COUNT(*) FILTER (WHERE reached_add_to_cart), 0), 4)
        AS cart_to_checkout_rate,
    ROUND(COUNT(*) FILTER (WHERE reached_purchase)::NUMERIC /
          NULLIF(COUNT(*) FILTER (WHERE reached_checkout_start), 0), 4)
        AS checkout_to_purchase_rate
FROM user_funnel
GROUP BY variant
ORDER BY variant;

 variant   | users    | product_view_rate | add_to_cart_rate | checkout_start_rate | purchase_rate | product_to_cart_rate | cart_to_checkout_rate | checkout_to_purchase_rate
-----------+----------+-------------------+------------------+---------------------+---------------+----------------------+-----------------------+-------------------------
 control   | 1,400,000| 0.6821            | 0.2847           | 0.1423              | 0.0834        | 0.4174               | 0.4998                | 0.5862
 treatment | 1,400,000| 0.6834            | 0.2861           | 0.1498              | 0.0912        | 0.4185               | 0.5236                | 0.6087

The treatment group has a higher overall purchase rate (9.12% vs. 8.34% — a 9.4% relative lift). But look at where the improvement happens: the cart_to_checkout_rate and checkout_to_purchase_rate both improved. The product-to-cart rate barely moved. The redesigned checkout is working as intended — it is not getting more people to add items to their cart, but it is getting more cart-adders to complete the purchase.

Phase 2: Segmented Analysis with Window Functions

The product team's overall result masks important variation. The data science team wants to know: does the treatment help all user segments equally?

-- User behavior features for segmented analysis
WITH user_behavior AS (
    SELECT
        pe.user_id,
        ab.variant,
        -- Purchase outcome (binary target for the A/B test)
        MAX(CASE WHEN pe.event_type = 'purchase' THEN 1 ELSE 0 END) AS purchased,
        -- Behavioral features
        COUNT(DISTINCT pe.session_id) AS total_sessions,
        COUNT(*) FILTER (WHERE pe.event_type = 'page_view') AS total_page_views,
        COUNT(DISTINCT DATE(pe.event_time)) AS active_days,
        -- Device usage
        MODE() WITHIN GROUP (ORDER BY pe.device_type) AS primary_device,
        COUNT(DISTINCT pe.device_type) AS device_count,
        -- Referral source
        MODE() WITHIN GROUP (ORDER BY pe.referral_source) AS primary_referral,
        -- Session depth: avg page views per session
        COUNT(*) FILTER (WHERE pe.event_type = 'page_view')::NUMERIC /
            NULLIF(COUNT(DISTINCT pe.session_id), 0) AS avg_pages_per_session,
        -- Cart abandonment: added to cart but didn't purchase in that session
        COUNT(DISTINCT CASE
            WHEN pe.event_type = 'add_to_cart'
                 AND pe.session_id NOT IN (
                     SELECT session_id FROM page_events
                     WHERE event_type = 'purchase' AND user_id = pe.user_id
                 )
            THEN pe.session_id
        END) AS abandoned_cart_sessions
    FROM page_events pe
    INNER JOIN ab_assignments ab ON pe.user_id = ab.user_id
    WHERE ab.experiment_id = 'checkout_redesign_2025q1'
      AND pe.event_time >= ab.assigned_date::TIMESTAMP
      AND pe.event_time < (ab.assigned_date + INTERVAL '21 days')::TIMESTAMP
    GROUP BY pe.user_id, ab.variant
),
user_segments AS (
    SELECT
        *,
        -- Engagement segment
        NTILE(4) OVER (ORDER BY total_sessions) AS engagement_quartile,
        -- Browsing depth segment
        NTILE(4) OVER (ORDER BY avg_pages_per_session) AS depth_quartile,
        -- Cart abandoner flag
        CASE WHEN abandoned_cart_sessions > 0 THEN 'abandoner' ELSE 'non_abandoner' END
            AS abandoner_segment
    FROM user_behavior
)
SELECT
    primary_device,
    engagement_quartile,
    variant,
    COUNT(*) AS users,
    ROUND(AVG(purchased::NUMERIC), 4) AS purchase_rate,
    ROUND(AVG(total_sessions::NUMERIC), 1) AS avg_sessions,
    ROUND(AVG(avg_pages_per_session), 1) AS avg_depth
FROM user_segments
GROUP BY primary_device, engagement_quartile, variant
ORDER BY primary_device, engagement_quartile, variant;

 primary_device | engagement_quartile | variant   | users   | purchase_rate | avg_sessions | avg_depth
----------------+---------------------+-----------+---------+---------------+--------------+----------
 desktop        | 1                   | control   |  87,500 | 0.0312        | 1.2          | 2.4
 desktop        | 1                   | treatment |  87,500 | 0.0341        | 1.2          | 2.4
 desktop        | 4                   | control   |  87,500 | 0.2187        | 8.7          | 6.1
 desktop        | 4                   | treatment |  87,500 | 0.2234        | 8.8          | 6.2
 mobile         | 1                   | control   | 131,250 | 0.0198        | 1.1          | 1.8
 mobile         | 1                   | treatment | 131,250 | 0.0287        | 1.1          | 1.8
 mobile         | 4                   | control   | 131,250 | 0.1456        | 7.2          | 4.3
 mobile         | 4                   | treatment | 131,250 | 0.1789        | 7.3          | 4.4

The segmented results reveal the story. The new checkout helps mobile users substantially more than desktop users. The biggest lift is in the mobile/low-engagement segment: 0.0198 to 0.0287 is a 45% relative increase. For high-engagement desktop users, the lift is only about 2%. The streamlined single-page checkout eliminates friction that disproportionately affected mobile users navigating a multi-step flow on small screens.

This is the kind of insight that a single "overall conversion rate" metric hides.

Phase 3: Time-to-Purchase Analysis

The product team also wants to know: does the new checkout reduce the time from first visit to purchase? Window functions make this straightforward.

-- Time from first event in session to purchase
WITH session_bounds AS (
    SELECT
        pe.user_id,
        pe.session_id,
        ab.variant,
        MIN(pe.event_time) AS session_start,
        MAX(CASE WHEN pe.event_type = 'purchase' THEN pe.event_time END) AS purchase_time,
        MIN(CASE WHEN pe.event_type = 'add_to_cart' THEN pe.event_time END) AS first_cart_add
    FROM page_events pe
    INNER JOIN ab_assignments ab ON pe.user_id = ab.user_id
    WHERE ab.experiment_id = 'checkout_redesign_2025q1'
      AND pe.event_time >= ab.assigned_date::TIMESTAMP
    GROUP BY pe.user_id, pe.session_id, ab.variant
),
purchase_sessions AS (
    SELECT
        user_id,
        session_id,
        variant,
        EXTRACT(EPOCH FROM (purchase_time - session_start)) / 60.0
            AS session_to_purchase_minutes,
        EXTRACT(EPOCH FROM (purchase_time - first_cart_add)) / 60.0
            AS cart_to_purchase_minutes
    FROM session_bounds
    WHERE purchase_time IS NOT NULL
)
SELECT
    variant,
    COUNT(*) AS purchase_sessions,
    ROUND(AVG(session_to_purchase_minutes)::NUMERIC, 1) AS avg_session_to_purchase_min,
    ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY session_to_purchase_minutes)::NUMERIC, 1)
        AS median_session_to_purchase_min,
    ROUND(AVG(cart_to_purchase_minutes)::NUMERIC, 1) AS avg_cart_to_purchase_min,
    ROUND(PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY cart_to_purchase_minutes)::NUMERIC, 1)
        AS median_cart_to_purchase_min
FROM purchase_sessions
GROUP BY variant;

 variant   | purchase_sessions | avg_session_to_purchase_min | median_session_to_purchase_min | avg_cart_to_purchase_min | median_cart_to_purchase_min
-----------+-------------------+----------------------------+-------------------------------+-------------------------+----------------------------
 control   |          116,760  |                       14.3 |                          11.2 |                     6.8 |                        4.9
 treatment |          127,680  |                       12.7 |                           9.8 |                     4.2 |                        2.8

The treatment group completes purchases 1.4 minutes faster (median) from cart to purchase. The streamlined checkout is not just converting more users — it is converting them faster, which reduces the chance of distraction or abandonment during the checkout process.

Phase 4: Building the A/B Test Feature Table

For the final analysis, the data science team wants a complete per-user feature table that can be loaded into Python for statistical testing and heterogeneous treatment effect analysis.

-- Complete A/B test feature table for Python analysis
WITH user_events AS (
    SELECT
        pe.user_id,
        ab.variant,
        pe.event_type,
        pe.event_time,
        pe.session_id,
        pe.device_type,
        pe.referral_source
    FROM page_events pe
    INNER JOIN ab_assignments ab ON pe.user_id = ab.user_id
    WHERE ab.experiment_id = 'checkout_redesign_2025q1'
      AND pe.event_time >= ab.assigned_date::TIMESTAMP
      AND pe.event_time < (ab.assigned_date + INTERVAL '21 days')::TIMESTAMP
),

user_features AS (
    SELECT
        user_id,
        variant,
        -- Outcome
        MAX(CASE WHEN event_type = 'purchase' THEN 1 ELSE 0 END) AS purchased,
        COUNT(*) FILTER (WHERE event_type = 'purchase') AS purchase_count,

        -- Engagement features
        COUNT(DISTINCT session_id) AS total_sessions,
        COUNT(DISTINCT DATE(event_time)) AS active_days,
        COUNT(*) FILTER (WHERE event_type = 'page_view') AS page_views,
        COUNT(*) FILTER (WHERE event_type = 'product_view') AS product_views,
        COUNT(*) FILTER (WHERE event_type = 'add_to_cart') AS cart_adds,
        COUNT(*) FILTER (WHERE event_type = 'checkout_start') AS checkout_starts,

        -- Device and referral
        MODE() WITHIN GROUP (ORDER BY device_type) AS primary_device,
        COUNT(DISTINCT device_type) AS device_types_used,
        MODE() WITHIN GROUP (ORDER BY referral_source) AS primary_referral,

        -- Session characteristics
        COUNT(*) FILTER (WHERE event_type = 'page_view')::NUMERIC /
            NULLIF(COUNT(DISTINCT session_id), 0) AS avg_pages_per_session,

        -- Temporal features
        MIN(event_time) AS first_event_time,
        MAX(event_time) AS last_event_time,
        EXTRACT(EPOCH FROM (MAX(event_time) - MIN(event_time))) / 86400.0
            AS active_span_days
    FROM user_events
    GROUP BY user_id, variant
),

session_level AS (
    -- Per-session purchase flag for cart abandonment calculation
    SELECT
        user_id,
        session_id,
        BOOL_OR(event_type = 'add_to_cart') AS had_cart_add,
        BOOL_OR(event_type = 'purchase') AS had_purchase
    FROM user_events
    GROUP BY user_id, session_id
),

abandonment AS (
    SELECT
        user_id,
        COUNT(*) FILTER (WHERE had_cart_add AND NOT had_purchase) AS abandoned_sessions,
        COUNT(*) FILTER (WHERE had_cart_add) AS cart_sessions,
        ROUND(
            COUNT(*) FILTER (WHERE had_cart_add AND NOT had_purchase)::NUMERIC /
            NULLIF(COUNT(*) FILTER (WHERE had_cart_add), 0), 4
        ) AS cart_abandonment_rate
    FROM session_level
    GROUP BY user_id
)

SELECT
    uf.user_id,
    uf.variant,
    uf.purchased,
    uf.purchase_count,
    uf.total_sessions,
    uf.active_days,
    uf.page_views,
    uf.product_views,
    uf.cart_adds,
    uf.checkout_starts,
    uf.primary_device,
    uf.device_types_used,
    uf.primary_referral,
    ROUND(uf.avg_pages_per_session::NUMERIC, 2) AS avg_pages_per_session,
    ROUND(uf.active_span_days::NUMERIC, 1) AS active_span_days,
    COALESCE(ab.abandoned_sessions, 0) AS abandoned_sessions,
    COALESCE(ab.cart_abandonment_rate, 0) AS cart_abandonment_rate
FROM user_features uf
LEFT JOIN abandonment ab ON uf.user_id = ab.user_id;

The result: 2.8 million rows, 17 columns, ready for pd.read_sql(). From here, the data science team can run statistical tests (Chapter 3 revisited), build heterogeneous treatment effect models, and segment the analysis by any combination of behavioral features.

import pandas as pd
from sqlalchemy import create_engine, text

engine = create_engine('postgresql://analyst:***@warehouse:5432/shopsmart')
df = pd.read_sql(text("SELECT * FROM ab_test_features"), engine)

# Quick statistical test: overall purchase rate difference
from scipy.stats import chi2_contingency

contingency = pd.crosstab(df['variant'], df['purchased'])
chi2, p_value, dof, expected = chi2_contingency(contingency)
print(f"Chi-squared: {chi2:.1f}, p-value: {p_value:.2e}")
print(f"Control purchase rate: {df[df.variant == 'control']['purchased'].mean():.4f}")
print(f"Treatment purchase rate: {df[df.variant == 'treatment']['purchased'].mean():.4f}")

Chi-squared: 487.3, p-value: 2.14e-107
Control purchase rate: 0.0834
Treatment purchase rate: 0.0912

Phase 5: Optimization for Production

The A/B test analysis is a one-time effort, but ShopSmart runs experiments continuously. The data science team builds a reusable pattern:

-- Indexes for experiment analysis queries
CREATE INDEX idx_page_events_user_time
    ON page_events (user_id, event_time);

CREATE INDEX idx_page_events_session_type
    ON page_events (session_id, event_type);

CREATE INDEX idx_ab_assignments_experiment
    ON ab_assignments (experiment_id, variant);

The indexes cut the funnel query runtime from 3 minutes 40 seconds to 52 seconds. For the 340M-row page_events table, this is the difference between a query you run once and wait versus a query you iterate on interactively.

Production Tip — For A/B test analysis, consider creating a pre-computed "session summary" table that aggregates page_events at the session level. This reduces 340M rows to approximately 28M sessions, and most A/B test queries operate at the session or user level, not the event level. The materialized-session-summary pattern saves time on every subsequent analysis.

Analysis Questions

The overall purchase rate for the treatment group was 9.12% vs. 8.34% for control — a 9.4% relative lift. But the mobile/low-engagement segment saw a 45% relative lift, while the desktop/high-engagement segment saw only a 2% lift. How should this information change the product team's rollout decision? Would you recommend rolling out to all users, or only to mobile users?
The funnel analysis shows that the treatment improved cart_to_checkout_rate and checkout_to_purchase_rate but barely moved product_to_cart_rate. This makes sense — the checkout redesign should not affect browsing behavior. But what if the treatment had improved the product-to-cart rate? What would that tell you about the experiment, and should you be concerned?
The time-to-purchase analysis used PERCENTILE_CONT(0.5) (the median) alongside the mean. The median was consistently lower than the mean, suggesting a right-skewed distribution. Why is the median a better summary statistic than the mean for time-to-purchase? When would you prefer the mean?
The cart abandonment rate was computed per-session: "sessions where the user added to cart but did not purchase." An alternative definition is per-user: "users who added to cart in any session but never purchased in any session." Write the SQL for the per-user definition. How might the two definitions lead to different conclusions?
The MODE() WITHIN GROUP (ORDER BY device_type) aggregate assigns a single primary device to each user. But a user who makes 10 visits on mobile and 1 on desktop might have different behavior on each device. Propose a SQL feature that captures multi-device behavior better than a single mode. Hint: consider a ratio or a count.
This analysis joins page_events to ab_assignments with an INNER JOIN. This means users who were assigned to the experiment but never visited the site are excluded. How does this affect the purchase rate calculation? Is an INNER JOIN or a LEFT JOIN the correct choice for measuring the experiment's effect?