Case Study 1: Automating Grade Categories — Jordan's First Script

Contributors to Introduction to Data Science

Case Study 1: Automating Grade Categories — Jordan's First Script

Tier 3 — Illustrative/Composite Example: Jordan is a fictional composite character. This case study illustrates a common data science scenario — automating a repetitive categorization task — using a realistic but invented dataset. The university, courses, and grade data described here are fictional. The frustrations, mistakes, and "aha" moments are drawn from typical beginner experiences in learning to program.

The Problem

It's a Wednesday evening in the library, and Jordan has a problem. For a sociology research project on grading equity, they've been given a spreadsheet containing 200 exam scores from four sections of Introduction to Sociology. Their advisor wants a simple breakdown: how many A's, B's, C's, D's, and F's in each section? Are the distributions roughly the same, or do some sections show suspiciously different patterns?

Jordan's first instinct is to do this in a spreadsheet. And honestly, for 200 grades split across 4 sections, that would work. They could sort, filter, use COUNTIF formulas, maybe build a pivot table. They've done it before.

But there's a catch. This is just the pilot study. If the initial findings are interesting, Jordan's advisor wants to expand the analysis to every section of every introductory course across the entire university — potentially thousands of grades spanning multiple semesters. Doing that in a spreadsheet would be a nightmare of copy-paste, manual formula adjustments, and the ever-present risk of accidentally sorting one column without sorting the others (every spreadsheet user's worst memory).

Jordan decides this is the moment to use Python. Not because it's required — the spreadsheet would work fine for 200 grades — but because if the analysis needs to scale, they want code that works for 200 grades and 20,000 grades without modification.

This is a key data science instinct: build for the general case, even when you're starting with a specific one. Functions make this possible.

Setting Up the Data

Jordan starts a new Jupyter notebook and creates sample data for the first section. (In a later chapter, they'll learn to load real data from CSV files. For now, typing in a small sample is fine for building and testing the logic.)

# Section A grades (Professor Martinez)
section_a = [88, 92, 75, 67, 95, 81, 73, 58,
             84, 90, 62, 71, 87, 93, 79, 55,
             86, 77, 69, 91]

Twenty grades. Enough to test with, small enough to verify by hand.

Step 1: The Naive Approach (and Why It Hurts)

Jordan's first attempt is straightforward — just count directly:

a_count = 0
b_count = 0
c_count = 0
d_count = 0
f_count = 0

for grade in section_a:
    if grade >= 90:
        a_count = a_count + 1
    elif grade >= 80:
        b_count = b_count + 1
    elif grade >= 70:
        c_count = c_count + 1
    elif grade >= 60:
        d_count = d_count + 1
    else:
        f_count = f_count + 1

print("Section A Grade Distribution:")
print(f"  A: {a_count}")
print(f"  B: {b_count}")
print(f"  C: {c_count}")
print(f"  D: {d_count}")
print(f"  F: {f_count}")

Output:

Section A Grade Distribution:
  A: 5
  B: 5
  C: 5
  D: 3
  F: 2

It works. Jordan verifies by hand: grades 90+ are 92, 95, 90, 93, 91 — that's 5 A's. Checks out.

But then Jordan adds Section B:

section_b = [72, 85, 91, 63, 78, 88, 94, 70,
             82, 76, 59, 87, 68, 81, 74, 96,
             65, 83, 77, 89]

And they realize they need to copy the entire counting block, rename all the counter variables to avoid conflicts (a_count_b, b_count_b...), and update the print statements. For four sections, that's the same 20 lines repeated four times with slight variations.

Jordan remembers the DRY principle from the chapter: Don't Repeat Yourself. Time for a function.

Step 2: Writing the First Function

def count_grades(grades):
    """Count grades by letter category and return the counts."""
    a_count = 0
    b_count = 0
    c_count = 0
    d_count = 0
    f_count = 0

    for grade in grades:
        if grade >= 90:
            a_count = a_count + 1
        elif grade >= 80:
            b_count = b_count + 1
        elif grade >= 70:
            c_count = c_count + 1
        elif grade >= 60:
            d_count = d_count + 1
        else:
            f_count = f_count + 1

    return a_count, b_count, c_count, d_count, f_count

Now analyzing any section is a single line:

a, b, c, d, f = count_grades(section_a)
print(f"Section A — A:{a} B:{b} C:{c} D:{d} F:{f}")

a, b, c, d, f = count_grades(section_b)
print(f"Section B — A:{a} B:{b} C:{c} D:{d} F:{f}")

Output:

Section A — A:5 B:5 C:5 D:3 F:2
Section B — A:3 B:6 C:4 D:3 F:1 (Note: actual counts based on section_b data)

Two lines instead of forty. And it works for any list of grades — 20 items, 200 items, 20,000 items.

Step 3: The First Bug

Jordan adds Section C's data and runs the analysis. The results look odd — 25 grades are categorized, but Section C only has 20. After some panicked staring, Jordan realizes the bug: they forgot to re-initialize the counter variables between sections in their original approach. The function version doesn't have this problem because each function call creates fresh local variables. Scope, which seemed like an abstract concept in the chapter, just saved Jordan from a real data error.

This is a lesson worth internalizing: functions don't just save typing — they prevent bugs. Each call to count_grades starts with fresh counters. There's no risk of leftover data from a previous section leaking into the next one.

Step 4: Adding Validation

Jordan's advisor mentions that the full dataset might have errors — grades recorded as 105 (probably meant 100), or -1 (a placeholder for "not yet graded"), or even text like "incomplete." Jordan decides to add validation:

def is_valid_grade(value):
    """Check if a value is a valid numeric grade between 0 and 100."""
    if not isinstance(value, (int, float)):
        return False
    if value < 0 or value > 100:
        return False
    return True

def count_grades(grades):
    """Count grades by letter category, skipping invalid values."""
    a_count = 0
    b_count = 0
    c_count = 0
    d_count = 0
    f_count = 0
    invalid_count = 0

    for grade in grades:
        if not is_valid_grade(grade):
            invalid_count = invalid_count + 1
        elif grade >= 90:
            a_count = a_count + 1
        elif grade >= 80:
            b_count = b_count + 1
        elif grade >= 70:
            c_count = c_count + 1
        elif grade >= 60:
            d_count = d_count + 1
        else:
            f_count = f_count + 1

    return a_count, b_count, c_count, d_count, f_count, invalid_count

Now Jordan can test with messy data:

messy_section = [88, 92, -1, 75, "incomplete", 105, 67, 81]
a, b, c, d, f, invalid = count_grades(messy_section)
print(f"Valid grades: {a+b+c+d+f}")
print(f"Invalid entries: {invalid}")

Output:

Valid grades: 5
Invalid entries: 3

The three invalid values (-1, "incomplete", 105) are caught and counted separately rather than silently corrupting the results.

Step 5: Building the Report

Jordan writes one more function to tie everything together:

def section_report(section_name, grades):
    """Generate a complete grade report for one section."""
    a, b, c, d, f, invalid = count_grades(grades)
    total_valid = a + b + c + d + f

    print(f"\n{'='*40}")
    print(f"Grade Report: {section_name}")
    print(f"{'='*40}")
    print(f"Total entries: {len(grades)}")

    if invalid > 0:
        print(f"Invalid entries removed: {invalid}")

    print(f"Valid grades: {total_valid}")
    print(f"  A (90-100): {a:3d}  ({a/total_valid*100:.0f}%)")
    print(f"  B (80-89):  {b:3d}  ({b/total_valid*100:.0f}%)")
    print(f"  C (70-79):  {c:3d}  ({c/total_valid*100:.0f}%)")
    print(f"  D (60-69):  {d:3d}  ({d/total_valid*100:.0f}%)")
    print(f"  F (0-59):   {f:3d}  ({f/total_valid*100:.0f}%)")

    # Compute average
    total_sum = 0
    for grade in grades:
        if is_valid_grade(grade):
            total_sum = total_sum + grade
    average = total_sum / total_valid
    print(f"  Average:  {average:.1f}")

Now generating a full report for all four sections is clean:

sections = {
    "Section A (Prof. Martinez)": section_a,
    "Section B (Prof. Chen)": section_b,
    "Section C (Prof. Williams)": section_c,
    "Section D (Prof. Patel)": section_d,
}

for name, grades in sections.items():
    section_report(name, grades)

(We're using a dictionary and .items() here — concepts from Chapter 5. For now, just notice the pattern: a loop that processes each section by calling the same function.)

What Jordan Learned

Jordan's experience illustrates several key lessons from Chapter 4:

1. Start specific, then generalize. Jordan's first version was hardcoded for one section. By wrapping it in a function, the same logic works for any section, any course, any semester.

2. Functions prevent bugs. The scope issue — leftover counter variables from one section contaminating the next — disappeared the moment Jordan used functions. Each function call gets a clean slate.

3. Validation is not optional. Real data has errors. Building validation into the pipeline from the beginning (instead of discovering the errors after the analysis) saves hours of frustration.

4. Decomposition makes code readable. Jordan's final program has four functions: is_valid_grade, count_grades, section_report, and (implicitly) the main loop that ties them together. Each function has a clear name and a clear purpose. Someone reading the code — including future Jordan — can understand what's happening without tracing through every line.

5. The same pattern scales. The code that works for 20 grades in one section will work for 2,000 grades in a hundred sections. Jordan didn't need to change a single line of the functions — just feed them more data.

Looking Ahead

Jordan's grade analysis is just beginning. In Chapter 5, they'll learn to store grade data in dictionaries instead of separate lists. In Chapter 6, they'll load real grade data from a CSV file. In later chapters, they'll visualize the distributions, run statistical tests to determine whether the differences between sections are significant, and present findings to the sociology department.

But the foundation — clean functions that validate, categorize, and report — was built right here in Chapter 4.

Discussion Questions:

Jordan chose cutoffs of 90/80/70/60 for letter grades. What would need to change in the code if their university used different cutoffs (say, 93/85/77/70)? How many lines would they need to edit?

Why did Jordan write is_valid_grade as a separate function instead of putting the validation logic directly inside count_grades? What advantage does the separate function provide?

In the "naive approach," Jordan's counter variables existed in the global scope. In the function version, they exist in the function's local scope. Explain how this difference prevents bugs when analyzing multiple sections.