Chapter 7 Exercises: Data Structures

These exercises progress through five tiers of difficulty. Work through them in order — each tier builds on the skills developed in the previous one. All exercises use realistic business scenarios.

Tier 1: Foundations (Exercises 1–4)

Practice creating and accessing Python data structures. No functions required.

Exercise 1: Build Your First Product Catalog

Context: You've just joined the e-commerce team at Meridian Office Supplies. Your first task is to represent the product catalog in Python.

Task: Create a Python list called product_catalog containing at least 6 product names as strings. Then:

Print the first product using indexing.
Print the last product using negative indexing.
Print the total number of products using len().
Print the second through fourth products (inclusive) using slicing.
Check whether "Ergonomic Chair" is in your catalog and print a message saying so.

Expected Output Format:

First product: [your product]
Last product: [your product]
Total products: 6
Products 2-4: [your slice]
'Ergonomic Chair' in catalog: True/False

Hint: Remember that indexing starts at 0.

Exercise 2: Create a Customer Record

Context: Carla in the sales team needs a customer record for her newest account, Pinnacle Group.

Task: Create a dictionary called customer representing a single customer with these fields: - customer_id: "CUST-1042" - company_name: "Pinnacle Group" - contact_name: "Derek Huang" - tier: "silver" - annual_spend_usd: 28500.00 - region: "West" - active: True

Then: 1. Print the customer's company name and tier. 2. Print their annual spend formatted as currency: $28,500.00. 3. Add a new key "account_manager" with the value "Carla Nguyen". 4. Update the tier to "gold". 5. Use .get() to retrieve the "preferred_currency" key, defaulting to "USD" if absent. 6. Print all keys and values using a for loop with .items().

Exercise 3: Working with Tuples

Context: Acme Corp stores its office locations as fixed coordinate records.

Task: 1. Create a named tuple called OfficeLocation with fields: city, state, latitude, longitude. 2. Create three office location instances for: New York (40.7128, -74.0060), Chicago (41.8781, -87.6298), and San Francisco (37.7749, -122.4194). 3. Store all three in a list called offices. 4. Use a for loop to print each office's city, state, and coordinates. 5. Try to assign a new latitude to one of the named tuples and observe the error (you can comment out this line after testing).

Bonus: Use tuple unpacking in the loop instead of dot notation.

Exercise 4: Sets for Market Segment Analysis

Context: Sandra is evaluating which market segments two new product lines cover.

Task: Create two sets: - product_line_a_segments: {"retail", "wholesale", "e-commerce", "enterprise", "education"} - product_line_b_segments: {"enterprise", "government", "healthcare", "e-commerce", "nonprofit"}

Then compute and print: 1. All segments covered by either line (union) 2. Segments covered by BOTH lines (intersection) 3. Segments covered only by Line A (difference) 4. Segments covered only by Line B (difference) 5. Segments that are unique to one line or the other (symmetric difference) 6. Whether Line A is a subset of the union 7. Whether {"enterprise", "e-commerce"} is a subset of Line A

Tier 2: Core Skills (Exercises 5–8)

Write functions and perform aggregations. Apply the patterns from the chapter.

Exercise 5: Sales Figures Analysis

Context: Priya has collected 20 days of daily sales figures for Q1.

Task: Start with this data:

daily_sales = [
    14200, 9800, 16500, 11300, 13700,
    8950, 17200, 12400, 15800, 10200,
    18400, 9100, 14600, 11900, 16200,
    13000, 10800, 19100, 8500, 15300
]

Write a function analyze_sales(sales: list[float]) -> dict that returns a dict containing: - total: sum of all values - average: mean value - maximum: highest value - minimum: lowest value - above_average_count: count of days where sales exceeded the average - best_day_index: index (0-based) of the highest day

Call the function and print each result in a formatted report.

Bonus: Add a weekly_totals key — a list of four weekly totals (5 days per week).

Exercise 6: Building a Sales Aggregation

Context: Marcus needs a summary of order totals by sales region.

Task: Start with this transaction list:

transactions = [
    ("ORD-2001", "Northeast", 4800.00),
    ("ORD-2002", "West",      9200.00),
    ("ORD-2003", "Southeast", 3100.50),
    ("ORD-2004", "Midwest",   6700.00),
    ("ORD-2005", "Northeast", 2250.75),
    ("ORD-2006", "West",      11400.00),
    ("ORD-2007", "Southeast", 5900.00),
    ("ORD-2008", "Midwest",   4200.25),
    ("ORD-2009", "Northeast", 7600.00),
    ("ORD-2010", "West",      3850.50),
]

Write a function aggregate_by_region(transactions: list[tuple]) -> dict[str, float] that returns a dict mapping each region to its total sales.

Then write a second function print_region_report(totals: dict[str, float]) -> None that prints a formatted table showing region, total, and percentage of grand total, sorted by total descending.

Exercise 7: List Comprehension Transformations

Context: Priya needs to clean and transform raw product data before loading it into the reporting system.

Task: Start with this raw product list:

raw_products = [
    {"sku": "P001", "name": " wireless headset ", "price": 149.99, "category": "audio",       "in_stock": True},
    {"sku": "P002", "name": "Standing Desk",       "price": 249.00, "category": "furniture",   "in_stock": True},
    {"sku": "P003", "name": "ergonomic mouse  ",   "price":  59.95, "category": "peripherals", "in_stock": False},
    {"sku": "P004", "name": " USB Hub ",           "price":  39.99, "category": "peripherals", "in_stock": True},
    {"sku": "P005", "name": "4K Webcam",           "price": 129.00, "category": "video",       "in_stock": True},
    {"sku": "P006", "name": "Monitor Arm",         "price":  89.00, "category": "furniture",   "in_stock": False},
]

Using list comprehensions (one per task), create: 1. A list of product names with leading/trailing whitespace stripped and .title() applied. 2. A list of SKUs for in-stock products only. 3. A list of (sku, price) tuples for products priced at $100 or more. 4. A list of product dicts with a new key "discounted_price" set to 90% of the original price, for out-of-stock products only. 5. A dict comprehension: {sku: name} for all in-stock products.

Exercise 8: Nested Structure — Table of Records

Context: Sandra wants a summary table showing each sales rep's name, region, number of deals closed, and total revenue for the month.

Task: Create a list of dicts called rep_performance with at least 6 entries, each containing: - rep_name, region, deals_closed, revenue_usd

Then write these functions (no pre-existing functions to start from — design them yourself):

top_rep_by_revenue(reps: list[dict]) -> dict — returns the rep with highest revenue
filter_by_region(reps: list[dict], region: str) -> list[dict] — returns reps in the given region
average_deal_size(reps: list[dict]) -> float — computes average revenue per deal across all reps
reps_above_target(reps: list[dict], target: float) -> list[str] — returns a list of rep names whose revenue exceeds the target

Print results from all four functions with formatted output.

Tier 3: Applied Patterns (Exercises 9–12)

Build complete, multi-function programs that solve a coherent business problem.

Exercise 9: Inventory Management System

Context: You are building a simple inventory tracker for Acme Corp's warehouse.

Task: Design and implement an inventory system using a dict-of-dicts structure where the outer key is SKU. Your system must include:

An initial inventory of at least 8 products (each with: name, category, quantity_on_hand, reorder_threshold, unit_cost_usd).
A function reorder_needed(inventory: dict) -> list[str] that returns a list of SKUs where quantity_on_hand <= reorder_threshold.
A function total_inventory_value(inventory: dict) -> float that returns the sum of quantity_on_hand * unit_cost_usd for all items.
A function adjust_stock(inventory: dict, sku: str, quantity_change: int) -> bool that adds or subtracts from quantity_on_hand (negative for a sale, positive for a restock). Prevent quantity from going below 0.
A function category_summary(inventory: dict) -> dict[str, dict] that returns, for each category, the number of SKUs and total inventory value.

Run the system: load inventory, check reorder list, adjust a few stock levels, print the category summary and total value.

Exercise 10: Employee Directory with Org Chart

Context: Marcus is building an employee directory for Acme Corp.

Task: Design a data structure that can represent employees AND their reporting relationships (who reports to whom). Each employee should have at minimum: employee_id, name, department, title, manager_id (the employee_id of their manager, or None for the CEO).

Create at least 12 employees across at least 3 departments.

Write these functions: 1. find_employee(directory: list[dict], employee_id: str) -> dict | None 2. direct_reports(directory: list[dict], manager_id: str) -> list[dict] — returns all employees who report directly to this manager 3. employees_by_department(directory: list[dict]) -> dict[str, list[dict]] — returns a dict of dept -> list of employees 4. org_chart(directory: list[dict], root_id: str, indent: int = 0) -> None — recursively prints an org chart starting from the given root employee

Exercise 11: Invoice Tracker

Context: Maya Reyes needs an invoice tracking system for her consulting practice.

Task: Create a list of invoice dicts. Each invoice should have: invoice_id, client_name, issue_date (string "YYYY-MM-DD"), due_date, amount_usd, status (one of: "draft", "sent", "paid", "overdue").

Create at least 10 invoices across at least 4 clients with mixed statuses.

Write these functions: 1. total_outstanding(invoices: list[dict]) -> float — sum of amounts with status "sent" or "overdue" 2. overdue_invoices(invoices: list[dict], reference_date: str) -> list[dict] — sent invoices past their due date 3. revenue_by_client(invoices: list[dict]) -> dict[str, float] — sum of paid invoices per client 4. mark_paid(invoices: list[dict], invoice_id: str) -> bool — updates status to "paid", returns False if not found 5. aging_report(invoices: list[dict], reference_date: str) -> None — prints outstanding invoices sorted by days overdue

Exercise 12: Sales Territory Rebalancing

Context: Sandra wants to rebalance Acme's sales territories based on current customer density and revenue.

Task: Start with a dataset of customer records, each including: customer_id, company_name, current_territory, annual_revenue_usd, city, state.

Create at least 15 customers spread across 4 territories.

Write these functions: 1. territory_stats(customers: list[dict]) -> dict[str, dict] — for each territory, compute: customer count, total revenue, average revenue 2. imbalance_score(stats: dict) -> float — computes the standard deviation of revenue across territories (a measure of imbalance) 3. customers_to_move(customers: list[dict], from_territory: str, to_territory: str, n: int) -> list[str] — returns the IDs of the N smallest customers to reassign (by revenue), to help rebalance 4. reassign_territory(customers: list[dict], customer_ids: list[str], new_territory: str) -> int — reassigns territory for the given IDs, returns count of changed records

Print the territory stats before and after a rebalancing operation.

Tier 4: Challenge Exercises (Exercises 13–15)

More complex problems requiring careful design and multiple data structures working together.

Exercise 13: Product Recommendation Engine (Basic)

Context: Priya wants to build a simple product recommendation system for Acme's e-commerce site.

Problem: Given a customer's purchase history (a list of SKUs), recommend products that similar customers have bought.

Task: Create a dataset of at least 8 "customers" with their purchase histories as lists of SKUs. Create a product catalog dict mapping SKU to name.

Then implement: 1. find_similar_customers(all_customers: list[dict], target_customer_id: str) -> list[dict] — find customers who share at least one purchased SKU with the target customer. Use set intersection. 2. recommend_products(all_customers: list[dict], target_customer_id: str, n: int = 3) -> list[str] — find SKUs that similar customers have bought that the target customer has NOT yet bought. Return the top N by frequency.

Test your recommender with at least two different target customers.

Hint: This is a great use case for set operations. A customer's purchase history as a set lets you quickly compute overlap.

Exercise 14: Budget Variance Tracker

Context: Marcus needs a budget variance tool for Acme's departmental budgets.

Problem: Track budgeted vs. actual spending by department and by expense category. Flag anything that is over 10% over budget.

Task: Design a nested dict structure: {department: {category: {"budget": float, "actual": float}}}.

Create data for at least 4 departments and 5 expense categories.

Implement: 1. variance_report(budget_data: dict) -> list[dict] — returns a flat list of records: {department, category, budget, actual, variance_usd, variance_pct, over_budget: bool} 2. department_summary(budget_data: dict) -> dict[str, dict] — total budget, total actual, and total variance per department 3. worst_overruns(budget_data: dict, n: int = 5) -> list[dict] — returns the top N over-budget items by percentage overrun 4. Print a formatted variance report sorted by variance percentage descending.

Exercise 15: Multi-Week Sales Leaderboard

Context: Sandra tracks a rolling 4-week sales leaderboard for the rep team.

Problem: You have 4 weeks of weekly sales data per rep. For each rep: show their weekly totals, 4-week total, rank, and trend (whether week 4 > week 1).

Task: Create a dataset with at least 8 sales reps, each with a list of 4 weekly revenue totals.

Example structure:

rep_weekly_sales = {
    "Carla Nguyen":   [18400, 21200, 19800, 23100],
    "Tom Delacroix":  [15300, 16900, 18200, 17400],
    # ...
}

Implement: 1. leaderboard(rep_data: dict) -> list[dict] — returns a list of records sorted by 4-week total descending, each including: rank, name, weekly_totals, four_week_total, avg_weekly, trend ("+", "-", or "=") 2. week_winners(rep_data: dict) -> dict[int, str] — returns a dict of {week_number: rep_name_with_highest_sales_that_week} 3. most_consistent_rep(rep_data: dict) -> str — returns the name of the rep with the lowest standard deviation in weekly sales (most consistent) 4. Print a formatted leaderboard table, including trend indicators.

Bonus: Add a function hot_streak_reps(rep_data: dict) -> list[str] that returns names of reps whose sales have increased every week (week 1 < week 2 < week 3 < week 4).

Tier 5: Open-Ended Projects (Exercises 16–18)

Design the data structure yourself and build a complete mini-application.

Exercise 16: Design Your Own CRM

Specification: Build a minimal CRM (Customer Relationship Management) system entirely in Python using data structures from this chapter.

Your CRM must: - Store at least 15 customer records with meaningful fields - Support: add, update, delete, and search operations - Track each customer's interaction history (list of interaction dicts — date, type, note) - Produce a report: customers sorted by relationship health score (a metric you define — perhaps based on recency of last contact and number of interactions)

Document your design decisions: why did you choose each data structure for each part of the system? Where did you use nested structures, and why?

Exercise 17: Financial Dashboard

Specification: Build a personal or small-business financial dashboard.

Your dashboard must track: - Income by category and month (at least 3 income categories, 6 months of data) - Expenses by category and month (at least 6 expense categories) - Savings rate per month

Produce: 1. A monthly P&L summary (income, expenses, net) 2. Category breakdown by percentage of total income/expense 3. Month-over-month trend for net income 4. Best and worst months

Exercise 18: Event Registration System

Specification: Build an event registration system for a business conference.

Design data structures to represent: - Events (session title, speaker, time slot, room, capacity) - Attendees (name, company, email, registered sessions)

Implement: - Register an attendee for a session (check capacity limits) - Cancel a registration - Waitlist logic when a session is full - Report: sessions sorted by fill rate (registered / capacity) - Report: attendees with the most sessions registered (VIP list) - Conflict detection: flag attendees registered for two sessions in the same time slot

Answer Hints

These are not full solutions — they are hints to unblock you if you get stuck. Try each exercise yourself first.

Exercise 1: Use catalog[0], catalog[-1], len(catalog), catalog[1:4], and the in operator.

Exercise 2: Direct bracket access for creation, customer.get("preferred_currency", "USD") for safe access, customer.items() for iteration.

Exercise 3: from collections import namedtuple then OfficeLocation = namedtuple("OfficeLocation", ["city", "state", "latitude", "longitude"]).

Exercise 4: |, &, -, ^, .issubset() operators and methods.

Exercise 5: sum(), max(), min(), daily_sales.index(max(...)) for best day. Use slicing for weekly buckets.

Exercise 6: region_totals[region] = region_totals.get(region, 0.0) + amount is the core pattern.

Exercise 7: List comprehension syntax: [expr for item in iterable if condition]. Dict comprehension: {k: v for ...}.

Exercise 8: Design your own records. max(reps, key=lambda r: r["revenue_usd"]) for top rep.

Exercise 9: Start with a dict of dicts. Use .get() everywhere. Use quantity + quantity_change and max(0, result) to prevent negative stock.

Exercise 10: The manager_id field is a foreign key to another record. direct_reports() scans the list for matching manager_id. Recursion in org_chart() is the key challenge.

Exercise 11: Use datetime.strptime() to parse date strings for comparison.

Exercise 12: Compute stats with sum(), len(), and arithmetic. Standard deviation requires import statistics or manual calculation.

Exercise 13: customer_a_skus.intersection(customer_b_skus) to find overlap. customer_a_skus.difference(recommended_skus) to find what's new.

Exercise 14: Variance = actual - budget. Percentage = (actual - budget) / budget * 100.

Exercise 15: Standard deviation: import statistics; statistics.stdev(weekly_list). Week winners: max(rep_data, key=lambda r: rep_data[r][week_index]).

Exercise 16–18: There are no wrong answers — the goal is to make a design decision and defend it with code.