Chapter 7 Exercises: Data Structures
These exercises progress through five tiers of difficulty. Work through them in order — each tier builds on the skills developed in the previous one. All exercises use realistic business scenarios.
Tier 1: Foundations (Exercises 1–4)
Practice creating and accessing Python data structures. No functions required.
Exercise 1: Build Your First Product Catalog
Context: You've just joined the e-commerce team at Meridian Office Supplies. Your first task is to represent the product catalog in Python.
Task: Create a Python list called product_catalog containing at least 6 product names as strings. Then:
- Print the first product using indexing.
- Print the last product using negative indexing.
- Print the total number of products using
len(). - Print the second through fourth products (inclusive) using slicing.
- Check whether
"Ergonomic Chair"is in your catalog and print a message saying so.
Expected Output Format:
First product: [your product]
Last product: [your product]
Total products: 6
Products 2-4: [your slice]
'Ergonomic Chair' in catalog: True/False
Hint: Remember that indexing starts at 0.
Exercise 2: Create a Customer Record
Context: Carla in the sales team needs a customer record for her newest account, Pinnacle Group.
Task: Create a dictionary called customer representing a single customer with these fields:
- customer_id: "CUST-1042"
- company_name: "Pinnacle Group"
- contact_name: "Derek Huang"
- tier: "silver"
- annual_spend_usd: 28500.00
- region: "West"
- active: True
Then:
1. Print the customer's company name and tier.
2. Print their annual spend formatted as currency: $28,500.00.
3. Add a new key "account_manager" with the value "Carla Nguyen".
4. Update the tier to "gold".
5. Use .get() to retrieve the "preferred_currency" key, defaulting to "USD" if absent.
6. Print all keys and values using a for loop with .items().
Exercise 3: Working with Tuples
Context: Acme Corp stores its office locations as fixed coordinate records.
Task:
1. Create a named tuple called OfficeLocation with fields: city, state, latitude, longitude.
2. Create three office location instances for: New York (40.7128, -74.0060), Chicago (41.8781, -87.6298), and San Francisco (37.7749, -122.4194).
3. Store all three in a list called offices.
4. Use a for loop to print each office's city, state, and coordinates.
5. Try to assign a new latitude to one of the named tuples and observe the error (you can comment out this line after testing).
Bonus: Use tuple unpacking in the loop instead of dot notation.
Exercise 4: Sets for Market Segment Analysis
Context: Sandra is evaluating which market segments two new product lines cover.
Task: Create two sets:
- product_line_a_segments: {"retail", "wholesale", "e-commerce", "enterprise", "education"}
- product_line_b_segments: {"enterprise", "government", "healthcare", "e-commerce", "nonprofit"}
Then compute and print:
1. All segments covered by either line (union)
2. Segments covered by BOTH lines (intersection)
3. Segments covered only by Line A (difference)
4. Segments covered only by Line B (difference)
5. Segments that are unique to one line or the other (symmetric difference)
6. Whether Line A is a subset of the union
7. Whether {"enterprise", "e-commerce"} is a subset of Line A
Tier 2: Core Skills (Exercises 5–8)
Write functions and perform aggregations. Apply the patterns from the chapter.
Exercise 5: Sales Figures Analysis
Context: Priya has collected 20 days of daily sales figures for Q1.
Task: Start with this data:
daily_sales = [
14200, 9800, 16500, 11300, 13700,
8950, 17200, 12400, 15800, 10200,
18400, 9100, 14600, 11900, 16200,
13000, 10800, 19100, 8500, 15300
]
Write a function analyze_sales(sales: list[float]) -> dict that returns a dict containing:
- total: sum of all values
- average: mean value
- maximum: highest value
- minimum: lowest value
- above_average_count: count of days where sales exceeded the average
- best_day_index: index (0-based) of the highest day
Call the function and print each result in a formatted report.
Bonus: Add a weekly_totals key — a list of four weekly totals (5 days per week).
Exercise 6: Building a Sales Aggregation
Context: Marcus needs a summary of order totals by sales region.
Task: Start with this transaction list:
transactions = [
("ORD-2001", "Northeast", 4800.00),
("ORD-2002", "West", 9200.00),
("ORD-2003", "Southeast", 3100.50),
("ORD-2004", "Midwest", 6700.00),
("ORD-2005", "Northeast", 2250.75),
("ORD-2006", "West", 11400.00),
("ORD-2007", "Southeast", 5900.00),
("ORD-2008", "Midwest", 4200.25),
("ORD-2009", "Northeast", 7600.00),
("ORD-2010", "West", 3850.50),
]
Write a function aggregate_by_region(transactions: list[tuple]) -> dict[str, float] that returns a dict mapping each region to its total sales.
Then write a second function print_region_report(totals: dict[str, float]) -> None that prints a formatted table showing region, total, and percentage of grand total, sorted by total descending.
Exercise 7: List Comprehension Transformations
Context: Priya needs to clean and transform raw product data before loading it into the reporting system.
Task: Start with this raw product list:
raw_products = [
{"sku": "P001", "name": " wireless headset ", "price": 149.99, "category": "audio", "in_stock": True},
{"sku": "P002", "name": "Standing Desk", "price": 249.00, "category": "furniture", "in_stock": True},
{"sku": "P003", "name": "ergonomic mouse ", "price": 59.95, "category": "peripherals", "in_stock": False},
{"sku": "P004", "name": " USB Hub ", "price": 39.99, "category": "peripherals", "in_stock": True},
{"sku": "P005", "name": "4K Webcam", "price": 129.00, "category": "video", "in_stock": True},
{"sku": "P006", "name": "Monitor Arm", "price": 89.00, "category": "furniture", "in_stock": False},
]
Using list comprehensions (one per task), create:
1. A list of product names with leading/trailing whitespace stripped and .title() applied.
2. A list of SKUs for in-stock products only.
3. A list of (sku, price) tuples for products priced at $100 or more.
4. A list of product dicts with a new key "discounted_price" set to 90% of the original price, for out-of-stock products only.
5. A dict comprehension: {sku: name} for all in-stock products.
Exercise 8: Nested Structure — Table of Records
Context: Sandra wants a summary table showing each sales rep's name, region, number of deals closed, and total revenue for the month.
Task: Create a list of dicts called rep_performance with at least 6 entries, each containing:
- rep_name, region, deals_closed, revenue_usd
Then write these functions (no pre-existing functions to start from — design them yourself):
top_rep_by_revenue(reps: list[dict]) -> dict— returns the rep with highest revenuefilter_by_region(reps: list[dict], region: str) -> list[dict]— returns reps in the given regionaverage_deal_size(reps: list[dict]) -> float— computes average revenue per deal across all repsreps_above_target(reps: list[dict], target: float) -> list[str]— returns a list of rep names whose revenue exceeds the target
Print results from all four functions with formatted output.
Tier 3: Applied Patterns (Exercises 9–12)
Build complete, multi-function programs that solve a coherent business problem.
Exercise 9: Inventory Management System
Context: You are building a simple inventory tracker for Acme Corp's warehouse.
Task: Design and implement an inventory system using a dict-of-dicts structure where the outer key is SKU. Your system must include:
-
An initial inventory of at least 8 products (each with:
name,category,quantity_on_hand,reorder_threshold,unit_cost_usd). -
A function
reorder_needed(inventory: dict) -> list[str]that returns a list of SKUs wherequantity_on_hand <= reorder_threshold. -
A function
total_inventory_value(inventory: dict) -> floatthat returns the sum ofquantity_on_hand * unit_cost_usdfor all items. -
A function
adjust_stock(inventory: dict, sku: str, quantity_change: int) -> boolthat adds or subtracts fromquantity_on_hand(negative for a sale, positive for a restock). Prevent quantity from going below 0. -
A function
category_summary(inventory: dict) -> dict[str, dict]that returns, for each category, the number of SKUs and total inventory value.
Run the system: load inventory, check reorder list, adjust a few stock levels, print the category summary and total value.
Exercise 10: Employee Directory with Org Chart
Context: Marcus is building an employee directory for Acme Corp.
Task: Design a data structure that can represent employees AND their reporting relationships (who reports to whom). Each employee should have at minimum: employee_id, name, department, title, manager_id (the employee_id of their manager, or None for the CEO).
Create at least 12 employees across at least 3 departments.
Write these functions:
1. find_employee(directory: list[dict], employee_id: str) -> dict | None
2. direct_reports(directory: list[dict], manager_id: str) -> list[dict] — returns all employees who report directly to this manager
3. employees_by_department(directory: list[dict]) -> dict[str, list[dict]] — returns a dict of dept -> list of employees
4. org_chart(directory: list[dict], root_id: str, indent: int = 0) -> None — recursively prints an org chart starting from the given root employee
Exercise 11: Invoice Tracker
Context: Maya Reyes needs an invoice tracking system for her consulting practice.
Task: Create a list of invoice dicts. Each invoice should have: invoice_id, client_name, issue_date (string "YYYY-MM-DD"), due_date, amount_usd, status (one of: "draft", "sent", "paid", "overdue").
Create at least 10 invoices across at least 4 clients with mixed statuses.
Write these functions:
1. total_outstanding(invoices: list[dict]) -> float — sum of amounts with status "sent" or "overdue"
2. overdue_invoices(invoices: list[dict], reference_date: str) -> list[dict] — sent invoices past their due date
3. revenue_by_client(invoices: list[dict]) -> dict[str, float] — sum of paid invoices per client
4. mark_paid(invoices: list[dict], invoice_id: str) -> bool — updates status to "paid", returns False if not found
5. aging_report(invoices: list[dict], reference_date: str) -> None — prints outstanding invoices sorted by days overdue
Exercise 12: Sales Territory Rebalancing
Context: Sandra wants to rebalance Acme's sales territories based on current customer density and revenue.
Task: Start with a dataset of customer records, each including: customer_id, company_name, current_territory, annual_revenue_usd, city, state.
Create at least 15 customers spread across 4 territories.
Write these functions:
1. territory_stats(customers: list[dict]) -> dict[str, dict] — for each territory, compute: customer count, total revenue, average revenue
2. imbalance_score(stats: dict) -> float — computes the standard deviation of revenue across territories (a measure of imbalance)
3. customers_to_move(customers: list[dict], from_territory: str, to_territory: str, n: int) -> list[str] — returns the IDs of the N smallest customers to reassign (by revenue), to help rebalance
4. reassign_territory(customers: list[dict], customer_ids: list[str], new_territory: str) -> int — reassigns territory for the given IDs, returns count of changed records
Print the territory stats before and after a rebalancing operation.
Tier 4: Challenge Exercises (Exercises 13–15)
More complex problems requiring careful design and multiple data structures working together.
Exercise 13: Product Recommendation Engine (Basic)
Context: Priya wants to build a simple product recommendation system for Acme's e-commerce site.
Problem: Given a customer's purchase history (a list of SKUs), recommend products that similar customers have bought.
Task: Create a dataset of at least 8 "customers" with their purchase histories as lists of SKUs. Create a product catalog dict mapping SKU to name.
Then implement:
1. find_similar_customers(all_customers: list[dict], target_customer_id: str) -> list[dict] — find customers who share at least one purchased SKU with the target customer. Use set intersection.
2. recommend_products(all_customers: list[dict], target_customer_id: str, n: int = 3) -> list[str] — find SKUs that similar customers have bought that the target customer has NOT yet bought. Return the top N by frequency.
Test your recommender with at least two different target customers.
Hint: This is a great use case for set operations. A customer's purchase history as a set lets you quickly compute overlap.
Exercise 14: Budget Variance Tracker
Context: Marcus needs a budget variance tool for Acme's departmental budgets.
Problem: Track budgeted vs. actual spending by department and by expense category. Flag anything that is over 10% over budget.
Task: Design a nested dict structure: {department: {category: {"budget": float, "actual": float}}}.
Create data for at least 4 departments and 5 expense categories.
Implement:
1. variance_report(budget_data: dict) -> list[dict] — returns a flat list of records: {department, category, budget, actual, variance_usd, variance_pct, over_budget: bool}
2. department_summary(budget_data: dict) -> dict[str, dict] — total budget, total actual, and total variance per department
3. worst_overruns(budget_data: dict, n: int = 5) -> list[dict] — returns the top N over-budget items by percentage overrun
4. Print a formatted variance report sorted by variance percentage descending.
Exercise 15: Multi-Week Sales Leaderboard
Context: Sandra tracks a rolling 4-week sales leaderboard for the rep team.
Problem: You have 4 weeks of weekly sales data per rep. For each rep: show their weekly totals, 4-week total, rank, and trend (whether week 4 > week 1).
Task: Create a dataset with at least 8 sales reps, each with a list of 4 weekly revenue totals.
Example structure:
rep_weekly_sales = {
"Carla Nguyen": [18400, 21200, 19800, 23100],
"Tom Delacroix": [15300, 16900, 18200, 17400],
# ...
}
Implement:
1. leaderboard(rep_data: dict) -> list[dict] — returns a list of records sorted by 4-week total descending, each including: rank, name, weekly_totals, four_week_total, avg_weekly, trend ("+", "-", or "=")
2. week_winners(rep_data: dict) -> dict[int, str] — returns a dict of {week_number: rep_name_with_highest_sales_that_week}
3. most_consistent_rep(rep_data: dict) -> str — returns the name of the rep with the lowest standard deviation in weekly sales (most consistent)
4. Print a formatted leaderboard table, including trend indicators.
Bonus: Add a function hot_streak_reps(rep_data: dict) -> list[str] that returns names of reps whose sales have increased every week (week 1 < week 2 < week 3 < week 4).
Tier 5: Open-Ended Projects (Exercises 16–18)
Design the data structure yourself and build a complete mini-application.
Exercise 16: Design Your Own CRM
Specification: Build a minimal CRM (Customer Relationship Management) system entirely in Python using data structures from this chapter.
Your CRM must: - Store at least 15 customer records with meaningful fields - Support: add, update, delete, and search operations - Track each customer's interaction history (list of interaction dicts — date, type, note) - Produce a report: customers sorted by relationship health score (a metric you define — perhaps based on recency of last contact and number of interactions)
Document your design decisions: why did you choose each data structure for each part of the system? Where did you use nested structures, and why?
Exercise 17: Financial Dashboard
Specification: Build a personal or small-business financial dashboard.
Your dashboard must track: - Income by category and month (at least 3 income categories, 6 months of data) - Expenses by category and month (at least 6 expense categories) - Savings rate per month
Produce: 1. A monthly P&L summary (income, expenses, net) 2. Category breakdown by percentage of total income/expense 3. Month-over-month trend for net income 4. Best and worst months
Exercise 18: Event Registration System
Specification: Build an event registration system for a business conference.
Design data structures to represent: - Events (session title, speaker, time slot, room, capacity) - Attendees (name, company, email, registered sessions)
Implement: - Register an attendee for a session (check capacity limits) - Cancel a registration - Waitlist logic when a session is full - Report: sessions sorted by fill rate (registered / capacity) - Report: attendees with the most sessions registered (VIP list) - Conflict detection: flag attendees registered for two sessions in the same time slot
Answer Hints
These are not full solutions — they are hints to unblock you if you get stuck. Try each exercise yourself first.
Exercise 1: Use catalog[0], catalog[-1], len(catalog), catalog[1:4], and the in operator.
Exercise 2: Direct bracket access for creation, customer.get("preferred_currency", "USD") for safe access, customer.items() for iteration.
Exercise 3: from collections import namedtuple then OfficeLocation = namedtuple("OfficeLocation", ["city", "state", "latitude", "longitude"]).
Exercise 4: |, &, -, ^, .issubset() operators and methods.
Exercise 5: sum(), max(), min(), daily_sales.index(max(...)) for best day. Use slicing for weekly buckets.
Exercise 6: region_totals[region] = region_totals.get(region, 0.0) + amount is the core pattern.
Exercise 7: List comprehension syntax: [expr for item in iterable if condition]. Dict comprehension: {k: v for ...}.
Exercise 8: Design your own records. max(reps, key=lambda r: r["revenue_usd"]) for top rep.
Exercise 9: Start with a dict of dicts. Use .get() everywhere. Use quantity + quantity_change and max(0, result) to prevent negative stock.
Exercise 10: The manager_id field is a foreign key to another record. direct_reports() scans the list for matching manager_id. Recursion in org_chart() is the key challenge.
Exercise 11: Use datetime.strptime() to parse date strings for comparison.
Exercise 12: Compute stats with sum(), len(), and arithmetic. Standard deviation requires import statistics or manual calculation.
Exercise 13: customer_a_skus.intersection(customer_b_skus) to find overlap. customer_a_skus.difference(recommended_skus) to find what's new.
Exercise 14: Variance = actual - budget. Percentage = (actual - budget) / budget * 100.
Exercise 15: Standard deviation: import statistics; statistics.stdev(weekly_list). Week winners: max(rep_data, key=lambda r: rep_data[r][week_index]).
Exercise 16–18: There are no wrong answers — the goal is to make a design decision and defend it with code.