Chapter 3 Exercises: Python for the Business Professional
Section A: Python Fundamentals (Exercises 1–10)
Exercise 1: Variables and Data Types
Create variables for the following business metrics and print each one with an appropriate label using f-strings:
- Company name (string): "Meridian Healthcare"
- Number of clinics (integer): 23
- Annual revenue (float): 8,750,000.00
- Year-over-year growth rate (float): 0.12
- Publicly traded (boolean): False
Then print a summary sentence that incorporates all five variables.
Exercise 2: Type Conversion
A dataset arrives with revenue stored as strings (e.g., "$1,250,000"). Write code that:
- Starts with the string
revenue_str = "$1,250,000" - Removes the dollar sign and commas
- Converts the result to a float
- Prints the float value
Hint: Use the .replace() method on strings. You can chain multiple .replace() calls.
Exercise 3: Arithmetic Operators
A retail store has the following quarterly data:
- Q1 Revenue: $320,000
- Q2 Revenue: $415,000
- Q3 Revenue: $380,000
- Q4 Revenue: $510,000
- Annual Costs: $1,200,000
Write code that calculates and prints:
- Total annual revenue
- Average quarterly revenue
- Annual profit (revenue minus costs)
- Profit margin as a percentage
- Projected next-year revenue assuming 8% growth
Exercise 4: String Operations
Given the following variables:
first_name = "Nkechi"
last_name = "Adeyemi"
department = "marketing"
employee_id = 4821
Write code that produces the following outputs:
"ADEYEMI, Nkechi"(last name uppercase, comma, first name)"Marketing"(department with first letter capitalized)"Employee ID: 004821"(zero-padded to 6 digits)"nadeyemi@company.com"(first initial + last name, all lowercase)
Exercise 5: Conditional Logic — Discount Tiers
Write an if/elif/else statement that assigns a discount rate based on order value:
| Order Value | Discount |
|---|---|
| $500 or more | 20% |
| $200 – $499.99 | 10% |
| $100 – $199.99 | 5% |
| Below $100 | 0% |
Test your code with order values of $75, $150, $350, and $600. For each, print the original price, discount rate, and final price.
Exercise 6: For Loops — Monthly Report
You have a list of monthly sales figures:
monthly_sales = [42000, 38500, 51000, 47200, 39800, 55100,
60200, 58400, 45000, 52300, 61800, 74500]
month_names = ["Jan", "Feb", "Mar", "Apr", "May", "Jun",
"Jul", "Aug", "Sep", "Oct", "Nov", "Dec"]
Write a for loop that prints a line for each month showing: - Month name - Sales figure (formatted with dollar sign and commas) - Whether the month was above or below the annual average
At the end, print the annual total, annual average, best month, and worst month.
Exercise 7: While Loop — Break-Even Analysis
A startup has: - Initial investment: $500,000 - Monthly revenue: $45,000 - Monthly costs: $32,000
Write a while loop that determines how many months it takes to break even (i.e., when cumulative profit equals or exceeds the initial investment). Print the month number and cumulative profit at each step.
Exercise 8: List Comprehensions
Given a list of product prices:
prices = [29.99, 49.99, 15.00, 89.99, 12.50, 199.99, 34.99, 7.99]
Use list comprehensions to create:
- A list of prices with 10% tax added
- A list of only prices above $30
- A list of price labels (e.g.,
"$29.99"formatted as strings) - A list of tuples pairing each price with its category: "Premium" if above $50, "Standard" if $20–$50, "Budget" if below $20
Exercise 9: Dictionaries — Product Catalog
Create a list of three dictionaries, each representing a product with keys: name, sku, price, category, in_stock.
Then write code that:
- Prints all products that are in stock
- Calculates the average price of all products
- Finds the most expensive product and prints its name
- Adds a new key
discounted_price(15% off) to each dictionary
Exercise 10: Functions — Business Calculator
Write three functions:
-
compound_growth(principal, rate, years)— Returns the future value given compound growth. Formula:principal * (1 + rate) ** years -
payback_period(investment, monthly_profit)— Returns the number of months to recoup an investment (rounded up to the nearest whole month). -
format_currency(amount)— Returns a string formatted as currency (e.g.,1234567.89becomes"$1,234,567.89").
Test each function with at least two different inputs.
Section B: pandas Exercises (Exercises 11–25)
Exercise 11: Creating a DataFrame
Create a DataFrame from the following data about five regional offices:
| Office | City | Employees | Revenue | Year Opened |
|---|---|---|---|---|
| HQ | New York | 120 | 4500000 | 2010 |
| West | San Francisco | 85 | 3200000 | 2015 |
| South | Austin | 45 | 1800000 | 2018 |
| Midwest | Chicago | 60 | 2100000 | 2016 |
| Northeast | Boston | 35 | 1200000 | 2020 |
After creating the DataFrame, display: - The shape - Data types of each column - Descriptive statistics
Exercise 12: Reading and Exploring Data
Using the Athena sales dataset created in the chapter (or a CSV of your choice), write code to answer:
- How many rows and columns does the dataset have?
- What are the unique values in the
regioncolumn? - What is the date range of the data?
- Are there any missing values?
- What are the mean, median, and standard deviation of the
revenuecolumn?
Exercise 13: Selecting and Filtering
Using the Athena sales dataset:
- Select only the
store,category, andrevenuecolumns. - Filter for all rows where
categoryis "Electronics." - Filter for rows where
revenueis above $60,000. - Filter for rows in the "West" region with revenue above $40,000.
- Filter for rows where the category is either "Electronics" or "Apparel."
Exercise 14: Sorting
Using the Athena sales dataset:
- Sort by revenue in descending order and display the top 10 rows.
- Sort by region (alphabetically) and then by revenue (descending) within each region.
- Find the single highest-revenue record in the dataset. What store, category, and month does it represent?
Exercise 15: Creating New Columns
Using the Athena sales dataset:
- Create a column called
avg_pricecalculated asrevenue / units_sold. - Create a column called
quarterbased on the month (Q1 = Jan–Mar, Q2 = Apr–Jun, etc.). - Create a column called
revenue_tierthat classifies each row as "High" (above $60,000), "Medium" ($30,000–$60,000), or "Low" (below $30,000).
Exercise 16: GroupBy Basics
Using the Athena sales dataset:
- Calculate total revenue by region.
- Calculate average revenue by category.
- Calculate total units sold by store.
- Find the region with the highest total revenue.
- Find the category with the lowest average revenue.
Exercise 17: GroupBy Advanced
Using the Athena sales dataset:
- Calculate total revenue by region and category (two-level groupby).
- For each store, calculate the total revenue, average revenue, and number of months of data.
- Find the top 3 stores by total revenue.
- Calculate what percentage of total company revenue each region contributes.
Exercise 18: Pivot Tables
Using the Athena sales dataset, create a pivot table showing:
- Rows: regions
- Columns: categories
- Values: total revenue
Then create a second pivot table showing average revenue instead of total.
Hint: Use pd.pivot_table() or df.pivot_table().
Exercise 19: Handling Missing Data
Create a DataFrame with some intentionally missing values:
import numpy as np
data = {
"product": ["Widget A", "Widget B", "Widget C", "Widget D", "Widget E"],
"price": [25.00, np.nan, 15.00, 42.00, np.nan],
"units_sold": [100, 250, np.nan, 175, 300],
"category": ["Electronics", "Electronics", "Home", np.nan, "Home"]
}
df = pd.DataFrame(data)
Write code that:
- Identifies which columns have missing values and how many
- Fills missing prices with the median price
- Fills the missing category with "Unknown"
- Drops any rows where
units_soldis still missing - Verifies that no missing values remain
Exercise 20: Merging DataFrames
You have two DataFrames:
stores = pd.DataFrame({
"store_id": [1, 2, 3, 4, 5],
"store_name": ["Downtown", "Airport", "Mall", "Suburban", "University"],
"region": ["NE", "NE", "SE", "MW", "W"]
})
performance = pd.DataFrame({
"store_id": [1, 2, 3, 4, 5],
"q1_revenue": [320000, 280000, 410000, 150000, 90000],
"q1_target": [300000, 300000, 400000, 200000, 100000]
})
- Merge these DataFrames on
store_id. - Add a column
met_targetthat isTrueifq1_revenue >= q1_target. - Add a column
pct_of_targetshowing revenue as a percentage of target. - Which stores met their target? Which did not?
Section C: Business Analysis Exercises (Exercises 21–30)
Exercise 21: Sales Trend Analysis
Using the Athena sales dataset, write code to:
- Calculate total revenue by month.
- Identify the month with the highest revenue and the month with the lowest.
- Calculate the month-over-month growth rate.
- Create a line chart showing the monthly revenue trend.
- Write a 2–3 sentence business summary of the trend.
Exercise 22: Store Performance Scorecard
Write a function called store_scorecard(df, store_name) that takes the Athena sales DataFrame and a store name, and prints:
- Total annual revenue for that store
- Revenue by category
- Best-performing category
- Worst-performing category
- Monthly trend (list of monthly totals)
- Whether the store is above or below the company average
Test it with at least three different stores.
Exercise 23: Category Analysis
Write code that answers the following questions about the Athena dataset:
- Which product category generates the most revenue overall?
- Which category has the highest average transaction value (revenue / units)?
- Is the category mix consistent across regions, or do some regions lean heavily toward certain categories?
- Which category shows the most seasonal variation?
Present your findings as printed output with clear labels.
Exercise 24: Underperformer Deep Dive
Extend the underperformer analysis from the chapter:
- Instead of using 80% of the category average as the threshold, try 70%, 80%, and 90%. How does the number of underperformers change?
- Are there any stores that appear as underperformers across multiple categories?
- For the most underperforming store, write a brief analysis (3–5 sentences) explaining what the data suggests and what additional information you would want.
Exercise 25: Automated Report Function
Write a function called monthly_report(df, month) that takes the Athena sales DataFrame and a month string (e.g., "2025-06") and produces a complete printed report including:
- Total revenue for the month
- Revenue by region
- Revenue by category
- Top-performing store
- Lowest-performing store
- Comparison to the previous month (if available)
Test it with at least two different months.
Exercise 26: Python vs. Excel Decision Framework
For each of the following tasks, decide whether Python or Excel would be the better tool (or whether both are needed). Explain your reasoning in 2–3 sentences per task.
- Calculating the sum of 50 numbers from a meeting
- Analyzing 500,000 rows of transaction data
- Building a one-time financial model for a board presentation
- Running the same sales report every Monday morning
- Exploring a new dataset you have never seen before
- Sharing results with a non-technical executive
- Connecting to an API to pull social media metrics
Exercise 27: Data Quality Assessment
Write a function called data_quality_check(df) that takes any DataFrame and prints:
- Shape (rows x columns)
- Columns with missing values and the percentage missing
- Columns with only one unique value (potential constants that add no information)
- Numeric columns with suspicious values (e.g., negative revenue)
- Duplicate rows (if any)
Test it on the Athena sales dataset and on a DataFrame you create with intentional quality issues.
Exercise 28: Year-over-Year Comparison
Create a second year of synthetic Athena data (2024) using the same structure as the 2025 data. Combine both years into a single DataFrame. Then:
- Calculate total revenue by year.
- Calculate the year-over-year growth rate by region.
- Identify which regions grew and which declined.
- Create a grouped bar chart comparing 2024 and 2025 revenue by region.
Exercise 29: Function Library
Create a Python file (or a notebook section) called "My Business Analysis Toolkit" containing the following reusable functions:
calculate_cagr(start_value, end_value, years)— Compound annual growth rateclassify_by_quartile(series)— Returns a Series with labels "Q1", "Q2", "Q3", "Q4" based on quartile positionformat_summary_table(df)— Takes a DataFrame and returns a string with formatted numbers (commas, dollar signs for revenue columns)detect_outliers(series, threshold=1.5)— Returns values that are more thanthresholdIQR away from Q1/Q3
Test each function.
Exercise 30: Capstone — Complete Business Analysis
Choose one of the following scenarios and write a complete analysis in a Jupyter notebook:
Option A: You are a new analyst at a chain of 12 coffee shops. Create a synthetic dataset of daily sales (date, shop name, drink category, revenue, cups sold) for 3 months. Analyze which shops are underperforming, which drink categories are most popular by location, and whether weekday vs. weekend patterns differ. Present your findings with at least two visualizations and a written recommendation.
Option B: You are a marketing analyst reviewing campaign performance. Create a synthetic dataset of 20 email campaigns (campaign name, date, audience size, open rate, click rate, conversions, spend). Analyze which campaigns were most cost-effective, whether audience size affects conversion rates, and what the optimal campaign profile looks like. Present findings with visualizations and recommendations.
Section D: Conceptual Questions (Exercises 31–35)
Exercise 31
Explain in your own words why reproducibility matters in business analysis. Give an example of a situation where a non-reproducible analysis could cause a business problem.
Exercise 32
A colleague says: "I don't need to learn Python — AI can write all my code for me." Write a 3–5 sentence response explaining why understanding Python is still valuable even in the age of AI coding assistants.
Exercise 33
Describe three situations where a Jupyter notebook would be a better deliverable than a PowerPoint presentation, and three situations where PowerPoint would be better.
Exercise 34
You have just shown your Python analysis to a VP who responds: "This is great, but can you put it in Excel so I can play with the numbers?" How do you handle this request, and what does it tell you about communication in data work?
Exercise 35
Professor Okonkwo says: "You are not learning to be software engineers. You are learning to ask better questions." In 3–5 sentences, explain what you think she means by this, and give an example of how Python enables "better questions" compared to manual analysis.
Solutions to selected exercises are available in Appendix B: Answers to Selected Exercises.