Chapter 3 Exercises: Python Fundamentals I — Variables, Data Types, and Expressions
How to use these exercises: Work through the sections in order. Parts A and B check your understanding of concepts and basic skills. Part C is all about debugging — finding and fixing broken code. Part D applies your skills to realistic data scenarios. Part E pushes you to combine ideas. Part M mixes in review questions from Chapters 1 and 2.
For every "predict the output" question: Write your prediction before running the code. The learning happens in the predicting, not the running.
Difficulty key: ⭐ Foundational | ⭐⭐ Intermediate | ⭐⭐⭐ Advanced | ⭐⭐⭐⭐ Extension
Part A: Conceptual Understanding ⭐
These questions check whether you've internalized the core ideas. Try to answer from memory before checking the chapter.
Exercise 3.1 — Variables as labels
The chapter introduced a threshold concept: variables are labels pointing to values, not boxes containing values. In your own words, explain the difference. Then explain what happens in memory when you execute these two lines:
a = 100
b = a
How many copies of the number 100 exist? How many labels point to it?
Guidance
One copy of `100` exists in memory. Two labels (`a` and `b`) point to it. If variables were boxes, `b = a` would create a copy of 100 and put it in a second box. But since variables are labels, `b = a` just sticks a second label on the same value. For simple types like integers, this distinction doesn't have practical consequences yet — but it becomes critical when you work with lists and dictionaries in Chapter 5, where multiple labels pointing to the same object means changes through one label are visible through the other.Exercise 3.2 — Type identification
Without running any code, state the type (int, float, str, or bool) of each value:
| Value | Type |
|---|---|
42 |
|
42.0 |
|
"42" |
|
True |
|
0 |
|
"" |
|
3.14 |
|
"False" |
Answers
| Value | Type | |-------|------| | `42` | `int` | | `42.0` | `float` | | `"42"` | `str` | | `True` | `bool` | | `0` | `int` | | `""` | `str` (empty string, but still a string) | | `3.14` | `float` | | `"False"` | `str` (it's in quotes — a string that happens to spell a boolean keyword) | The tricky ones: `42.0` is a float (the decimal point makes it so, even though it's a whole number). `"False"` is a string, not a boolean — the quotes make it text. And `""` is a string (an empty one), not nothing.Exercise 3.3 — Operator precedence
Write the result of each expression. Show your work by indicating which operation Python performs first.
2 + 3 * 4(2 + 3) * 410 - 6 / 22 ** 3 + 115 // 4 + 15 % 410 > 5 and 3 + 2 == 5
Answers
1. `2 + (3 * 4)` = `2 + 12` = `14` — multiplication before addition 2. `(2 + 3) * 4` = `5 * 4` = `20` — parentheses first 3. `10 - (6 / 2)` = `10 - 3.0` = `7.0` — division before subtraction; note the result is a float because `/` returns float 4. `(2 ** 3) + 1` = `8 + 1` = `9` — exponentiation before addition 5. `(15 // 4) + (15 % 4)` = `3 + 3` = `6` — floor division and modulo have the same precedence as multiplication/division, evaluated left to right, but they're independent here 6. `(10 > 5) and ((3 + 2) == 5)` = `True and (5 == 5)` = `True and True` = `True` — arithmetic first, then comparisons, then `and`Exercise 3.4 — Assignment vs. comparison
Explain the difference between = and == in Python. For each of the following, state whether it's assignment or comparison, and what the result is:
x = 10x == 10y = xy == x
Guidance
1. **Assignment.** Gives the name `x` the value `10`. No output. 2. **Comparison.** Returns `True` if `x` equals `10`, `False` otherwise. (After line 1, this would return `True`.) 3. **Assignment.** Gives the name `y` the same value that `x` points to. 4. **Comparison.** Returns `True` if `y` and `x` have the same value. The key: `=` stores a value. `==` asks a question ("are these equal?").Exercise 3.5 — Truthiness
Without running code, predict what bool() returns for each value:
bool(1)bool(0)bool(-1)bool("")bool(" ")bool("0")bool(0.0)bool(None)
Answers
1. `True` — any nonzero number is truthy 2. `False` — zero is falsy 3. `True` — negative numbers are nonzero, therefore truthy 4. `False` — empty string is falsy 5. `True` — a space is a character, so the string isn't empty 6. `True` — the string "0" is not empty (it contains one character) 7. `False` — zero as a float is still falsy 8. `False` — `None` is falsyExercise 3.6 — Immutability of strings
What is wrong with the following code? What does the programmer probably intend, and how would you fix it?
name = "elena"
name.upper()
print(name)
Guidance
The programmer expects `name` to be `"ELENA"` after calling `.upper()`. But string methods return a *new* string — they don't modify the original. `name.upper()` creates `"ELENA"` and then it's immediately discarded because it isn't saved to any variable. The fix:name = "elena"
name = name.upper()
print(name) # ELENA
Or, if you want to keep the original: `upper_name = name.upper()`.
Part B: Applied Skills ⭐⭐
These exercises ask you to write code. Type every answer into a Jupyter notebook and run it.
Exercise 3.7 — Variable creation
Create variables to store the following information about a dataset. Use descriptive snake_case names. Then print a formatted summary using an f-string.
- The dataset is called "Global Health Observatory"
- It was last updated in 2023
- It has 1,284 rows
- It covers 195 countries
- The average life expectancy across all countries is 73.4 years
Your f-string output should look something like:
Dataset: Global Health Observatory (updated 2023)
Rows: 1,284 | Countries: 195
Average life expectancy: 73.4 years
Guidance
dataset_name = "Global Health Observatory"
last_updated = 2023
row_count = 1284
country_count = 195
avg_life_expectancy = 73.4
print(f"Dataset: {dataset_name} (updated {last_updated})")
print(f"Rows: {row_count:,} | Countries: {country_count}")
print(f"Average life expectancy: {avg_life_expectancy} years")
Note the `:,` format specifier in `{row_count:,}` to add the comma separator.
Exercise 3.8 — Arithmetic with data
A basketball player attempted 82 three-point shots and made 31 of them. Write code to:
- Store the attempts and makes in variables
- Calculate the three-point shooting percentage (makes / attempts)
- Print the result as a percentage with one decimal place using an f-string
Expected output: Three-point percentage: 37.8%
Guidance
three_pt_attempts = 82
three_pt_makes = 31
three_pt_pct = three_pt_makes / three_pt_attempts * 100
print(f"Three-point percentage: {three_pt_pct:.1f}%")
The `:.1f` format specifier means "one decimal place, float format."
Exercise 3.9 — String methods practice
Given the following messy data values (simulating what you might read from a file), clean each one using string methods:
country_raw = " United States "
temp_raw = "98.6 degrees"
code_raw = "us"
- Remove the extra whitespace from
country_raw - Extract just the number part from
temp_raw(hint: use.split()and indexing) - Convert
code_rawto uppercase
Guidance
country_raw = " United States "
temp_raw = "98.6 degrees"
code_raw = "us"
country_clean = country_raw.strip()
temp_number = temp_raw.split(" ")[0] # Splits into ["98.6", "degrees"], takes first
code_upper = code_raw.upper()
print(f"Country: '{country_clean}'")
print(f"Temperature: {temp_number}")
print(f"Code: {code_upper}")
Output:
Country: 'United States'
Temperature: 98.6
Code: US
Note: `temp_number` is still a string (`"98.6"`). If you wanted to do math with it, you'd need `float(temp_number)`.
Exercise 3.10 — String slicing
A dataset uses patient IDs in the format "HOSP-YYYY-NNNNN" where HOSP is a hospital code, YYYY is the year, and NNNNN is a sequence number. Given:
patient_id = "MGH-2024-00142"
Use slicing to extract:
1. The hospital code ("MGH")
2. The year ("2024")
3. The sequence number ("00142")
4. Convert the year to an integer and add 1 to it
Guidance
patient_id = "MGH-2024-00142"
hospital = patient_id[:3]
year_str = patient_id[4:8]
sequence = patient_id[9:]
print(f"Hospital: {hospital}")
print(f"Year: {year_str}")
print(f"Sequence: {sequence}")
year_int = int(year_str) + 1
print(f"Next year: {year_int}")
Alternative approach using `.split("-")`:
parts = patient_id.split("-")
hospital = parts[0] # "MGH"
year_str = parts[1] # "2024"
sequence = parts[2] # "00142"
Exercise 3.11 — Type conversion chain
Start with the string "3.14159" and perform the following conversions, printing the result and type at each step:
- Convert to a float
- Convert the float to an int
- Convert the int back to a string
- Convert the string to a bool
What value and type do you have at each step?
Guidance
step0 = "3.14159"
print(f"Step 0: {step0} ({type(step0).__name__})")
step1 = float(step0)
print(f"Step 1: {step1} ({type(step1).__name__})")
step2 = int(step1)
print(f"Step 2: {step2} ({type(step2).__name__})")
step3 = str(step2)
print(f"Step 3: {step3} ({type(step3).__name__})")
step4 = bool(step3)
print(f"Step 4: {step4} ({type(step4).__name__})")
Output:
Step 0: 3.14159 (str)
Step 1: 3.14159 (float)
Step 2: 3 (int) ← truncated, not rounded!
Step 3: 3 (str)
Step 4: True (bool) ← "3" is a non-empty string, so it's truthy
Key insights: `int()` truncates (3.14159 becomes 3, not 3). And `bool("3")` is `True` because any non-empty string is truthy — even `bool("0")` would be `True` and even `bool("False")` would be `True`!
Exercise 3.12 — Comparison expressions
Given the following variables, predict whether each comparison returns True or False. Then verify in Python.
a = 10
b = 3.0
c = "10"
d = True
a == 10a == ca == int(c)b > 2 and b < 4a != bd == 1type(a) == type(c)
Answers
1. `True` — `a` is 10, and 10 equals 10 2. `False` — `a` is an int, `c` is a string. `10 == "10"` is `False` in Python (no automatic type coercion) 3. `True` — `int("10")` is 10, and `10 == 10` is `True` 4. `True` — 3.0 is greater than 2 and less than 4 5. `True` — `10 != 3.0` is `True` (they're different numbers) 6. `True` — `True` is equal to `1` in Python (booleans are a subtype of integers: `True == 1`, `False == 0`) 7. `False` — `type(a)` is `Exercise 3.13 — f-string formatting
Write f-strings that produce the following outputs, given the variables below:
population = 8045311
growth_rate = 0.02847
city = "new york"
pi = 3.14159265358979
Target outputs:
1. Population: 8,045,311
2. Growth rate: 2.85%
3. City: New York
4. Pi to 4 decimals: 3.1416
Guidance
print(f"Population: {population:,}")
print(f"Growth rate: {growth_rate * 100:.2f}%")
print(f"City: {city.title()}")
print(f"Pi to 4 decimals: {pi:.4f}")
Notes:
- `:,` adds comma separators
- `:.2f` formats as float with 2 decimal places
- `.title()` capitalizes the first letter of each word
- `:.4f` formats with 4 decimal places (and rounds the last digit)
Exercise 3.14 — Augmented assignment
What is the value of x after each line executes? Track the value step by step.
x = 10
x += 5
x *= 2
x -= 7
x //= 4
x %= 3
Answer
x = 10 → x is 10
x += 5 → x is 15 (10 + 5)
x *= 2 → x is 30 (15 * 2)
x -= 7 → x is 23 (30 - 7)
x //= 4 → x is 5 (23 // 4 = 5, remainder discarded)
x %= 3 → x is 2 (5 % 3 = 2, the remainder)
Part C: Debugging ⭐⭐
Every exercise in this section contains buggy code. Find the error, identify the error type (NameError, TypeError, SyntaxError, or ValueError), and fix it.
Exercise 3.15 — Debug this
city_name = "Chicago"
print(City_name)
Answer
**Error:** `NameError: name 'City_name' is not defined` **Cause:** Python is case-sensitive. The variable was defined as `city_name` (lowercase c) but referenced as `City_name` (uppercase C). **Fix:** `print(city_name)`Exercise 3.16 — Debug this
score = "95"
curved_score = score + 5
print(curved_score)
Answer
**Error:** `TypeError: can only concatenate str (not "int") to str` **Cause:** `score` is a string (`"95"`), not a number. You can't add an integer to a string. **Fix:** `curved_score = int(score) + 5`Exercise 3.17 — Debug this
print("The temperature is 72 degrees)
Answer
**Error:** `SyntaxError: EOL while scanning string literal` **Cause:** Missing closing quotation mark before the closing parenthesis. **Fix:** `print("The temperature is 72 degrees")`Exercise 3.18 — Debug this
vaccination rate = 0.73
Answer
**Error:** `SyntaxError: invalid syntax` **Cause:** Variable names cannot contain spaces. Python sees `vaccination` as a variable and then doesn't know what to do with `rate = 0.73`. **Fix:** `vaccination_rate = 0.73`Exercise 3.19 — Debug this
total = 100
average = total / 0
Answer
**Error:** `ZeroDivisionError: division by zero` **Cause:** You can't divide by zero — not in Python, not in math. This often happens when a count variable that's supposed to be the denominator hasn't been properly populated. **Fix:** This depends on the context. You might add a check: `if denominator != 0: average = total / denominator`. Or you might need to trace back to figure out why the denominator is zero.Exercise 3.20 — Debug this: multiple errors
This code has three separate errors. Find and fix all of them.
Patient_count = 450
vacc_rate = "0.82"
city = seattle
result = Patient_count * vacc_rate
print(f"In {city}, approximately {result} patients were vaccinated")
Answer
**Error 1 (line 3):** `NameError: name 'seattle' is not defined` — `seattle` needs quotes: `city = "Seattle"` **Error 2 (line 5):** `TypeError: can't multiply sequence by non-int of type 'str'` would occur if the NameError were fixed — `vacc_rate` is a string. Fix: `vacc_rate = 0.82` (remove the quotes) or convert: `float(vacc_rate)` **Error 3 (minor):** The variable naming is inconsistent — `Patient_count` uses different casing than the other variables. While not a Python error, convention says use `patient_count`. Fixed code:patient_count = 450
vacc_rate = 0.82
city = "Seattle"
result = patient_count * vacc_rate
print(f"In {city}, approximately {result:.0f} patients were vaccinated")
Part D: Real-World Application ⭐⭐⭐
These exercises simulate tasks you'd encounter in actual data work.
Exercise 3.21 — BMI calculator
Body Mass Index (BMI) is calculated as weight in kilograms divided by height in meters squared. Write code to:
- Store a weight of 70 kg and a height of 1.75 m in variables
- Calculate the BMI
- Print the result formatted to one decimal place
- Create a boolean variable indicating whether the BMI is in the "normal" range (18.5 to 24.9)
Guidance
weight_kg = 70
height_m = 1.75
bmi = weight_kg / (height_m ** 2)
print(f"BMI: {bmi:.1f}")
is_normal = bmi >= 18.5 and bmi <= 24.9
print(f"Normal range: {is_normal}")
Output:
BMI: 22.9
Normal range: True
Exercise 3.22 — Temperature conversion
Write code that converts a temperature from Fahrenheit to Celsius using the formula: C = (F - 32) * 5/9. Use the temperature 98.6 F. Print the result with two decimal places. Then verify your answer by converting back to Fahrenheit: F = C * 9/5 + 32.
Guidance
temp_f = 98.6
temp_c = (temp_f - 32) * 5 / 9
print(f"{temp_f}°F = {temp_c:.2f}°C")
# Verify by converting back
verify_f = temp_c * 9 / 5 + 32
print(f"Verification: {verify_f:.2f}°F")
Output:
98.6°F = 37.00°C
Verification: 98.60°F
Exercise 3.23 — Data summary report
You have the following raw data about a survey. Write code that stores each value, performs calculations, and prints a formatted report.
- Survey name: "Public Transit Satisfaction Survey"
- Total respondents: 2,847
- Satisfied: 1,891
- Unsatisfied: 814
- No response: 142
- Survey start date: "2024-01-15"
- Survey end date: "2024-02-28"
Your report should calculate and display: - The satisfaction rate as a percentage - The response rate (respondents who gave an answer / total) - The start year and month extracted from the date string
Guidance
survey_name = "Public Transit Satisfaction Survey"
total = 2847
satisfied = 1891
unsatisfied = 814
no_response = 142
start_date = "2024-01-15"
end_date = "2024-02-28"
responded = satisfied + unsatisfied
satisfaction_rate = satisfied / responded * 100
response_rate = responded / total * 100
start_year = start_date[:4]
start_month = start_date[5:7]
print(f"=== {survey_name} ===")
print(f"Total respondents: {total:,}")
print(f"Satisfaction rate: {satisfaction_rate:.1f}%")
print(f"Response rate: {response_rate:.1f}%")
print(f"Survey period: {start_year}, month {start_month}")
Exercise 3.24 — Course grade calculation
A student's grade is computed as: homework 30%, midterm 30%, final 40%. Given scores of homework=88, midterm=76, final=91, compute the weighted grade. Then determine whether the student passed (grade >= 60) and whether they earned honors (grade >= 90).
Guidance
homework = 88
midterm = 76
final = 91
grade = homework * 0.30 + midterm * 0.30 + final * 0.40
passed = grade >= 60
honors = grade >= 90
print(f"Weighted grade: {grade:.1f}")
print(f"Passed: {passed}")
print(f"Honors: {honors}")
Output:
Weighted grade: 85.6
Passed: True
Honors: False
Exercise 3.25 — Cleaning messy strings
Imagine you've read these values from a badly formatted spreadsheet. Use string methods to clean each one:
name = " dr. elena RODRIGUEZ "
email = "Elena.Rodriguez@Hospital.ORG"
phone = "555-867-5309"
department = "infectious diseases"
Transform them to produce:
- Name as title case with no extra spaces: "Dr. Elena Rodriguez"
- Email as all lowercase: "elena.rodriguez@hospital.org"
- Phone with no dashes: "5558675309"
- Department capitalized: "Infectious Diseases"
Guidance
name = " dr. elena RODRIGUEZ "
email = "Elena.Rodriguez@Hospital.ORG"
phone = "555-867-5309"
department = "infectious diseases"
name_clean = name.strip().title()
email_clean = email.lower()
phone_clean = phone.replace("-", "")
dept_clean = department.title()
print(f"Name: {name_clean}")
print(f"Email: {email_clean}")
print(f"Phone: {phone_clean}")
print(f"Department: {dept_clean}")
Part E: Synthesis and Extension ⭐⭐⭐⭐
These problems require combining multiple concepts.
Exercise 3.26 — Data type detective
Without using type(), write expressions that test whether a variable contains a specific type. For example, to check if x is an integer, you could use x == int(x) — but be careful, that doesn't always work.
For each of the following variables, write a boolean expression that evaluates to True:
a = 42
b = 42.0
c = "42"
d = True
Hint: use isinstance() — Python's built-in function for type checking. Look up how it works, or try isinstance(a, int).
Guidance
print(isinstance(a, int)) # True
print(isinstance(b, float)) # True
print(isinstance(c, str)) # True
print(isinstance(d, bool)) # True
# Interesting edge case:
print(isinstance(d, int)) # Also True! bool is a subclass of int
The fact that `isinstance(True, int)` returns `True` is a Python quirk — booleans are technically integers (`True == 1`, `False == 0`).
Exercise 3.27 — Building a data dictionary
A "data dictionary" is a description of every column in a dataset. Using only variables and f-strings (no lists or dictionaries yet — those come in Chapter 5), create a printed data dictionary for a small dataset with three columns. For each column, store and display:
- Column name
- Data type (as a descriptive string like "numeric" or "text")
- Description
- Example value
Format the output neatly. This is practice for the kind of documentation you'll write alongside every data science project.
Guidance
col1_name = "country"
col1_type = "text"
col1_desc = "Full country name"
col1_example = "Brazil"
col2_name = "year"
col2_type = "numeric (integer)"
col2_desc = "Year of observation"
col2_example = "2023"
col3_name = "vaccination_rate"
col3_type = "numeric (float)"
col3_desc = "Percentage of population vaccinated"
col3_example = "0.73"
print("=== DATA DICTIONARY ===")
print(f"\n{'Column':<20} {'Type':<20} {'Description':<35} {'Example'}")
print("-" * 85)
print(f"{col1_name:<20} {col1_type:<20} {col1_desc:<35} {col1_example}")
print(f"{col2_name:<20} {col2_type:<20} {col2_desc:<35} {col2_example}")
print(f"{col3_name:<20} {col3_type:<20} {col3_desc:<35} {col3_example}")
The `:<20` format specifier left-aligns text in a field 20 characters wide.
Exercise 3.28 — Floating-point exploration
Investigate floating-point precision by answering these questions with code:
- What does
0.1 + 0.2equal in Python? Is it exactly0.3? - What does
0.1 + 0.2 == 0.3return? - What does
round(0.1 + 0.2, 1) == round(0.3, 1)return? - Try
0.1 + 0.1 + 0.1 - 0.3. Is the result exactly zero? - In one or two sentences, explain why this happens and whether it matters for data science.
Guidance
print(0.1 + 0.2) # 0.30000000000000004
print(0.1 + 0.2 == 0.3) # False
print(round(0.1 + 0.2, 1) == round(0.3, 1)) # True
print(0.1 + 0.1 + 0.1 - 0.3) # 5.551115123125783e-17
This happens because computers store floats in binary (base 2), and some decimal fractions (like 0.1) can't be represented exactly in binary — similar to how 1/3 can't be represented exactly in decimal. For data science, this rarely matters because real-world measurements already have far more uncertainty than one quadrillionth. But it's a gotcha when comparing floats with `==`.
Part M: Mixed Review (Chapters 1-2) ⭐
These questions revisit earlier chapters. If you struggle with any of them, revisit the relevant chapter section.
Exercise 3.29 — Data science lifecycle revisited (from Chapter 1)
For each of the Python operations below, identify which stage of the data science lifecycle it most closely corresponds to:
vaccination_rate = vaccinated / total_populationsource_url = "https://data.who.int/"print(f"Vaccination rates range from {min_rate} to {max_rate}")country = country_raw.strip().lower()
Lifecycle stages: Ask, Acquire, Clean, Explore, Model, Communicate
Answers
1. **Explore** (or Model, depending on context) — computing a summary statistic from data 2. **Acquire** — recording the source of data 3. **Communicate** — presenting findings in a readable format 4. **Clean** — standardizing text data by removing whitespace and converting to consistent caseExercise 3.30 — Jupyter workflow (from Chapter 2)
You're working in a Jupyter notebook and encounter this situation: you defined patient_count = 4521 in cell 3, used it in cell 7, and then accidentally deleted cell 3.
- Does cell 7 still work if you run it right now?
- What happens if you restart the kernel and try to run cell 7?
- How would you prevent this kind of problem in the future?