Chapter 9 Quiz: File I/O — Reading and Writing Business Data

DataField.Dev

Chapter 9 Quiz: File I/O — Reading and Writing Business Data

Instructions: Choose the best answer for each multiple-choice question. For short-answer questions, write your answer before checking the key. There are 20 questions covering all major topics from Chapter 9.

Multiple Choice (Questions 1–14)

Question 1

Which of the following correctly constructs a file path using pathlib that will work on both Windows and Mac/Linux?

A. "data\\sales\\report.csv"

B. "data/sales/report.csv"

C. Path("data") / "sales" / "report.csv"

D. Path("data\\sales\\report.csv")

Question 2

Priya writes the following code:

file_handle = open("report.csv", mode="r", encoding="utf-8")
content = file_handle.read()

What is the most significant problem with this code?

A. The mode parameter should be "read" not "r"

B. file_handle.read() will only read the first line

C. The file is never closed, which wastes operating system resources and may leave data unflushed

D. encoding="utf-8" is not a valid parameter for open()

Question 3

Maya wants to open a log file and add new entries without overwriting the existing content. Which file mode should she use?

A. "w" — write mode

B. "r+" — read-write mode

C. "x" — exclusive creation mode

D. "a" — append mode

Question 4

What does the following code print?

from pathlib import Path

p = Path("data") / "sales" / "q1_report.csv"
print(p.stem)

A. "data/sales/q1_report.csv"

B. "q1_report.csv"

C. "q1_report"

D. ".csv"

Question 5

When reading a CSV file with csv.DictReader, what Python type does the revenue field contain if the CSV row is "North,Enterprise,189000.00"?

A. float

B. int

C. str

D. Decimal

Question 6

Which of the following correctly opens a CSV file for reading in a way that prevents newline translation issues?

A. open("sales.csv", mode="r", encoding="utf-8")

B. open("sales.csv", mode="r", newline="", encoding="utf-8")

C. open("sales.csv", mode="r", newline="\n", encoding="utf-8")

D. open("sales.csv", mode="rb", encoding="utf-8")

Question 7

Priya runs this code:

with open("data/north_q1.csv", mode="w", encoding="utf-8") as f:
    f.write("region,revenue\n")
    f.write("North,189000\n")

with open("data/north_q1.csv", mode="w", encoding="utf-8") as f:
    f.write("region,revenue\n")
    f.write("South,130500\n")

What are the final contents of north_q1.csv?

A. Both sets of rows (4 lines total)

B. Only the North row (the second open() has no effect on an existing file)

C. Only the South row (the second open() in "w" mode overwrites the file)

D. The file is empty because it was opened twice

Question 8

What is the purpose of the indent=2 argument in json.dump(data, f, indent=2)?

A. It limits the JSON output to 2 levels of nesting

B. It makes the JSON output human-readable with 2-space indentation

C. It compresses the JSON output by removing 2 spaces per line

D. It is required when writing JSON to a file (without it, json.dump() raises an error)

Question 9

Which pathlib method would Priya use to find all files ending in .csv within a directory and all of its subdirectories?

A. directory.glob("*.csv")

B. directory.iterdir("*.csv")

C. directory.rglob("*.csv")

D. directory.find("*.csv")

Question 10

Maya writes this code to check if her project log exists before reading it:

log_path = Path("data") / "project_log.csv"
if log_path.is_file():
    with open(log_path, ...) as f:
        ...

Which pathlib method could she use instead of .is_file() if she only cared whether the path existed at all (file or directory)?

A. .exists()

B. .stat()

C. .resolve()

D. .name()

Question 11

What does extrasaction="ignore" do when passed to csv.DictWriter?

A. It ignores rows where any field is an empty string

B. It silently drops any dictionary keys that are not in the fieldnames list

C. It skips writing the header row

D. It prevents writing duplicate rows to the output file

Question 12

Marcus writes this function to create a directory for output files:

output_dir = Path("output") / "reports" / "2024"
output_dir.mkdir()

Under which condition will this code raise an error?

A. When output_dir already exists

B. When the string "output" contains a forward slash

C. When the parent directory (output/reports/) does not yet exist

D. Both A and C

Question 13

Priya needs to read a large CSV file (2 million rows) as memory-efficiently as possible. Which approach is best?

A.

with open(path, newline="", encoding="utf-8") as f:
    all_rows = f.readlines()
for row in all_rows:
    process(row)

B.

with open(path, newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    all_records = list(reader)
for record in all_records:
    process(record)

C.

with open(path, newline="", encoding="utf-8") as f:
    reader = csv.DictReader(f)
    for record in reader:
        process(record)

D.

content = open(path, encoding="utf-8").read()
for line in content.split("\n"):
    process(line)

Question 14

What is the key difference between json.load() and json.loads()?

A. json.load() reads arrays; json.loads() reads objects

B. json.load() reads from a file object; json.loads() reads from a string

C. json.load() is for Python 3; json.loads() is for Python 2

D. json.load() returns a dict; json.loads() returns a list

Short Answer (Questions 15–20)

Write a brief answer or small code snippet for each question.

Question 15

Write a single line of code that creates the directory output/reports/2024/ and all necessary parent directories, without raising an error if the directory already exists.

Question 16

The following code has two bugs. Identify both and explain how to fix them.

import csv

sales_records = [
    {"rep": "Marcus Webb", "revenue": 247500, "quota": 220000},
    {"rep": "Tom Reilly",  "revenue": 130500, "quota": 160000},
]

with open("reps.csv", mode="w", encoding="utf-8") as csv_file:
    writer = csv.DictWriter(csv_file, fieldnames=["rep", "revenue", "quota"])
    writer.writerows(sales_records)

Question 17

Maya uses this code to check whether a project is over budget:

ratio = project["actual_hours"] / project["estimated_hours"]
if ratio > 1.10:
    print("Over budget!")

What error will occur if project["estimated_hours"] is 0, and how should Maya guard against it?

Question 18

Write a code snippet that reads a JSON file at config/settings.json into a variable called app_config, changes the value of app_config["last_run"] to the current date as an ISO-format string (YYYY-MM-DD), and writes the updated dict back to the same file with 2-space indentation.

Question 19

Explain why you should always pass newline="" when opening a CSV file, and what can go wrong if you omit it on Windows.

Question 20

Priya has a list of 500 sales records as dicts. She wants to write them to a CSV file sorted by region (ascending) and then by revenue (descending within each region). Write the sorting expression she should use before calling writer.writerows().

Answer Key

Question 1 — Answer: C

Path("data") / "sales" / "report.csv" uses pathlib's / operator to build a path in a cross-platform way. Option A uses Windows-only backslashes. Option B uses a hardcoded forward-slash string that works on Mac/Linux but is less robust and not a Path object. Option D passes a Windows-style string into Path() — this works on Windows but not on Mac/Linux.

Question 2 — Answer: C

The file is opened but never closed. Without a with statement, the programmer must call file_handle.close() manually. If any exception occurs between open() and the manual close(), the file remains open indefinitely. The fix is to use a with statement: with open("report.csv", mode="r", encoding="utf-8") as file_handle:.

Question 3 — Answer: D

Mode "a" (append) positions the write cursor at the end of the file before writing anything, leaving existing content intact. If the file does not exist, "a" creates it. Mode "w" would overwrite everything. Mode "r+" requires the file to exist and starts at the beginning. Mode "x" creates a new file and raises FileExistsError if the file already exists.

Question 4 — Answer: C

.stem returns the final component of the path with its suffix removed — the filename without the extension. .name would return "q1_report.csv", and .suffix would return ".csv". The full path string is returned by .resolve() or str(p).

Question 5 — Answer: C

csv.DictReader always returns strings. Every value, including what looks like "189000.00", is a Python str until you explicitly convert it. This is one of the most common sources of bugs when working with CSV data — forgetting to convert numeric strings before arithmetic.

Question 6 — Answer: B

newline="" tells open() not to apply Python's universal newline translation. Without it, the csv module's own newline handling can be interfered with, particularly on Windows where files may use \r\n line endings, causing the csv module to see extra blank rows between records. Option C is wrong — newline should be "" (empty string), not "\n". Option D ("rb" binary mode) is incompatible with the encoding parameter.

Question 7 — Answer: C

Opening a file with mode "w" truncates (empties) the file immediately, before any writing occurs. The second open() call in "w" mode deletes all content written by the first block. The file ends up containing only the South row. To append to existing content, use mode "a".

Question 8 — Answer: B

indent=2 makes json.dump() write the JSON in "pretty-printed" format with 2-space indentation, making it readable by humans. Without it, the entire JSON structure is written on a single line (compact format). Neither usage is required — both produce valid JSON. The indent parameter has no effect on the depth of nesting allowed.

Question 9 — Answer: C

directory.rglob("*.csv") searches recursively through all subdirectories. directory.glob("*.csv") only searches the immediate directory, not subdirectories. .iterdir() does not accept a glob pattern. .find() is not a pathlib.Path method (it is a Unix command-line tool).

Question 10 — Answer: A

.exists() returns True if the path refers to any existing filesystem entry — file, directory, or symbolic link. .is_file() is more specific and returns True only for regular files. .stat() raises an exception if the path does not exist (it does not return a boolean). .resolve() returns the absolute path and does not check existence.

Question 11 — Answer: B

extrasaction="ignore" tells csv.DictWriter to silently discard any keys in the row dictionary that are not listed in fieldnames. Without it, any extra key raises a ValueError. This is useful when your dicts may contain computed or intermediate fields that you do not want in the output CSV.

Question 12 — Answer: D

Both A and C will cause errors. .mkdir() without arguments raises FileExistsError if the directory already exists, and raises FileNotFoundError if any parent directory in the path does not exist. The correct call is .mkdir(parents=True, exist_ok=True): parents=True creates all missing intermediate directories, and exist_ok=True suppresses the error if the directory is already there.

Question 13 — Answer: C

Option C is the most memory-efficient because it iterates over the csv.DictReader directly, processing one record at a time without ever holding the entire file in memory. Options A and B both load the entire file into memory (as a list) before processing — fine for small files, but a problem for 2 million rows. Option D misuses open() without a with statement, never closes the file, and would also load the entire file into memory.

Question 14 — Answer: B

json.load(f) reads from a file object (an open file handle). json.loads(s) parses a JSON string that is already in memory (the s stands for "string"). Both return the same Python types — the only difference is the input source. A common use case for json.loads() is parsing an API response body that arrives as a string.

Question 15 — Answer:

Path("output") / "reports" / "2024").mkdir(parents=True, exist_ok=True)

Or equivalently:

output_path = Path("output") / "reports" / "2024"
output_path.mkdir(parents=True, exist_ok=True)

parents=True creates output/, then output/reports/, then output/reports/2024/ if they do not exist. exist_ok=True prevents a FileExistsError if the directory is already there.

Question 16 — Answer:

Bug 1: The open() call is missing newline="". When writing CSV files, open() must be called with newline="" to prevent Python's universal newline translation from interfering with the csv module's line termination. Fix: open("reps.csv", mode="w", newline="", encoding="utf-8").

Bug 2: writer.writerows(sales_records) is called without first calling writer.writeheader(). Without the header, the CSV file will contain only data rows with no column names, making the file nearly unusable. Fix: add writer.writeheader() before writer.writerows(sales_records).

Question 17 — Answer:

A ZeroDivisionError will occur because Python cannot divide by zero. Maya should guard against it with a conditional check before computing the ratio:

if project["estimated_hours"] == 0:
    print(f"Warning: {project['project_name']} has no estimated hours.")
else:
    ratio = project["actual_hours"] / project["estimated_hours"]
    if ratio > 1.10:
        print("Over budget!")

Alternatively, using a ternary expression inline:

ratio = (
    project["actual_hours"] / project["estimated_hours"]
    if project["estimated_hours"] > 0
    else 0.0
)

Question 18 — Answer:

import json
from datetime import date
from pathlib import Path

config_path = Path("config") / "settings.json"

with open(config_path, mode="r", encoding="utf-8") as json_file:
    app_config = json.load(json_file)

app_config["last_run"] = date.today().isoformat()

with open(config_path, mode="w", encoding="utf-8") as json_file:
    json.dump(app_config, json_file, indent=2)

Note: two separate with blocks are needed — one to read, one to write. You cannot read and then immediately overwrite in the same with block opened in "r" mode.

Question 19 — Answer:

The newline="" parameter tells Python's open() function to pass raw bytes to the csv module without translating newline characters. The csv module handles its own newline processing internally and needs to see the raw \r\n (carriage return + newline) characters that Windows uses.

Without newline="" on Windows: - When reading: Python's universal newline mode converts \r\n to \n, but the csv module may still see \r characters within fields, causing extra blank rows to appear between records - When writing: Python's default newline translation may add an extra \r before each newline, resulting in \r\r\n line endings that cause extra blank rows when opening the file

The symptom — blank rows appearing between every data row in a CSV — is one of the most confusing CSV bugs beginners encounter, and newline="" prevents it entirely.

Question 20 — Answer:

sorted_records = sorted(
    sales_records,
    key=lambda record: (record["region"], -record["revenue"])
)

The tuple (record["region"], -record["revenue"]) sorts first by region in ascending alphabetical order, then by revenue in descending order within each region. Negating a numeric value is the standard Python idiom for reversing the sort order of one key in a multi-key sort without using reverse=True (which would reverse all keys).

For non-numeric secondary sorts in reverse order, you would use:

sorted_records = sorted(
    sales_records,
    key=lambda r: (r["region"], r["revenue"]),
    reverse=False   # region ascending only; revenue cannot easily be reversed this way
)

The negation trick (-record["revenue"]) is the clean solution when you need mixed ascending/descending on multiple keys.

End of Quiz