> "The string is the duct tape of programming — it holds everything together."
Learning Objectives
- Access individual characters and substrings using indexing and slicing
- Explain why strings are immutable and work effectively within that constraint
- Use essential string methods (split, join, strip, replace, find, upper, lower, startswith, endswith, count) to process text
- Iterate over strings character by character, word by word, and line by line
- Validate user input using string inspection methods (isdigit, isalpha, isalnum)
- Format output precisely using f-string format specifications
- Use escape characters and raw strings for special text handling
In This Chapter
- Chapter Overview
- 7.1 Strings Are Everywhere
- 7.2 String Indexing
- 7.3 String Slicing
- 7.4 Strings Are Immutable
- 7.5 Essential String Methods
- 7.6 Processing Text
- 7.7 Input Validation with Strings
- 7.8 Advanced Formatting with f-Strings
- 7.9 Escape Characters and Raw Strings
- 7.10 Project Checkpoint: TaskFlow v0.6
- 🐛 Debugging Walkthrough: Immutability TypeError
- Chapter Summary
Chapter 7: Strings: Text Processing and Manipulation
"The string is the duct tape of programming — it holds everything together." — Practical programmer wisdom
Chapter Overview
Think about the last application you used. Maybe you searched for something on Google, sent a text message, filled out a form, or scrolled through social media. Every single one of those interactions involved text — strings of characters being parsed, validated, transformed, compared, and displayed.
Strings are the most common data type in real-world software. Not integers. Not floating-point numbers. Strings. Web servers parse URL strings. Databases store and query text fields. Machine learning pipelines clean messy text data before analysis. Bioinformaticians process DNA sequences that are just long strings of A, C, G, and T. Every command-line tool you've ever used takes string arguments and produces string output.
You've been using strings since Chapter 2, of course — every print() call, every input() prompt, every f-string you formatted. But we've been treating strings like simple containers for text. In this chapter, we crack them open and learn what they can really do.
This chapter introduces one of Python's most important concepts: immutability. Understanding that strings cannot be changed in place — only replaced with new strings — is a threshold concept that reshapes how you think about data. It trips up nearly every beginner, and it's the foundation for understanding how Python handles data more broadly.
In this chapter, you will learn to: - Access individual characters and substrings using indexing and slicing - Explain why strings are immutable and work effectively within that constraint - Use essential string methods to process text like a professional - Iterate over strings in multiple ways for different tasks - Validate user input before your program tries to use it - Format output with precise alignment, decimal places, and separators - Handle special characters with escape sequences and raw strings
🔄 Spaced Review: This chapter builds directly on Chapter 5 (loops — you'll iterate over strings) and Chapter 3 (type casting — remember
str()?). You'll also use the functions you learned to write in Chapter 6 throughout every example.🏃 Fast Track: If you're comfortable with basic string indexing and just need methods and formatting, skim 7.2-7.3 and jump to 7.5.
🔬 Deep Dive: The case studies for this chapter explore FASTA file parsing (bioinformatics) and how everyday apps like autocorrect, search engines, and spam filters rely on string processing.
7.1 Strings Are Everywhere
Let's start with a question: what percentage of code in a typical web application deals with strings?
The answer varies, but it's shockingly high — often 60-70% of the logic involves string operations. Parsing HTTP headers, validating email addresses, sanitizing user input, constructing SQL queries, formatting dates, generating HTML, processing JSON. Strings are the universal interface between systems, between users and programs, and between different parts of the same program.
Here's a quick tour of strings in the wild:
# Bioinformatics: a DNA sequence is just a string
dna = "ATCGATCGATCG"
# Web development: URLs are strings that encode routing information
url = "https://example.com/users/42/profile?tab=settings"
# Data science: CSV files are strings with structure
csv_row = "Patel,Anika,Biology,University of Michigan"
# System administration: log files are strings with timestamps
log_entry = "2025-03-14 08:23:17 ERROR Database connection timeout"
# Natural language processing: all text starts as a string
tweet = "Just finished my CS homework! #python #coding"
Every one of these examples requires different string operations. By the end of this chapter, you'll know how to handle all of them.
Dr. Anika Patel — the biology researcher you met back in Chapter 1 — works with DNA sequence files in a format called FASTA. Each sequence has a header line starting with > followed by lines of nucleotide characters. Her daily work is essentially string processing: parsing headers, counting nucleotides, searching for patterns. We'll use her work as a running example throughout this chapter.
7.2 String Indexing
A string is a sequence of characters. Each character sits at a numbered position called an index. Python uses zero-based indexing, which means the first character is at index 0, not index 1.
gene = "ATCGATCG"
# 01234567
print(gene[0]) # Output: A
print(gene[1]) # Output: T
print(gene[4]) # Output: A
print(gene[7]) # Output: G
Why zero-based? It's a convention inherited from C and the way memory addresses work — the index represents the offset from the start of the string. The first character has zero offset. You'll get used to it, and eventually it'll feel natural.
Negative Indexing
Python offers a slick shortcut for counting from the end: negative indices. Index -1 is the last character, -2 is second-to-last, and so on.
gene = "ATCGATCG"
# 01234567
# -8 -1
print(gene[-1]) # Output: G (last character)
print(gene[-2]) # Output: C (second to last)
print(gene[-8]) # Output: A (same as gene[0])
This is genuinely useful. When you need the last character of a string and you don't know (or don't care) how long it is, my_string[-1] is cleaner than my_string[len(my_string) - 1].
IndexError: Going Out of Bounds
What happens when you try to access an index that doesn't exist?
gene = "ATCGATCG" # 8 characters, indices 0–7
print(gene[8]) # IndexError: string index out of range
Python raises an IndexError. This is a common mistake, especially when you forget that a string of length n has valid indices 0 through n-1. If gene has 8 characters, the last valid index is 7, not 8.
🐛 Debugging Tip: When you see
IndexError: string index out of range, check two things: (1) Is your index off by one? (2) Is the string shorter than you expected? Printlen(your_string)to verify.
🔄 Check Your Understanding #1
What does the following code print?
message = "Hello, World!"
print(message[7])
print(message[-6])
Answer
`W` and `W`. Index 7 counts from the start (H=0, e=1, l=2, l=3, o=4, ,=5, space=6, W=7). Index -6 counts from the end (!=−1, d=−2, l=−3, r=−4, o=−5, W=−6). They're the same character.7.3 String Slicing
Indexing gets you one character. Slicing gets you a substring — a piece of the original string. The syntax is string[start:stop:step].
greeting = "Hello, World!"
print(greeting[0:5]) # Output: Hello
print(greeting[7:12]) # Output: World
print(greeting[:5]) # Output: Hello (start defaults to 0)
print(greeting[7:]) # Output: World! (stop defaults to end)
print(greeting[:]) # Output: Hello, World! (copy entire string)
The critical rule: the start index is inclusive, but the stop index is exclusive. greeting[0:5] gives you characters at indices 0, 1, 2, 3, 4 — five characters total. This feels odd at first, but it has a nice property: the length of the slice equals stop - start.
The Step Parameter
The optional third parameter controls the step size — how many positions to advance between each character:
alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
print(alphabet[0:10:2]) # Output: ACEGI (every other character)
print(alphabet[::3]) # Output: ADGJMPSVY (every third character)
print(alphabet[::-1]) # Output: ZYXWVUTSRQPONMLKJIHGFEDCBA (reversed!)
That last one — [::-1] — is the classic Python idiom for reversing a string. You'll see it everywhere.
Common Slicing Patterns
Here are patterns you'll use constantly:
filename = "experiment_results_2025.csv"
# Get file extension
extension = filename[-4:] # ".csv"
# Get filename without extension
name_only = filename[:-4] # "experiment_results_2025"
# Get first n characters
first_ten = filename[:10] # "experiment"
# Get last n characters
last_eight = filename[-8:] # "2025.csv"
Slicing Never Raises IndexError
Here's something that surprises most beginners: slicing with out-of-range indices doesn't crash.
short = "Hi"
print(short[0:100]) # Output: Hi (no error!)
print(short[50:100]) # Output: (empty string, no error)
Python simply returns whatever characters fall within the requested range. This is different from indexing — short[100] would raise an IndexError, but short[0:100] quietly returns all available characters. This is by design; it makes slicing more forgiving when you don't know the exact length of your data.
🐛 Debugging Walkthrough: Off-by-One in Slicing
A student writes code to extract the area code from a phone number:
python phone = "(555) 867-5309" area_code = phone[1:3] print(area_code) # Output: 55 — wrong! Expected 555The bug: they wanted characters at indices 1, 2, and 3, but
phone[1:3]only gives indices 1 and 2. The stop index is exclusive. The fix:phone[1:4].This off-by-one error is the most common slicing mistake. When you're not getting the characters you expect, count the indices on your fingers and remember: stop is exclusive.
7.4 Strings Are Immutable
Here's the moment that trips up every beginner. You have a string, and you want to change one character:
name = "Jython"
name[0] = "P" # TypeError: 'str' object does not support item assignment
Python throws a TypeError. You cannot change a string in place. This is because strings are immutable — once created, their contents cannot be modified. Not a single character. Not a slice. Not at all.
🚪 Threshold Concept: Immutability
- Before: "I can change the third character of a string."
- After: "Strings are immutable — I create a NEW string with the changes I want."
This is a fundamental shift in how you think about data. Instead of modifying existing strings, you build new strings from old ones. The original string remains unchanged (and Python's garbage collector eventually cleans it up if nothing references it anymore).
🧩 Productive Struggle
Before reading on, try to fix this code. The goal is to change "Jython" to "Python":
name = "Jython"
# How do you make name equal "Python"?
Spend a minute thinking about it. What tools do you already have?
Solution
You create a new string by combining pieces:name = "Jython"
name = "P" + name[1:] # Concatenate "P" with "ython"
print(name) # Output: Python
Or use the `replace()` method (which we'll learn in Section 7.5):
name = "Jython"
name = name.replace("J", "P")
print(name) # Output: Python
In both cases, you're not modifying the original string — you're creating a brand-new string and reassigning the variable `name` to point to it. The old `"Jython"` string still exists briefly in memory until Python cleans it up.
Why Immutability?
You might wonder: why would Python be designed this way? Isn't it inconvenient?
There are real engineering reasons:
-
Safety. When you pass a string to a function, you know the function can't alter your original string. This eliminates an entire category of bugs.
-
Efficiency. Python can optimize memory by reusing identical strings. If two variables hold
"hello", Python can point them both to the same object in memory. This is only safe because neither can be changed. -
Hashability. Strings can be used as dictionary keys and in sets (you'll learn about these in Chapter 9) precisely because they're immutable. Mutable objects can't be safely used as keys because their hash might change.
-
Thread safety. In concurrent programs, immutable objects can be shared between threads without locks. You won't need this for a while, but it matters in production systems.
For now, the practical takeaway is simple: every "change" to a string creates a new string. Get comfortable with patterns like text = text.upper() and result = text[:5] + "new_part" + text[8:].
🔄 Spaced Review (Ch 3): Remember from Chapter 3 that variables are name tags, not boxes? This is where that mental model pays off. When you write
name = name.replace("J", "P"), you're not changing the object — you're pointing the name tagnameat a different object.
7.5 Essential String Methods
Strings come loaded with methods — functions that belong to the string object and operate on its contents. Python strings have over 40 built-in methods. We'll focus on the ones you'll use most.
Remember: because strings are immutable, none of these methods change the original string. They all return new strings.
Changing Case
title = "the great gatsby"
print(title.upper()) # Output: THE GREAT GATSBY
print(title.lower()) # Output: the great gatsby
print(title.title()) # Output: The Great Gatsby
print(title.capitalize()) # Output: The great gatsby
# The original is unchanged
print(title) # Output: the great gatsby
Case conversion is essential for case-insensitive comparisons — a pattern you'll use all the time:
user_input = input("Continue? (yes/no): ") # User might type "YES", "Yes", "yes"
if user_input.lower() == "yes":
print("Continuing...")
Searching
text = "Dr. Patel studies DNA sequences in her laboratory"
# find() returns the index of the first occurrence, or -1 if not found
print(text.find("DNA")) # Output: 18
print(text.find("RNA")) # Output: -1
# count() returns how many times a substring appears
print(text.count("a")) # Output: 3
# startswith() and endswith() return True/False
print(text.startswith("Dr.")) # Output: True
print(text.endswith("lab")) # Output: False
print(text.endswith("laboratory")) # Output: True
# in operator (not a method, but essential for searching)
print("DNA" in text) # Output: True
print("RNA" in text) # Output: False
💡 Tip: Use
find()when you need the position of a substring. Useinwhen you just need to know if it's present. There's alsoindex(), which works likefind()but raises aValueErrorinstead of returning -1. Preferfind()unless you want the exception.
Splitting and Joining
split() and join() are arguably the most powerful string methods. They convert between strings and lists.
# split() breaks a string into a list of substrings
csv_line = "Patel,Anika,Biology,University of Michigan"
fields = csv_line.split(",")
print(fields) # Output: ['Patel', 'Anika', 'Biology', 'University of Michigan']
print(fields[2]) # Output: Biology
# split() with no argument splits on whitespace (spaces, tabs, newlines)
sentence = "The quick brown fox"
words = sentence.split()
print(words) # Output: ['The', 'quick', 'brown', 'fox']
# Note: multiple spaces are treated as one separator
# join() does the opposite — combines a list into a string
words = ["Hello", "World"]
result = " ".join(words)
print(result) # Output: Hello World
# join() with different separators
print(", ".join(words)) # Output: Hello, World
print("-".join(words)) # Output: Hello-World
print("".join(words)) # Output: HelloWorld
The join() syntax feels backwards at first — separator.join(list) instead of list.join(separator). Think of it as: "I'm a separator, and I'm joining these pieces together."
Stripping Whitespace
messy = " Hello, World! \n"
print(messy.strip()) # Output: Hello, World! (both sides)
print(messy.lstrip()) # Output: Hello, World! \n (left only)
print(messy.rstrip()) # Output: Hello, World! (right only)
# strip() is essential when reading user input or file data
user_input = input("Enter your name: ") # User types " Alice "
clean_name = user_input.strip() # "Alice"
Replacing
text = "I love Java. Java is great!"
new_text = text.replace("Java", "Python")
print(new_text) # Output: I love Python. Python is great!
# Replace with a limit (third argument)
new_text = text.replace("Java", "Python", 1)
print(new_text) # Output: I love Python. Java is great!
# Remove characters by replacing with empty string
phone = "(555) 867-5309"
digits_only = phone.replace("(", "").replace(")", "").replace(" ", "").replace("-", "")
print(digits_only) # Output: 5558675309
Text Adventure: Processing Player Commands
Let's see these methods in action. In Crypts of Pythonia, the text adventure game, we need to process whatever the player types into a standardized format:
def process_command(raw_input):
"""Clean and parse a player command."""
# Strip whitespace and convert to lowercase
cleaned = raw_input.strip().lower()
# Split into words
words = cleaned.split()
if not words:
return None, None
# First word is the action, rest is the target
action = words[0]
target = " ".join(words[1:]) if len(words) > 1 else None
return action, target
# Test with messy player input
commands = [
" GO north ",
"TAKE rusty sword",
" look ",
"use health potion",
]
for cmd in commands:
action, target = process_command(cmd)
print(f"Action: {action!r:12s} Target: {target!r}")
Output:
Action: 'go' Target: 'north'
Action: 'take' Target: 'rusty sword'
Action: 'look' Target: None
Action: 'use' Target: 'health potion'
Notice how strip(), lower(), and split() work together to normalize wildly inconsistent input into a clean, predictable format. This is real-world string processing in miniature.
Quick Reference: Common String Methods
| Method | What It Does | Returns | Example |
|---|---|---|---|
upper() |
All uppercase | new str | "hi".upper() → "HI" |
lower() |
All lowercase | new str | "HI".lower() → "hi" |
strip() |
Remove leading/trailing whitespace | new str | " hi ".strip() → "hi" |
split(sep) |
Break into list | list | "a,b".split(",") → ["a","b"] |
join(list) |
Combine list into string | new str | ",".join(["a","b"]) → "a,b" |
replace(old, new) |
Replace occurrences | new str | "ab".replace("a","x") → "xb" |
find(sub) |
Index of first match, or -1 | int | "abc".find("b") → 1 |
count(sub) |
Number of occurrences | int | "aaba".count("a") → 3 |
startswith(s) |
Starts with prefix? | bool | "abc".startswith("ab") → True |
endswith(s) |
Ends with suffix? | bool | "abc".endswith("bc") → True |
title() |
Title Case | new str | "hi there".title() → "Hi There" |
capitalize() |
First char uppercase | new str | "hi there".capitalize() → "Hi there" |
7.6 Processing Text
Now that you know the individual methods, let's combine them with the loops from Chapter 5 to process text in three different ways.
Iterating Character by Character
A for loop over a string gives you one character at a time:
def count_nucleotides(sequence):
"""Count each nucleotide in a DNA sequence."""
counts = {"A": 0, "T": 0, "C": 0, "G": 0}
for char in sequence.upper():
if char in counts:
counts[char] += 1
return counts
dna = "ATCGATCGATCG"
result = count_nucleotides(dna)
print(result) # Output: {'A': 3, 'T': 3, 'C': 3, 'G': 3}
This is Dr. Patel's bread-and-butter operation — counting nucleotides is the "Hello, World!" of bioinformatics.
Iterating Word by Word
Split the string first, then iterate:
def word_frequency(text):
"""Count how often each word appears."""
words = text.lower().split()
freq = {}
for word in words:
# Strip punctuation from each word
clean_word = word.strip(".,!?;:'\"()")
if clean_word:
freq[clean_word] = freq.get(clean_word, 0) + 1
return freq
sample = "To be or not to be, that is the question."
result = word_frequency(sample)
for word, count in sorted(result.items()):
print(f" {word}: {count}")
Output:
be: 2
is: 1
not: 1
or: 1
question: 1
that: 1
the: 1
to: 2
🔄 Spaced Review (Ch 5): This pattern — looping over a sequence to build up a result — is exactly the accumulator pattern from Chapter 5. Here the accumulator is a dictionary instead of a number, but the structure is the same.
Iterating Line by Line
Multi-line strings (using triple quotes) or strings read from files contain newline characters. Use splitlines() or split("\n") to process them:
def parse_fasta_header(fasta_text):
"""Extract sequence headers from FASTA-formatted text."""
headers = []
for line in fasta_text.splitlines():
if line.startswith(">"):
# Remove the '>' and strip whitespace
header = line[1:].strip()
headers.append(header)
return headers
fasta_data = """>gi|12345|ref|NM_001.2| BRCA1 gene
ATCGATCGATCGATCGATCG
GCTAGCTAGCTAGCTAGCTA
>gi|67890|ref|NM_002.1| TP53 gene
TTTTAAAACCCCGGGG
>gi|11111|ref|NM_003.3| MYC gene
AAAGGGCCCTTTTAAA"""
headers = parse_fasta_header(fasta_data)
for h in headers:
print(h)
Output:
gi|12345|ref|NM_001.2| BRCA1 gene
gi|67890|ref|NM_002.1| TP53 gene
gi|11111|ref|NM_003.3| MYC gene
This is a simplified version of what Dr. Patel does every day. Real FASTA files can contain millions of sequences, but the parsing logic is exactly this.
7.7 Input Validation with Strings
Here's a scenario you've encountered if you've built any interactive programs: you ask the user for a number, and they type "forty-two". Your program calls int("forty-two") and crashes with a ValueError.
The fix is to validate first, convert second. Python strings have built-in methods for checking what kind of characters they contain:
test_strings = ["42", "3.14", "hello", "Hello42", " ", ""]
for s in test_strings:
print(f"{s!r:10s} isdigit={str(s.isdigit()):5s} "
f"isalpha={str(s.isalpha()):5s} "
f"isalnum={str(s.isalnum()):5s}")
Output:
'42' isdigit=True isalpha=False isalnum=True
'3.14' isdigit=False isalpha=False isalnum=False
'hello' isdigit=False isalpha=True isalnum=True
'Hello42' isdigit=False isalpha=False isalnum=True
' ' isdigit=False isalpha=False isalnum=False
'' isdigit=False isalpha=False isalnum=False
Key things to notice:
- isdigit() returns True only if every character is a digit. It doesn't handle decimals, negatives, or spaces.
- isalpha() returns True only if every character is a letter. Spaces and numbers disqualify it.
- isalnum() returns True if every character is a letter or a digit.
- Empty strings return False for all three.
Grade Calculator: Validating Score Input
Let's apply this to the grade calculator running example. We need to make sure the user enters an actual number before we try to compute with it:
def get_valid_score(prompt):
"""Keep asking until the user enters a valid integer score (0-100)."""
while True:
raw = input(prompt).strip()
if not raw:
print(" Please enter a score — don't leave it blank.")
continue
if not raw.isdigit():
print(f" '{raw}' is not a valid number. Enter digits only (0-100).")
continue
score = int(raw) # Safe to convert now — we know it's all digits
if score < 0 or score > 100:
print(f" {score} is out of range. Enter a score between 0 and 100.")
continue
return score
# Usage
score = get_valid_score("Enter test score: ")
print(f"Score recorded: {score}")
Example session:
Enter test score:
Please enter a score — don't leave it blank.
Enter test score: forty-two
'forty-two' is not a valid number. Enter digits only (0-100).
Enter test score: -5
'-5' is not a valid number. Enter digits only (0-100).
Enter test score: 85
Score recorded: 85
⚠️ Caveat:
isdigit()doesn't handle negative numbers (the minus sign isn't a digit) or decimal points. For more sophisticated numeric validation, you'll learn about try/except in Chapter 11, which is the Pythonic way to handle this. For now,isdigit()handles the most common case — positive integers — cleanly.
🔄 Check Your Understanding #2
What does " 123 ".strip().isdigit() return? What about "12.5".isdigit()?
Answer
`" 123 ".strip().isdigit()` returns `True`. The `strip()` removes whitespace, leaving `"123"`, and `isdigit()` confirms all characters are digits. `"12.5".isdigit()` returns `False`. The decimal point `.` is not a digit.7.8 Advanced Formatting with f-Strings
You've been using basic f-strings since Chapter 3: f"Hello, {name}!". But f-strings have a powerful formatting mini-language that lets you control exactly how values are displayed.
The syntax is {value:format_spec}, where the format spec comes after a colon.
Width and Alignment
# Right-aligned in a field of 10 characters (default for numbers)
print(f"{'Price':>10s}: {'Amount':>10s}")
print(f"{'='*10:s}: {'='*10:s}")
print(f"{9.99:>10.2f}: {3:>10d}")
print(f"{149.50:>10.2f}: {1:>10d}")
print(f"{1099.00:>10.2f}: {2:>10d}")
Output:
Price: Amount
==========: ==========
9.99: 3
149.50: 1
1099.00: 2
The alignment characters:
- < left-align (default for strings)
- > right-align (default for numbers)
- ^ center
# Alignment examples
name = "Python"
print(f"|{name:<20s}|") # Left-aligned
print(f"|{name:>20s}|") # Right-aligned
print(f"|{name:^20s}|") # Centered
print(f"|{name:*^20s}|") # Centered with fill character
Output:
|Python |
| Python|
| Python |
|*******Python*******|
Decimal Places
pi = 3.141592653589793
print(f"Default: {pi}") # 3.141592653589793
print(f"2 decimals: {pi:.2f}") # 3.14
print(f"4 decimals: {pi:.4f}") # 3.1416 (rounds!)
print(f"0 decimals: {pi:.0f}") # 3
Thousands Separator
population = 8045311
print(f"Population: {population:,}") # 8,045,311
print(f"Population: {population:_}") # 8_045_311
print(f"Budget: ${2_500_000.75:,.2f}") # $2,500,000.75
Percentage
ratio = 0.8567
print(f"Pass rate: {ratio:.1%}") # 85.7%
print(f"Pass rate: {ratio:.0%}") # 86%
Combining Format Specs
Format specs can be combined. The full syntax is {value:fill_char alignment width .precision type}:
# Practical example: formatted report
students = [
("Alice Chen", 92.567, 0.945),
("Bob Martinez", 87.333, 0.892),
("Carol Washington", 95.100, 0.971),
]
print(f"{'Student':<20s} {'Average':>8s} {'Attendance':>11s}")
print("-" * 41)
for name, avg, attend in students:
print(f"{name:<20s} {avg:>8.1f} {attend:>10.1%}")
Output:
Student Average Attendance
-----------------------------------------
Alice Chen 92.6 94.5%
Bob Martinez 87.3 89.2%
Carol Washington 95.1 97.1%
This is what professional output looks like — clean columns, consistent alignment, appropriate precision. You'll use this pattern in the TaskFlow project below.
7.9 Escape Characters and Raw Strings
Some characters can't be typed directly into a string. You need escape characters — special sequences that start with a backslash (\).
Common Escape Characters
# Newline: \n
print("Line one\nLine two")
# Output:
# Line one
# Line two
# Tab: \t
print("Name\tAge\tCity")
print("Alice\t30\tNew York")
# Output:
# Name Age City
# Alice 30 New York
# Backslash: \\
print("C:\\Users\\Documents\\file.txt")
# Output: C:\Users\Documents\file.txt
# Quote inside a string: \" or \'
print("She said, \"Hello!\"")
# Output: She said, "Hello!"
print('It\'s a beautiful day')
# Output: It's a beautiful day
Escape Character Reference
| Escape | Meaning | Example Output |
|---|---|---|
\n |
Newline | Line break |
\t |
Tab | Horizontal tab |
\\ |
Literal backslash | \ |
\" |
Double quote | " |
\' |
Single quote | ' |
\0 |
Null character | (empty) |
Raw Strings
Sometimes you need a string with lots of backslashes — file paths on Windows, regular expressions (Chapter 22), or LaTeX formulas. Escaping every backslash gets tedious:
# Without raw string — need to double every backslash
path = "C:\\Users\\Patel\\Documents\\sequences\\data.fasta"
# With raw string — backslashes are literal
path = r"C:\Users\Patel\Documents\sequences\data.fasta"
print(path) # Output: C:\Users\Patel\Documents\sequences\data.fasta
A raw string is created by prefixing the string with r or R. Inside a raw string, backslashes are treated as literal characters — no escape processing happens.
# Regular string: \n is a newline
print("Hello\nWorld")
# Output:
# Hello
# World
# Raw string: \n is literally backslash-n
print(r"Hello\nWorld")
# Output: Hello\nWorld
💡 Tip: You'll use raw strings extensively in Chapter 22 when we cover regular expressions. For now, just know they exist and that they're useful for Windows file paths and any string with literal backslashes.
🔄 Check Your Understanding #3
What does the following code print?
print("A\tB\tC")
print(r"A\tB\tC")
print("Line1\nLine2")
print(len("Hello\n"))
Answer
A B C
A\tB\tC
Line1
Line2
6
The first `print` outputs tab-separated characters. The second, being a raw string, outputs the literal backslash-t characters. The third outputs two lines. The `len("Hello\n")` is 6 because `\n` is a single character (newline), so the string contains H-e-l-l-o-newline = 6 characters.
7.10 Project Checkpoint: TaskFlow v0.6
Time to put everything together. In Chapter 6, we refactored TaskFlow into functions. Now we'll add two new features that use string processing:
- Search tasks by keyword — case-insensitive using
lower() - Formatted task display — aligned columns using f-string formatting
Here's the updated version with the new features highlighted:
"""
TaskFlow v0.6 — Task manager with search and formatted display.
New in v0.6:
- search_tasks(): case-insensitive keyword search
- Formatted display with aligned columns
"""
# --- Task storage ---
tasks = []
def add_task():
"""Prompt for a task description and priority, then add to the list."""
description = input("Task description: ").strip()
if not description:
print(" Task description cannot be empty.")
return
priority = input("Priority (high/medium/low): ").strip().lower()
if priority not in ("high", "medium", "low"):
print(f" '{priority}' is not valid. Using 'medium'.")
priority = "medium"
tasks.append({"description": description, "priority": priority})
print(f" Added: '{description}' [{priority}]")
def list_tasks():
"""Display all tasks in a formatted table."""
if not tasks:
print(" No tasks yet.")
return
# Header
print(f"\n {'#':<4s} {'Description':<35s} {'Priority':>10s}")
print(f" {'-'*4} {'-'*35} {'-'*10}")
# Task rows
for i, task in enumerate(tasks, start=1):
desc = task["description"]
pri = task["priority"]
# Truncate long descriptions
if len(desc) > 33:
desc = desc[:30] + "..."
print(f" {i:<4d} {desc:<35s} {pri:>10s}")
print(f"\n Total: {len(tasks)} task(s)")
def delete_task():
"""Delete a task by its number."""
list_tasks()
if not tasks:
return
raw = input("Delete task number: ").strip()
if not raw.isdigit():
print(f" '{raw}' is not a valid number.")
return
num = int(raw)
if num < 1 or num > len(tasks):
print(f" No task #{num}. Enter 1-{len(tasks)}.")
return
removed = tasks.pop(num - 1)
print(f" Deleted: '{removed['description']}'")
def search_tasks():
"""Search tasks by keyword (case-insensitive)."""
keyword = input("Search keyword: ").strip().lower()
if not keyword:
print(" Please enter a search term.")
return
matches = []
for i, task in enumerate(tasks, start=1):
if keyword in task["description"].lower():
matches.append((i, task))
if not matches:
print(f" No tasks matching '{keyword}'.")
return
print(f"\n Found {len(matches)} match(es) for '{keyword}':")
print(f" {'#':<4s} {'Description':<35s} {'Priority':>10s}")
print(f" {'-'*4} {'-'*35} {'-'*10}")
for num, task in matches:
desc = task["description"]
if len(desc) > 33:
desc = desc[:30] + "..."
print(f" {num:<4d} {desc:<35s} {task['priority']:>10s}")
def show_menu():
"""Display the main menu."""
print("\n--- TaskFlow v0.6 ---")
print("1. Add task")
print("2. List tasks")
print("3. Delete task")
print("4. Search tasks")
print("5. Quit")
def main():
"""Main loop for the TaskFlow application."""
print("Welcome to TaskFlow v0.6!")
print("Now with search and formatted display.\n")
while True:
show_menu()
choice = input("\nChoose (1-5): ").strip()
if choice == "1":
add_task()
elif choice == "2":
list_tasks()
elif choice == "3":
delete_task()
elif choice == "4":
search_tasks()
elif choice == "5":
print("Goodbye!")
break
else:
print(f" '{choice}' is not a valid option.")
if __name__ == "__main__":
main()
What's New in v0.6
Let's break down the string techniques used:
strip()on everyinput()call — cleans up accidental whitespacelower()for case-insensitive priority and search —"BUY groceries".lower()becomes"buy groceries", matching a search for"buy"or"groceries"isdigit()to validate the delete number before callingint()- f-string alignment (
:<4d,:<35s,:>10s) for clean column output - String truncation with slicing (
desc[:30] + "...") for long descriptions inoperator for substring search (keyword in task["description"].lower())
💡 Looking Ahead: In Chapter 8, we'll convert tasks from dictionaries to tuples and add sorting by priority. In Chapter 9, we'll use dictionaries more fully for category filtering. And in Chapter 10, we'll save tasks to a file — making TaskFlow persistent.
🐛 Debugging Walkthrough: Immutability TypeError
Here's a common debugging scenario. A student writes this code to censor a word in a sentence:
def censor(text, word):
"""Replace a word with asterisks."""
position = text.find(word)
if position != -1:
for i in range(position, position + len(word)):
text[i] = "*" # TypeError!
return text
result = censor("The password is secret123", "secret123")
The error: TypeError: 'str' object does not support item assignment
The fix: Use replace() instead:
def censor(text, word):
"""Replace a word with asterisks."""
replacement = "*" * len(word)
return text.replace(word, replacement)
result = censor("The password is secret123", "secret123")
print(result) # Output: The password is *********
Or, using slicing and concatenation:
def censor(text, word):
"""Replace a word with asterisks."""
position = text.find(word)
if position == -1:
return text
stars = "*" * len(word)
return text[:position] + stars + text[position + len(word):]
result = censor("The password is secret123", "secret123")
print(result) # Output: The password is *********
Both approaches create a new string rather than trying to modify the original. The replace() version is cleaner and handles multiple occurrences automatically.
Chapter Summary
Strings are sequences of characters, and they're the most common data type in real-world software. Here's what you've learned:
- Indexing accesses individual characters:
s[0](first),s[-1](last) - Slicing extracts substrings:
s[start:stop:step], with stop being exclusive - Strings are immutable — you can't change them in place, only create new ones
- String methods like
split(),join(),strip(),replace(),find(),upper(), andlower()are your daily tools for text processing - Inspection methods like
isdigit(),isalpha(), andisalnum()validate input - f-string format specs control width, alignment, decimal places, and separators
- Escape characters (
\n,\t,\\) represent special characters; raw strings (r"...") disable escape processing
The threshold concept of this chapter — immutability — will come up again when you learn about tuples in Chapter 8 and becomes even more important when you study mutability and aliasing (the threshold concept of Chapter 8). The contrast between immutable strings and mutable lists is one of the most important distinctions in Python.
What's next: Chapter 8 introduces lists and tuples — mutable and immutable sequences. You'll see how the indexing and slicing you learned here apply to all sequence types, and you'll encounter the flip side of immutability: what happens when objects can be changed in place.