Chapter 35 Exercises: Natural Language Processing for Business Text
These exercises are organized into five tiers, from foundational to advanced. Complete them in order — each tier builds on skills from the previous one.
Tier 1: Foundations (No Prior NLP Knowledge Required)
Exercise 1.1 — Text Preprocessing Pipeline
Write a function clean_and_tokenize(text) that takes a raw string and returns a list of processed tokens. The function should:
- Convert to lowercase
- Remove punctuation and numbers
- Split into individual words
- Remove words shorter than 3 characters
Test it on the following strings:
"Our Q4 revenue grew by 23.7%! Fantastic results for the team."
"URGENT: Package #98342 has NOT been delivered. Contact us ASAP!!!"
"The customer's satisfaction score was 4.5/5.0 — excellent performance."
Expected output for the first string: ['our', 'revenue', 'grew', 'fantastic', 'results', 'for', 'the', 'team']
Exercise 1.2 — Stopword Removal
Using NLTK, extend your clean_and_tokenize function to remove English stopwords. Compare the output with and without stopword removal for a five-sentence paragraph about a business meeting.
Questions to answer: - How many words were removed as stopwords? - What percentage of the original word count remains? - Are any words removed that you think should have been kept?
Exercise 1.3 — TextBlob Quick Sentiment
Score the sentiment of these five customer messages using TextBlob and print the polarity and a human-readable label (POSITIVE/NEUTRAL/NEGATIVE):
messages = [
"Thank you so much for the quick resolution. You've been amazing.",
"The package arrived but the box was slightly dented.",
"I've called three times and nobody has helped me. This is ridiculous.",
"Item received. Works as expected.",
"Honestly surprised by the quality — much better than I expected!",
]
For each message, print: the first 50 characters, the polarity score, and the label.
Tier 2: Core Skills
Exercise 2.1 — Batch Sentiment Analysis
Create a DataFrame with 15 rows representing customer reviews. Include columns: review_id, product_category, review_text, star_rating. Add reviews across three product categories (5 per category).
Apply TextBlob sentiment analysis to create polarity and sentiment_label columns. Then:
- Calculate the average polarity per product category
- Identify the product category with the lowest average sentiment
- Check: does your NLP-based sentiment ranking match the star-rating ranking?
Exercise 2.2 — Keyword Frequency Counter
You have been given 20 customer support ticket texts (create your own realistic samples for a fictional e-commerce company). Build a function that: 1. Preprocesses all ticket texts 2. Counts the frequency of each word across all tickets 3. Returns the top 15 keywords as a formatted table with rank, keyword, and count
Which keywords appear most frequently? Do they suggest any business problems?
Exercise 2.3 — Simple Text Classifier
Write a function classify_inquiry(text) that classifies a customer inquiry into one of five categories:
- billing
- shipping
- returns
- technical_support
- general
Use keyword matching. Define at least 8 keywords per category. Test your classifier on 10 sample inquiries and report the classification result for each.
Exercise 2.4 — N-gram Extraction
Using NLTK's ngrams function, extract all bigrams (2-word phrases) from a collection of product reviews. Find the 10 most common bigrams and the 10 most common trigrams.
How do the results differ from single-word frequency analysis? Give one example of a bigram that provides more business insight than its individual words would.
Tier 3: Applied Business Analysis
Exercise 3.1 — Support Ticket Triage System
Build a complete support ticket triage function that takes a DataFrame of support tickets and returns the same DataFrame sorted by urgency. Urgency should be calculated from: - Sentiment polarity (more negative = more urgent) - Ticket age in hours (older = more urgent) - Presence of specific escalation keywords ("urgent," "manager," "lawyer," "refund demand")
The function should add an urgency_score column (0 to 10) and a priority_label column (HIGH/MEDIUM/LOW). Test it on a DataFrame of at least 15 sample tickets.
Exercise 3.2 — Product Review Dashboard
Given a DataFrame of 50+ product reviews (create realistic synthetic data), build an analysis function that produces: 1. Overall sentiment summary (counts and percentages) 2. Average polarity by star rating (1-5) — are 4-star reviews actually more positive in language than 3-star reviews? 3. Top 10 most common words in 5-star reviews 4. Top 10 most common words in 1-star reviews 5. Three words that appear proportionally more in negative reviews than positive reviews
Present the output in a clean, readable format.
Exercise 3.3 — spaCy Named Entity Extraction
Write a function extract_business_entities(text) using spaCy that returns a structured dictionary containing:
- organizations: list of mentioned company names
- people: list of person names
- dates: list of date expressions
- money: list of monetary amounts
- locations: list of cities/countries/states
Test it on: 1. A sample contract excerpt you write yourself (100-200 words) 2. A sample business email (75-150 words) 3. A news article excerpt about a business acquisition (100-200 words)
Which document type yielded the most entities? Which entity type was most reliable?
Exercise 3.4 — Sentiment Trend Over Time
Create a DataFrame of 60 customer reviews with realistic created_date values spanning 6 months and realistic review_text values (some positive, some negative, with a pattern — perhaps reviews get worse in the final two months due to a fictional product defect).
- Calculate average sentiment by month
- Plot a line chart of average polarity over time
- Add a horizontal reference line at y=0 (neutral sentiment)
- Write a two-sentence interpretation of the chart
Tier 4: Integration and Advanced Application
Exercise 4.1 — ML Text Classifier
Using scikit-learn, build a machine learning text classifier for customer support tickets: 1. Create a labeled dataset of at least 60 tickets across 4 categories (15 per category). Write the ticket texts yourself — make them realistic. 2. Split 80/20 into train/test sets 3. Build a Pipeline with TfidfVectorizer and MultinomialNB 4. Train and evaluate the model 5. Print the classification report
Questions: - Which category has the lowest F1 score? Why might that be? - What would you need to do to improve accuracy?
Exercise 4.2 — LDA Topic Discovery
Collect or create a dataset of at least 40 short text documents on a business theme (customer reviews, employee feedback, product descriptions — your choice). Build an LDA topic model with 4 topics.
Then: 1. Display the top 8 words per topic 2. Label each topic with your interpretation 3. Assign the dominant topic to each document 4. Build a bar chart showing how many documents belong to each topic
Bonus: Test with 3 topics and 5 topics. Which number produces the most interpretable results? Explain your reasoning.
Exercise 4.3 — Comparative Language Analysis
You have two sets of survey responses: one from clients who renewed contracts and one from clients who did not renew. Analyze whether the language in these two groups differs.
- What words appear more frequently in non-renewal responses?
- Do non-renewal responses mention different topics?
- Is there a measurable sentiment difference?
(You will need to create synthetic data for this exercise — aim for 25 responses per group.)
Exercise 4.4 — Entity Co-occurrence Network
Using spaCy, process 10 news article excerpts about business mergers and acquisitions (write or adapt them). Extract all ORG and PERSON entities. Build a co-occurrence table showing which organizations are mentioned together in the same document.
Which organization appears in the most documents? Which pair of organizations co-occurs most frequently?
Tier 5: Capstone
Exercise 5.1 — Complete NLP Analysis Pipeline
Build a self-contained Python script analyze_feedback.py that performs a complete NLP analysis on a business text dataset. The script should:
- Accept a CSV file path as a command-line argument
- Detect which column contains the main text (or accept a
--text-colargument) - Run sentiment analysis and produce a distribution report
- Perform keyword frequency analysis and display the top 20 terms
- If a
--group-colargument is provided, break down sentiment by that column - Save a summary to
analysis_output.csv - Generate and save two charts: sentiment distribution and (if group column provided) sentiment by group
The script should handle missing data, short responses, and encoding issues gracefully. Include a --help message. Test it on at least two different datasets.
Exercise 5.2 — Multi-Dataset Comparison Study
Choose a business domain (e-commerce, healthcare, hospitality, SaaS — your choice). Find or create three different text datasets from that domain: - Customer reviews - Social media mentions (can be synthetic) - Support/complaint text
Apply the full NLP pipeline to each dataset separately. Write a 400-500 word analysis comparing: - Sentiment profiles across the three sources - Whether the same topics appear across all three - Which dataset provides the most actionable business intelligence and why - What NLP limitations are most apparent in your analysis
Exercise 5.3 — NLP-Powered Weekly Report Generator
Build a Python script that reads a CSV of support tickets filed in the last 7 days and generates a plain-text weekly summary report containing:
- Total ticket count and comparison to previous week (if prior week data is available)
- Sentiment breakdown with trend indicator (↑↓→)
- Top 5 keywords this week
- Most urgent ticket (lowest polarity) with ticket ID and truncated text
- Category breakdown table
- Any categories where sentiment dropped by more than 0.05 vs. prior week
The report should be formatted to be readable when pasted into an email. Include a function send_summary_email(report_text, recipients) stub that prints "Email would be sent to: [recipients]" for now.
Answer Guidance
Tier 1 Solutions
1.1 Solution sketch:
import re
def clean_and_tokenize(text: str) -> list[str]:
text = text.lower()
text = re.sub(r'[^\w\s]', '', text)
text = re.sub(r'\d+', '', text)
tokens = text.split()
return [t for t in tokens if len(t) >= 3]
1.3 Polarity expectations: Message 1 ≈ +0.5 to +0.7, Message 3 ≈ -0.4 to -0.6, Message 4 ≈ 0 (neutral). If your scores differ significantly, check that TextBlob corpora are installed correctly.
Common Mistakes to Watch For
-
Stopword removal before sentiment analysis: Removing "not" or "never" will flip the meaning of negative sentences. Apply stopword removal only for keyword extraction and topic modeling, not for TextBlob sentiment scoring.
-
Forgetting to handle NaN values: Real datasets always have missing text. Always use
.fillna('')or checkisinstance(text, str)before processing. -
Over-trusting individual polarity scores: A single score of +0.1 does not mean a review is meaningfully positive. Look at distributions and aggregates.
-
LDA with too few documents: LDA needs sufficient data to find stable topic patterns. With fewer than 30 documents, topics will be unstable across runs. For small datasets, stick to keyword frequency analysis.