Chapter 14 Quiz: NLP for Business


Multiple Choice

Question 1. Which of the following best describes the purpose of tokenization in an NLP pipeline?

  • (a) Converting text to lowercase to ensure consistency
  • (b) Splitting text into individual units (words, subwords, or characters) for processing
  • (c) Removing common words that carry little meaning
  • (d) Reducing words to their dictionary base forms

Question 2. A sentiment analysis model classifies the review "Great, another product that falls apart after one wash" as positive. What is the most likely cause of this error?

  • (a) The model was not trained on enough data.
  • (b) The preprocessing pipeline removed the word "great."
  • (c) The model cannot detect sarcasm and treats "great" as a positive signal without understanding context.
  • (d) The model was trained on a different language.

Question 3. What does the "IDF" in TF-IDF measure?

  • (a) How frequently a term appears in a single document
  • (b) How frequently a term appears across all documents in the corpus
  • (c) How rare a term is across the corpus — terms appearing in fewer documents receive higher IDF scores
  • (d) The sentiment polarity of a term

Question 4. The vector arithmetic "king - man + woman = queen" demonstrates which property of word embeddings?

  • (a) Word embeddings assign higher values to more important words.
  • (b) Word embeddings capture semantic and relational similarities between words in vector space.
  • (c) Word embeddings can translate between languages.
  • (d) Word embeddings always produce exact mathematical relationships.

Question 5. A customer review states: "The material is soft and comfortable, but the return process was a complete nightmare." What type of NLP analysis would be most appropriate to capture the distinct sentiments about material and returns?

  • (a) Document-level sentiment analysis
  • (b) Topic modeling
  • (c) Aspect-based sentiment analysis
  • (d) Named entity recognition

Question 6. Which of the following is NOT an advantage of transformer models over previous NLP architectures?

  • (a) Transformers process all words in a sequence in parallel rather than sequentially.
  • (b) The attention mechanism captures long-range dependencies between words.
  • (c) Transformers require less compute and are cheaper to run than simpler models.
  • (d) Transformers can be pre-trained on large corpora and fine-tuned for specific tasks.

Question 7. Athena Retail Group's ReviewAnalyzer discovered a 340 percent increase in reviews mentioning sustainability over 18 months. This insight was generated by which NLP technique?

  • (a) Sentiment analysis — the system detected increasingly positive sentiment
  • (b) Named entity recognition — the system extracted "sustainability" as an organization name
  • (c) Topic modeling and aspect extraction — the system identified sustainability as an emerging theme in the review corpus
  • (d) Text classification — the system categorized reviews into a predefined "sustainability" category

Question 8. In transfer learning for NLP, what is the primary benefit of using a pre-trained model like BERT?

  • (a) BERT eliminates the need for any task-specific training data.
  • (b) BERT achieves high accuracy on domain-specific tasks with far fewer labeled examples than training from scratch.
  • (c) BERT is faster at inference than simpler models like logistic regression.
  • (d) BERT automatically detects and corrects sarcasm in text.

Question 9. A company trains a sentiment classifier on product reviews and deploys it in production. After six months, the model's accuracy drops from 92 percent to 81 percent, even though the product has not changed. What is the most likely explanation?

  • (a) The model's parameters have degraded over time due to computational wear.
  • (b) Language drift — customer vocabulary, slang, and sentiment expression patterns have evolved since the model was trained.
  • (c) The TF-IDF vectorizer has run out of memory.
  • (d) The training data was corrupted during deployment.

Question 10. Which of the following business applications would benefit LEAST from NLP?

  • (a) Analyzing 50,000 customer support emails to identify the top five complaint categories
  • (b) Predicting next quarter's revenue based on historical sales figures in a spreadsheet
  • (c) Extracting key terms and obligations from a portfolio of 2,000 legal contracts
  • (d) Monitoring social media for real-time brand sentiment during a product launch

Question 11. In the context of NLP preprocessing, what is the difference between stemming and lemmatization?

  • (a) Stemming uses a dictionary to find base forms; lemmatization uses rule-based suffix removal.
  • (b) Stemming removes prefixes; lemmatization removes suffixes.
  • (c) Stemming applies crude rule-based suffix removal (often producing non-words); lemmatization uses vocabulary and morphological analysis to return proper dictionary base forms.
  • (d) Stemming is more accurate than lemmatization but slower.

Question 12. A data scientist proposes using a bag-of-words model for a task that requires understanding the difference between "the dog bit the man" and "the man bit the dog." Why would this approach fail?

  • (a) Bag of words cannot handle documents with more than 100 words.
  • (b) Bag of words ignores word order and treats both sentences identically.
  • (c) Bag of words requires too much computational power for short sentences.
  • (d) Bag of words only works with numerical data, not text.

Question 13. Named Entity Recognition (NER) would be most valuable for which of the following business tasks?

  • (a) Determining whether a product review is positive or negative
  • (b) Extracting the names of companies, people, and monetary amounts from a collection of news articles for competitive intelligence
  • (c) Grouping 100,000 documents into clusters of similar content
  • (d) Generating automated responses to customer inquiries

Question 14. Latent Dirichlet Allocation (LDA) is an unsupervised topic modeling algorithm. What does "unsupervised" mean in this context?

  • (a) The algorithm does not require any input data.
  • (b) The algorithm discovers topics without predefined category labels — the themes emerge from the data itself.
  • (c) The algorithm does not require any computational resources.
  • (d) The algorithm runs without human configuration or parameter tuning.

Question 15. According to the NLP decision framework in the chapter, when should a business choose a simple TF-IDF + logistic regression approach over a transformer-based model?

  • (a) When the task requires state-of-the-art accuracy on a nuanced understanding of context
  • (b) When the task is well-defined (like ticket routing), speed and cost matter, and 85-90 percent accuracy is sufficient
  • (c) When labeled training data is limited to fewer than 200 examples
  • (d) When the text data contains sarcasm and domain-specific jargon

Short Answer

Question 16. Explain why the word "not" should typically be excluded from stopword lists in sentiment analysis, even though it is a very common English word. Provide a specific example to illustrate.


Question 17. A retail executive says: "We have 2 million customer reviews. Let's just run sentiment analysis and give the results to the product team." Identify two reasons why this approach, while a good start, is insufficient for actionable product insights. What additional NLP techniques from this chapter would you recommend, and why?


Question 18. The chapter describes three discoveries from Athena's ReviewAnalyzer: the sustainability surge, the returns pain point, and early defect detection. For each discovery, explain how it illustrates one of the textbook's five recurring themes (The Hype-Reality Gap, Human-in-the-Loop, Data as a Strategic Asset, The Build-vs-Buy Decision, Responsible Innovation).


Question 19. Tom observes that preprocessing strips emojis from reviews, potentially losing sentiment information. A fire emoji likely signals enthusiasm; a thumbs-down emoji signals dissatisfaction. Propose a preprocessing strategy that preserves emoji sentiment while still converting text to a format suitable for TF-IDF vectorization. Describe the tradeoffs of your approach.


Question 20. Explain the connection between Chapter 9 (Unsupervised Learning / clustering) and topic modeling as discussed in this chapter. How are LDA topics similar to clusters? How do they differ?


Applied Scenario

Question 21. You are the VP of Customer Experience at a mid-sized e-commerce company. Your team currently reads a random sample of 200 customer reviews per week (out of 5,000 received). The CEO asks you to propose an NLP initiative.

  • (a) In three to four sentences, write the business case for NLP-powered review analysis, focusing on what the company is currently missing by sampling only 4 percent of reviews.
  • (b) Which two NLP techniques from this chapter would you deploy first, and why?
  • (c) How would you measure the ROI of this NLP initiative after six months?
  • (d) Identify one risk of deploying NLP for customer review analysis and propose a mitigation strategy.

Selected answers appear in Appendix B. Questions 16-21 are designed for written responses of 100-300 words each.