Chapter 13 Exercises: Neural Networks Demystified


Section A: Recall and Comprehension

Exercise 13.1 Describe the four steps an artificial neuron performs, using non-technical language. Then explain why the biological neuron analogy, while useful, is misleading if taken too literally.

Exercise 13.2 Define the following terms in your own words, using no more than two sentences each: (a) weight, (b) bias, (c) activation function, (d) loss function, (e) gradient descent, (f) backpropagation, (g) epoch, (h) batch size.

Exercise 13.3 Explain the difference between the input layer, hidden layers, and output layer of a neural network. What does the word "hidden" mean in this context?

Exercise 13.4 Using Professor Okonkwo's restaurant decision analogy, explain how an artificial neuron makes a decision. Then extend the analogy: what would it mean for the neuron to "learn"?

Exercise 13.5 What is the universal approximation theorem? Why is it both a powerful promise and a potentially misleading one? Explain using the analogy from the chapter.

Exercise 13.6 Describe each of the following activation functions using the analogy from the chapter, and identify when each is typically used: - (a) Sigmoid - (b) ReLU - (c) Softmax

Exercise 13.7 Explain overfitting using the "straight-A student" analogy. Then describe three techniques for preventing overfitting and explain how each works.


Section B: Application

Exercise 13.8: Architecture Matching For each of the following business problems, identify the most appropriate neural network architecture (feedforward, CNN, RNN/LSTM, or transformer) and explain your reasoning: - (a) Predicting which customers will default on their credit card payments based on demographic and transaction data - (b) Automatically categorizing product photos into 500 product categories for an e-commerce catalog - (c) Translating customer service emails from Spanish to English - (d) Detecting anomalies in manufacturing quality by analyzing photos of finished products - (e) Predicting daily stock trading volume based on 30 days of historical market data - (f) Generating product descriptions from structured product attributes (size, color, material) - (g) Classifying customer sentiment in social media posts - (h) Predicting insurance claim amounts based on policyholder attributes and claim details

Exercise 13.9: The Deep Learning Decision Framework Athena Retail Group's data science team has proposed three new deep learning projects. For each one, apply Ravi's five-question decision framework (data type, data volume, accuracy gain, interpretability, total cost of ownership) and recommend whether to proceed with deep learning, use traditional ML, or decline the project entirely. Justify each decision.

  • Project A: Using a neural network to predict weekly demand for each of Athena's 12,000 SKUs, replacing the current XGBoost model. The team estimates a 2 percent improvement in forecast accuracy. Data consists of 3 years of weekly sales data in tabular format. Demand forecasts directly influence purchasing decisions worth $200 million annually.

  • Project B: Building a visual search feature for Athena's mobile app — customers take a photo of a product they see in the real world, and the app finds similar products in Athena's catalog. No existing model can do this; it requires processing images. Athena has 50,000 product images. Customer research indicates this feature would increase mobile conversion by 8-12 percent.

  • Project C: Using deep learning to analyze 2 million customer reviews, extracting specific product complaints (e.g., "zipper broke after two washes," "color faded quickly") and routing them to the relevant product team. The current approach uses keyword matching and captures roughly 40 percent of actionable complaints.

Exercise 13.10: Transfer Learning Cost-Benefit Analysis A mid-size healthcare company wants to build an AI system that identifies skin conditions from photographs. They have 5,000 labeled images of skin conditions collected from their dermatology clinics.

  • (a) Explain why training a CNN from scratch would be impractical with only 5,000 images.
  • (b) Describe how transfer learning could make this project feasible. What kind of pre-trained model would be appropriate, and what data was it likely trained on?
  • (c) Estimate the rough cost difference between training from scratch and using transfer learning (order of magnitude is sufficient).
  • (d) What risks or limitations should the company be aware of when using transfer learning for medical imaging?

Exercise 13.11: GPU Budget Planning You are the VP of Data Science at a company planning three deep learning initiatives for the next fiscal year. Estimate the GPU compute tier (experimentation, production training, or frontier) for each initiative, and explain the factors that drive your estimate:

  • (a) Fine-tuning a pre-trained language model on 100,000 customer service transcripts for intent classification
  • (b) Training a custom object detection model from scratch on 2 million annotated warehouse images
  • (c) Running 50 experiments to determine the optimal hyperparameters for a recommendation model

Exercise 13.12: Vendor Evaluation A vendor pitches the following to your executive team: "Our proprietary deep neural network analyzes your structured customer data — demographics, purchase history, and engagement scores — and predicts churn with 97 percent accuracy. Our model uses 47 hidden layers and was trained on 500 million parameters."

Using the concepts from this chapter, identify at least four red flags or questions you would raise about this pitch. For each, explain what the concern is and what additional information you would request.


Section C: Analysis and Evaluation

Exercise 13.13: The "Start Simple" Debate Professor Okonkwo advocates starting with the simplest model that could work and adding complexity only when the data proves it necessary. Tom argues that this principle is too conservative — that in a fast-moving competitive landscape, companies that wait to prove the need for deep learning will be outpaced by competitors who deploy it aggressively.

  • (a) Present three arguments in favor of the "start simple" principle.
  • (b) Present three arguments against it (i.e., in favor of a "go deep first" approach).
  • (c) Under what conditions might the "go deep first" approach be strategically justified? Give a specific business example.
  • (d) How does the "start simple" principle relate to the broader theme of the Hype-Reality Gap discussed in Chapter 1?

Exercise 13.14: The Interpretability-Accuracy Tradeoff A bank is considering replacing its current logistic regression credit scoring model (which regulators have approved and which loan officers understand) with a deep neural network that improves prediction accuracy by 4 percentage points — reducing defaults by an estimated $12 million per year.

  • (a) What are the business arguments for making the switch?
  • (b) What are the regulatory and ethical arguments against it?
  • (c) What questions would you ask the data science team before making a decision?
  • (d) Propose a compromise approach that might capture some of the accuracy gain while preserving interpretability.
  • (e) How does this scenario connect to the themes of Chapters 25-26 (Bias and Fairness) and Chapter 28 (AI Regulation)?

Exercise 13.15: Explaining Neural Networks to Non-Technical Stakeholders You need to explain to your company's board of directors why the AI team is recommending a deep learning approach for a new product feature. The board has no technical background.

Write a one-page (approximately 400-word) memo that: - (a) Explains what a neural network is using at least one original analogy (not one from the chapter) - (b) Explains why deep learning is necessary for this particular problem (you may choose any business problem that genuinely requires deep learning) - (c) Addresses the cost, including why GPUs are needed - (d) Addresses interpretability concerns - (e) Provides a clear recommendation with the key tradeoffs

Exercise 13.16: Competitive Intelligence — GPU Economics Research the current pricing for GPU cloud computing from at least two major providers (AWS, Google Cloud, Azure, or a specialized provider like Lambda Labs or CoreWeave).

  • (a) What is the cost per hour for a training-grade GPU instance?
  • (b) What is the cost per hour for an inference-grade GPU instance?
  • (c) If your company needed to train a model for 100 GPU-hours per month, what would the annual cloud GPU cost be?
  • (d) At what point (in terms of GPU-hours per month) might it be more cost-effective to purchase dedicated GPU hardware rather than renting from the cloud? What factors besides hourly cost should influence this decision?

Exercise 13.17: The Forward Pass — A Walk-Through Consider a simple neural network with: - 3 input neurons (representing customer age, annual spending, and number of purchases) - 1 hidden layer with 2 neurons (using ReLU activation) - 1 output neuron (using sigmoid activation, predicting churn probability)

Without performing any calculations, describe in plain English what happens during a single forward pass through this network. Trace the journey of a single customer record from input to output, explaining what each layer does at each step. Use at least one analogy.

Exercise 13.18: Deep Learning in Your Industry Choose an industry you know well or plan to work in after your MBA.

  • (a) Identify two applications where deep learning is already creating significant business value in that industry. For each, identify the data type, the architecture likely being used, and the business outcome.
  • (b) Identify one application where deep learning is being attempted but has not yet delivered clear value. What is the likely reason?
  • (c) Identify one application where deep learning could be valuable but is not yet widely attempted. What barriers prevent adoption?

Section D: Integrative Exercises

Exercise 13.19: From Part 2 to Part 3 — Connecting the Chapters Review the machine learning concepts from Chapters 7-12. For each of the following topics, explain how the Part 2 concept relates to or extends into the deep learning concepts covered in Chapter 13:

  • (a) Overfitting (Chapter 11) → Overfitting in deep learning
  • (b) Feature engineering (Chapter 7) → Learned representations in neural networks
  • (c) Model evaluation (Chapter 11) → Evaluating neural networks
  • (d) Build vs. buy (Chapter 6) → Transfer learning
  • (e) MLOps (Chapter 12) → Serving deep learning models in production

Exercise 13.20: Preparing for Part 3 Based on what you learned in Chapter 13, write three specific questions you have about each of the following upcoming chapters. These questions should demonstrate that you understand the foundational concepts from Chapter 13 well enough to anticipate how they will be applied:

  • (a) Chapter 14: NLP for Business
  • (b) Chapter 15: Computer Vision for Business
  • (c) Chapter 17: Generative AI — Large Language Models

Selected answers are available in Appendix: Answers to Selected Exercises.