Chapter 7 Quiz: Self-Assessment
Instructions: Answer each question without looking back at the chapter. After completing all questions, check your answers against the key at the bottom. If you score below 70%, revisit the relevant sections before moving on to Chapter 8.
Multiple Choice
Q1. In gradient descent, the "gradient" refers to:
a) The overall shape of the optimization landscape b) The difference between the local optimum and the global optimum c) The rate and direction of change in a quantity at a specific point d) The total distance from the starting point to the solution
Q2. A local optimum is best defined as:
a) The worst solution in the immediate neighborhood b) A solution that is better than all neighboring solutions but not necessarily the best overall c) The best possible solution to an optimization problem d) A saddle point where the gradient is zero in all directions
Q3. In the landscape metaphor, the "height" at a given point represents:
a) The difficulty of reaching that point from the starting position b) The number of steps required to find that solution c) The quality of the solution at that point (fitness, cost, loss, etc.) d) The dimensionality of the problem at that point
Q4. Why does water sometimes get trapped in a closed basin (like the Great Salt Lake) instead of reaching the ocean?
a) Water does not actually perform gradient descent b) The basin is a local minimum -- every direction leads uphill, but it is not the global minimum c) The water's step size is too large d) The landscape is convex, which prevents further movement
Q5. Sewall Wright's adaptive landscape is:
a) A physical terrain used to study water flow b) An abstract space where each point represents a possible genetic configuration, with height representing fitness c) A model of market prices over time d) A neural network architecture designed for optimization
Q6. The chapter argues that the difficulty of an optimization problem is determined primarily by:
a) The sophistication of the algorithm being used b) The number of dimensions in the search space c) The shape (topology) of the landscape being navigated d) The step size of the gradient descent process
Q7. In neural network training, the loss function measures:
a) The number of training examples the network has processed b) How far the network's predictions are from the correct answers c) The total number of parameters in the network d) The learning rate of the gradient descent algorithm
Q8. Ant colonies find efficient foraging paths by:
a) Each ant memorizing the full map of the territory b) A queen ant directing foragers to food sources c) Individual ants following pheromone concentration gradients, with positive feedback reinforcing successful trails d) Ants randomly exploring until they stumble upon food, with no gradient following
Q9. The QWERTY keyboard persists despite potentially superior alternatives. In the language of this chapter, this is an example of:
a) A global optimum b) A saddle point c) A local optimum maintained by lock-in effects d) A convex landscape with a single minimum
Q10. The chapter suggests that local optima may be less of a problem in very high-dimensional spaces because:
a) High-dimensional landscapes are always convex b) True local minima (where every dimension curves upward) become exponentially rare; saddle points dominate instead c) Gradient descent is more accurate in higher dimensions d) The step size automatically increases with dimensionality
Q11. Which of the following is NOT a strategy for escaping local optima, as discussed in the chapter?
a) Adding random noise to the gradient (stochastic gradient descent) b) Genetic drift in small populations c) Reducing the step size to zero d) Pheromone evaporation in ant colonies
Q12. The concept of convergence in gradient descent refers to:
a) The merging of two separate optimization processes b) The system actually reaching an optimum rather than wandering indefinitely c) The landscape becoming smoother over time d) Multiple agents arriving at the same solution simultaneously
Q13. In the market example, what plays the role of the "gradient" that drives prices toward equilibrium?
a) Government regulations b) The difference between supply and demand at the current price c) The total volume of goods being traded d) Historical price trends
Q14. According to the chapter, which of the following statements about gradient descent is TRUE?
a) Gradient descent always finds the global optimum b) Gradient descent requires global knowledge of the landscape c) Gradient descent operates on local information and may get stuck in local optima d) Gradient descent works only in continuous, physical systems
Q15. The vanishing gradient problem occurs when:
a) The gradient becomes so large that the system overshoots the minimum b) The gradient becomes so small that the system effectively stops making progress c) Multiple gradients point in contradictory directions d) The landscape changes faster than the system can follow
True or False
Q16. Gradient descent and gradient ascent are fundamentally different algorithms that apply to different kinds of problems.
Q17. The starting point of a gradient descent process can determine which optimum the system converges to.
Q18. A rugged landscape with many local optima is harder to optimize than a smooth landscape with a single minimum.
Q19. Momentum in gradient descent means the system ignores the gradient and moves in a random direction.
Q20. The fitness landscape metaphor applies only to biological evolution and cannot be meaningfully extended to other domains.
Short Answer
Q21. In two or three sentences, explain why the fitness landscape is considered the threshold concept of this chapter. What changes in your thinking once you grasp it?
Q22. Give one example of a local optimum trap from biology and one from economics. For each, explain why gradient descent cannot escape it.
Q23. The chapter states that "the difficulty of an optimization problem is determined primarily by the shape of the landscape, not by the cleverness of the algorithm." Explain this claim using a concrete example.
Answer Key
Q1. c) The rate and direction of change in a quantity at a specific point.
Q2. b) A solution that is better than all neighboring solutions but not necessarily the best overall.
Q3. c) The quality of the solution at that point (fitness, cost, loss, etc.).
Q4. b) The basin is a local minimum -- every direction leads uphill, but it is not the global minimum.
Q5. b) An abstract space where each point represents a possible genetic configuration, with height representing fitness.
Q6. c) The shape (topology) of the landscape being navigated.
Q7. b) How far the network's predictions are from the correct answers.
Q8. c) Individual ants following pheromone concentration gradients, with positive feedback reinforcing successful trails.
Q9. c) A local optimum maintained by lock-in effects.
Q10. b) True local minima (where every dimension curves upward) become exponentially rare; saddle points dominate instead.
Q11. c) Reducing the step size to zero. (This would stop all movement, making escape impossible.)
Q12. b) The system actually reaching an optimum rather than wandering indefinitely.
Q13. b) The difference between supply and demand at the current price.
Q14. c) Gradient descent operates on local information and may get stuck in local optima.
Q15. b) The gradient becomes so small that the system effectively stops making progress.
Q16. False. They are the same algorithm with opposite signs -- descent minimizes, ascent maximizes.
Q17. True. Different starting points can lead to different local optima (path dependence).
Q18. True. Rugged landscapes have many local optima that can trap gradient descent.
Q19. False. Momentum means the system carries forward velocity from previous gradient steps, creating inertia that helps cross small obstacles.
Q20. False. The landscape metaphor is explicitly cross-domain, applying to markets, engineering, drug design, career planning, and many other fields.
Q21. Sample answer: The fitness landscape is the threshold concept because it reveals that evolution, market pricing, neural network training, and many other processes are all navigating the same kind of abstract space. Once grasped, every optimization problem becomes a landscape with peaks, valleys, and basins of attraction, and the local optimum trap becomes visible everywhere. It transforms "how does this system find solutions?" into "what does this system's landscape look like?"
Q22. Sample answer: Biology -- the vertebrate eye's backward wiring is a local optimum because rewiring would require nonfunctional intermediate forms that selection would eliminate. Economics -- QWERTY keyboard lock-in is a local optimum because switching costs for millions of trained typists exceed the benefit for any individual, even though a better layout might exist.
Q23. Sample answer: Consider finding the lowest point in a smooth bowl versus finding the lowest point in a mountain range with thousands of valleys. In the bowl, even a crude algorithm will find the bottom. In the mountain range, even the most sophisticated algorithm may get stuck in the wrong valley. The landscape's shape (bowl vs. mountain range) matters more than the algorithm's cleverness.
Scoring Guide
- Multiple Choice (Q1-Q15): 4 points each = 60 points
- True/False (Q16-Q20): 4 points each = 20 points
- Short Answer (Q21-Q23): Approximately 7 points each = ~20 points
- Total: 100 points
- 70% threshold to proceed: 70 points
If you scored below 70%, revisit the following sections based on which questions you missed: - Q1-Q3: Section 7.2 (What Is a Gradient?) - Q4-Q6: Sections 7.3 and 7.8 (Water Flowing Downhill, The Landscape Metaphor) - Q7-Q8: Sections 7.6 and 7.7 (Neural Networks, Ant Foraging) - Q9-Q11: Section 7.9-7.10 (Local Optima Traps, Escaping the Trap) - Q12-Q15: Sections 7.11-7.14 (Convergence, Limitations) - Q16-Q20: Review the full chapter for conceptual clarity - Q21-Q23: Focus on Sections 7.8 and 7.9 (The Landscape Metaphor, Local Optima)