Chapter 15 Exercises: Computer Vision for Business

DataField.Dev

Chapter 15 Exercises: Computer Vision for Business

Section A: Recall and Comprehension

Exercise 15.1 Define the following terms in your own words, using no more than two sentences each: (a) pixel, (b) convolutional filter, (c) feature map, (d) pooling, (e) bounding box, (f) transfer learning.

Exercise 15.2 Explain why images are considered "high-dimensional data." Calculate the number of individual data points in a 1280 x 720 color image, and compare it to the typical number of fields in a structured customer database record.

Exercise 15.3 Describe the hierarchical feature detection process in a CNN. What types of patterns are detected at early layers vs. deep layers, and why does this hierarchy emerge through training rather than being manually programmed?

Exercise 15.4 Compare and contrast image classification, object detection, and image segmentation. For each task, state: (a) what the model outputs, (b) the relative data labeling cost, and (c) one business application where that task is the appropriate choice.

Exercise 15.5 Explain the difference between semantic segmentation and instance segmentation. Provide a business scenario where the distinction matters — where semantic segmentation would be insufficient and instance segmentation would be required.

Exercise 15.6 What is the ImageNet dataset, and why was it significant for the development of computer vision? How did it enable transfer learning to become the standard approach for business CV applications?

Exercise 15.7 List the seven provisions of Athena's computer vision governance policy. For each provision, explain the specific risk it mitigates.

Section B: Application

Exercise 15.8: CV Task Selection For each of the following business scenarios, identify the most appropriate computer vision task (classification, object detection, semantic segmentation, or instance segmentation) and justify your choice. For each, also indicate whether a cloud API or custom model would be more appropriate.

(a) A logistics company wants to automatically sort packages by size category (small, medium, large, oversized) on a conveyor belt.
(b) An insurance company wants to estimate vehicle damage from photographs submitted with claims.
(c) A real estate platform wants to automatically tag listing photos (kitchen, bathroom, living room, exterior, etc.).
(d) A construction company wants to monitor a building site and count the number of workers wearing hard hats vs. not wearing hard hats.
(e) A dermatology clinic wants to precisely outline suspicious skin lesions in photographs to track changes over time.
(f) A fashion retailer wants to automatically remove backgrounds from product photographs for its website.

Exercise 15.9: Transfer Learning Decision A mid-sized manufacturing company produces custom metal brackets. They have identified a quality inspection problem: approximately 3% of brackets have surface defects (scratches, pitting, or discoloration) that are currently caught by human inspectors at a rate of about 88%. The company has: - 2,000 photographs of brackets (both good and defective) - A budget of $75,000 for the CV project - No in-house machine learning expertise - A production line speed of 60 brackets per minute

(a) Design a transfer learning approach for this problem. Which pre-trained model would you recommend and why? (b) Estimate the data labeling requirements and costs. (c) Evaluate whether edge or cloud deployment is more appropriate given the production line speed. (d) Calculate the potential ROI, assuming each missed defect costs $1,200 in warranty and return costs.

Exercise 15.10: Cloud API Comparison Visit the online demos for at least two of the following cloud vision APIs: Google Cloud Vision, AWS Rekognition, and Azure Computer Vision. Upload the same five images (include at least one retail shelf, one document, and one outdoor scene). For each image: - (a) Compare the labels, detected objects, and confidence scores returned by each service. - (b) Identify cases where one service outperformed the other. - (c) Note any cases where the results were incorrect or misleading. - (d) Based on your comparison, under what circumstances would you choose each service?

Exercise 15.11: Shelf Analytics Business Case You are the VP of Operations at a grocery chain with 120 stores, each with approximately 150 shelf sections. Current shelf auditing is performed manually: each store has two employees who spend 3 hours per day conducting visual shelf checks, at an average labor cost of $22 per hour. Their detection accuracy for out-of-stock positions is estimated at 82%.

A computer vision vendor proposes an automated shelf analytics system with the following costs: - Hardware (cameras + edge devices): $8,500 per store - Software license: $350 per store per month - Implementation and integration: $180,000 (one-time) - Training and change management: $45,000 (one-time)

The vendor claims 95% detection accuracy and estimates a 10% reduction in lost sales from out-of-stock items. Your current estimated annual lost sales from OOS across all stores is $18 million.

(a) Calculate the total first-year cost of the CV system. (b) Calculate the estimated first-year benefit. (c) Determine the payback period. (d) Identify three risks that could cause actual results to fall short of the vendor's claims. (e) What pilot structure would you propose before committing to full deployment?

Exercise 15.12: Visual Search Strategy for NK NK is developing a proposal for Athena's "Snap & Shop" visual search feature on the mobile app. Draft a one-page strategy document that includes: - (a) The target customer use case (who uses this feature and why) - (b) Technical approach (build custom model vs. use API vs. hybrid) - (c) Success metrics (how will you measure whether this feature creates value) - (d) Competitive analysis (which retailers already offer visual search and how does Athena differentiate) - (e) Risks and mitigation strategies - (f) Estimated development timeline and budget

Section C: Analysis and Evaluation

Exercise 15.13: The Surveillance Spectrum The chapter argues that "every camera system you deploy is a surveillance system." Consider the following continuum of retail CV applications, ordered from least to most ethically concerning:

Cameras photograph shelves (no people) twice daily for inventory analysis
Overhead cameras count anonymous foot traffic patterns (no identification)
Cameras detect queue lengths and trigger additional register openings
Cameras analyze customer demographics (estimated age, gender) for marketing analytics
Cameras use facial recognition to identify known shoplifters
Cameras track individual customer journeys through the store to optimize product placement

(a) For each application on the spectrum, identify the specific ethical concern that distinguishes it from the previous level. (b) Where on this spectrum would you draw the line as a business leader? Justify your position with reference to ethical principles, regulatory requirements, and business risk. (c) How does the concept of consent apply differently at each level? (d) Athena's policy currently positions the company at level 1-2 on this spectrum. Under what business conditions, if any, would you recommend Athena move to level 3 or 4? What governance controls would be required?

Exercise 15.14: Medical CV — Promise and Peril A healthcare startup has developed a computer vision model that detects early-stage melanoma from smartphone photographs of skin lesions with 91% sensitivity and 89% specificity. They want to release it as a consumer app.

(a) Calculate the positive predictive value (PPV) assuming a melanoma prevalence of 0.5% in the app's user population. (Recall: PPV = true positives / (true positives + false positives).) (b) Interpret the PPV you calculated. For every 100 users the app tells "you may have melanoma," how many actually do? (c) What are the potential harms of false positives? False negatives? (d) How does the base rate (prevalence) affect the practical utility of the app? (e) What regulatory approvals would be required before this app could be marketed in the US and EU? (f) How would you redesign the app to mitigate the risks you identified?

Exercise 15.15: Edge vs. Cloud Decision Matrix For each of the following scenarios, evaluate whether edge deployment, cloud deployment, or a hybrid approach is most appropriate. Consider latency, bandwidth, privacy, cost, and model update requirements in your analysis.

(a) A chain of 500 convenience stores deploying theft detection (product in hand but no checkout event)
(b) A single large manufacturing plant with 40 quality inspection cameras on a high-speed production line
(c) A real estate company processing 10,000 listing photographs per week to generate automated property descriptions
(d) A fleet of 200 delivery vehicles using dashcams for driver safety monitoring
(e) A hospital using computer vision to analyze pathology slides

Exercise 15.16: Bias Audit Design You have been hired to audit a retail visual search system for potential bias. The system allows customers to photograph clothing items and find similar products in the retailer's catalog.

(a) Identify three dimensions along which the system might exhibit bias (e.g., skin tone, clothing style, cultural context). (b) For each dimension, design a specific test you would conduct to detect bias. Describe the test images you would use, the metrics you would measure, and the threshold at which you would flag a concern. (c) If your audit reveals significant performance disparities, recommend three remediation strategies. (d) How should the retailer communicate the results of the bias audit to customers, employees, and stakeholders?

Section D: Integration and Synthesis

Exercise 15.17: CV Strategy Presentation Select an industry other than retail or manufacturing (e.g., agriculture, logistics, construction, insurance, hospitality). Prepare a 10-minute presentation that:

(a) Identifies the three highest-value computer vision use cases in that industry
(b) For each use case, specifies the CV task (classification, detection, segmentation), the data requirements, and the expected ROI
(c) Recommends a phased deployment strategy (pilot, validate, scale)
(d) Addresses ethical considerations specific to that industry
(e) Proposes a governance framework modeled on Athena's approach but adapted for the industry's unique characteristics

Exercise 15.18: The "Should We Deploy This?" Framework Athena's board has proposed expanding the shelf analytics camera system to also: - Track employee movements to optimize staffing allocation - Analyze customer facial expressions to gauge reactions to product displays - Monitor checkout interactions to assess cashier performance

Write a memo to the board evaluating each proposal. For each, address: - (a) The potential business value - (b) The technical feasibility - (c) The ethical implications - (d) The regulatory risk - (e) The impact on employee trust and morale - (f) Your recommendation (deploy, modify, or reject) with justification

Reference Athena's existing governance policy and explain whether each proposal is consistent with or violates its principles.

Selected answers are available in Appendix B: Answers to Selected Exercises.