measure whether the model actually works. For classification, the metrics from Chapter 16 apply directly. For topic modeling, coherence scores. For sentiment, both accuracy and qualitative review of misclassified examples.