Setting max tokens too low truncates responses mid-sentence — a common source of frustration. - Setting max tokens too high does not force the model to write longer responses; it merely allows it to. - For cost control in API-based deployments, max tokens directly affects pricing (you pay per token