**Communication overhead.** Sending model updates back and forth requires bandwidth. For large models, this can be prohibitive. - **Data heterogeneity.** Different devices have different data distributions. A user who texts primarily in medical terminology trains a very different local model than a