[ ] Wrap all preprocessing in a scikit-learn Pipeline. - [ ] Verify no data leakage by checking that all `fit` calls use only training data. - [ ] Use cross-validation to evaluate the impact of each feature engineering decision. - [ ] Apply feature selection to remove noise and redundancy.