Benchmark critical transactions (response time, throughput) - Run batch jobs with production-scale data volumes - Compare CPU time, elapsed time, and I/O counts against legacy baselines - Define acceptable performance thresholds before testing (e.g., "within 10% of legacy")