Glossary

flat minima

regions where the loss is low across a wide neighborhood of parameter values. - **Large batches** provide more accurate gradient estimates, which tend to converge to **sharp minima**---narrow valleys in the loss landscape where small perturbations in parameters cause large increases in loss.

Learn More

Related Terms