that captures the essential factors of variation in the data. This simple idea, when extended with probabilistic reasoning (Variational Autoencoders), regularization (sparse and denoising variants), and modern training paradigms (contrastive and self-supervised learning), has become one of the pilla