CNN features from individual frames - 3D CNN features from frame sequences (e.g., I3D, SlowFast networks) - Transformer-based video representations