Glossary

pre-norm

leads to more stable training, especially for deeper models:

Learn More

Related Terms