🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers1

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#DNA sequence modeling

Stronger Normalization-Free Transformers

Intermediate
Mingzhi Chen, Taiming Lu et al.Dec 11arXiv

This paper shows that we can remove normalization layers from Transformers and still train them well by using a simple point‑by‑point function called Derf.

#Normalization‑free Transformers#LayerNorm replacement#Point‑wise activation