🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search

Concepts2

Category

🔷All∑Math⚙️Algo🗂️DS📚Theory

Level

AllBeginnerIntermediateAdvanced
Filtering by:
#multi-head attention
📚TheoryAdvanced

Transformer Theory

Transformers map sequences to sequences using layers of self-attention and feed-forward networks wrapped with residual connections and LayerNorm.

#transformer#self-attention#positional encoding+12
📚TheoryIntermediate

Attention Mechanism Theory

Attention computes a weighted sum of values V where the weights come from how similar queries Q are to keys K.

#attention#self-attention#multi-head attention+12