Concepts2
📚TheoryIntermediate
Attention Mechanism Theory
Attention computes a weighted sum of values V where the weights come from how similar queries Q are to keys K.
#attention#self-attention#multi-head attention+12
📚TheoryAdvanced
Neural Network Expressivity
Neural network expressivity studies what kinds of functions different network architectures can represent and how efficiently they can do so.
#neural network expressivity#depth separation#relu linear regions+12