Groups
Category
Mamba uses a state-space model whose parameters are selected (gated) by the current input token, letting the model adapt its memory dynamics at each step.
Long Short-Term Memory (LSTM) networks use gates (forget, input, and output) to control what information to erase, write, and reveal at each time step.