Groups
Category
Principal Component Analysis (PCA) finds new orthogonal axes (principal components) that capture the maximum variance in your data.
Standard softmax attention costs O(nยฒ) in sequence length because every token compares with every other token.