Big AI models used to get better by getting wider or reading longer texts, but those tricks are slowing down.
The paper fixes a stability problem in Hyper-Connections (HC) by gently steering the network’s mixing matrix onto a safe shape (a manifold) where signals don’t blow up or vanish.