Transformers converge to invariant algorithmic cores
IntermediateJoshua S. SchiffmanFeb 26arXiv
Different transformers may have very different weights, but they often hide the same tiny "engine" inside that actually does the task.
#algorithmic cores#mechanistic interpretability#transformers