How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning
IntermediateJiahao Yuan, Yike Xu et al.Feb 11arXiv
Decoder-only language models can be great at making user profiles (embeddings), but how we let them look at the sequence—called attention masking—changes how smart those profiles are.
#decoder-only LLM#attention masking#causal attention