Large language models often sound confident even when they are wrong, and existing ways to catch mistakes are slow or not very accurate.
The paper shows that many AI image generators are trained to prefer one popular idea of beauty, even when a user clearly asks for something messy, dark, blurry, or emotionally heavy.