This paper shows a simple way to make image-generating AIs (diffusion Transformers) produce clearer, more accurate pictures by letting the model guide itself from the inside.
DiffThinker turns hard picture-based puzzles into an image-to-image drawing task instead of a long texting task.
This paper shows a new way to teach an autoencoder to shape its hidden space (the 'latent space') to look like any distribution we want, not just a simple bell curve.