Papers3

#ImageNet 256x256

Image Generation with a Sphere Encoder

Kaiyu Yue, Menglin Jia et al.Feb 16arXiv

The Sphere Encoder is a new way to make images fast by teaching an autoencoder to place all images evenly on a big imaginary sphere and then decode random spots on that sphere back into pictures.

#Sphere Encoder#Spherical Latent Space#RMS Normalization

Not triaged yet

Boosting Latent Diffusion Models via Disentangled Representation Alignment

Intermediate

John Page, Xuesong Niu et al.Jan 9arXiv

This paper shows that the best VAEs for image generation are the ones whose latents neatly separate object attributes, a property called semantic disentanglement.

#Send-VAE#semantic disentanglement#latent diffusion

Not triaged yet

Guiding a Diffusion Transformer with the Internal Dynamics of Itself

Beginner

Xingyu Zhou, Qifan Li et al.Dec 30arXiv

This paper shows a simple way to make image-generating AIs (diffusion Transformers) produce clearer, more accurate pictures by letting the model guide itself from the inside.

#Internal Guidance#Diffusion Transformer#Intermediate Supervision

Not triaged yet