dLLM is a single, open-source toolbox that standardizes how diffusion language models are trained, run, and tested.
Hepato-LLaVA is a special AI that reads giant microscope pictures of the liver and answers medical questions about cancer.
This paper speeds up image and video generators called diffusion transformers by changing how big their puzzle pieces (patches) are at each step.
This paper teaches image models to copy a change shown in one image pair and apply it to a new image, like saying 'hat added here, add a similar hat there.'
Decoder-only language models can be great at making user profiles (embeddings), but how we let them look at the sequence—called attention masking—changes how smart those profiles are.
The paper asks a simple question: which kind of step-by-step reasoning helps small language models learn best, and why?
When you tune the learning rate carefully, plain old LoRA fine-tuning works about as well as fancy new versions.
The paper tries several different ways to translate five low-resource Turkic languages, instead of forcing one method to fit all.
LatentMem is a new memory system that helps teams of AI agents remember the right things for their specific jobs without overloading them with text.
This paper teaches AI how to fix broken Lean math proofs by learning from the compiler’s feedback, not just from finished, perfect proofs.
SLIME is a new way to train chatbots so they follow human preferences without forgetting how to write well.
The paper shows that three popular ways to control language models—fine-tuning a few weights, LoRA, and activation steering—are actually the same kind of action: a dynamic weight update driven by a control knob.