🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers21

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#LoRA

dLLM: Simple Diffusion Language Modeling

Intermediate
Zhanhui Zhou, Lingjie Chen et al.Feb 26arXiv

dLLM is a single, open-source toolbox that standardizes how diffusion language models are trained, run, and tested.

#diffusion language models#masked diffusion#block diffusion

Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images

Intermediate
Yuxuan Yang, Zhonghao Yan et al.Feb 23arXiv

Hepato-LLaVA is a special AI that reads giant microscope pictures of the liver and answers medical questions about cancer.

#Hepato-LLaVA#Hepatocellular Carcinoma#Whole Slide Images

DDiT: Dynamic Patch Scheduling for Efficient Diffusion Transformers

Intermediate
Dahye Kim, Deepti Ghadiyaram et al.Feb 19arXiv

This paper speeds up image and video generators called diffusion transformers by changing how big their puzzle pieces (patches) are at each step.

#Diffusion Transformer#Dynamic Tokenization#Patch Scheduling

Spanning the Visual Analogy Space with a Weight Basis of LoRAs

Intermediate
Hila Manor, Rinon Gal et al.Feb 17arXiv

This paper teaches image models to copy a change shown in one image pair and apply it to a new image, like saying 'hat added here, add a similar hat there.'

#visual analogy learning#LoRA#LoRA basis

How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning

Intermediate
Jiahao Yuan, Yike Xu et al.Feb 11arXiv

Decoder-only language models can be great at making user profiles (embeddings), but how we let them look at the sequence—called attention masking—changes how smart those profiles are.

#decoder-only LLM#attention masking#causal attention

No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data

Intermediate
Dmitry KarpovFeb 4arXiv

The paper tries several different ways to translate five low-resource Turkic languages, instead of forcing one method to fit all.

#low-resource machine translation#Turkic languages#NLLB-200

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Intermediate
Muxin Fu, Guibin Zhang et al.Feb 3arXiv

LatentMem is a new memory system that helps teams of AI agents remember the right things for their specific jobs without overloading them with text.

#multi-agent systems#latent memory#role-aware memory

Learning to Repair Lean Proofs from Compiler Feedback

Intermediate
Evan Wang, Simon Chess et al.Feb 3arXiv

This paper teaches AI how to fix broken Lean math proofs by learning from the compiler’s feedback, not just from finished, perfect proofs.

#Lean proof repair#compiler feedback#APRIL dataset

SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization

Intermediate
Maksim Afanasyev, Illarion IovFeb 2arXiv

SLIME is a new way to train chatbots so they follow human preferences without forgetting how to write well.

#SLIME#preference optimization#DPO

Why Steering Works: Toward a Unified View of Language Model Parameter Dynamics

Intermediate
Ziwen Xu, Chenyan Wu et al.Feb 2arXiv

The paper shows that three popular ways to control language models—fine-tuning a few weights, LoRA, and activation steering—are actually the same kind of action: a dynamic weight update driven by a control knob.

#language model steering#dynamic weight updates#activation steering

LRAgent: Efficient KV Cache Sharing for Multi-LoRA LLM Agents

Intermediate
Hyesung Jeon, Hyeongju Ha et al.Feb 1arXiv

Multi-agent LLM systems often use LoRA adapters so each agent has a special role, but they all rebuild almost the same KV cache, wasting memory and time.

#LoRA#Multi-LoRA#KV cache

JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion

Intermediate
Anthony Chen, Naomi Ken Korem et al.Jan 29arXiv

This paper shows a simple, one-model way to dub videos that makes the new voice and the lips move together naturally.

#video dubbing#audio-visual diffusion#joint generation
12