🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers8

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#preference optimization

Demo-ICL: In-Context Learning for Procedural Video Knowledge Acquisition

Beginner
Yuhao Dong, Shulin Tian et al.Feb 9arXiv

This paper teaches AI to learn how-to steps from demonstrations in the moment, the way people do.

#video in-context learning#procedural video understanding#multimodal large language models

Not triaged yet

SLIME: Stabilized Likelihood Implicit Margin Enforcement for Preference Optimization

Intermediate
Maksim Afanasyev, Illarion IovFeb 2arXiv

SLIME is a new way to train chatbots so they follow human preferences without forgetting how to write well.

#SLIME#preference optimization#DPO

Not triaged yet

HeartMuLa: A Family of Open Sourced Music Foundation Models

Intermediate
Dongchao Yang, Yuxin Xie et al.Jan 15arXiv

HeartMuLa is a family of open-source music AI models that can understand and generate full songs with clear lyrics and strong musical structure.

#music generation#audio tokenizer#residual vector quantization

Not triaged yet

Token-Level LLM Collaboration via FusionRoute

Intermediate
Nuoya Xiong, Yuhang Zhou et al.Jan 8arXiv

Big all-in-one language models are powerful but too expensive to run everywhere, while small specialists are cheaper but narrow.

#FusionRoute#token-level collaboration#expert routing

Not triaged yet

ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

Beginner
Hengjia Li, Liming Jiang et al.Jan 6arXiv

ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.

#reasoning-centric image editing#reinforcement learning#chain-of-thought

Not triaged yet

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Intermediate
Taekyung Ki, Sangwon Jang et al.Jan 2arXiv

This paper builds a real-time talking-listening head avatar that reacts naturally to your words, tone, nods, and smiles in about half a second.

#interactive avatar#talking head generation#causal diffusion forcing

Not triaged yet

Factorized Learning for Temporally Grounded Video-Language Models

Intermediate
Wenzheng Zeng, Difei Gao et al.Dec 30arXiv

This paper teaches video-language models to first find when the proof happens in a video and then answer with that proof, instead of mixing both steps together.

#temporal grounding#video-language models#evidence tokens

Not triaged yet

Adaptation of Agentic AI

Intermediate
Pengcheng Jiang, Jiacheng Lin et al.Dec 18arXiv

This paper organizes how AI agents learn and improve into one simple map with four roads: A1, A2, T1, and T2.

#agentic AI#adaptation#A1 A2 T1 T2

Not triaged yet