🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Transformer

MeKi: Memory-based Expert Knowledge Injection for Efficient LLM Scaling

Intermediate
Ning Ding, Fangcheng Liu et al.Feb 3arXiv

MeKi is a new way to grow a language model’s knowledge by using storage (ROM) instead of extra heavy calculations (FLOPs).

#MeKi#memory-based scaling#token-level experts

FrankenMotion: Part-level Human Motion Generation and Composition

Beginner
Chuqiao Li, Xianghui Xie et al.Jan 15arXiv

FrankenMotion is a new AI that makes human motion by controlling each body part over time, like a careful puppeteer.

#Human motion generation#Part-level control#Hierarchical conditioning

Deep Delta Learning

Intermediate
Yifan Zhang, Yifeng Liu et al.Jan 1arXiv

Deep Delta Learning (DDL) replaces the usual “add the shortcut” rule in deep networks with a smarter, learnable move that can gently erase old info and write new info along a chosen direction.

#Deep Delta Learning#Delta Operator#Residual connection

Dynamic Large Concept Models: Latent Reasoning in an Adaptive Semantic Space

Intermediate
Xingwei Qu, Shaowen Wang et al.Dec 31arXiv

Language is lumpy: easy stretches and tricky jumps are mixed together, but old models spend the same effort on every word.

#Dynamic Large Concept Models#semantic boundaries#latent reasoning

End-to-End Test-Time Training for Long Context

Intermediate
Arnuv Tandon, Karan Dalal et al.Dec 29arXiv

This paper shows how a language model can keep learning while you use it, so it handles very long inputs without slowing down.

#Test-Time Training#Meta-learning#Long-context language modeling

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Intermediate
NVIDIA, : et al.Dec 23arXiv

Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.

#Mixture-of-Experts#Mamba-2#Transformer

Bidirectional Normalizing Flow: From Data to Noise and Back

Intermediate
Yiyang Lu, Qiao Sun et al.Dec 11arXiv

Normalizing Flows are models that learn how to turn real images into simple noise and then back again.

#Normalizing Flow#Bidirectional Normalizing Flow#Hidden Alignment