🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers807

AllBeginnerIntermediateAdvanced
All SourcesarXiv

World Models That Know When They Don't Know: Controllable Video Generation with Calibrated Uncertainty

Intermediate
Zhiting Mei, Tenny Yin et al.Dec 5arXiv

This paper teaches video-making AI models to say how sure they are about each tiny part of every frame they create.

#controllable video generation#uncertainty quantification#calibration

SCAIL: Towards Studio-Grade Character Animation via In-Context Learning of 3D-Consistent Pose Representations

Intermediate
Wenhao Yan, Sheng Ye et al.Dec 5arXiv

SCAIL is a new AI system that turns a single character image into a studio-quality animation by following the moves in a driving video.

#character animation#3D pose representation#occlusion-aware pose

Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning

Intermediate
Zhenpeng Su, Leiyu Pan et al.Dec 5arXiv

Reinforcement learning (RL) can make big language models smarter, but off-policy training often pushes updates too far from the “safe zone,” causing unstable learning.

#reinforcement learning#PPO-clip#KL penalty

ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Intermediate
Zijun Wang, Panwen Hu et al.Dec 5arXiv

ProPhy is a new two-step method that helps video AIs follow real-world physics, not just make pretty pictures.

#physics-aware video generation#mixture-of-experts#token-level routing

BEAVER: An Efficient Deterministic LLM Verifier

Intermediate
Tarun Suresh, Nalin Wadhwa et al.Dec 5arXiv

BEAVER is a new way to check, with guaranteed certainty, how likely a language model is to give answers that obey important rules.

#BEAVER#deterministic verification#large language models

SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling

Intermediate
Elisabetta Fedele, Francis Engelmann et al.Dec 5arXiv

SpaceControl lets you steer a powerful 3D generator with simple shapes you draw, without retraining the model.

#3D generative modeling#test-time guidance#latent space intervention

From Segments to Scenes: Temporal Understanding in Autonomous Driving via Vision-Language Model

Intermediate
Kevin Cannons, Saeed Ranjbar Alvar et al.Dec 4arXiv

This paper builds TAD, a brand-new test that checks if AI can understand what happens over time in real driving videos.

#Temporal understanding#Autonomous driving#Vision-language models

Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image

Intermediate
Yanran Zhang, Ziyi Wang et al.Dec 4arXiv

This paper teaches a computer to turn one single picture into a moving 3D scene that stays consistent from every camera angle.

#4D scene generation#single-image to 4D#joint geometry and motion

Arbitrage: Efficient Reasoning via Advantage-Aware Speculation

Intermediate
Monishwaran Maheswaran, Rishabh Tiwari et al.Dec 4arXiv

ARBITRAGE makes AI solve step-by-step problems faster by only using the big, slow model when it is predicted to truly help.

#speculative decoding#step-level speculative decoding#advantage-aware routing

EMMA: Efficient Multimodal Understanding, Generation, and Editing with a Unified Architecture

Intermediate
Xin He, Longhui Wei et al.Dec 4arXiv

EMMA is a single AI model that can understand images, write about them, create new images from text, and edit images—all in one unified system.

#EMMA#unified multimodal architecture#32x autoencoder

EtCon: Edit-then-Consolidate for Reliable Knowledge Editing

Intermediate
Ruilin Li, Yibin Wang et al.Dec 4arXiv

Large language models forget or misuse new facts if you only poke their weights once; EtCon fixes this with a two-step plan.

#knowledge editing#EtCon#TPSFT

TwinFlow: Realizing One-step Generation on Large Models with Self-adversarial Flows

Intermediate
Zhenglin Cheng, Peng Sun et al.Dec 3arXiv

TwinFlow is a new way to make big image models draw great pictures in just one step instead of 40–100 steps.

#TwinFlow#one-step generation#twin trajectories
6465666768