🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers8

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#iterative refinement

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Intermediate
Jiahao Lu, Jiayi Xu et al.Mar 3arXiv

Track4World is a fast, feedforward AI that can follow the 3D path of every pixel in a video using just one camera.

#dense 3D tracking#scene flow#2D-to-3D correlation

Large Multimodal Models as General In-Context Classifiers

Intermediate
Marco Garosi, Matteo Farina et al.Feb 26arXiv

People often pick CLIP-like models for image labeling, but this paper shows that large multimodal models (LMMs) can be just as good—or even better—when you give them a few examples in the prompt (in-context learning).

#in-context learning#multimodal models#open-world classification

Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

Intermediate
Sen Ye, Mengde Xu et al.Feb 17arXiv

Big idea: Make image-making AIs stop, think, check, and fix their own work so they get better at both creating pictures and understanding them.

#multimodal models#image generation#reasoning

FS-Researcher: Test-Time Scaling for Long-Horizon Research Tasks with File-System-Based Agents

Intermediate
Chiwei Zhu, Benfeng Xu et al.Feb 2arXiv

FS-Researcher is a two-agent system that lets AI do very long research by saving everything in a computer folder so it never runs out of memory.

#FS-Researcher#file-system agents#external memory

AgentIF-OneDay: A Task-level Instruction-Following Benchmark for General AI Agents in Daily Scenarios

Beginner
Kaiyuan Chen, Qimin Wu et al.Jan 28arXiv

This paper builds a new test called AgentIF-OneDay that checks if AI helpers can follow everyday instructions the way people actually give them.

#AgentIF-OneDay#instruction following#AI agents

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

Intermediate
Vikash Singh, Darion Cassel et al.Jan 27arXiv

VERGE is a teamwork system where an AI writer (an LLM) works with a strict math checker (an SMT solver) to make answers both smart and logically sound.

#VERGE#neurosymbolic reasoning#SMT solver

Diffusion In Diffusion: Reclaiming Global Coherence in Semi-Autoregressive Diffusion

Intermediate
Linrui Ma, Yufei Cui et al.Jan 20arXiv

The paper proposes Diffusion in Diffusion, a draft-then-revise method that brings back global coherence to fast, block-based diffusion language models.

#discrete diffusion#block diffusion#semi-autoregressive

On the Role of Discreteness in Diffusion LLMs

Intermediate
Ziqi Jin, Bin Wang et al.Dec 27arXiv

The paper asks what a truly good diffusion-based language model should look like and lists five must-have properties.

#diffusion language models#smooth corruption#discrete tokens