🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers807

AllBeginnerIntermediateAdvanced
All SourcesarXiv

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Intermediate
Minh V. T. Thai, Tue Le et al.Dec 20arXiv

SWE-EVO is a new test (benchmark) that checks if AI coding agents can upgrade real software projects over many steps, not just fix one small bug.

#SWE-EVO#software evolution#coding agents

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

Intermediate
Philipp Langsteiner, Jan-Niklas Dihlmann et al.Dec 20arXiv

MatSpray turns 2D guesses about what materials look like (color, shininess, metal) into a clean 3D model you can relight realistically.

#MatSpray#3D Gaussian Splatting#Gaussian Ray Tracing

Is There a Better Source Distribution than Gaussian? Exploring Source Distributions for Image Flow Matching

Intermediate
Junho Lee, Kwanseok Kim et al.Dec 20arXiv

Flow Matching is like teaching arrows to push points from a simple cloud (source) to real pictures (target); most people start from a Gaussian cloud because it points equally in all directions.

#flow matching#conditional flow matching#source distribution

SAM Audio: Segment Anything in Audio

Intermediate
Bowen Shi, Andros Tjandra et al.Dec 19arXiv

SAM Audio is a new AI that can pull out exactly the sound you want from a noisy mix using text, clicks on a video, and time ranges—together or separately.

#audio source separation#multimodal prompting#text-guided separation

When Reasoning Meets Its Laws

Intermediate
Junyu Zhang, Yifan Sun et al.Dec 19arXiv

The paper proposes the Laws of Reasoning (LORE), simple rules that say how much a model should think and how accurate it can be as problems get harder.

#Large Reasoning Models#Laws of Reasoning#Compute Law

RadarGen: Automotive Radar Point Cloud Generation from Cameras

Intermediate
Tomer Borreda, Fangqiang Ding et al.Dec 19arXiv

RadarGen is a tool that learns to generate realistic car radar point clouds just from multiple camera views.

#automotive radar#radar point cloud generation#latent diffusion

Region-Constraint In-Context Generation for Instructional Video Editing

Intermediate
Zhongwei Zhang, Fuchen Long et al.Dec 19arXiv

ReCo is a new way to edit videos just by telling the computer what to change with words, no extra masks needed.

#instruction-based video editing#in-context generation#region constraint

InsertAnywhere: Bridging 4D Scene Geometry and Diffusion Models for Realistic Video Object Insertion

Intermediate
Hoiyeong Jin, Hyojin Jang et al.Dec 19arXiv

InsertAnywhere is a two-stage system that lets you add a new object into any video so it looks like it was always there.

#video object insertion#4D scene geometry#diffusion video generation

GroundingME: Exposing the Visual Grounding Gap in MLLMs through Multi-Dimensional Evaluation

Intermediate
Rang Li, Lei Li et al.Dec 19arXiv

Visual grounding is when an AI finds the exact thing in a picture that a sentence is talking about, and this paper shows today’s big vision-language AIs are not as good at it as we thought.

#visual grounding#multimodal large language models#benchmark

3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework

Intermediate
Tobias Sautter, Jan-Niklas Dihlmann et al.Dec 19arXiv

3D-RE-GEN turns a single photo of a room into a full 3D scene with separate, textured objects and a usable background.

#single-image 3D reconstruction#scene composition#context-aware inpainting

UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

Intermediate
Jiajun Wu, Jian Yang et al.Dec 19arXiv

The paper introduces UCoder, a way to teach a code-generating AI to get better without using any outside datasets, not even unlabeled code.

#unsupervised code generation#self-training#internal probing

Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers

Intermediate
Zeyuan Allen-ZhuDec 19arXiv

The paper introduces Canon layers, tiny add-ons that let nearby words share information directly, like passing notes along a row of desks.

#Canon layers#horizontal information flow#transformer architecture
4950515253