Papers1262

Code2Math: Can Your Code Agent Effectively Evolve Math Problems Through Exploration?

This paper shows that code-writing AI agents can take an existing math problem and automatically turn it into a new, harder one while keeping it solvable.

#code agents#multi-agent systems#mathematical reasoning

Not triaged yet

Specificity-aware reinforcement learning for fine-grained open-world classification

Intermediate

Samuele Angheben, Davide Berasi et al.Mar 3arXiv

This paper teaches AI to name things in pictures very specifically (like “golden retriever” instead of just “dog”) without making more mistakes.

#open-world classification#fine-grained recognition#large multimodal models

Not triaged yet

Chain of World: World Model Thinking in Latent Motion

Intermediate

Fuxiang Yang, Donglin Di et al.Mar 3arXiv

Robots learn better when they think about how things move over time, not by redrawing every pixel of a video.

#Vision-Language-Action#World Model#Latent Motion

Not triaged yet

BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing?

Intermediate

Guoxin Chen, Fanzhe Meng et al.Mar 3arXiv

BeyondSWE is a new benchmark that tests code agents on tougher, more real-life tasks than single-repo bug fixing.

#BeyondSWE#code agents#software engineering benchmark

Not triaged yet

NOVA: Sparse Control, Dense Synthesis for Pair-Free Video Editing

Intermediate

Tianlin Pan, Jiayi Dai et al.Mar 3arXiv

NOVA is a new video editor that lets you change a few key frames (sparse control) while it carefully keeps the original motion and background details (dense synthesis).

#video editing#pair-free training#sparse control

Not triaged yet

Next Embedding Prediction Makes World Models Stronger

Intermediate

George Bredis, Nikita Balagansky et al.Mar 3arXiv

NE-Dreamer is a model-based reinforcement learning agent that skips rebuilding pixels and instead learns by predicting the next step’s hidden features.

#model-based reinforcement learning#world models#next-embedding prediction

Not triaged yet

DREAM: Where Visual Understanding Meets Text-to-Image Generation

Beginner

Chao Li, Tianhong Li et al.Mar 3arXiv

DREAM is one model that both understands images (like CLIP) and makes images from text (like top text-to-image models).

#DREAM#contrastive learning#masked autoregressive modeling

Not triaged yet

Heterogeneous Agent Collaborative Reinforcement Learning

Intermediate

Zhixia Zhang, Zixuan Huang et al.Mar 3arXiv

This paper introduces HACRL, a way for different kinds of AI agents to learn together during training but still work alone during use.

#HACRL#HACPO#heterogeneous agents

Not triaged yet

How Controllable Are Large Language Models? A Unified Evaluation across Behavioral Granularities

Beginner

Ziwen Xu, Kewei Xu et al.Mar 3arXiv

Large language models can act unpredictably in sensitive places like schools, hospitals, and customer support, so we need reliable ways to guide how they talk and behave.

#LLM controllability#behavioral granularity#hierarchical evaluation

Not triaged yet

Track4World: Feedforward World-centric Dense 3D Tracking of All Pixels

Intermediate

Jiahao Lu, Jiayi Xu et al.Mar 3arXiv

Track4World is a fast, feedforward AI that can follow the 3D path of every pixel in a video using just one camera.

#dense 3D tracking#scene flow#2D-to-3D correlation

Not triaged yet

MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning

Intermediate

Jiejun Tan, Zhicheng Dou et al.Mar 3arXiv

MemSifter is a smart helper that picks the right memories for a big AI so the big AI doesn’t have to read everything.

#long-term memory#LLM retrieval#proxy model

Not triaged yet

ParEVO: Synthesizing Code for Irregular Data: High-Performance Parallelism through Agentic Evolution

Intermediate

Liu Yang, Zeyu Nie et al.Mar 3arXiv

ParEVO teaches AI to write fast, safe parallel code for messy, irregular data like big graphs and uneven trees.

#ParEVO#ParlayLib#irregular parallelism

Not triaged yet

2 3 4 5 6