Papers1262

GARDO: Reinforcing Diffusion Models without Reward Hacking

GARDO is a new way to fine-tune text-to-image diffusion models with reinforcement learning without getting tricked by bad reward signals.

#GARDO#reward hacking#gated KL regularization

Not triaged yet

Factorized Learning for Temporally Grounded Video-Language Models

Intermediate

Wenzheng Zeng, Difei Gao et al.Dec 30arXiv

This paper teaches video-language models to first find when the proof happens in a video and then answer with that proof, instead of mixing both steps together.

#temporal grounding#video-language models#evidence tokens

Not triaged yet

Fantastic Reasoning Behaviors and Where to Find Them: Unsupervised Discovery of the Reasoning Process

Beginner

Zhenyu Zhang, Shujian Zhang et al.Dec 30arXiv

This paper shows a new way (called RISE) to find and control how AI models think without needing any human-made labels.

#RISE#sparse auto-encoder#reasoning vectors

Not triaged yet

Improving Multi-step RAG with Hypergraph-based Memory for Long-Context Complex Relational Modeling

Intermediate

Chulun Zhou, Chunkang Zhang et al.Dec 30arXiv

Multi-step RAG systems often struggle with long documents because their memory is just a pile of isolated facts, not a connected understanding.

#multi-step RAG#hypergraph memory#hyperedge merging

Not triaged yet

Pretraining Frame Preservation in Autoregressive Video Memory Compression

Intermediate

Lvmin Zhang, Shengqu Cai et al.Dec 29arXiv

The paper teaches a video model to squeeze long video history into a tiny memory while still keeping sharp details in single frames.

#autoregressive video generation#video memory compression#frame retrieval pretraining

Not triaged yet

Stream-DiffVSR: Low-Latency Streamable Video Super-Resolution via Auto-Regressive Diffusion

Intermediate

Hau-Shiang Shiu, Chin-Yang Lin et al.Dec 29arXiv

This paper makes diffusion-based video super-resolution (VSR) practical for live, low-latency use by removing the need for future frames and cutting denoising from ~50 steps down to just 4.

#video super-resolution#diffusion model#latent diffusion

Not triaged yet

Training AI Co-Scientists Using Rubric Rewards

Intermediate

Shashwat Goel, Rishi Hazra et al.Dec 29arXiv

The paper teaches AI to write strong research plans by letting it grade its own work using checklists (rubrics) pulled from real scientific papers.

#AI co-scientist#research plan generation#rubric rewards

Not triaged yet

Diffusion Knows Transparency: Repurposing Video Diffusion for Transparent Object Depth and Normal Estimation

Intermediate

Shaocong Xu, Songlin Wei et al.Dec 29arXiv

Transparent and shiny objects confuse normal depth cameras, but video diffusion models already learned how light bends and reflects through them.

#video diffusion model#transparent object depth#normal estimation

Not triaged yet

Web World Models

Intermediate

Jichen Feng, Yifan Zhang et al.Dec 29arXiv

This paper introduces Web World Models (WWMs), a way to build huge, explorable worlds by putting strict rules in code and letting AI write the fun details.

#Web World Model#typed interfaces#deterministic hashing

Not triaged yet