Papers784

All Beginner Intermediate Advanced

All Sources arXiv

DenseGRPO: From Sparse to Dense Reward for Flow Matching Model Alignment

Intermediate

Haoyou Deng, Keyu Yan et al.Jan 28arXiv

DenseGRPO teaches image models using lots of small, timely rewards instead of one final score at the end.

#DenseGRPO#flow matching#GRPO

Spark: Strategic Policy-Aware Exploration via Dynamic Branching for Long-Horizon Agentic Learning

Intermediate

Jinyang Wu, Shuo Yang et al.Jan 28arXiv

SPARK is a new way to train AI agents that saves compute by exploring more only at the most important moments.

#SPARK#dynamic branching#strategic exploration

VERGE: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

Intermediate

Vikash Singh, Darion Cassel et al.Jan 27arXiv

VERGE is a teamwork system where an AI writer (an LLM) works with a strict math checker (an SMT solver) to make answers both smart and logically sound.

#VERGE#neurosymbolic reasoning#SMT solver

Self-Distillation Enables Continual Learning

Intermediate

Idan Shenfeld, Mehul Damani et al.Jan 27arXiv

This paper shows a simple way for AI models to keep learning new things without forgetting what they already know.

#Self-Distillation Fine-Tuning#On-Policy Distillation#Continual Learning

Post-LayerNorm Is Back: Stable, ExpressivE, and Deep

Intermediate

Chen Chen, Lai WeiJan 27arXiv

Big AI models used to get better by getting wider or reading longer texts, but those tricks are slowing down.

#Keel#Post-LayerNorm#Pre-LayerNorm

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Intermediate

Jialong Wu, Xiaoying Zhang et al.Jan 27arXiv

The paper argues that making and using pictures inside an AI’s thinking can help it reason more like humans, especially for real-world, physical and spatial problems.

#visual world modeling#multimodal chain-of-thought#unified multimodal models

Youtu-VL: Unleashing Visual Potential via Unified Vision-Language Supervision

Intermediate

Zhixiang Wei, Yi Li et al.Jan 27arXiv

Youtu-VL is a new kind of vision-language model that learns to predict both words and tiny image pieces, not just words.

#Vision-Language Models#Unified Autoregressive Supervision#Visual Tokenization

AACR-Bench: Evaluating Automatic Code Review with Holistic Repository-Level Context

Intermediate

Lei Zhang, Yongda Yu et al.Jan 27arXiv

AACR-Bench is a new test set that checks how well AI can do code reviews using the whole project, not just one file.

#Automated Code Review#Benchmark#Repository-level Context

Selective Steering: Norm-Preserving Control Through Discriminative Layer Selection

Intermediate

Quy-Anh Dang, Chris NgoJan 27arXiv

Selective Steering is a new way to gently nudge a language model’s inner thoughts without breaking its flow or skills.

#Selective Steering#Activation Steering#Angular Steering

Revisiting Parameter Server in LLM Post-Training

Intermediate

Xinyi Wan, Penghui Qi et al.Jan 27arXiv

Large language model (LLM) post-training has uneven work per GPU because some text sequences are much longer than others.

#On-Demand Communication#Fully Sharded Data Parallel#Parameter Server

Innovator-VL: A Multimodal Large Language Model for Scientific Discovery

Intermediate

Zichen Wen, Boxue Yang et al.Jan 27arXiv

Innovator-VL is a new multimodal AI model that understands both pictures and text to help solve science problems without needing mountains of special data.

#Innovator-VL#multimodal large language model#scientific reasoning

Group Distributionally Robust Optimization-Driven Reinforcement Learning for LLM Reasoning

Intermediate

Kishan Panaganti, Zhenwen Liang et al.Jan 27arXiv

LLMs are usually trained by treating every question the same and giving each one the same number of tries, which wastes compute on easy problems and neglects hard ones.

#LLM reasoning#Reinforcement Learning (RL)#GRPO

15 16 17 18 19