Papers1055

MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

Lei Zhang, Mouxiang Chen et al.Jan 12arXiv

MegaFlow is a new system that helps thousands of AI agents practice and test big, messy tasks (like fixing real software bugs) all at once without crashing or wasting money.

#agent orchestration#distributed systems#event-driven architecture

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Intermediate

Siqi Zhu, Jiaxuan YouJan 12arXiv

OpenTinker is an open-source system that makes training AI agents with reinforcement learning simple, modular, and reusable.

#Reinforcement learning#LLM agents#Agent–environment interaction

Beyond Hard Masks: Progressive Token Evolution for Diffusion Language Models

Intermediate

Linhao Zhong, Linyu Wu et al.Jan 12arXiv

Diffusion Language Models (DLMs) write by polishing whole sentences in several passes instead of one token at a time.

#Diffusion Language Models#Masked Diffusion#Soft Token Distributions

Controlled Self-Evolution for Algorithmic Code Optimization

Intermediate

Tu Hu, Ronghao Chen et al.Jan 12arXiv

The paper introduces Controlled Self-Evolution (CSE), a smarter way for AI to write and improve code quickly under a tight budget of tries.

#Controlled Self-Evolution#Code optimization#Self-evolving agents

VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding

Intermediate

Jiapeng Shi, Junke Wang et al.Jan 12arXiv

VideoLoom is a single AI model that can tell both when something happens in a video and where it happens, at the pixel level.

#Video Large Language Model#Temporal Grounding#Referring Video Object Segmentation

Focal Guidance: Unlocking Controllability from Semantic-Weak Layers in Video Diffusion Models

Intermediate

Yuanyang Yin, Yufan Deng et al.Jan 12arXiv

Image-to-Video models often keep the picture looking right but ignore parts of the text instructions.

#Image-to-Video generation#Diffusion Transformer#Controllability

MeepleLM: A Virtual Playtester Simulating Diverse Subjective Experiences

Intermediate

Zizhen Li, Chuanhao Li et al.Jan 12arXiv

MeepleLM is a special AI that reads a board game’s rulebook and pretends to be different kinds of players to give helpful, honest feedback.

#virtual playtesting#persona-aligned critique#MDA reasoning

Lost in the Noise: How Reasoning Models Fail with Contextual Distractors

Intermediate

Seongyun Lee, Yongrae Jo et al.Jan 12arXiv

The paper shows that when we give AI lots of extra text, even harmless extra text, it can get badly confused—sometimes losing up to 80% of its accuracy.

#NoisyBench#Rationale-Aware Reward#RARE

Dr. Zero: Self-Evolving Search Agents without Training Data

Intermediate

Zhenrui Yue, Kartikeya Upasani et al.Jan 11arXiv

Dr. Zero is a pair of AI agents (a Proposer and a Solver) that teach each other to do web-search-based reasoning without any human-written training data.

#Dr. Zero#self-evolution#proposer-solver

Solar Open Technical Report

Intermediate

Sungrae Park, Sanghoon Kim et al.Jan 11arXiv

Solar Open is a giant bilingual AI (102 billion parameters) that focuses on helping underserved languages like Korean catch up with English-level AI quality.

#Solar Open#Mixture-of-Experts#bilingual LLM

X-Coder: Advancing Competitive Programming with Fully Synthetic Tasks, Solutions, and Tests

Intermediate

Jie Wu, Haoling Li et al.Jan 11arXiv

X-Coder shows that models can learn expert-level competitive programming using data that is 100% synthetic—no real contest problems needed.

#competitive programming#synthetic data generation#feature-based synthesis

Watching, Reasoning, and Searching: A Video Deep Research Benchmark on Open Web for Agentic Video Reasoning

Intermediate

Chengwen Liu, Xiaomin Yu et al.Jan 11arXiv

VideoDR is a new benchmark that tests if AI can watch a video, pull out key visual clues, search the open web, and chain the clues together to find one verifiable answer.

#video deep research#multimodal reasoning#open-domain question answering

53 54 55 56 57