Papers791

Act2Goal: From World Model To General Goal-conditioned Policy

Pengfei Zhou, Liliang Chen et al.Dec 29arXiv

Robots often get confused on long, multi-step tasks when they only see the final goal image and try to guess the next move directly.

#goal-conditioned policy#visual world model#multi-scale temporal hashing

Coupling Experts and Routers in Mixture-of-Experts via an Auxiliary Loss

Intermediate

Ang Lv, Jin Ma et al.Dec 29arXiv

Mixture-of-Experts (MoE) models use many small specialist networks (experts) and a router to pick which experts handle each token, but the router isn’t explicitly taught what each expert is good at.

#Mixture-of-Experts#expert-router coupling#auxiliary loss

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Intermediate

Jiawei Chen, Xintian Shen et al.Dec 29arXiv

MindWatcher is a smart AI agent that can think step by step and decide when to use tools like web search, image zooming, and a code calculator to solve tough, multi-step problems.

#Tool-Integrated Reasoning#Interleaved Thinking#Multimodal Chain-of-Thought

A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers

Intermediate

Mohammad Nasirzadeh, Jafar Tahmoresnezhad et al.Dec 29arXiv

CoLog is a new AI system that reads computer logs like a story and spots both single strange events (point anomalies) and strange patterns over time (collective anomalies).

#log anomaly detection#multimodal learning#collaborative transformer

AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Intermediate

Jiafeng Liang, Hao Li et al.Dec 29arXiv

This survey links how human brains remember things to how AI agents should remember things so they can act smarter over time.

#agent memory#episodic memory#semantic memory

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Intermediate

Xu Lin, Jinlong Peng et al.Dec 29arXiv

YOLO-Master is a new real-time object detector that uses a Mixture-of-Experts (MoE) design to spend more compute on hard scenes and less on easy ones.

#YOLO-Master#Mixture of Experts#ES-MoE

KernelEvolve: Scaling Agentic Kernel Coding for Heterogeneous AI Accelerators at Meta

Intermediate

Gang Liao, Hongsen Qin et al.Dec 29arXiv

KernelEvolve is a smart, self-improving system that writes and tunes tiny but crucial programs (kernels) so AI runs fast on many kinds of chips.

#KernelEvolve#agentic kernel coding#graph-based search

Bridging Your Imagination with Audio-Video Generation via a Unified Director

Intermediate

Jiaxu Zhang, Tianshu Hu et al.Dec 29arXiv

UniMAGE is a single “director” AI that writes a film-like script and draws the key pictures for each shot, so stories stay clear and characters look the same from scene to scene.

#Unified Director Model#Mixture-of-Transformers#Interleaved Concept Learning

Evaluating Parameter Efficient Methods for RLVR

Intermediate

Qingyu Yin, Yulun Wu et al.Dec 29arXiv

The paper asks which small, add-on training tricks (PEFT) work best when we teach language models with yes/no rewards we can check (RLVR).

#RLVR#parameter-efficient fine-tuning#LoRA

SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling

Intermediate

Yufan He, Pengfei Guo et al.Dec 29arXiv

SurgWorld teaches surgical robots using videos plus text, then guesses the missing robot moves so we can train good policies without collecting tons of real robot-action data.

#surgical world model#SATA dataset#inverse dynamics model

Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

Intermediate

Yi Zhou, Xuechao Zou et al.Dec 28arXiv

Co2S is a new way to train segmentation models with very few labels by letting two different students (CLIP and DINOv3) learn together and correct each other.

#semi-supervised segmentation#remote sensing#pseudo-label drift

Diversity or Precision? A Deep Dive into Next Token Prediction

Intermediate

Haoyuan Wu, Hai Wang et al.Dec 28arXiv

The paper shows that teaching a language model with a special “reward-shaped” next-token objective can make later reinforcement learning (RL) work much better.

#next-token prediction#cross-entropy as policy gradient#reward shaping

42 43 44 45 46