Papers1055

A unified framework for detecting point and collective anomalies in operating system logs via collaborative transformers

Mohammad Nasirzadeh, Jafar Tahmoresnezhad et al.Dec 29arXiv

CoLog is a new AI system that reads computer logs like a story and spots both single strange events (point anomalies) and strange patterns over time (collective anomalies).

#log anomaly detection#multimodal learning#collaborative transformer

Not triaged yet

AI Meets Brain: Memory Systems from Cognitive Neuroscience to Autonomous Agents

Intermediate

Jiafeng Liang, Hao Li et al.Dec 29arXiv

This survey links how human brains remember things to how AI agents should remember things so they can act smarter over time.

#agent memory#episodic memory#semantic memory

Not triaged yet

YOLO-Master: MOE-Accelerated with Specialized Transformers for Enhanced Real-time Detection

Intermediate

Xu Lin, Jinlong Peng et al.Dec 29arXiv

YOLO-Master is a new real-time object detector that uses a Mixture-of-Experts (MoE) design to spend more compute on hard scenes and less on easy ones.

#YOLO-Master#Mixture of Experts#ES-MoE

Not triaged yet

KernelEvolve: Scaling Agentic Kernel Coding for Heterogeneous AI Accelerators at Meta

Intermediate

Gang Liao, Hongsen Qin et al.Dec 29arXiv

KernelEvolve is a smart, self-improving system that writes and tunes tiny but crucial programs (kernels) so AI runs fast on many kinds of chips.

#KernelEvolve#agentic kernel coding#graph-based search

Not triaged yet

Bridging Your Imagination with Audio-Video Generation via a Unified Director

Intermediate

Jiaxu Zhang, Tianshu Hu et al.Dec 29arXiv

UniMAGE is a single “director” AI that writes a film-like script and draws the key pictures for each shot, so stories stay clear and characters look the same from scene to scene.

#Unified Director Model#Mixture-of-Transformers#Interleaved Concept Learning

Not triaged yet

Evaluating Parameter Efficient Methods for RLVR

Intermediate

Qingyu Yin, Yulun Wu et al.Dec 29arXiv

The paper asks which small, add-on training tricks (PEFT) work best when we teach language models with yes/no rewards we can check (RLVR).

#RLVR#parameter-efficient fine-tuning#LoRA

Not triaged yet

SurgWorld: Learning Surgical Robot Policies from Videos via World Modeling

Intermediate

Yufan He, Pengfei Guo et al.Dec 29arXiv

SurgWorld teaches surgical robots using videos plus text, then guesses the missing robot moves so we can train good policies without collecting tons of real robot-action data.

#surgical world model#SATA dataset#inverse dynamics model

Not triaged yet

Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

Intermediate

Yi Zhou, Xuechao Zou et al.Dec 28arXiv

Co2S is a new way to train segmentation models with very few labels by letting two different students (CLIP and DINOv3) learn together and correct each other.

#semi-supervised segmentation#remote sensing#pseudo-label drift

Not triaged yet

Diversity or Precision? A Deep Dive into Next Token Prediction

Intermediate

Haoyuan Wu, Hai Wang et al.Dec 28arXiv

The paper shows that teaching a language model with a special “reward-shaped” next-token objective can make later reinforcement learning (RL) work much better.

#next-token prediction#cross-entropy as policy gradient#reward shaping

Not triaged yet

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

Intermediate

Kai Liu, Jungang Li et al.Dec 28arXiv

JavisGPT is a single AI that can both understand sounding videos (audio + video together) and also create new ones that stay in sync.

#multimodal large language model#audio-video synchronization#SyncFusion

Not triaged yet

On the Role of Discreteness in Diffusion LLMs

Intermediate

Ziqi Jin, Bin Wang et al.Dec 27arXiv

The paper asks what a truly good diffusion-based language model should look like and lists five must-have properties.

#diffusion language models#smooth corruption#discrete tokens

Not triaged yet

Dream-VL & Dream-VLA: Open Vision-Language and Vision-Language-Action Models with Diffusion Language Model Backbone

Intermediate

Jiacheng Ye, Shansan Gong et al.Dec 27arXiv

Dream-VL and Dream-VLA use a diffusion language model backbone to understand images, talk about them, and plan actions better than many regular (autoregressive) models.

#diffusion language model#vision-language model#vision-language-action

Not triaged yet

64 65 66 67 68