Papers1262

MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models

Multimodal AI models handle text, images, and audio, but their signals are very different in size, which breaks standard low‑bit compression methods.

#post‑training quantization#multimodal LLM#channel‑wise smoothing

Not triaged yet

Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

Intermediate

Yong Liu, Xingjian Su et al.Mar 5arXiv

Timer-S1 is a huge time-series model (8.3B parameters, only 0.75B used per step) that predicts the future by thinking step-by-step inside one forward pass.

#time series forecasting#foundation models#Mixture-of-Experts

Not triaged yet

HiMAP-Travel: Hierarchical Multi-Agent Planning for Long-Horizon Constrained Travel

Intermediate

The Viet Bui, Wenjun Li et al.Mar 5arXiv

HiMAP-Travel is a team-based AI planner that splits a long trip into daily chunks so it can follow tough rules like budgets without drifting off course.

#hierarchical planning#multi-agent systems#constraint drift

Not triaged yet

DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

Intermediate

Maojun Sun, Yue Wu et al.Mar 5arXiv

DARE is a new way for AI assistants to find the right R functions by also looking at what the data looks like, not just the words in the question.

#distribution-aware retrieval#RPKB#RCodingAgent

Not triaged yet

Interactive Benchmarks

Beginner

Baoqing Yue, Zihan Zhu et al.Mar 5arXiv

This paper says we should test AI the way real life works: by letting it ask questions, gather clues, and make smart moves step by step under a limited budget.

#interactive benchmarks#information acquisition#budgeted reasoning

Not triaged yet

RoboMME: Benchmarking and Understanding Memory for Robotic Generalist Policies

Intermediate

Yinpei Dai, Hongze Fu et al.Mar 4arXiv

RoboMME is a new, big test playground that checks whether robot brains can remember important things over time, not just what they see right now.

#robot memory#long-horizon manipulation#vision-language-action (VLA)

Not triaged yet

Helios: Real Real-Time Long Video Generation Model

Intermediate

Shenghai Yuan, Yuanyang Yin et al.Mar 4arXiv

Helios is a 14-billion-parameter video model that can make minute-long videos in real time at about 19.5 frames per second on a single NVIDIA H100 GPU.

#real-time video generation#long video diffusion#autoregressive diffusion

Not triaged yet

ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors

Beginner

Zihao Huang, Tianqi Liu et al.Mar 4arXiv

ArtHOI is a new zero-shot method that makes people and everyday articulated objects (like doors, drawers, and fridges) move together realistically using only a single generated video as guidance.

#articulated human-object interaction#4D reconstruction#optical flow segmentation

Not triaged yet

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners

Intermediate

Harman Singh, Xiuyu Li et al.Mar 4arXiv

The paper shows that when a model compares two of its own answers head-to-head, it picks the right one more often than when it judges each answer alone.

#pairwise self-verification#test-time scaling#parallel reasoning

Not triaged yet

CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video

Intermediate

Lingen Li, Guangzhi Wang et al.Mar 4arXiv

CubeComposer is a new AI method that turns a normal forward-facing video into a full 360° VR video at true 4K quality without using super-resolution upscaling.

#360° video generation#cubemap#spatio-temporal autoregression

Not triaged yet

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Beginner

Zhenting Wang, Huancheng Chen et al.Mar 4arXiv

This paper teaches long-horizon AI agents to remember everything exactly without stuffing their whole memory at once.

#indexed memory#LLM agents#long-horizon tasks

Not triaged yet

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Intermediate

Yansong Shi, Qingsong Zhao et al.Mar 4arXiv

RIVER Bench is a new test that checks how well AI can watch a video stream and talk with you in real time.

#RIVER Bench#online video understanding#multimodal large language models

Not triaged yet

1 2 3 4 5