Papers1262

MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models

Yitian Gong, Kuangwei Chen et al.Feb 11arXiv

This paper builds a new audio tokenizer, called MOSS-Audio-Tokenizer, that turns sound into tiny tokens the way text tokenizers turn sentences into words.

#audio tokenizer#causal transformer#residual vector quantization

Not triaged yet

DeepImageSearch: Benchmarking Multimodal Agents for Context-Aware Image Retrieval in Visual Histories

Intermediate

Chenlong Deng, Mengjie Deng et al.Feb 11arXiv

Most image search systems judge each photo by itself, which fails when clues are split across many photos taken over time.

#context-aware image retrieval#multimodal agents#visual history exploration

Not triaged yet

Benchmarking Large Language Models for Knowledge Graph Validation

Beginner

Farzad Shami, Stefano Marchesin et al.Feb 11arXiv

Knowledge graphs are like giant fact maps, and keeping every fact correct is hard and important.

#Knowledge Graph Validation#Fact Checking#Large Language Models

Not triaged yet

VESPO: Variational Sequence-Level Soft Policy Optimization for Stable Off-Policy LLM Training

Intermediate

Guobin Shen, Chenxiao Zhao et al.Feb 11arXiv

VESPO is a new, stable way to train language models with reinforcement learning even when training data comes from older or mismatched policies.

#VESPO#off-policy reinforcement learning#importance sampling

Not triaged yet

How Do Decoder-Only LLMs Perceive Users? Rethinking Attention Masking for User Representation Learning

Intermediate

Jiahao Yuan, Yike Xu et al.Feb 11arXiv

Decoder-only language models can be great at making user profiles (embeddings), but how we let them look at the sequence—called attention masking—changes how smart those profiles are.

#decoder-only LLM#attention masking#causal attention

Not triaged yet

Online Causal Kalman Filtering for Stable and Effective Policy Optimization

Intermediate

Shuo He, Lang Feng et al.Feb 11arXiv

Training big language models with reinforcement learning can wobble because the per-token importance-sampling (IS) ratios swing wildly.

#Kalman filter#importance sampling ratio#policy optimization

Not triaged yet

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Intermediate

Ailin Huang, Ang Li et al.Feb 11arXiv

Step 3.5 Flash is a huge but efficient AI that keeps 196 billion total parameters but only wakes up about 11 billion per token, so it thinks smart and fast.

#Sparse Mixture-of-Experts#Sliding-Window Attention#Head-wise Gated Attention

Not triaged yet

MetaphorStar: Image Metaphor Understanding and Reasoning with End-to-End Visual Reinforcement Learning

Intermediate

Chenhao Zhang, Yazhe Niu et al.Feb 11arXiv

Pictures can hide deeper meanings, like a wilted plant meaning someone feels burned out; most AI models miss these hints.

#image metaphor understanding#image implication#visual reinforcement learning

Not triaged yet

When to Memorize and When to Stop: Gated Recurrent Memory for Long-Context Reasoning

Intermediate

Leheng Sheng, Yongtao Zhang et al.Feb 11arXiv

Long texts overwhelm many language models, which forget important bits and slow down as the context grows.

#gated recurrent memory#update gate#exit gate

Not triaged yet

LiveMedBench: A Contamination-Free Medical Benchmark for LLMs with Automated Rubric Evaluation

Beginner

Zhiling Yan, Dingjie Song et al.Feb 10arXiv

LiveMedBench is a new, always-updating test for medical AIs that keeps test questions safely separated from training data to avoid cheating by memorization.

#LiveMedBench#medical benchmark#data contamination

Not triaged yet

Blockwise Advantage Estimation for Multi-Objective RL with Verifiable Rewards

Intermediate

Kirill Pavlenko, Alexander Golubev et al.Feb 10arXiv

The paper fixes a common mistake in training language models for multi-part tasks: giving the same reward signal to every token, even when different text parts aim at different goals.

#Blockwise Advantage Estimation#Outcome-Conditioned Baseline#Group Relative Policy Optimization

Not triaged yet

Latent Thoughts Tuning: Bridging Context and Reasoning with Fused Information in Latent Tokens

Intermediate

Weihao Liu, Dehai Min et al.Feb 10arXiv

The paper introduces LT-Tuning, a way for AI models to “think silently” using special hidden tokens instead of writing every step out loud.

#latent tokens#chain-of-thought#context-prediction fusion

Not triaged yet

21 22 23 24 25