Papers1055

All Beginner Intermediate Advanced

All Sources arXiv

From Data to Behavior: Predicting Unintended Model Behaviors Before Training

Intermediate

Mengru Wang, Zhenqian Xu et al.Feb 4arXiv

Large language models can quietly pick up hidden preferences from training data that looks harmless.

#Data2Behavior#Manipulating Data Features#activation injection

ERNIE 5.0 Technical Report

Intermediate

Haifeng Wang, Hua Wu et al.Feb 4arXiv

ERNIE 5.0 is a single giant model that can read and create text, images, video, and audio by predicting the next pieces step by step, like writing a story one line at a time.

#ERNIE 5.0#unified autoregressive model#mixture-of-experts

WideSeek-R1: Exploring Width Scaling for Broad Information Seeking via Multi-Agent Reinforcement Learning

Intermediate

Zelai Xu, Zhexuan Xu et al.Feb 4arXiv

WideSeek-R1 teaches a small 4B-parameter language model to act like a well-run team: one leader plans, many helpers work in parallel, and everyone learns together with reinforcement learning.

#width scaling#multi-agent reinforcement learning#orchestration

ASA: Training-Free Representation Engineering for Tool-Calling Agents

Intermediate

Youjin Wang, Run Zhou et al.Feb 4arXiv

The paper finds a strange gap: the model’s hidden thoughts almost perfectly show when it should use a tool, but its actual words often don’t trigger the tool under strict rules.

#activation steering#representation engineering#tool calling

LycheeDecode: Accelerating Long-Context LLM Inference via Hybrid-Head Sparse Decoding

Intermediate

Gang Lin, Dongfang Li et al.Feb 4arXiv

Long texts make language models slow because they must keep and re-check a huge memory called the KV cache for every new word they write.

#long-context LLM#sparse attention#head specialization

EgoActor: Grounding Task Planning into Spatial-aware Egocentric Actions for Humanoid Robots via Visual-Language Models

Intermediate

Yu Bai, MingMing Yu et al.Feb 4arXiv

EgoActor is a vision-language model that turns everyday instructions like 'Go to the door and say hi' into step-by-step, egocentric actions a humanoid robot can actually do.

#EgoActing#vision-language model#humanoid robot

Beyond Unimodal Shortcuts: MLLMs as Cross-Modal Reasoners for Grounded Named Entity Recognition

Intermediate

Jinlong Ma, Yu Zhang et al.Feb 4arXiv

The paper teaches multimodal large language models (MLLMs) to stop guessing from just text or just images and instead check both together before answering.

#GMNER#Multimodal Large Language Models#Modality Bias

No One-Size-Fits-All: Building Systems For Translation to Bashkir, Kazakh, Kyrgyz, Tatar and Chuvash Using Synthetic And Original Data

Intermediate

Dmitry KarpovFeb 4arXiv

The paper tries several different ways to translate five low-resource Turkic languages, instead of forcing one method to fit all.

#low-resource machine translation#Turkic languages#NLLB-200

Proxy Compression for Language Modeling

Intermediate

Lin Zheng, Xinyu Li et al.Feb 4arXiv

Most language models are trained on compressed tokens, which makes training fast but ties the model to a specific tokenizer.

#proxy compression#byte-level language modeling#tokenizer-free inference

Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

Intermediate

Yansong Ning, Jun Fang et al.Feb 4arXiv

Agent-Omit teaches AI agents to skip unneeded thinking and old observations, cutting tokens while keeping accuracy high.

#LLM agents#reinforcement learning#agentic RL

Steering LLMs via Scalable Interactive Oversight

Intermediate

Enyu Zhou, Zhiheng Xi et al.Feb 4arXiv

The paper tackles a common problem: people can ask AI to do big, complex tasks, but they can’t always explain exactly what they want or check the results well.

#scalable oversight#interactive alignment#requirement elicitation

Training Data Efficiency in Multimodal Process Reward Models

Intermediate

Jinyuan Li, Chengsong Huang et al.Feb 4arXiv

Multimodal Process Reward Models (MPRMs) teach AI to judge each step of a picture-and-text reasoning process, not just the final answer.

#Multimodal Process Reward Model#Process Supervision#Monte Carlo Annotation

23 24 25 26 27