Papers1055

Meta-RL Induces Exploration in Language Agents

Yulun Jiang, Liangze Jiang et al.Dec 18arXiv

This paper introduces LAMER, a Meta-RL training framework that teaches language agents to explore first and then use what they learned to solve tasks faster.

#Meta-Reinforcement Learning#Language Agents#Exploration vs Exploitation

Not triaged yet

PhysBrain: Human Egocentric Data as a Bridge from Vision Language Models to Physical Intelligence

Intermediate

Xiaopeng Lin, Shijie Lian et al.Dec 18arXiv

Robots learn best from what they would actually see, which is a first-person (egocentric) view, but most AI models are trained on third-person videos and get confused.

#egocentric vision#first-person video#vision-language model

Not triaged yet

Kling-Omni Technical Report

Intermediate

Kling Team, Jialu Chen et al.Dec 18arXiv

Kling-Omni is a single, unified model that can understand text, images, and videos together and then make or edit high-quality videos from those mixed instructions.

#multimodal visual language#MVL#prompt enhancer

Not triaged yet

Vision-Language-Action Models for Autonomous Driving: Past, Present, and Future

Intermediate

Tianshuai Hu, Xiaolu Liu et al.Dec 18arXiv

Traditional self-driving used separate boxes for seeing, thinking, and acting, but tiny mistakes in early boxes could snowball into big problems later.

#Vision-Language-Action#End-to-End Autonomous Driving#Dual-System VLA

Not triaged yet

DataFlow: An LLM-Driven Framework for Unified Data Preparation and Workflow Automation in the Era of Data-Centric AI

Intermediate

Hao Liang, Xiaochen Ma et al.Dec 18arXiv

DataFlow is a building-block system that helps large language models get better data by unifying how we create, clean, check, and organize that data.

#DataFlow#LLM data preparation#operator pipeline

Not triaged yet

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Intermediate

Bingxiang He, Zekai Qu et al.Dec 18arXiv

JustRL shows that a tiny, steady recipe for reinforcement learning (RL) can make a 1.5B-parameter language model much better at math without fancy tricks.

#Reinforcement Learning#GRPO#Policy Entropy

Not triaged yet

REGLUE Your Latents with Global and Local Semantics for Entangled Diffusion

Intermediate

Giorgos Petsangourakis, Christos Sgouropoulos et al.Dec 18arXiv

Latent diffusion models are great at making images but learn the meaning of scenes slowly because their training goal mostly teaches them to clean up noise, not to understand objects and layouts.

#latent diffusion#REGLUE#representation entanglement

Not triaged yet

DeContext as Defense: Safe Image Editing in Diffusion Transformers

Intermediate

Linghui Shen, Mingyue Cui et al.Dec 18arXiv

This paper protects your photos from being misused by new AI image editors that can copy your face or style from just one picture.

#Diffusion Transformer#cross-attention#in-context image editing

Not triaged yet

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Intermediate

Yuxin Wang, Lei Ke et al.Dec 18arXiv

This paper teaches a vision-language model to first find objects in real 3D space (not just 2D pictures) and then reason about where things are.

#3D grounding#vision-language models#spatial reasoning

Not triaged yet

StageVAR: Stage-Aware Acceleration for Visual Autoregressive Models

Intermediate

Senmao Li, Kai Wang et al.Dec 18arXiv

StageVAR makes image-generating AI much faster by recognizing that early steps set the meaning and structure, while later steps just polish details.

#Visual Autoregressive Modeling#Next-Scale Prediction#Stage-Aware Acceleration

Not triaged yet

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Intermediate

Wanghan Xu, Yuhao Zhou et al.Dec 18arXiv

The paper defines Scientific General Intelligence (SGI) as an AI that can do science like a human scientist across the full loop: study, imagine, test, and understand.

#Scientific General Intelligence#Practical Inquiry Model#Scientist-aligned benchmark

Not triaged yet

Adaptation of Agentic AI

Intermediate

Pengcheng Jiang, Jiacheng Lin et al.Dec 18arXiv

This paper organizes how AI agents learn and improve into one simple map with four roads: A1, A2, T1, and T2.

#agentic AI#adaptation#A1 A2 T1 T2

Not triaged yet

72 73 74 75 76