Papers791

All Beginner Intermediate Advanced

All Sources arXiv

KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

Intermediate

Yixuan Tang, Yi YangJan 3arXiv

This paper shows how to get strong text embeddings from decoder-only language models without any training.

#text embeddings#decoder-only LLMs#causal attention

The Reasoning-Creativity Trade-off: Toward Creativity-Driven Problem Solving

Intermediate

Max Ruiz Luyten, Mihaela van der SchaarJan 2arXiv

Modern AI models can get very good at being correct, but in the process they often lose their ability to think in many different ways.

#Distributional Creative Reasoning#diversity energy#creativity kernel

Fast-weight Product Key Memory

Intermediate

Tianyu Zhao, Llion JonesJan 2arXiv

The paper introduces Fast-weight Product Key Memory (FwPKM), a memory layer that can quickly learn from the current text it reads, not just from past training.

#Fast-weight memory#Product Key Memory#Sparse retrieval

Avatar Forcing: Real-Time Interactive Head Avatar Generation for Natural Conversation

Intermediate

Taekyung Ki, Sangwon Jang et al.Jan 2arXiv

This paper builds a real-time talking-listening head avatar that reacts naturally to your words, tone, nods, and smiles in about half a second.

#interactive avatar#talking head generation#causal diffusion forcing

CPPO: Contrastive Perception for Vision Language Policy Optimization

Intermediate

Ahmad Rezaei, Mohsen Gholami et al.Jan 1arXiv

CPPO is a new way to fine‑tune vision‑language models so they see pictures more accurately before they start to reason.

#CPPO#Contrastive Perception Loss#Vision-Language Models

E-GRPO: High Entropy Steps Drive Effective Reinforcement Learning for Flow Models

Intermediate

Shengjun Zhang, Zhang Zhang et al.Jan 1arXiv

This paper shows that when teaching image generators with reinforcement learning, only a few early, very noisy steps actually help the model learn what people like.

#E-GRPO#Group Relative Policy Optimization#Flow Matching

Deep Delta Learning

Intermediate

Yifan Zhang, Yifeng Liu et al.Jan 1arXiv

Deep Delta Learning (DDL) replaces the usual “add the shortcut” rule in deep networks with a smarter, learnable move that can gently erase old info and write new info along a chosen direction.

#Deep Delta Learning#Delta Operator#Residual connection

MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing

Intermediate

Xiaokun Sun, Zeyu Cai et al.Jan 1arXiv

MorphAny3D is a training-free way to smoothly change one 3D object into another, even if they are totally different (like a bee into a biplane).

#3D morphing#Structured Latent#SLAT

GaMO: Geometry-aware Multi-view Diffusion Outpainting for Sparse-View 3D Reconstruction

Intermediate

Yi-Chuan Huang, Hao-Jen Chien et al.Dec 31arXiv

GaMO is a new way to rebuild 3D scenes from just a few photos by expanding each photo’s edges (outpainting) instead of inventing whole new camera views.

#3D reconstruction#outpainting#multi-view diffusion

Scaling Open-Ended Reasoning to Predict the Future

Intermediate

Nikhil Chandak, Shashwat Goel et al.Dec 31arXiv

The paper teaches small language models to predict open-ended future events by turning daily news into thousands of safe, graded practice questions.

#open-ended forecasting#calibrated prediction#Brier score

ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands

Intermediate

Siyuan Hu, Kevin Qinghong Lin et al.Dec 31arXiv

Computers usually click like a woodpecker, but they struggle to drag smoothly like a human hand; this paper fixes that.

#GUI automation#continuous control#flow matching

BEDA: Belief Estimation as Probabilistic Constraints for Performing Strategic Dialogue Acts

Intermediate

Hengli Li, Zhaoxin Yu et al.Dec 31arXiv

This paper presents BEDA, a simple way to make chatty AI act strategically by turning what it believes into gentle rules (probabilistic constraints) that guide what it can say.

#strategic dialogue#belief estimation#probabilistic constraints

39 40 41 42 43