Papers7

All Beginner Intermediate Advanced

All Sources arXiv

#attention mechanism

FastVMT: Eliminating Redundancy in Video Motion Transfer

Intermediate

Yue Ma, Zhikai Wang et al.Feb 5arXiv

FastVMT is a faster way to copy motion from one video to another without training a new model for each video.

#FastVMT#video motion transfer#Diffusion Transformer

Not triaged yet

HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding

Intermediate

Haowei Zhang, Shudong Yang et al.Jan 21arXiv

HERMES is a training-free way to make video-language models understand live, streaming video quickly and accurately.

#HERMES#KV cache#hierarchical memory

Not triaged yet

The AI Hippocampus: How Far are We From Human Memory?

Intermediate

Zixia Jia, Jiaqi Li et al.Jan 14arXiv

This survey asks how close AI memory systems are to human memory and organizes the answer into three parts: implicit memory (inside the model), explicit memory (outside storage you can look up), and agentic memory (what an AI agent keeps over time to plan and act).

#LLM memory#implicit memory#explicit memory

Not triaged yet

KV-Embedding: Training-free Text Embedding via Internal KV Re-routing in Decoder-only LLMs

Intermediate

Yixuan Tang, Yi YangJan 3arXiv

This paper shows how to get strong text embeddings from decoder-only language models without any training.

#text embeddings#decoder-only LLMs#causal attention

Not triaged yet

MorphAny3D: Unleashing the Power of Structured Latent in 3D Morphing

Intermediate

Xiaokun Sun, Zeyu Cai et al.Jan 1arXiv

MorphAny3D is a training-free way to smoothly change one 3D object into another, even if they are totally different (like a bee into a biplane).

#3D morphing#Structured Latent#SLAT

Not triaged yet

LitePT: Lighter Yet Stronger Point Transformer

Intermediate

Yuanwen Yue, Damien Robert et al.Dec 15arXiv

LitePT is a new AI backbone for 3D point clouds that uses convolutions in early layers and attention in later layers to be both fast and accurate.

#LitePT#Point Transformer#3D point cloud

Not triaged yet

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Beginner

Yuran Wang, Bohan Zeng et al.Dec 14arXiv

Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.

#subject-driven image generation#multi-subject composition#subject distinction

Not triaged yet