🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers807

AllBeginnerIntermediateAdvanced
All SourcesarXiv

RePo: Language Models with Context Re-Positioning

Intermediate
Huayang Li, Tianyu Zhao et al.Dec 16arXiv

Large language models usually line words up in fixed order slots, which can waste mental energy and make it harder to find the important parts of a long or noisy text.

#context re-positioning#positional encoding#self-attention

Vector Prism: Animating Vector Graphics by Stratifying Semantic Structure

Intermediate
Jooyeol Yun, Jaegul ChooDec 16arXiv

Vector Prism helps computers animate SVG images by first discovering which tiny shapes belong together as meaningful parts.

#SVG animation#semantic restructuring#vision–language models

SS4D: Native 4D Generative Model via Structured Spacetime Latents

Intermediate
Zhibing Li, Mengchen Zhang et al.Dec 16arXiv

SS4D is a new AI model that turns a short single-camera video into a full 3D object that moves over time (that’s 4D), and it does this in about 2 minutes.

#4D generation#structured spacetime latents#temporal attention

Zoom-Zero: Reinforced Coarse-to-Fine Video Understanding via Temporal Zoom-in

Intermediate
Xiaoqian Shen, Min-Hung Chen et al.Dec 16arXiv

Zoom-Zero helps AI answer questions about videos by first finding the right moment and then zooming in to double-check tiny details.

#Grounded Video Question Answering#Temporal Grounding#Coarse-to-Fine

Understanding and Improving Hyperbolic Deep Reinforcement Learning

Intermediate
Timo Klein, Thomas Lang et al.Dec 16arXiv

Reinforcement learning agents often see the world in straight, flat space (Euclidean), but many decision problems look more like branching trees that fit curved, hyperbolic space better.

#hyperbolic reinforcement learning#Hyperboloid#Poincaré Ball

SonicMoE: Accelerating MoE with IO and Tile-aware Optimizations

Intermediate
Wentao Guo, Mayank Mishra et al.Dec 16arXiv

SonicMoE makes Mixture-of-Experts (MoE) models train faster and use less memory by redesigning how data is moved and computed on GPUs.

#Mixture of Experts#Grouped GEMM#Token Rounding

Efficient-DLM: From Autoregressive to Diffusion Language Models, and Beyond in Speed

Intermediate
Yonggan Fu, Lexington Whalen et al.Dec 16arXiv

Autoregressive (AR) models write one word at a time, which is accurate but slow, especially when your computer or GPU can’t keep many tasks in memory at once.

#diffusion language models#autoregressive models#AR-to-dLM conversion

HyperVL: An Efficient and Dynamic Multimodal Large Language Model for Edge Devices

Intermediate
HyperAI Team, Yuchen Liu et al.Dec 16arXiv

HyperVL is a small but smart model that understands images and text, designed to run fast on phones and tablets.

#HyperVL#on-device multimodal#edge AI

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Intermediate
Mengzhang Cai, Xin Gao et al.Dec 16arXiv

OpenDataArena (ODA) is a fair, open platform that measures how valuable different post‑training datasets are for large language models by holding everything else constant.

#OpenDataArena#post-training datasets#data-centric AI

FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition

Intermediate
Jonas Golde, Patrick Haller et al.Dec 15arXiv

FINERWEB is a new, carefully built dataset pipeline that teaches computers to spot names of people, places, and more across 91 languages and 25 writing systems.

#multilingual NER#named entity recognition#LLM supervision

SAGE: Training Smart Any-Horizon Agents for Long Video Reasoning with Reinforcement Learning

Intermediate
Jitesh Jain, Jialuo Li et al.Dec 15arXiv

SAGE is a smart video-watching agent that decides when to answer quickly and when to take multiple steps, just like how people skim or rewind long videos.

#any-horizon reasoning#video agents#temporal grounding

LitePT: Lighter Yet Stronger Point Transformer

Intermediate
Yuanwen Yue, Damien Robert et al.Dec 15arXiv

LitePT is a new AI backbone for 3D point clouds that uses convolutions in early layers and attention in later layers to be both fast and accurate.

#LitePT#Point Transformer#3D point cloud
5556575859