🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers1252

AllBeginnerIntermediateAdvanced
All SourcesarXiv

LMEB: Long-horizon Memory Embedding Benchmark

Intermediate
Xinping Zhao, Xinshuo Hu et al.Mar 13arXiv

LMEB is a new test that checks whether text-embedding models can remember and find information across long stretches of time, not just in short, neat passages.

#LMEB#long-horizon memory retrieval#memory embeddings

RoboPocket: Improve Robot Policies Instantly with Your Phone

Intermediate
Junjie Fang, Wendi Chen et al.Mar 5arXiv

RoboPocket turns an ordinary smartphone into a pocket robot coach that helps you fix robot mistakes instantly—without touching a robot.

#RoboPocket#Imitation Learning#Interactive Imitation Learning

RealWonder: Real-Time Physical Action-Conditioned Video Generation

Intermediate
Wei Liu, Ziyu Chen et al.Mar 5arXiv

RealWonder is a system that turns a single picture and 3D physical actions (like pushes, wind, and robot gripper moves) into a realistic video in real time.

#action-conditioned video generation#physics simulation#optical flow

On-Policy Self-Distillation for Reasoning Compression

Beginner
Hejian Sang, Yuanda Xu et al.Mar 5arXiv

Reasoning models often talk too much, and those extra words can actually make them more wrong.

#on-policy self-distillation#reasoning compression#conciseness instruction

UltraDexGrasp: Learning Universal Dexterous Grasping for Bimanual Robots with Synthetic Data

Intermediate
Sizhe Yang, Yiman Xie et al.Mar 5arXiv

Robots need many different ways to grab things, just like people use pinch, tripod, whole-hand, or two hands together.

#bimanual dexterous grasping#universal grasp policy#synthetic data generation

KARL: Knowledge Agents via Reinforcement Learning

Beginner
Jonathan D. Chang, Andrew Drozdov et al.Mar 5arXiv

KARL is a smart search helper that learns to look up information step by step and explain answers using the facts it finds.

#grounded reasoning#enterprise search#reinforcement learning

Locality-Attending Vision Transformer

Intermediate
Sina Hajimiri, Farzad Beizaee et al.Mar 5arXiv

Vision Transformers (ViTs) are great at recognizing what is in a whole image but often blur the tiny details needed to label each pixel (segmentation).

#Vision Transformer#self-attention#segmentation

MASQuant: Modality-Aware Smoothing Quantization for Multimodal Large Language Models

Intermediate
Lulu Hu, Wenhu Xiao et al.Mar 5arXiv

Multimodal AI models handle text, images, and audio, but their signals are very different in size, which breaks standard low‑bit compression methods.

#post‑training quantization#multimodal LLM#channel‑wise smoothing

Timer-S1: A Billion-Scale Time Series Foundation Model with Serial Scaling

Intermediate
Yong Liu, Xingjian Su et al.Mar 5arXiv

Timer-S1 is a huge time-series model (8.3B parameters, only 0.75B used per step) that predicts the future by thinking step-by-step inside one forward pass.

#time series forecasting#foundation models#Mixture-of-Experts

DARE: Aligning LLM Agents with the R Statistical Ecosystem via Distribution-Aware Retrieval

Intermediate
Maojun Sun, Yue Wu et al.Mar 5arXiv

DARE is a new way for AI assistants to find the right R functions by also looking at what the data looks like, not just the words in the question.

#distribution-aware retrieval#RPKB#RCodingAgent

Interactive Benchmarks

Beginner
Baoqing Yue, Zihan Zhu et al.Mar 5arXiv

This paper says we should test AI the way real life works: by letting it ask questions, gather clues, and make smart moves step by step under a limited budget.

#interactive benchmarks#information acquisition#budgeted reasoning

Helios: Real Real-Time Long Video Generation Model

Intermediate
Shenghai Yuan, Yuanyang Yin et al.Mar 4arXiv

Helios is a 14-billion-parameter video model that can make minute-long videos in real time at about 19.5 frames per second on a single NVIDIA H100 GPU.

#real-time video generation#long video diffusion#autoregressive diffusion
12345