🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers784

AllBeginnerIntermediateAdvanced
All SourcesarXiv

SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback

Intermediate
Fangyuan Xu, Rujun Han et al.Jan 26arXiv

SAGE is a two-agent system that automatically writes tough, multi-step search questions and checks them by actually trying to solve them.

#deep search#agentic data generation#execution feedback

Agentic Very Long Video Understanding

Intermediate
Aniket Rege, Arka Sadhu et al.Jan 26arXiv

The paper tackles understanding super long, first‑person videos (days to a week) by giving an AI a smarter memory and better tools.

#entity scene graph#agentic planning#long-horizon video understanding

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Intermediate
Zhaopeng Qiu, Shuang Yu et al.Jan 26arXiv

The paper shows how to speed up reinforcement learning (RL) for large language models (LLMs) by making numbers smaller (FP8) without breaking training.

#FP8 quantization#LLM reinforcement learning#KV-cache

DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints

Intermediate
Yinger Zhang, Shutong Jiang et al.Jan 26arXiv

DeepPlanning is a new benchmark that tests whether AI can make long, realistic plans that fit time and money limits.

#long-horizon planning#agentic tool use#global constrained optimization

FABLE: Forest-Based Adaptive Bi-Path LLM-Enhanced Retrieval for Multi-Document Reasoning

Intermediate
Lin Sun, Linglin Zhang et al.Jan 26arXiv

FABLE is a new retrieval system that helps AI find and combine facts from many documents by letting the AI both organize the library and choose the right shelves to read.

#FABLE#Structured RAG#Hierarchical retrieval

DRPG (Decompose, Retrieve, Plan, Generate): An Agentic Framework for Academic Rebuttal

Intermediate
Peixuan Han, Yingjie Yu et al.Jan 26arXiv

DRPG is a four-step AI helper that writes strong academic rebuttals by first breaking a review into parts, then fetching evidence, planning a strategy, and finally writing the response.

#academic rebuttal#agentic framework#planning with LLMs

Masked Depth Modeling for Spatial Perception

Intermediate
Bin Tan, Changjiang Sun et al.Jan 25arXiv

The paper turns the 'holes' (missing spots) in depth camera images into helpful training hints instead of treating them as garbage.

#Masked Depth Modeling#RGB-D cameras#Depth completion

EEG Foundation Models: Progresses, Benchmarking, and Open Problems

Intermediate
Dingkun Liu, Yuheng Chen et al.Jan 25arXiv

This paper builds a fair, big playground (a benchmark) to test many EEG foundation models side-by-side on the same rules.

#EEG foundation models#brain-computer interface#self-supervised learning

AR-Omni: A Unified Autoregressive Model for Any-to-Any Generation

Intermediate
Dongjie Cheng, Ruifeng Yuan et al.Jan 25arXiv

AR-Omni is a single autoregressive model that can take in and produce text, images, and speech without extra expert decoders.

#autoregressive modeling#multimodal large language model#any-to-any generation

The Script is All You Need: An Agentic Framework for Long-Horizon Dialogue-to-Cinematic Video Generation

Intermediate
Chenyu Mu, Xin He et al.Jan 25arXiv

This paper teaches AI to turn simple dialogue into full movie scenes by first writing a detailed script and then filming it step by step.

#dialogue-to-video#cinematic script generation#ScripterAgent

Fast KVzip: Efficient and Accurate LLM Inference with Gated KV Eviction

Intermediate
Jang-Hyun Kim, Dongyoon Han et al.Jan 25arXiv

Fast KVzip is a new way to shrink an LLM’s memory (the KV cache) while keeping answers just as accurate.

#KV cache compression#gated KV eviction#sink attention

AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking

Intermediate
Xilin Jiang, Qiaolin Wang et al.Jan 25arXiv

AVMeme Exam is a new test made by humans that checks if AI can understand famous internet audio and video clips the way people do.

#AVMeme Exam#multimodal large language models#audio-visual memes
1718192021