🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers7

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#long-term memory

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Intermediate
Yansong Shi, Qingsong Zhao et al.Mar 4arXiv

RIVER Bench is a new test that checks how well AI can watch a video stream and talk with you in real time.

#RIVER Bench#online video understanding#multimodal large language models

MemSifter: Offloading LLM Memory Retrieval via Outcome-Driven Proxy Reasoning

Intermediate
Jiejun Tan, Zhicheng Dou et al.Mar 3arXiv

MemSifter is a smart helper that picks the right memories for a big AI so the big AI doesn’t have to read everything.

#long-term memory#LLM retrieval#proxy model

MemGUI-Bench: Benchmarking Memory of Mobile GUI Agents in Dynamic Environments

Intermediate
Guangyi Liu, Pengxiang Zhao et al.Feb 3arXiv

MemGUI-Bench is a new test that checks how well phone-controlling AI agents can remember important information both during a task and across different tries.

#mobile GUI agents#memory benchmarking#short-term memory

Research on World Models Is Not Merely Injecting World Knowledge into Specific Tasks

Intermediate
Bohan Zeng, Kaixin Zhu et al.Feb 2arXiv

This paper argues that true world models are not just sprinkling facts into single tasks, but building a unified system that can see, think, remember, act, and generate across many situations.

#world models#unified framework#multimodal reasoning

MemoryRewardBench: Benchmarking Reward Models for Long-Term Memory Management in Large Language Models

Beginner
Zecheng Tang, Baibei Ji et al.Jan 17arXiv

This paper builds MemoryRewardBench, a big test that checks if reward models (AI judges) can fairly grade how other AIs manage long-term memory, not just whether their final answers are right.

#reward models#long-term memory#long-context reasoning

RealMem: Benchmarking LLMs in Real-World Memory-Driven Interaction

Beginner
Haonan Bian, Zhiyuan Yao et al.Jan 11arXiv

RealMem is a new benchmark that tests how well AI assistants remember and manage long, ongoing projects across many conversations.

#RealMem#long-term memory#project-oriented interactions

Confucius Code Agent: Scalable Agent Scaffolding for Real-World Codebases

Beginner
Sherman Wong, Zhenting Qi et al.Dec 11arXiv

This paper introduces the Confucius Code Agent (CCA), a coding helper built to handle huge real-world codebases with long tasks and many tools.

#coding agents#agent scaffolding#context management