๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#out-of-distribution

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Beginner
Yinjie Wang, Tianbao Xie et al.Feb 2arXiv

RLAnything is a new reinforcement learning (RL) framework that trains three things together at once: the policy (the agent), the reward model (the judge), and the environment (the tasks).

#reinforcement learning#closed-loop optimization#reward modeling

Thinking in Frames: How Visual Context and Test-Time Scaling Empower Video Reasoning

Intermediate
Chengzu Li, Zanyi Wang et al.Jan 28arXiv

This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.

#video reasoning#visual planning#test-time scaling

OmniSafeBench-MM: A Unified Benchmark and Toolbox for Multimodal Jailbreak Attack-Defense Evaluation

Intermediate
Xiaojun Jia, Jie Liao et al.Dec 6arXiv

OmniSafeBench-MM is a one-stop, open-source test bench that fairly compares how multimodal AI models get tricked (jailbroken) and how well different defenses stop that.

#multimodal large language models#jailbreak attacks#safety alignment