RLAnything is a new reinforcement learning (RL) framework that trains three things together at once: the policy (the agent), the reward model (the judge), and the environment (the tasks).
This paper shows that making short videos can help AI plan and reason in pictures better than writing out steps in text.
OmniSafeBench-MM is a one-stop, open-source test bench that fairly compares how multimodal AI models get tricked (jailbroken) and how well different defenses stop that.