๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#reasoning distillation

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Intermediate
Tianyi Wu, Mingzhe Du et al.Feb 7arXiv

This paper introduces SecCoderX, a way to teach code-writing AIs to be secure without breaking what the code is supposed to do.

#secure code generation#reinforcement learning#vulnerability reward model

MMFineReason: Closing the Multimodal Reasoning Gap via Open Data-Centric Methods

Intermediate
Honglin Lin, Zheng Liu et al.Jan 29arXiv

MMFineReason is a huge, open dataset (1.8 million examples, 5.1 billion solution tokens) that teaches AIs to think step by step about pictures and text together.

#multimodal reasoning#vision-language models#chain-of-thought