🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers138

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#GRPO

CPPO: Contrastive Perception for Vision Language Policy Optimization

Intermediate
Ahmad Rezaei, Mohsen Gholami et al.Jan 1arXiv

CPPO is a new way to fine‑tune vision‑language models so they see pictures more accurately before they start to reason.

#CPPO#Contrastive Perception Loss#Vision-Language Models

Scaling Open-Ended Reasoning to Predict the Future

Intermediate
Nikhil Chandak, Shashwat Goel et al.Dec 31arXiv

The paper teaches small language models to predict open-ended future events by turning daily news into thousands of safe, graded practice questions.

#open-ended forecasting#calibrated prediction#Brier score

Figure It Out: Improve the Frontier of Reasoning with Executable Visual States

Intermediate
Meiqi Chen, Fandong Meng et al.Dec 30arXiv

FIGR is a new way for AI to ‘think by drawing,’ using code to build clean, editable diagrams while it reasons.

#executable visual states#diagrammatic reasoning#reinforcement learning for reasoning

GARDO: Reinforcing Diffusion Models without Reward Hacking

Intermediate
Haoran He, Yuxiao Ye et al.Dec 30arXiv

GARDO is a new way to fine-tune text-to-image diffusion models with reinforcement learning without getting tricked by bad reward signals.

#GARDO#reward hacking#gated KL regularization

Training AI Co-Scientists Using Rubric Rewards

Intermediate
Shashwat Goel, Rishi Hazra et al.Dec 29arXiv

The paper teaches AI to write strong research plans by letting it grade its own work using checklists (rubrics) pulled from real scientific papers.

#AI co-scientist#research plan generation#rubric rewards

ProGuard: Towards Proactive Multimodal Safeguard

Intermediate
Shaohan Yu, Lijun Li et al.Dec 29arXiv

ProGuard is a safety guard for text and images that doesn’t just spot known problems—it can also recognize and name new, never-seen-before risks.

#proactive safety#multimodal moderation#out-of-distribution detection

MindWatcher: Toward Smarter Multimodal Tool-Integrated Reasoning

Intermediate
Jiawei Chen, Xintian Shen et al.Dec 29arXiv

MindWatcher is a smart AI agent that can think step by step and decide when to use tools like web search, image zooming, and a code calculator to solve tough, multi-step problems.

#Tool-Integrated Reasoning#Interleaved Thinking#Multimodal Chain-of-Thought

Evaluating Parameter Efficient Methods for RLVR

Intermediate
Qingyu Yin, Yulun Wu et al.Dec 29arXiv

The paper asks which small, add-on training tricks (PEFT) work best when we teach language models with yes/no rewards we can check (RLVR).

#RLVR#parameter-efficient fine-tuning#LoRA

See Less, See Right: Bi-directional Perceptual Shaping For Multimodal Reasoning

Intermediate
Shuoshuo Zhang, Yizhen Zhang et al.Dec 26arXiv

The paper teaches vision-language models (AIs that look and read) to pay attention to the right picture parts without needing extra tools during answering time.

#BiPS#perceptual shaping#vision-language models

MAI-UI Technical Report: Real-World Centric Foundation GUI Agents

Intermediate
Hanzhang Zhou, Xu Zhang et al.Dec 26arXiv

MAI-UI is a family of AI agents that can see, understand, and control phone and computer screens using plain language.

#GUI agent#GUI grounding#mobile navigation

SmartSnap: Proactive Evidence Seeking for Self-Verifying Agents

Intermediate
Shaofei Cai, Yulei Qin et al.Dec 26arXiv

SmartSnap teaches an agent not only to finish a phone task but also to prove it with a few perfect snapshots it picks itself.

#Self-verifying agents#Evidence curation#3C principles

UniPercept: Towards Unified Perceptual-Level Image Understanding across Aesthetics, Quality, Structure, and Texture

Intermediate
Shuo Cao, Jiayang Li et al.Dec 25arXiv

This paper teaches AI to notice not just what is in a picture, but how the picture looks and feels to people.

#perceptual image understanding#image aesthetics assessment (IAA)#image quality assessment (IQA)
678910