๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐ŸงฉProblems๐ŸŽฏPrompts๐Ÿง Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers3

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#reward model

HyperAlign: Hypernetwork for Efficient Test-Time Alignment of Diffusion Models

Intermediate
Xin Xie, Jiaxian Guo et al.Jan 22arXiv

Diffusion models make pictures from noise but often miss what people actually want in the prompt or what looks good to humans.

#diffusion models#rectified flow#hypernetwork

SWE-RM: Execution-free Feedback For Software Engineering Agents

Intermediate
KaShun Shum, Binyuan Hui et al.Dec 26arXiv

Coding agents used to fix software rely on feedback; unit tests give only pass/fail signals that are often noisy or missing.

#execution-free feedback#reward model#software engineering agents

Self-Improving VLM Judges Without Human Annotations

Intermediate
Inna Wanyin Lin, Yushi Hu et al.Dec 2arXiv

The paper shows how a vision-language model (VLM) can train itself to be a fair judge of answers about images without using any human preference labels.

#vision-language model#VLM judge#reward model