ManCAR helps recommendation systems think step by step but keeps their thoughts on realistic paths using a map of how items connect.
This paper introduces P-GenRM, a personalized generative reward model that judges AI answers using a custom scorecard built just for each user and situation.