Papers1262

Classroom Final Exam: An Instructor-Tested Reasoning Benchmark

Chongyang Gao, Diji Yang et al.Feb 23arXiv

CFE-BENCH is a new, teacher-verified "Classroom Final Exam" for AI that uses real college STEM problems to test deep, step-by-step reasoning.

#CFE-BENCH#variable-based verification#reasoning flow

Not triaged yet

Hepato-LLaVA: An Expert MLLM with Sparse Topo-Pack Attention for Hepatocellular Pathology Analysis on Whole Slide Images

Intermediate

Yuxuan Yang, Zhonghao Yan et al.Feb 23arXiv

Hepato-LLaVA is a special AI that reads giant microscope pictures of the liver and answers medical questions about cancer.

#Hepato-LLaVA#Hepatocellular Carcinoma#Whole Slide Images

Not triaged yet

Anatomy of Agentic Memory: Taxonomy and Empirical Analysis of Evaluation and System Limitations

Intermediate

Dongming Jiang, Yi Li et al.Feb 22arXiv

This paper explains how AI agents remember things across long conversations and why many current tests don’t truly measure that memory.

#agentic memory#memory-augmented generation#long-context LLMs

Not triaged yet

TOPReward: Token Probabilities as Hidden Zero-Shot Rewards for Robotics

Intermediate

Shirui Chen, Cole Harrison et al.Feb 22arXiv

Robots learn better when they get small hints at every step instead of only a final thumbs-up or thumbs-down.

#TOPReward#token probabilities#logits

Not triaged yet

JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation

Intermediate

Kai Liu, Yanhao Zheng et al.Feb 22arXiv

JavisDiT++ is a new AI that makes short videos and matching sounds from a text prompt, keeping sight and sound in sync.

#joint audio-video generation#multimodal diffusion transformer#modality-specific mixture-of-experts

Not triaged yet

Learning Cross-View Object Correspondence via Cycle-Consistent Mask Prediction

Intermediate

Shannan Yan, Leqi Zheng et al.Feb 22arXiv

This paper teaches a computer to find the same object when seen from two very different cameras, like a body camera (first-person) and a room camera (third-person).

#cross-view correspondence#egocentric to exocentric#binary segmentation

Not triaged yet

DREAM: Deep Research Evaluation with Agentic Metrics

Intermediate

Elad Ben Avraham, Changhao Li et al.Feb 21arXiv

Deep research agents write long reports, but old tests often judge only how smooth they sound and whether they add links, not whether the facts are true today or the logic really holds.

#deep research agents#agentic evaluation#capability parity

Not triaged yet

RoboCurate: Harnessing Diversity with Action-Verified Neural Trajectory for Robot Learning

Beginner

Seungku Kim, Suhyeok Jang et al.Feb 21arXiv

RoboCurate is a way to make better robot training videos by checking if the actions in a generated video actually match what a robot would do in a simulator.

#RoboCurate#neural trajectory#action verification

Not triaged yet

Spilled Energy in Large Language Models

Intermediate

Adrian Robert Minut, Hazem Dewidar et al.Feb 21arXiv

The paper treats the last layer of a Large Language Model (the softmax over tokens) as an Energy-Based Model, which lets us measure a new signal called spilled energy.

#spilled energy#energy-based models#marginal energy

Not triaged yet

Decoding ML Decision: An Agentic Reasoning Framework for Large-Scale Ranking System

Intermediate

Longfei Yun, Yihan Wu et al.Feb 20arXiv

GEARS is a new way to improve big ranking systems (like what shows up first in your feed) by letting an AI agent explore options safely, instead of humans tweaking knobs by hand.

#GEARS#agentic ranking#Specialized Agent Skills

Not triaged yet

SARAH: Spatially Aware Real-time Agentic Humans

Intermediate

Evonne Ng, Siwei Zhang et al.Feb 20arXiv

SARAH is a real-time system that makes virtual characters move their whole bodies naturally during a conversation while knowing where the user is.

#spatially aware motion#real-time avatars#causal transformer

Not triaged yet

Generated Reality: Human-centric World Simulation using Interactive Video Generation with Hand and Camera Control

Intermediate

Linxi Xie, Lisong C. Sun et al.Feb 20arXiv

This paper builds a "generated reality" system that lets AI-made videos react to your real head and hand movements in VR.

#generated reality#hand pose conditioning#video diffusion transformer

Not triaged yet

12 13 14 15 16