Papers1262

CASA: Cross-Attention via Self-Attention for Efficient Vision-Language Fusion

Moritz Böhle, Amélie Royer et al.Dec 22arXiv

CASA is a new way to mix images and text inside a language model that keeps speed and memory low while keeping accuracy high.

#CASA#cross-attention#self-attention

Not triaged yet

QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models

Intermediate

Li Puyin, Tiange Xiang et al.Dec 22arXiv

QuantiPhy is a new test that checks if AI models can measure real-world physics from videos using numbers, not guesses.

#QuantiPhy#Vision-Language Models#Physical reasoning

Not triaged yet

Real2Edit2Real: Generating Robotic Demonstrations via a 3D Control Interface

Beginner

Yujie Zhao, Hongwei Fan et al.Dec 22arXiv

Robots learn better when they see many examples, but collecting lots of real videos is slow and expensive.

#robotic demonstration generation#depth-controlled video generation#metric-scale 3D reconstruction

Not triaged yet

QuCo-RAG: Quantifying Uncertainty from the Pre-training Corpus for Dynamic Retrieval-Augmented Generation

Intermediate

Dehai Min, Kailin Zhang et al.Dec 22arXiv

QuCo-RAG is a new way to decide when an AI should look things up while it writes, using facts from its training data instead of its own shaky confidence.

#Dynamic RAG#Retrieval-Augmented Generation#Uncertainty Quantification

Not triaged yet

DramaBench: A Six-Dimensional Evaluation Framework for Drama Script Continuation

Intermediate

Shijian Ma, Yunqi Huang et al.Dec 22arXiv

DramaBench is a new test that checks how well AI continues drama scripts across six separate skills instead of one big score.

#DramaBench#script continuation#screenplay evaluation

Not triaged yet

Can LLMs Estimate Student Struggles? Human-AI Difficulty Alignment with Proficiency Simulation for Item Difficulty Prediction

Intermediate

Ming Li, Han Chen et al.Dec 21arXiv

This paper asks a simple question with big impact: Can AI tell which test questions are hard for humans?

#Item Difficulty Prediction#Item Response Theory#Rasch Model

Not triaged yet

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Intermediate

Yixia Li, Hongru Wang et al.Dec 21arXiv

This paper asks if large language models (LLMs) can act like "world models" that predict what happens next in text-based environments, not just the next word in a sentence.

#world models#next-state prediction#text-based environments

Not triaged yet

MemEvolve: Meta-Evolution of Agent Memory Systems

Beginner

Guibin Zhang, Haotian Ren et al.Dec 21arXiv

MemEvolve teaches AI agents not only to remember past experiences but also to improve the way they remember, like a student who upgrades their study habits over time.

#LLM agents#agent memory#meta-evolution

Not triaged yet

InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search

Intermediate

Kaican Li, Lewei Yao et al.Dec 21arXiv

This paper builds a tough new test called O3-BENCH to check if AI can truly think with images, not just spot objects.

#multimodal reasoning#generalized visual search#reinforcement learning

Not triaged yet

Does It Tie Out? Towards Autonomous Legal Agents in Venture Capital

Intermediate

Pierre Colombo, Malik Boudiaf et al.Dec 21arXiv

Capitalization tie-out checks if a company’s ownership table truly matches what its legal documents say.

#capitalization tie-out#dataroom#cap table verification

Not triaged yet

SWE-EVO: Benchmarking Coding Agents in Long-Horizon Software Evolution Scenarios

Intermediate

Minh V. T. Thai, Tue Le et al.Dec 20arXiv

SWE-EVO is a new test (benchmark) that checks if AI coding agents can upgrade real software projects over many steps, not just fix one small bug.

#SWE-EVO#software evolution#coding agents

Not triaged yet

MatSpray: Fusing 2D Material World Knowledge on 3D Geometry

Intermediate

Philipp Langsteiner, Jan-Niklas Dihlmann et al.Dec 20arXiv

MatSpray turns 2D guesses about what materials look like (color, shininess, metal) into a clean 3D model you can relight realistically.

#MatSpray#3D Gaussian Splatting#Gaussian Ray Tracing

Not triaged yet

83 84 85 86 87