Papers200

SageBwd: A Trainable Low-bit Attention

Jintao Zhang, Marco Chen et al.Mar 2arXiv

SageBwd is a way to make the Transformer's attention both fast and trainable by doing most big multiplications in 8-bit instead of full precision.

#SageBwd#low-bit attention#INT8 training

Not triaged yet

MMR-Life: Piecing Together Real-life Scenes for Multimodal Multi-image Reasoning

Beginner

Jiachun Li, Shaoping Huang et al.Mar 2arXiv

MMR-Life is a new test (benchmark) that checks how AI understands everyday situations using several real photos at once.

#multimodal reasoning#multi-image understanding#real-life benchmark

Not triaged yet

OpenAutoNLU: Open Source AutoML Library for NLU

Beginner

Grigory Arshinov, Aleksandr Boriskin et al.Mar 2arXiv

OpenAutoNLU is a simple, open-source tool that automatically builds text understanding models for you.

#AutoML#Natural Language Understanding#Text Classification

Not triaged yet

Legal RAG Bench: an end-to-end benchmark for legal RAG

Beginner

Abdur-Rahman Butler, Umar ButlerMar 2arXiv

Legal RAG Bench is a new, end-to-end test that checks how well legal AI systems find information and use it to answer tough, real-world legal questions.

#legal RAG#retrieval-augmented generation#embedding models

Not triaged yet

PhotoBench: Beyond Visual Matching Towards Personalized Intent-Driven Photo Retrieval

Beginner

Tianyi Xu, Rong Shan et al.Mar 2arXiv

PhotoBench is a new test built from real people’s photo albums to see if AI can find photos based on what you truly mean, not just what you see.

#PhotoBench#personalized photo retrieval#multi-source reasoning

Not triaged yet

SciDER: Scientific Data-centric End-to-end Researcher

Beginner

Ke Lin, Yilin Lu et al.Mar 2arXiv

SciDER is a team of smart AI helpers that can run almost the whole research process: think of ideas, read raw data, write and run code, and improve itself with feedback.

#data-centric AI#AI research agent#self-evolving memory

Not triaged yet

Spectral Attention Steering for Prompt Highlighting

Beginner

Weixian Waylon Li, Yuchen Niu et al.Mar 1arXiv

This paper teaches a new way to make a language model pay extra attention to the exact words you highlight in a prompt.

#attention steering#prompt highlighting#key embeddings

Not triaged yet

CHIMERA: Compact Synthetic Data for Generalizable LLM Reasoning

Beginner

Xinyu Zhu, Yihao Feng et al.Mar 1arXiv

CHIMERA is a small (about 9,000 examples) but very carefully built synthetic dataset that teaches AI to solve hard problems step by step.

#CHIMERA dataset#synthetic data generation#chain-of-thought

Not triaged yet

Memory Caching: RNNs with Growing Memory

Beginner

Ali Behrouz, Zeman Li et al.Feb 27arXiv

Recurrent neural networks (RNNs) are fast but forgetful because they squeeze everything they’ve seen into a tiny, fixed memory.

#Memory Caching#Recurrent Neural Networks#Attention

Not triaged yet

Humans and LLMs Diverge on Probabilistic Inferences

Beginner

Gaurav Kamath, Sreenath Madathil et al.Feb 26arXiv

Humans often make guesses about the world that are likely but not certain, and this paper studies how humans and AI compare at doing that.

#probabilistic reasoning#uncertainty calibration#natural language inference

Not triaged yet

Synthetic Visual Genome 2: Extracting Large-scale Spatio-Temporal Scene Graphs from Videos

Beginner

Ziqi Gao, Jieyu Zhang et al.Feb 26arXiv

This paper builds a giant, automatically made video library called SVG2 that tells who is in a video, what they look like, and how they interact over time.

#video scene graph#spatio-temporal reasoning#panoptic segmentation

Not triaged yet

CiteAudit: You Cited It, But Did You Read It? A Benchmark for Verifying Scientific References in the LLM Era

Beginner

Zhengqing Yuan, Kaiwen Shi et al.Feb 26arXiv

The paper tackles a new integrity problem in science: large language models sometimes invent realistic-looking citations that do not exist.

#citation verification#hallucinated citations#scholarly integrity

Not triaged yet

1 2 3 4 5