DARE is a new way for AI assistants to find the right R functions by also looking at what the data looks like, not just the words in the question.
LaSER teaches a fast search model to “think” quietly inside its hidden space, so it gets the benefits of step-by-step reasoning without writing those steps out as text.
The paper shows that when we train with the popular InfoNCE contrastive loss, the learned features start to behave like they come from a Gaussian (bell-shaped) distribution.
Similarity-based image–text models like CLIP can be fooled by “half-truths,” where adding one plausible but wrong detail makes a caption look more similar to an image instead of less similar.
MAEB is a giant, fair report card for audio AI that tests 50+ models on 30 tasks across speech, music, environmental sounds, and audio–text tasks in 100+ languages.
The paper teaches small AI models to make high‑quality text embeddings by first copying a big expert model (distillation) and then practicing four jobs with special mini‑modules (LoRA adapters): retrieval, similarity, clustering, and classification.
QRRanker is a lightweight way to sort many long text chunks by how helpful they are to a question, using the model’s own attention to score relevance.
Decoder-only language models can be great at making user profiles (embeddings), but how we let them look at the sequence—called attention masking—changes how smart those profiles are.
ObjEmbed teaches an AI to understand not just whole pictures, but each object inside them, and to link those objects to the right words.
The paper asks a simple question: if a language model becomes better at step-by-step reasoning (using RLVR), do its text embeddings also get better? The short answer is no.
This paper introduces CGPT, a way to help computers find the right tables by building smarter mini-versions of tables and training with tough practice questions.
Agentic-R is a new way to teach a search retriever to find not just similar text, but the text that truly helps an AI get the final answer right.