Papers9

#self-consistency

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Tong Zheng, Chengsong Huang et al.Feb 3arXiv

Parallel-Probe is a simple add-on that lets many AI “thought paths” think at once but stop early when they already agree.

#parallel thinking#2D probing#consensus-based early stopping

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Intermediate

Chengyi Yang, Zhishang Xiang et al.Jan 30arXiv

TTCS is a way for a model to teach itself during the test by first making easier practice questions that are similar to the real hard question and then learning from them.

#test-time training#test-time reinforcement learning#curriculum learning

Enhancing Sentiment Classification and Irony Detection in Large Language Models through Advanced Prompt Engineering Techniques

Beginner

Marvin Schmitt, Anne Schwerk et al.Jan 13arXiv

Giving large language models a few good examples and step-by-step instructions can make them much better at spotting feelings in text.

#prompt engineering#few-shot learning#chain-of-thought

Illusions of Confidence? Diagnosing LLM Truthfulness via Neighborhood Consistency

Beginner

Haoming Xu, Ningyuan Zhao et al.Jan 9arXiv

LLMs can look confident but still change their answers when the surrounding text nudges them, showing that confidence alone isn’t real truthfulness.

#Neighbor-Consistency Belief#belief robustness#self-consistency

Atlas: Orchestrating Heterogeneous Models and Tools for Multi-Domain Complex Reasoning

Beginner

Jinyang Wu, Guocheng Zhai et al.Jan 7arXiv

ATLAS is a system that picks the best mix of AI models and helper tools for each question, instead of using just one model or a fixed tool plan.

#ATLAS#LLM routing#tool augmentation

Confidence Estimation for LLMs in Multi-turn Interactions

Intermediate

Caiqi Zhang, Ruihan Yang et al.Jan 5arXiv

This paper studies how sure (confident) large language models are during multi-turn chats where clues arrive step by step.

#multi-turn confidence estimation#LLM calibration#InfoECE

UCoder: Unsupervised Code Generation by Internal Probing of Large Language Models

Intermediate

Jiajun Wu, Jian Yang et al.Dec 19arXiv

The paper introduces UCoder, a way to teach a code-generating AI to get better without using any outside datasets, not even unlabeled code.

#unsupervised code generation#self-training#internal probing

Reasoning Palette: Modulating Reasoning via Latent Contextualization for Controllable Exploration for (V)LMs

Intermediate

Rujiao Long, Yang Li et al.Dec 19arXiv

Reasoning Palette gives a language or vision-language model a tiny hidden “mood” (a latent code) before it starts answering, so it chooses a smarter plan rather than just rolling dice on each next word.

#Reasoning Palette#latent contextualization#VAE

ThreadWeaver: Adaptive Threading for Efficient Parallel Reasoning in Language Models

Intermediate

Long Lian, Sida Wang et al.Nov 24arXiv

ThreadWeaver teaches a language model to split big problems into smaller parts it can solve at the same time, like teammates working in parallel.

#adaptive parallel reasoning#fork–join#threaded inference