🎓How I Study AIHISA
đź“–Read
📄Papers📰Blogs🎬Courses
đź’ˇLearn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers925

AllBeginnerIntermediateAdvanced
All SourcesarXiv

DiffCoT: Diffusion-styled Chain-of-Thought Reasoning in LLMs

Intermediate
Shidong Cao, Hongzhan Lin et al.Jan 7arXiv

DiffCoT treats a model’s step-by-step thinking (Chain-of-Thought) like a messy draft that can be cleaned up over time, not something fixed forever.

#Chain-of-Thought#Diffusion models#Autoregressive decoding

What Users Leave Unsaid: Under-Specified Queries Limit Vision-Language Models

Beginner
Dasol Choi, Guijin Son et al.Jan 7arXiv

Real people often ask vague questions with pictures, and today’s vision-language models (VLMs) struggle with them.

#vision-language models#under-specified queries#query explicitation

Evolving Programmatic Skill Networks

Intermediate
Haochen Shi, Xingdi Yuan et al.Jan 7arXiv

This paper teaches a computer agent to grow a toolbox of skills that are real, runnable programs, not just text ideas.

#Programmatic Skill Network#continual learning#symbolic programs

EpiQAL: Benchmarking Large Language Models in Epidemiological Question Answering for Enhanced Alignment and Reasoning

Intermediate
Mingyang Wei, Dehai Min et al.Jan 6arXiv

EpiQAL is a new benchmark that tests how well AI models answer population-level disease questions using real research papers.

#Epidemiological reasoning#Question answering#Benchmarking LLMs

ThinkRL-Edit: Thinking in Reinforcement Learning for Reasoning-Centric Image Editing

Beginner
Hengjia Li, Liming Jiang et al.Jan 6arXiv

ThinkRL-Edit teaches an image editor to think first and draw second, which makes tricky, reasoning-heavy edits much more accurate.

#reasoning-centric image editing#reinforcement learning#chain-of-thought

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks

Beginner
Atsuki Yamaguchi, Maggie Mi et al.Jan 6arXiv

The paper teaches language models using extra 'language homework' made from the same raw text so they learn grammar and meaning, not just next-word guessing.

#language model pretraining#causal language modeling#linguistic competence

The Illusion of Specialization: Unveiling the Domain-Invariant "Standing Committee" in Mixture-of-Experts Models

Intermediate
Yan Wang, Yitao Xu et al.Jan 6arXiv

Mixture-of-Experts (MoE) language models don’t split cleanly into domain specialists; instead, a small, stable group of experts gets chosen again and again across many subjects.

#Mixture-of-Experts#Standing Committee#Sparse routing

InfiniDepth: Arbitrary-Resolution and Fine-Grained Depth Estimation with Neural Implicit Fields

Intermediate
Hao Yu, Haotong Lin et al.Jan 6arXiv

InfiniDepth is a new way to predict depth that treats every image location as a smooth, continuous place you can ask for depth, not just the fixed pixels of a grid.

#monocular depth estimation#neural implicit fields#arbitrary resolution depth

LTX-2: Efficient Joint Audio-Visual Foundation Model

Intermediate
Yoav HaCohen, Benny Brazowski et al.Jan 6arXiv

LTX-2 is an open-source model that makes video and sound together from a text prompt, so the picture and audio match in time and meaning.

#text-to-video#text-to-audio#audiovisual generation

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Beginner
Ruiyan Han, Zhen Fang et al.Jan 6arXiv

This paper fixes a common problem in multimodal AI: models can understand pictures and words well but stumble when asked to create matching images.

#Unified Multimodal Models#Self-Generated Supervision#Conduction Aphasia

Unified Thinker: A General Reasoning Modular Core for Image Generation

Intermediate
Sashuai Zhou, Qiang Zhou et al.Jan 6arXiv

Unified Thinker separates “thinking” (planning) from “drawing” (image generation) so complex instructions get turned into clear, doable steps before any pixels are painted.

#reasoning-aware image generation#structured planning#edit-only prompt

One Sample to Rule Them All: Extreme Data Efficiency in RL Scaling

Beginner
Yiyuan Li, Zhen Huang et al.Jan 6arXiv

This paper shows that training a language model with reinforcement learning on just one super well-designed example can boost reasoning across many school subjects, not just math.

#polymath learning#one-shot reinforcement learning#GRPO
4243444546