🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers196

AllBeginnerIntermediateAdvanced
All SourcesarXiv

Wiki Live Challenge: Challenging Deep Research Agents with Expert-Level Wikipedia Articles

Beginner
Shaohan Wang, Benfeng Xu et al.Feb 2arXiv

This paper builds a live challenge that tests how well Deep Research Agents (DRAs) can write expert-level Wikipedia-style articles.

#Deep Research Agents#Wikipedia Good Articles#Benchmark

Making Avatars Interact: Towards Text-Driven Human-Object Interaction for Controllable Talking Avatars

Beginner
Youliang Zhang, Zhengguang Zhou et al.Feb 2arXiv

This paper teaches talking avatars not just to speak, but to look around their scene and handle nearby objects exactly as a text instruction says.

#grounded human-object interaction#talking avatars#diffusion transformer

PolySAE: Modeling Feature Interactions in Sparse Autoencoders via Polynomial Decoding

Beginner
Panagiotis Koromilas, Andreas D. Demou et al.Feb 1arXiv

PolySAE is a new kind of sparse autoencoder that keeps a simple, linear way to find features but uses a smarter decoder that can multiply features together.

#Sparse Autoencoder#Polynomial Decoder#Feature Interactions

LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs

Beginner
Benno Krojer, Shravan Nayak et al.Jan 31arXiv

LatentLens is a simple, training-free way to translate what a model "sees" in image patches into clear words and phrases.

#LatentLens#visual tokens#contextual embeddings

PaperBanana: Automating Academic Illustration for AI Scientists

Beginner
Dawei Zhu, Rui Meng et al.Jan 30arXiv

PaperBanana is a team of AI helpers that turns a paper’s method text and caption into a clean, accurate, publication-ready figure.

#academic illustration#methodology diagrams#visual language models

SSL: Sweet Spot Learning for Differentiated Guidance in Agentic Optimization

Beginner
Jinyang Wu, Changpeng Yang et al.Jan 30arXiv

Most reinforcement learning agents only get a simple pass/fail reward, which hides how good or bad their attempts really were.

#Sweet Spot Learning#tiered rewards#reinforcement learning with verifiable rewards

One-step Latent-free Image Generation with Pixel Mean Flows

Beginner
Yiyang Lu, Susie Lu et al.Jan 29arXiv

This paper shows how to make a whole picture in one go, directly in pixels, without using a hidden “latent” space or many tiny steps.

#pixel MeanFlow#one-step generation#x-prediction

PaddleOCR-VL-1.5: Towards a Multi-Task 0.9B VLM for Robust In-the-Wild Document Parsing

Beginner
Cheng Cui, Ting Sun et al.Jan 29arXiv

This paper upgrades a small but mighty vision-language model called PaddleOCR-VL-1.5 to read and understand real-world, messy documents better than any model before it.

#document parsing#vision-language model#layout analysis

ConceptMoE: Adaptive Token-to-Concept Compression for Implicit Compute Allocation

Beginner
Zihao Huang, Jundong Zhou et al.Jan 29arXiv

ConceptMoE teaches a language model to group easy, similar tokens into bigger ideas called concepts, so it spends more brainpower on the hard parts.

#ConceptMoE#Mixture of Experts#Adaptive Compression

Less Noise, More Voice: Reinforcement Learning for Reasoning via Instruction Purification

Beginner
Yiju Guo, Tianyi Hu et al.Jan 29arXiv

This paper shows that many reasoning failures in AI are caused by just a few distracting words in the prompt, not because the problems are too hard.

#LENS#Interference Tokens#Reinforcement Learning with Verifiable Rewards

Llama-3.1-FoundationAI-SecurityLLM-Reasoning-8B Technical Report

Beginner
Zhuoran Yang, Ed Li et al.Jan 28arXiv

This paper introduces Foundation-Sec-8B-Reasoning, a small (8 billion parameter) AI model that is trained to “think out loud” before answering cybersecurity questions.

#native reasoning#cybersecurity LLM#chain-of-thought

DeepSearchQA: Bridging the Comprehensiveness Gap for Deep Research Agents

Beginner
Nikita Gupta, Riju Chatterjee et al.Jan 28arXiv

DeepSearchQA is a new test with 900 real-world style questions that checks if AI agents can find complete lists of answers, not just one fact.

#DeepSearchQA#agentic information retrieval#systematic collation
56789