🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers129

AllBeginnerIntermediateAdvanced
All SourcesarXiv

VIBEVOICE-ASR Technical Report

Beginner
Zhiliang Peng, Jianwei Yu et al.Jan 26arXiv

VIBEVOICE-ASR is a single-pass system that listens to up to 60 minutes of audio at once and outputs who spoke, when they spoke, and what they said in one stream.

#long-form ASR#speaker diarization#timestamping

Typhoon-S: Minimal Open Post-Training for Sovereign Large Language Models

Beginner
Kunat Pipatanakul, Pittawat TaveekitworachaiJan 26arXiv

Typhoon-S is a simple, open recipe that turns a basic language model into a helpful assistant and then teaches it important local skills, all on small budgets.

#Typhoon-S#on-policy distillation#full-logits distillation

Elastic Attention: Test-time Adaptive Sparsity Ratios for Efficient Transformers

Beginner
Zecheng Tang, Quantong Qiu et al.Jan 24arXiv

Transformers slow down on very long inputs because standard attention looks at every token pair, which is expensive.

#elastic attention#sparse attention#full attention

LongCat-Flash-Thinking-2601 Technical Report

Beginner
Meituan LongCat Team, Anchun Gui et al.Jan 23arXiv

LongCat-Flash-Thinking-2601 is a huge 560-billion-parameter Mixture-of-Experts model built to act like a careful helper that can use tools, browse, code, and solve multi-step tasks.

#Agentic reasoning#Mixture-of-Experts#Asynchronous reinforcement learning

DSGym: A Holistic Framework for Evaluating and Training Data Science Agents

Beginner
Fan Nie, Junlin Wang et al.Jan 22arXiv

DSGym is a unified 'gym' where AI data science agents are tested and trained by actually running code on real datasets, not just chatting about them.

#DSGym#data science agents#execution-grounded evaluation

IVRA: Improving Visual-Token Relations for Robot Action Policy with Training-Free Hint-Based Guidance

Beginner
Jongwoo Park, Kanchana Ranasinghe et al.Jan 22arXiv

IVRA is a simple, training-free add-on that helps robot brains keep the 2D shape of pictures while following language instructions.

#Vision-Language-Action#affinity map#training-free guidance

LLM-in-Sandbox Elicits General Agentic Intelligence

Beginner
Daixuan Cheng, Shaohan Huang et al.Jan 22arXiv

This paper shows that giving an AI a safe, tiny virtual computer (a sandbox) lets it solve many kinds of problems better, not just coding ones.

#LLM-in-Sandbox#Agentic Intelligence#Reinforcement Learning

Agentic Confidence Calibration

Beginner
Jiaxin Zhang, Caiming Xiong et al.Jan 22arXiv

AI agents often act very sure of themselves even when they are wrong, especially on long, multi-step tasks.

#agentic confidence calibration#holistic trajectory calibration#general agent calibrator

Dancing in Chains: Strategic Persuasion in Academic Rebuttal via Theory of Mind

Beginner
Zhitao He, Zongwei Lyu et al.Jan 22arXiv

Academic rebuttals are not just about being polite; they are about smart, strategic persuasion under hidden information.

#academic rebuttal#theory of mind#strategic persuasion

Robust Tool Use via Fission-GRPO: Learning to Recover from Execution Errors

Beginner
Zhiwei Zhang, Fei Zhao et al.Jan 22arXiv

Small AI models often stumble when a tool call fails and then get stuck repeating bad calls instead of fixing the mistake.

#FISSION-GRPO#error recovery#tool use

Rethinking Video Generation Model for the Embodied World

Beginner
Yufan Deng, Zilin Pan et al.Jan 21arXiv

Robots need videos that not only look pretty but also follow real-world physics and finish the task asked of them.

#robot video generation#embodied AI#benchmark

The Flexibility Trap: Why Arbitrary Order Limits Reasoning Potential in Diffusion Language Models

Beginner
Zanlin Ni, Shenzhi Wang et al.Jan 21arXiv

Diffusion language models can write tokens in any order, but that freedom can accidentally hurt their ability to reason well.

#diffusion language model#arbitrary order generation#autoregressive training
12345