🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers1055

AllBeginnerIntermediateAdvanced
All SourcesarXiv

Recurrent-Depth VLA: Implicit Test-Time Compute Scaling of Vision-Language-Action Models via Latent Iterative Reasoning

Intermediate
Yalcin Tur, Jalal Naghiyev et al.Feb 8arXiv

Robots often use the same amount of thinking for easy and hard moves, which wastes time on easy steps and isn’t enough for tricky ones.

#Recurrent depth#Latent iterative reasoning#Vision-Language-Action

Secure Code Generation via Online Reinforcement Learning with Vulnerability Reward Model

Intermediate
Tianyi Wu, Mingzhe Du et al.Feb 7arXiv

This paper introduces SecCoderX, a way to teach code-writing AIs to be secure without breaking what the code is supposed to do.

#secure code generation#reinforcement learning#vulnerability reward model

Optimizing Few-Step Generation with Adaptive Matching Distillation

Intermediate
Lichen Bai, Zikai Zhou et al.Feb 7arXiv

Diffusion models make great images and videos but are slow because they usually need many tiny steps.

#diffusion distillation#few-step generation#distribution matching distillation

InftyThink+: Effective and Efficient Infinite-Horizon Reasoning via Reinforcement Learning

Intermediate
Yuchen Yan, Liang Jiang et al.Feb 6arXiv

Long chains of thought make AI smarter but also slower, pricier, and limited by memory windows.

#Iterative reasoning#Reinforcement learning for LLMs#Trajectory-level optimization

Ex-Omni: Enabling 3D Facial Animation Generation for Omni-modal Large Language Models

Intermediate
Haoyu Zhang, Zhipeng Li et al.Feb 6arXiv

Ex-Omni is a new open-source AI system that can understand text or speech and then talk back while moving a 3D face in sync with the voice.

#omni-modal LLM#3D facial animation#lip-sync

AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents

Intermediate
Alisia Lupidi, Bhavul Gauri et al.Feb 6arXiv

AIRS-Bench is a new test suite that checks whether AI research agents can do real machine learning research from start to finish, not just answer questions.

#AIRS-Bench#AI research agents#LLM agents

NanoQuant: Efficient Sub-1-Bit Quantization of Large Language Models

Intermediate
Hyochan Chong, Dongkyu Kim et al.Feb 6arXiv

NanoQuant is a new way to shrink large language models down to 1-bit and even less than 1-bit per weight without retraining on huge datasets.

#post-training quantization#sub-1-bit quantization#binary LLMs

PlanViz: Evaluating Planning-Oriented Image Generation and Editing for Computer-Use Tasks

Intermediate
Junxian Li, Kai Liu et al.Feb 6arXiv

PlanViz is a new test that checks whether AI image models can plan and draw useful computer-related pictures like routes on maps, work flowcharts, and website screens.

#PlanViz benchmark#planning-oriented image generation#image editing evaluation

AgentCPM-Report: Interleaving Drafting and Deepening for Open-Ended Deep Research

Intermediate
Yishan Li, Wentong Chen et al.Feb 6arXiv

This paper teaches small, local AI models to write deep, insightful research reports by letting writing and planning work together instead of staying separate.

#AgentCPM-Report#WARP#Writing As Reasoning Policy

QuantaAlpha: An Evolutionary Framework for LLM-Driven Alpha Mining

Intermediate
Jun Han, Shuo Zhang et al.Feb 6arXiv

QuantaAlpha is a smart, evolving system that helps find trading signals (called alpha factors) even when markets are noisy and keep changing.

#Alpha mining#Evolutionary agents#Trajectory optimization

RelayGen: Intra-Generation Model Switching for Efficient Reasoning

Intermediate
Jiwon Song, Yoongon Kim et al.Feb 6arXiv

RelayGen is a training-free way to switch between a big model and a small model while one answer is being generated.

#RelayGen#intra-generation model switching#segment-level routing

Alleviating Sparse Rewards by Modeling Step-Wise and Long-Term Sampling Effects in Flow-Based GRPO

Intermediate
Yunze Tong, Mushui Liu et al.Feb 6arXiv

Text-to-image models using GRPO used to give the same final reward to every step, which is like giving the whole team the same grade no matter who did what.

#TurningPoint-GRPO#GRPO#Flow Matching
1920212223