🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers943

AllBeginnerIntermediateAdvanced
All SourcesarXiv

WAY: Estimation of Vessel Destination in Worldwide AIS Trajectory

Intermediate
Jin Sob Kim, Hyun Joon Park et al.Dec 15arXiv

Ships constantly broadcast AIS messages, but these messages are messy, unevenly spaced in time, and sometimes wrong.

#AIS trajectory#vessel destination prediction#nested sequence

Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows

Intermediate
Haoyu Dong, Pengkun Zhang et al.Dec 15arXiv

FINCH is a new test that checks whether AI can handle real finance and accounting work using messy, real spreadsheets, emails, PDFs, charts, and more.

#FINCH benchmark#finance and accounting AI#spreadsheet agents

Spatial-Aware VLA Pretraining through Visual-Physical Alignment from Human Videos

Intermediate
Yicheng Feng, Wanpeng Zhang et al.Dec 15arXiv

Robots often see the world as flat pictures but must move in a 3D world, which makes accurate actions hard.

#Vision-Language-Action#3D spatial grounding#visual-physical alignment

GTR-Turbo: Merged Checkpoint is Secretly a Free Teacher for Agentic VLM Training

Intermediate
Tong Wei, Yijun Yang et al.Dec 15arXiv

GTR-Turbo teaches a vision-language agent using a 'free teacher' made by merging its own past checkpoints, so no costly external model is needed.

#GTR-Turbo#checkpoint merging#TIES-merging

Few-Step Distillation for Text-to-Image Generation: A Practical Guide

Intermediate
Yifan Pu, Yizeng Han et al.Dec 15arXiv

Big text-to-image models make amazing pictures but are slow because they take hundreds of tiny steps to turn noise into an image.

#text-to-image#diffusion models#few-step generation

Reveal Hidden Pitfalls and Navigate Next Generation of Vector Similarity Search from Task-Centric Views

Beginner
Tingyang Chen, Cong Fu et al.Dec 15arXiv

The paper shows that judging vector search only by distance-based recall and speed can be very misleading for real tasks.

#vector similarity search#approximate nearest neighbor#maximum inner product search

QwenLong-L1.5: Post-Training Recipe for Long-Context Reasoning and Memory Management

Intermediate
Weizhou Shen, Ziyi Yang et al.Dec 15arXiv

QwenLong-L1.5 is a training recipe that helps AI read and reason over very long documents by improving the data it learns from, the way it is trained, and how it remembers important stuff.

#long-context reasoning#reinforcement learning#GRPO

Improving Recursive Transformers with Mixture of LoRAs

Intermediate
Mohammadmahdi Nouriborji, Morteza Rohanian et al.Dec 14arXiv

Recursive transformers save memory by reusing the same layer over and over, but that makes them less expressive and hurts accuracy.

#Mixture of LoRAs#recursive transformers#parameter sharing

DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning

Intermediate
Zhe Liu, Runhui Huang et al.Dec 14arXiv

DrivePI is a single, small (0.5B) multimodal language model that sees with cameras and LiDAR, talks in natural language, and plans driving actions all at once.

#DrivePI#Vision-Language-Action#3D occupancy

State over Tokens: Characterizing the Role of Reasoning Tokens

Intermediate
Mosh Levy, Zohar Elyoseph et al.Dec 14arXiv

Reasoning tokens (the words a model writes before its final answer) help the model think better, but they are not a trustworthy diary of how it really thought.

#State over Tokens#reasoning tokens#chain-of-thought

NL2Repo-Bench: Towards Long-Horizon Repository Generation Evaluation of Coding Agents

Intermediate
Jingzhe Ding, Shengda Long et al.Dec 14arXiv

NL2Repo-Bench is a new benchmark that tests if coding agents can build a whole Python library from just one long natural-language document and an empty folder.

#NL2Repo-Bench#autonomous coding agents#long-horizon reasoning

WebOperator: Action-Aware Tree Search for Autonomous Agents in Web Environment

Intermediate
Mahir Labib Dihan, Tanzima Hashem et al.Dec 14arXiv

WebOperator is a smart way for AI to use a map of choices (a search tree) to navigate websites safely and reach goals.

#web agent#tree search#best-first search
6768697071