How I Study AI - Learn AI Papers & Lectures the Easy Way

On Data Engineering for Scaling LLM Terminal Capabilities

Intermediate

Renjie Pi, Grace Lam et al.Feb 24arXiv

This paper shows that you can vastly improve a model’s command-line (terminal) skills by carefully engineering the training data, not just by using a bigger model.

#Terminal-Bench 2.0#terminal agents#synthetic task generation

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Intermediate

Wenxuan Huang, Yu Zeng et al.Jan 29arXiv

The paper tackles a real problem: one-shot image or text searches often miss the right evidence (low hit-rate), especially in noisy, cluttered pictures.

#multimodal deep research#visual question answering#ReAct reasoning

ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands

Intermediate

Siyuan Hu, Kevin Qinghong Lin et al.Dec 31arXiv

Computers usually click like a woodpecker, but they struggle to drag smoothly like a human hand; this paper fixes that.

#GUI automation#continuous control#flow matching

Papers3

On Data Engineering for Scaling LLM Terminal Capabilities

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

ShowUI-$π$: Flow-based Generative Models as GUI Dexterous Hands