🎓How I Study AIHISA
đź“–Read
📄Papers📰Blogs🎬Courses
đź’ˇLearn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#visual tokens

CodeOCR: On the Effectiveness of Vision Language Models in Code Understanding

Intermediate
Yuling Shi, Chaoxiang Xie et al.Feb 2arXiv

The paper tests a simple but bold idea: show code to AI as pictures instead of plain text, then shrink those pictures to save tokens and time.

#multimodal language models#code as images#visual code understanding

DeepSeek-OCR 2: Visual Causal Flow

Intermediate
Haoran Wei, Yaofeng Sun et al.Jan 28arXiv

DeepSeek-OCR 2 teaches a computer to “read” pictures of documents in a smarter order, more like how people read.

#DeepSeek-OCR 2#DeepEncoder V2#visual tokens

AgentOCR: Reimagining Agent History via Optical Self-Compression

Intermediate
Lang Feng, Fuchao Yang et al.Jan 8arXiv

AgentOCR turns an agent’s long text history into pictures so it can remember more using fewer tokens.

#AgentOCR#optical self-compression#visual tokens

VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression?

Intermediate
Hongbo Zhao, Meng Wang et al.Dec 17arXiv

Long texts are expensive for AI to read because each extra token costs a lot of compute and memory.

#vision‑text compression#VTCBench#vision‑language models