Papers2

#reading order

FireRed-OCR Technical Report

FireRed-OCR turns a general vision-language model into a careful document reader that follows strict rules, so its outputs are usable in the real world.

#FireRed-OCR#structural hallucination#document parsing

Not triaged yet

DeepSeek-OCR 2: Visual Causal Flow

Intermediate

Haoran Wei, Yaofeng Sun et al.Jan 28arXiv

DeepSeek-OCR 2 teaches a computer to “read” pictures of documents in a smarter order, more like how people read.

#DeepSeek-OCR 2#DeepEncoder V2#visual tokens

Not triaged yet