FireRed-OCR Technical Report
IntermediateHao Wu, Haoran Lou et al.Mar 2arXiv
FireRed-OCR turns a general vision-language model into a careful document reader that follows strict rules, so its outputs are usable in the real world.
#FireRed-OCR#structural hallucination#document parsing