Phi-4-reasoning-vision-15B Technical Report
IntermediateJyoti Aneja, Michael Harrison et al.Mar 4arXiv
Phi-4-reasoning-vision-15B is a small, open-weight AI that understands pictures and text together and is especially good at math, science, and using computer screens.
#multimodal reasoning#vision-language model#mid-fusion