Hepato-LLaVA is a special AI that reads giant microscope pictures of the liver and answers medical questions about cancer.
SLA2 is a new way for AI to pay attention faster by smartly splitting work between two helpers: a precise one (sparse attention) and a speedy one (linear attention).
HySparse is a new way for AI models to pay attention that mixes a few full attention layers with many fast, memory‑saving sparse layers.
This paper speeds up how AI models read very long texts by carefully choosing which words (tokens) to focus on at each step.
This paper shows how to make powerful image‑generating Transformers run fast on phones without needing the cloud.
Long texts make standard attention in large language models very slow because it checks every word against every other word.