InfiniteVL: Synergizing Linear and Sparse Attention for Highly-Efficient, Unlimited-Input Vision-Language Models
IntermediateHongyuan Tao, Bencheng Liao et al.Dec 9arXiv
InfiniteVL is a vision-language model that mixes two ideas: local focus with Sliding Window Attention and long-term memory with a linear module called Gated DeltaNet.
#InfiniteVL#linear attention#Gated DeltaNet