🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers4

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#residual vector quantization

MIBURI: Towards Expressive Interactive Gesture Synthesis

Intermediate
M. Hamza Mughal, Rishabh Dabral et al.Mar 3arXiv

MIBURI is a system that makes a talking digital character move its body and face expressively in real time while it speaks.

#co-speech gesture synthesis#embodied conversational agents#causal generation

MOSS-Audio-Tokenizer: Scaling Audio Tokenizers for Future Audio Foundation Models

Intermediate
Yitian Gong, Kuangwei Chen et al.Feb 11arXiv

This paper builds a new audio tokenizer, called MOSS-Audio-Tokenizer, that turns sound into tiny tokens the way text tokenizers turn sentences into words.

#audio tokenizer#causal transformer#residual vector quantization

SAMTok: Representing Any Mask with Two Words

Intermediate
Yikang Zhou, Tao Zhang et al.Jan 22arXiv

SAMTok turns any object’s mask in an image into just two special “words” so language models can handle pixels like they handle text.

#SAMTok#mask tokenizer#residual vector quantization

HeartMuLa: A Family of Open Sourced Music Foundation Models

Intermediate
Dongchao Yang, Yuxin Xie et al.Jan 15arXiv

HeartMuLa is a family of open-source music AI models that can understand and generate full songs with clear lyrics and strong musical structure.

#music generation#audio tokenizer#residual vector quantization