🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Adam

On Surprising Effectiveness of Masking Updates in Adaptive Optimizers

Intermediate
Taejong Joo, Wenhan Xia et al.Feb 17arXiv

The paper finds a simple trick—randomly skipping some parameter updates—can train large language models better than fancy optimizers.

#Magma#random masking#adaptive optimizers

ECO: Quantized Training without Full-Precision Master Weights

Intermediate
Mahdi Nikdan, Amir Zandieh et al.Jan 29arXiv

Training big AI models uses lots of memory because most methods still keep a secret full-precision copy of the weights called master weights.

#quantized training#master weights#error feedback