๐ŸŽ“How I Study AIHISA
๐Ÿ“–Read
๐Ÿ“„Papers๐Ÿ“ฐBlogs๐ŸŽฌCourses
๐Ÿ’กLearn
๐Ÿ›ค๏ธPaths๐Ÿ“šTopics๐Ÿ’กConcepts๐ŸŽดShorts
๐ŸŽฏPractice
๐Ÿ“Daily Log๐ŸŽฏPrompts๐Ÿง Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers2

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#token utilization

Pretraining A Large Language Model using Distributed GPUs: A Memory-Efficient Decentralized Paradigm

Beginner
Jinrui Zhang, Chaodong Xiao et al.Feb 12arXiv

Training big language models usually needs super-expensive, tightly connected GPU clusters, which most people do not have.

#decentralized LLM pretraining#mixture-of-experts (MoE)#sparse expert synchronization

SimpleMem: Efficient Lifelong Memory for LLM Agents

Intermediate
Jiaqi Liu, Yaofeng Su et al.Jan 5arXiv

SimpleMem is a new memory system that helps AI remember long conversations without wasting space or tokens.

#LLM memory#semantic compression#online synthesis