🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
Search
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers18

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#Mixture-of-Experts

NVIDIA Nemotron 3: Efficient and Open Intelligence

Intermediate
NVIDIA, : et al.Dec 24arXiv

Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.

#Nemotron 3#Mixture-of-Experts#LatentMoE

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Intermediate
NVIDIA, : et al.Dec 23arXiv

Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.

#Mixture-of-Experts#Mamba-2#Transformer

INTELLECT-3: Technical Report

Intermediate
Prime Intellect Team, Mika Senghaas et al.Dec 18arXiv

INTELLECT-3 is a 106B-parameter Mixture-of-Experts model (about 12B active per token) trained with large-scale reinforcement learning and it beats many bigger models on math, coding, science, and reasoning tests.

#INTELLECT-3#prime-rl#verifiers

VersatileFFN: Achieving Parameter Efficiency in LLMs via Adaptive Wide-and-Deep Reuse

Intermediate
Ying Nie, Kai Han et al.Dec 16arXiv

Large language models get smarter when they get bigger, but storing all those extra weights eats tons of memory.

#VersatileFFN#parameter efficiency#virtual experts

Improving Recursive Transformers with Mixture of LoRAs

Intermediate
Mohammadmahdi Nouriborji, Morteza Rohanian et al.Dec 14arXiv

Recursive transformers save memory by reusing the same layer over and over, but that makes them less expressive and hurts accuracy.

#Mixture of LoRAs#recursive transformers#parameter sharing

Scone: Bridging Composition and Distinction in Subject-Driven Image Generation via Unified Understanding-Generation Modeling

Beginner
Yuran Wang, Bohan Zeng et al.Dec 14arXiv

Scone is a new AI method that makes images from instructions while correctly picking the right subject even when many look similar.

#subject-driven image generation#multi-subject composition#subject distinction
12