🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
🧩Problems🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers791

AllBeginnerIntermediateAdvanced
All SourcesarXiv

UltraShape 1.0: High-Fidelity 3D Shape Generation via Scalable Geometric Refinement

Intermediate
Tanghui Jia, Dongyu Yan et al.Dec 24arXiv

UltraShape 1.0 is a two-step 3D generator that first makes a simple overall shape and then zooms in to add tiny details.

#3D diffusion#coarse-to-fine generation#voxel-based refinement

Quantile Rendering: Efficiently Embedding High-dimensional Feature on 3D Gaussian Splatting

Intermediate
Yoonwoo Jeong, Cheng Sun et al.Dec 24arXiv

This paper speeds up how 3D scenes handle big, 512‑dimensional features without throwing away important information.

#3D Gaussian Splatting#Quantile Rendering#Open-vocabulary segmentation

NVIDIA Nemotron 3: Efficient and Open Intelligence

Intermediate
NVIDIA, : et al.Dec 24arXiv

Nemotron 3 is a new family of open AI models (Nano, Super, Ultra) built to think better while running faster and cheaper.

#Nemotron 3#Mixture-of-Experts#LatentMoE

Nemotron 3 Nano: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Intermediate
NVIDIA, : et al.Dec 23arXiv

Nemotron 3 Nano is a new open-source language model that mixes two brain styles (Mamba and Transformer) and adds a team of special experts (MoE) so it thinks better while running much faster.

#Mixture-of-Experts#Mamba-2#Transformer

SemanticGen: Video Generation in Semantic Space

Intermediate
Jianhong Bai, Xiaoshi Wu et al.Dec 23arXiv

SemanticGen is a new way to make videos that starts by planning in a small, high-level 'idea space' (semantic space) and then adds the tiny visual details later.

#Video generation#Diffusion model#Semantic representation

LongVideoAgent: Multi-Agent Reasoning with Long Videos

Intermediate
Runtao Liu, Ziyi Liu et al.Dec 23arXiv

LongVideoAgent is a team of three AIs that work together to answer questions about hour‑long TV episodes without missing small details.

#long video question answering#multi-agent reasoning#temporal grounding

SpatialTree: How Spatial Abilities Branch Out in MLLMs

Intermediate
Yuxi Xiao, Longfei Li et al.Dec 23arXiv

SpatialTree is a new, four-level "ability tree" that tests how multimodal AI models (that see and read) handle space: from basic seeing to acting in the world.

#Spatial Intelligence#Multimodal Large Language Models#Hierarchical Benchmark

Active Intelligence in Video Avatars via Closed-loop World Modeling

Intermediate
Xuanhua He, Tianyu Yang et al.Dec 23arXiv

The paper turns video avatars from passive puppets into active doers that can plan, act, check their own work, and fix mistakes over many steps.

#ORCA#L-IVA#Internal World Model

Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Intermediate
Seijin Kobayashi, Yanick Schimpf et al.Dec 23arXiv

The paper shows that big sequence models (like transformers) quietly learn longer goals inside their hidden activations, even though they are trained one step at a time.

#hierarchical reinforcement learning#temporal abstractions#autoregressive models

Can LLMs Predict Their Own Failures? Self-Awareness via Internal Circuits

Intermediate
Amirhosein Ghasemabadi, Di NiuDec 23arXiv

Large language models often sound confident even when they are wrong, and existing ways to catch mistakes are slow or not very accurate.

#self-awareness#large language models#hidden states

Learning to Reason in 4D: Dynamic Spatial Understanding for Vision Language Models

Intermediate
Shengchao Zhou, Yuxin Chen et al.Dec 23arXiv

The paper tackles a big blind spot in vision-language models: understanding how objects move and relate in 3D over time (dynamic spatial reasoning, or DSR).

#dynamic spatial reasoning#vision-language models#4D understanding

Step-DeepResearch Technical Report

Intermediate
Chen Hu, Haikuo Du et al.Dec 23arXiv

Search is not the same as research; real research needs planning, checking many sources, fixing mistakes, and writing a clear report.

#Deep Research#Atomic Capabilities#ReAct Agent
4546474849