🎓How I Study AIHISA
📖Read
📄Papers📰Blogs🎬Courses
💡Learn
🛤️Paths📚Topics💡Concepts🎴Shorts
🎯Practice
📝Daily Log🎯Prompts🧠Review
SearchSettings
How I Study AI - Learn AI Papers & Lectures the Easy Way

Papers10

AllBeginnerIntermediateAdvanced
All SourcesarXiv
#self-supervised learning

Utonia: Toward One Encoder for All Point Clouds

Intermediate
Yujia Zhang, Xiaoyang Wu et al.Mar 3arXiv

Utonia is a single brain (encoder) that learns from many kinds of 3D point clouds, like indoor rooms, outdoor streets, tiny toys, and even city maps.

#Utonia#point cloud#self-supervised learning

Next Embedding Prediction Makes World Models Stronger

Intermediate
George Bredis, Nikita Balagansky et al.Mar 3arXiv

NE-Dreamer is a model-based reinforcement learning agent that skips rebuilding pixels and instead learns by predicting the next step’s hidden features.

#model-based reinforcement learning#world models#next-embedding prediction

InfoNCE Induces Gaussian Distribution

Intermediate
Roy Betser, Eyal Gofer et al.Feb 27arXiv

The paper shows that when we train with the popular InfoNCE contrastive loss, the learned features start to behave like they come from a Gaussian (bell-shaped) distribution.

#InfoNCE#contrastive learning#Gaussian embeddings

BatCoder: Self-Supervised Bidirectional Code-Documentation Learning via Back-Translation

Intermediate
Jingwen Xu, Yiyang Lu et al.Jan 30arXiv

BatCoder teaches a code model to write both code and its documentation by doing a round trip: from code to docs and back to code.

#back-translation#self-supervised learning#reinforcement learning for code

EEG Foundation Models: Progresses, Benchmarking, and Open Problems

Intermediate
Dingkun Liu, Yuheng Chen et al.Jan 25arXiv

This paper builds a fair, big playground (a benchmark) to test many EEG foundation models side-by-side on the same rules.

#EEG foundation models#brain-computer interface#self-supervised learning

Enhancing Linguistic Competence of Language Models through Pre-training with Language Learning Tasks

Beginner
Atsuki Yamaguchi, Maggie Mi et al.Jan 6arXiv

The paper teaches language models using extra 'language homework' made from the same raw text so they learn grammar and meaning, not just next-word guessing.

#language model pretraining#causal language modeling#linguistic competence

Toward Stable Semi-Supervised Remote Sensing Segmentation via Co-Guidance and Co-Fusion

Intermediate
Yi Zhou, Xuechao Zou et al.Dec 28arXiv

Co2S is a new way to train segmentation models with very few labels by letting two different students (CLIP and DINOv3) learn together and correct each other.

#semi-supervised segmentation#remote sensing#pseudo-label drift

Next-Embedding Prediction Makes Strong Vision Learners

Beginner
Sihan Xu, Ziqiao Ma et al.Dec 18arXiv

This paper introduces NEPA, a very simple way to teach vision models by having them predict the next patch’s embedding in an image sequence, just like language models predict the next word.

#self-supervised learning#vision transformer#autoregression

In Pursuit of Pixel Supervision for Visual Pre-training

Intermediate
Lihe Yang, Shang-Wen Li et al.Dec 17arXiv

Pixels are the raw stuff of images, and this paper shows you can learn great vision skills by predicting pixels directly, not by comparing fancy hidden features.

#pixel supervision#masked autoencoders#MAE redesign

Towards Scalable Pre-training of Visual Tokenizers for Generation

Intermediate
Jingfeng Yao, Yuda Song et al.Dec 15arXiv

The paper tackles a paradox: visual tokenizers that get great pixel reconstructions often make worse images when used for generation.

#visual tokenizer#latent space#Vision Transformer