Papers6

#semantic segmentation

Utonia: Toward One Encoder for All Point Clouds

Yujia Zhang, Xiaoyang Wu et al.Mar 3arXiv

Utonia is a single brain (encoder) that learns from many kinds of 3D point clouds, like indoor rooms, outdoor streets, tiny toys, and even city maps.

#Utonia#point cloud#self-supervised learning

Not triaged yet

RadarGen: Automotive Radar Point Cloud Generation from Cameras

Intermediate

Tomer Borreda, Fangqiang Ding et al.Dec 19arXiv

RadarGen is a tool that learns to generate realistic car radar point clouds just from multiple camera views.

#automotive radar#radar point cloud generation#latent diffusion

Not triaged yet

Next-Embedding Prediction Makes Strong Vision Learners

Beginner

Sihan Xu, Ziqiao Ma et al.Dec 18arXiv

This paper introduces NEPA, a very simple way to teach vision models by having them predict the next patch’s embedding in an image sequence, just like language models predict the next word.

#self-supervised learning#vision transformer#autoregression

Not triaged yet

In Pursuit of Pixel Supervision for Visual Pre-training

Intermediate

Lihe Yang, Shang-Wen Li et al.Dec 17arXiv

Pixels are the raw stuff of images, and this paper shows you can learn great vision skills by predicting pixels directly, not by comparing fancy hidden features.

#pixel supervision#masked autoencoders#MAE redesign

Not triaged yet

LitePT: Lighter Yet Stronger Point Transformer

Intermediate

Yuanwen Yue, Damien Robert et al.Dec 15arXiv

LitePT is a new AI backbone for 3D point clouds that uses convolutions in early layers and attention in later layers to be both fast and accurate.

#LitePT#Point Transformer#3D point cloud

Not triaged yet

COOPER: A Unified Model for Cooperative Perception and Reasoning in Spatial Intelligence

Beginner

Zefeng Zhang, Xiangzhao Hao et al.Dec 4arXiv

COOPER is a single AI model that both “looks better” (perceives depth and object boundaries) and “thinks smarter” (reasons step by step) to answer spatial questions about images.

#COOPER#multimodal large language model#unified model

Not triaged yet