Theory of learned representations: embeddings, contrastive learning, disentanglement, and feature space geometry.
10 concepts
Embedding spaces map discrete things like words or products to dense vectors so that similar items are close together.
Contrastive learning teaches models by pulling together similar examples (positives) and pushing apart dissimilar ones (negatives).
Self-supervised learning (SSL) teaches models to learn useful representations from unlabeled data by solving proxy tasks created directly from the data.
Metric learning is about automatically learning a distance function so that similar items are close and dissimilar items are far in a feature space.
Principal Component Analysis (PCA) finds new orthogonal axes (principal components) that capture the maximum variance in your data.
t-SNE and UMAP are nonlinear dimensionality-reduction methods that preserve local neighborhoods to make high-dimensional data visible in 2D or 3D.
Disentangled representations aim to encode independent factors of variation (like shape, size, or color) into separate coordinates of a latent vector.
Transfer learning theory studies when and why a model trained on a source distribution will work on a different target distribution.