From Theory to Algorithms
What is learning? Types, relations to other fields, and how to read this book.
The statistical learning framework and empirical risk minimization.
Chapter 3PAC learning and agnostic PAC learning formalized.
Chapter 4When uniform convergence guarantees learnability.
Chapter 5No-Free-Lunch theorem and error decomposition.
Chapter 6Measuring hypothesis class complexity with VC-dimension.
Chapter 7SRM, MDL, Occam's Razor, and consistency.
Chapter 8Computational complexity and implementing ERM.
Halfspaces, linear regression, and logistic regression.
Chapter 10Weak learnability, AdaBoost, and face recognition.
Chapter 11Cross-validation, hold-out sets, and model selection.
Chapter 12Convexity, Lipschitzness, and surrogate loss functions.
Chapter 13Tikhonov regularization and the fitting-stability tradeoff.
Chapter 14GD, SGD, subgradients, and convergence analysis.
Chapter 15Hard-SVM, Soft-SVM, margin, and support vectors.
Chapter 16Feature space embeddings and the kernel trick.
Chapter 17One-vs-All, structured output, and ranking.
Chapter 18Tree algorithms, gain measures, pruning, and random forests.
Chapter 19k-NN, generalization bounds, and curse of dimensionality.
Chapter 20Feedforward networks, backpropagation, and expressive power.
Online classification, weighted majority, and online convex optimization.
Chapter 22Linkage, k-means, spectral clustering, and information bottleneck.
Chapter 23PCA, random projections, and compressed sensing.
Chapter 24MLE, Naive Bayes, LDA, EM algorithm, and Bayesian reasoning.
Chapter 25Filters, greedy selection, sparsity, and auto-encoders.
Rademacher complexity, calculus, and generalization bounds.
Chapter 27Covering and chaining for complexity analysis.
Chapter 28Upper and lower bounds for PAC learning.
Chapter 29Natarajan dimension and multiclass fundamental theorem.
Chapter 30Compression-based generalization bounds.
Chapter 31PAC-Bayes bounds for learning theory.
Visualize underfitting vs overfitting as model complexity changes.
Step through gradient descent on 2D loss surfaces.
Interactive support vector machine with margin visualization.
Build decision trees step by step on sample data.
Train a simple feedforward network and watch it learn.
Interactive clustering with animated centroid updates.