# Mathematics for Machine Learning

## Chapters

2 Notation

3 Linear Algebra

3.1 Vector spaces

• 3.1.1 Euclidean space
• 3.1.2 Subspaces

3.2 Linear maps

• 3.2.1 The matrix of a linear map
• 3.2.2 Nullspace, range

3.3 Metric spaces

3.4 Normed spaces

3.5 Inner product spaces

• 3.5.1 Pythagorean Theorem
• 3.5.2 Cauchy-Schwarz inequality
• 3.5.3 Orthogonal complements and projections

3.6 Eigenthings

3.7 Trace

3.8 Determinant

3.9 Orthogonal matrices

3.10 Symmetric matrices

• 3.10.1 Rayleigh quotients

3.11 Positive (semi-)definite matrices

• 3.11.1 The geometry of positive definite quadratic forms

3.12 Singular value decomposition

3.13 Fundamental Theorem of Linear Algebra

3.14 Operator and matrix norms

3.15 Low-rank approximation

3.16 Pseudoinverses

3.17 Some useful matrix identities

• 3.17.1 Matrix-vector product as linear combination of matrix columns
• 3.17.2 Sum of outer products as matrix-matrix product

4 Calculus and Optimization

4.1 Extrema

4.3 The Jacobian

4.4 The Hessian

4.5 Matrix calculus

• 4.5.1 The chain rule

4.6 Taylor’s theorem

4.7 Conditions for local minima

4.8 Convexity

• 4.8.1 Convex sets
• 4.8.2 Basics of convex functions
• 4.8.3 Consequences of convexity
• 4.8.4 Showing that a function is convex
• 4.8.5 Examples

5 Probability

5.1 Basics

• 5.1.1 Conditional probability
• 5.1.2 Chain rule
• 5.1.3 Bayes’ rule

5.2 Random variables

• 5.2.1 The cumulative distribution function
• 5.2.2 Discrete random variables
• 5.2.3 Continuous random variables
• 5.2.4 Other kinds of random variables

5.3 Joint distributions

• 5.3.1 Independence of random variables
• 5.3.2 Marginal distributions

5.4 Great Expectations

• 5.4.1 Properties of expected value

5.5 Variance

• 5.5.1 Properties of variance
• 5.5.2 Standard deviation

5.6 Covariance

• 5.6.1 Correlation

5.8 Estimation of Parameters

• 5.8.1 Maximum likelihood estimation
• 5.8.2 Maximum a posteriori estimation

5.9 The Gaussian distribution

• 5.9.1 The geometry of multivariate Gaussians