Mathematics for Machine Learning

Chapters

1 About

2 Notation

3 Linear Algebra

3.1 Vector spaces

  • 3.1.1 Euclidean space
  • 3.1.2 Subspaces

3.2 Linear maps

  • 3.2.1 The matrix of a linear map
  • 3.2.2 Nullspace, range

3.3 Metric spaces

3.4 Normed spaces

3.5 Inner product spaces

  • 3.5.1 Pythagorean Theorem
  • 3.5.2 Cauchy-Schwarz inequality
  • 3.5.3 Orthogonal complements and projections

3.6 Eigenthings

3.7 Trace

3.8 Determinant

3.9 Orthogonal matrices

3.10 Symmetric matrices

  • 3.10.1 Rayleigh quotients

3.11 Positive (semi-)definite matrices

  • 3.11.1 The geometry of positive definite quadratic forms

3.12 Singular value decomposition

3.13 Fundamental Theorem of Linear Algebra

3.14 Operator and matrix norms

3.15 Low-rank approximation

3.16 Pseudoinverses

3.17 Some useful matrix identities

  • 3.17.1 Matrix-vector product as linear combination of matrix columns
  • 3.17.2 Sum of outer products as matrix-matrix product
  • 3.17.3 Quadratic forms

4 Calculus and Optimization

4.1 Extrema

4.2 Gradients

4.3 The Jacobian

4.4 The Hessian

4.5 Matrix calculus

  • 4.5.1 The chain rule

4.6 Taylor’s theorem

4.7 Conditions for local minima

4.8 Convexity

  • 4.8.1 Convex sets
  • 4.8.2 Basics of convex functions
  • 4.8.3 Consequences of convexity
  • 4.8.4 Showing that a function is convex
  • 4.8.5 Examples

5 Probability

5.1 Basics

  • 5.1.1 Conditional probability
  • 5.1.2 Chain rule
  • 5.1.3 Bayes’ rule

5.2 Random variables

  • 5.2.1 The cumulative distribution function
  • 5.2.2 Discrete random variables
  • 5.2.3 Continuous random variables
  • 5.2.4 Other kinds of random variables

5.3 Joint distributions

  • 5.3.1 Independence of random variables
  • 5.3.2 Marginal distributions

5.4 Great Expectations

  • 5.4.1 Properties of expected value

5.5 Variance

  • 5.5.1 Properties of variance
  • 5.5.2 Standard deviation

5.6 Covariance

  • 5.6.1 Correlation

5.8 Estimation of Parameters

  • 5.8.1 Maximum likelihood estimation
  • 5.8.2 Maximum a posteriori estimation

5.9 The Gaussian distribution

  • 5.9.1 The geometry of multivariate Gaussians

Source: gwthomas.github.io/docs/math4ml.pdf