InfoCoBuild

18.S096 Matrix Calculus for Machine Learning and Beyond

18.S096 Matrix Calculus for Machine Learning and Beyond (IAP 2023, MIT OCW). Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be reimagined in large-scale computing. We will discuss reverse/adjoint/backpropagation differentiation, custom vector-Jacobian products, and how modern automatic differentiation is more computer science than calculus (it is neither symbolic formulas nor finite differences. (from ocw.mit.edu)

Lecture 05 - Part3: Differentiation on Computational Graphs

Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. A very general way to think about the chain rule is to view computations as flowing through "graphs" consisting of nodes (intermediate values) connected by edges (functions acting on those values). When we propagate derivatives through the graph from inputs to outputs, we get the structure of forward-mode automatic differentiation; going from outputs to inputs yields reverse mode, which we will return to in lecture 8.


Go to the Course Home or watch other lectures:

Lecture 01 - Part1: Introduction and Motivation
Lecture 01 - Part2: Derivatives as Linear Operators
Lecture 02 - Part1: Derivatives in Higher Dimensions: Jacobians and Matrix Functions
Lecture 02 - Part2: Vectorization of Matrix Functions
Lecture 03 - Part1: Kronecker Products and Jacobians
Lecture 03 - Part2: Finite Difference Approximations
Lecture 04 - Part1: Gradients and Inner Products in Other Vector Spaces
Lecture 04 - Part2: Nonlinear Root Finding, Optimization, and Adjoint Gradient Methods
Lecture 05 - Part1: Derivative of Matrix Determinant and Inverse
Lecture 05 - Part2: Forward Automatic Differentiation via Dual Numbers
Lecture 05 - Part3: Differentiation on Computational Graphs
Lecture 06 - Part1: Adjoint Differentiation of ODE Solutions
Lecture 06 - Part2: Calculus of Variations and Gradients of Functionals
Lecture 07 - Part1: Derivatives of Random Functions
Lecture 07 - Part2: Second Derivatives, Bilinear Forms, and Hessian Matrices
Lecture 08 - Part1: Derivatives of Eigenproblems
Lecture 08 - Part2: Automatic Differentiation on Computational Graphs