18.S096 Matrix Calculus for Machine Learning and Beyond (IAP 2023, MIT OCW): Lecture 05 - Part3: Differentiation on Computational Graphs

18.S096 Matrix Calculus for Machine Learning and Beyond

18.S096 Matrix Calculus for Machine Learning and Beyond (IAP 2023, MIT OCW). Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be reimagined in large-scale computing. We will discuss reverse/adjoint/backpropagation differentiation, custom vector-Jacobian products, and how modern automatic differentiation is more computer science than calculus (it is neither symbolic formulas nor finite differences. (from ocw.mit.edu)

Lecture 05 - Part3: Differentiation on Computational Graphs

Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. A very general way to think about the chain rule is to view computations as flowing through "graphs" consisting of nodes (intermediate values) connected by edges (functions acting on those values). When we propagate derivatives through the graph from inputs to outputs, we get the structure of forward-mode automatic differentiation; going from outputs to inputs yields reverse mode, which we will return to in lecture 8.

Go to the Course Home or watch other lectures:

Lecture 01 - Part1: Introduction and Motivation

Lecture 01 - Part2: Derivatives as Linear Operators

Lecture 02 - Part1: Derivatives in Higher Dimensions: Jacobians and Matrix Functions

Lecture 02 - Part2: Vectorization of Matrix Functions

Lecture 03 - Part1: Kronecker Products and Jacobians

Lecture 03 - Part2: Finite Difference Approximations

Lecture 04 - Part1: Gradients and Inner Products in Other Vector Spaces

Lecture 04 - Part2: Nonlinear Root Finding, Optimization, and Adjoint Gradient Methods

Lecture 05 - Part1: Derivative of Matrix Determinant and Inverse

Lecture 05 - Part2: Forward Automatic Differentiation via Dual Numbers

Lecture 05 - Part3: Differentiation on Computational Graphs

Lecture 06 - Part1: Adjoint Differentiation of ODE Solutions

Lecture 06 - Part2: Calculus of Variations and Gradients of Functionals

Lecture 07 - Part1: Derivatives of Random Functions

Lecture 07 - Part2: Second Derivatives, Bilinear Forms, and Hessian Matrices

Lecture 08 - Part1: Derivatives of Eigenproblems

Lecture 08 - Part2: Automatic Differentiation on Computational Graphs