18.S096 Matrix Calculus for Machine Learning and Beyond
18.S096 Matrix Calculus for Machine Learning and Beyond (IAP 2023, MIT OCW). Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. This class covers a coherent approach to matrix calculus showing techniques that allow you to think of a matrix holistically (not just as an array of scalars), generalize and compute derivatives of important matrix factorizations and many other complicated-looking operations, and understand how differentiation formulas must be reimagined in large-scale computing. We will discuss reverse/adjoint/backpropagation differentiation, custom vector-Jacobian products, and how modern automatic differentiation is more computer science than calculus (it is neither symbolic formulas nor finite differences. (from ocw.mit.edu)
Lecture 05 - Part3: Differentiation on Computational Graphs |
Instructors: Prof. Alan Edelman and Prof. Steven G. Johnson. A very general way to think about the chain rule is to view computations as flowing through "graphs" consisting of nodes (intermediate values) connected by edges (functions acting on those values). When we propagate derivatives through the graph from inputs to outputs, we get the structure of forward-mode automatic differentiation; going from outputs to inputs yields reverse mode, which we will return to in lecture 8.
Go to the Course Home or watch other lectures: