18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning (Spring 2018, MIT OCW): Lecture 27 - Backpropagation: Find Partial Derivatives

18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning

18.065 Matrix Methods in Data Analysis, Signal Processing, and Machine Learning (Spring 2018, MIT OCW). Instructor: Prof. Gilbert Strang. Linear algebra concepts are key for understanding and creating machine learning algorithms, especially as applied to deep learning and neural networks. This course reviews linear algebra with applications to probability and statistics and optimization-and above all a full explanation of deep learning. (from ocw.mit.edu)

Lecture 27 - Backpropagation: Find Partial Derivatives

In this lecture, Professor Strang presents Professor Sra's theorem which proves the convergence of stochastic gradient descent (SGD). He then reviews backpropagation, a method to compute derivatives quickly, using the chain rule.

Go to the Course Home or watch other lectures:

Lecture 01 - The Column Space of A Contains All Vectors Ax

Lecture 02 - Multiplying and Factoring Matrices

Lecture 03 - Orthogonal Columns in Q Give Q^TQ = I

Lecture 04 - Eigenvalues and Eigenvectors

Lecture 05 - Positive Definite and Semidefinite Matrices

Lecture 06 - Singular Value Decomposition (SVD)

Lecture 07 - Eckart-Young: The Closest Rank k Matrix to A

Lecture 08 - Norms of Vectors and Matrices

Lecture 09 - Four Ways to Solve Least Squares Problems

Lecture 10 - Survey of Difficulties with Ax = b

Lecture 11 - Minimizing ∥X∥ Subject to Ax = b