Lecture 28 Summary

« Previous: Lecture 27 Summary Next: Lecture 29 Summary »

Gave BFGS update formula, and showed that it satisfies at least the third property. Reduced the problem of proving positive-definiteness to showing that the dot product γTδ of the change in gradient (γ) with the change in x (δ) must be positive.

Explained why an exact line search leads to positive γTδ and hence positive-definite approximate Hessians, and why an approximate line-search can usually be defined to enforce this (cf. Wolfe conditions)...and what to do when this doesn't happen.

Noted that the BFGS update for the approximate Hessian can be transformed into a similar update for the inverse Hessian, using the Sherman-Morrison formula for rank-1 updates. Briefly derived this formula.

Briefly discussed the connection of the BFGS update to minimizing a weighted Frobenius norm of the change in the inverse Hessian approximation.

Showed that the same considerations applied to the inverse Hessian, interchanging the roles of δ and γ, lead to another possible update formula, the DFP update (which minimizes a change in the Hessian approximation). In practice, BFGS seems to work better than DFP for this problem.

Briefly discussed the limited-memory BFGS algorithm, and applications to sequential quadratic programming (SQP).