Analyzing Inexact Hypergradients for Bilevel Learning [slides available]


Estimating hyperparameters is an important and long-standing problem in machine learning. We consider the case where hyperparameter estimation can be formulated as a bilevel optimization problem. One difficulty with this formulation is that the exact gradient with respect to the hyperparameters cannot be computed and must instead be approximated. We provide an analysis of hypergradient estimation in a flexible framework which allows using both automatic differentiation or the Implicit Function Theorem, and demonstrate the relative performance of several approaches. This is joint work with Matthias Ehrhardt (Bath, UK).

9 Dec 2022
Lindon Roberts

My research is in numerical analysis, particularly nonconvex and derivative-free optimization.