Sitemap

A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.

Pages

Posts

Future Blog Post

less than 1 minute read

Published:

This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.

Blog Post number 4

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 3

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 2

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

Blog Post number 1

less than 1 minute read

Published:

This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.

portfolio

publications

software

trustregion

Python routines for solving trust-region subproblems

DFBGN

Derivative-free block Gauss-Newton method (for large-scale nonlinear least-squares problems)

Download Paper

talks

Derivative-Free Optimisation Methods for Nonlinear Least-Squares Problems

Published:

Derivative-free optimisation (DFO) algorithms are a category of optimisation methods for situations when one is unable to compute or estimate derivatives of the objective. The need for DFO arises in applications where techniques such as finite differencing or algorithmic differentiation are inaccurate or impractical, such as when the objective has noise (e.g. Monte Carlo simulations in finance) or is very expensive (e.g. climate simulations). \n

Improving the efficiency of derivative-free methods for nonlinear least squares problems

Published:

We present DFO-GN, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems. As is common in derivative-free optimization, DFO-GN uses interpolation of function values to build a model of the objective, which is then used within a trust-region framework to give a globally-convergent algorithm requiring $O(\epsilon^{-2})$ iterations to reach approximate first-order criticality within tolerance $\epsilon$. This algorithm is a simplification of the method by Zhang, Conn and Scheinberg (SIOPT, 2010), where we replace quadratic models for each residual with linear models. We demonstrate that DFO-GN performs comparably to the method of Zhang et al. in terms of objective evaluations, as well as having a substantially faster runtime and improved scalability. Finally, we present two extensions of DFO-GN designed to improve its performance on large-scale and noisy problems respectively.

A flexible, robust and efficient derivative-free solver for least-squares

Published:

We present DFO-LS, a software package for derivative-free optimization (DFO) for nonlinear least-squares problems, that has simplified models, flexible initialization and improved robustness to noise. Inspired by the Gauss-Newton method, DFO-LS constructs simplified linear regression models for the residuals. DFO-LS also has improved flexibility for expensive problems, whereby it can begin making progress from as few as two objective evaluations. Numerical results show DFO-LS can gain reasonable progress on some medium-scale problems with fewer objective evaluations than is needed for one gradient evaluation. For noisy problems, DFO-LS allows a wide variety of sample averaging methodologies, the construction of highly overdetermined regression models, and restart strategies. Our extensive numerical experimentation shows that restarting the solver when stagnation is detected is a cheap and effective mechanism for achieving robustness, with superior performance. We also discuss our package Py-BOBYQA, a Python implementation of BOBYQA (Powell, 2009) which implements some of these features for general objective problems.

Improving the Flexibility and Robustness of Derivative-Free Optimization Solvers

Published:

Classical nonlinear optimization algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimization methods which do not require gradient information, so-called derivative-free optimization (DFO). In this talk, I will introduce DFO and discuss two new software packages for DFO: DFO-LS for nonlinear least-squares problems, and Py-BOBYQA (a Python implementation of Powell’s BOBYQA) for general minimization problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which helps Py-BOBYQA to escape local minima, and show its favorable performance on global optimization problems.

Improving the Flexibility and Robustness of Derivative-Free Optimisation Solvers

Published:

Classical nonlinear optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). This has applications in areas such as climate modelling, hyperparameter tuning and generating adversarial examples in deep learning. In this talk, I will introduce DFO and discuss two new software packages for DFO for nonlinear least-squares problems and for general minimisation problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which improves the ability of these methods to escape local minima, and show its favourable performance on global optimisation problems.

Improving the efficiency and robustness of black-box optimisation

Published:

In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box and computationally expensive or stochastic, gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods for general minimisation and nonlinear least-squares problems.

Improving the scalability of derivative-free optimization for nonlinear least-squares problems

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.

Improving the scalability of derivative-free optimization for nonlinear least-squares problems

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.

Improving the scalability of derivative-free optimization for nonlinear least-squares problems

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.

Derivative-Free Algorithms for Nonconvex Optimisation

Published:

Classical nonconvex optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective function and testing for convergence. In settings where the objective is expensive to evaluate and/or noisy, evaluating its gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). DFO has applications in areas such as finance, climate modelling and machine learning. In this talk, I will introduce DFO methods and recent progress in the theory and implementation of these methods, with a particular focus on least-squares problems (e.g. parameter fitting).

Improving the efficiency and robustness of black-box optimisation

Published:

In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box, computationally expensive and/or stochastic - which occurs in a variety of practical settings - gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm, which has grown rapidly in popularity and maturity in recent years. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods. My focus will be on both general minimisation and nonlinear least-squares problems specifically.

Improving the scalability of model-based derivative-free optimization

Published:

Derivative-free optimization (DFO) methods are an important class of optimization routines for many problems in data science, such as hyperparameter optimization and adversarial attacks for neural networks. However, in model-based DFO methods, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations. I will also discuss how this approach may be extended to DFO for general nonlinear optimization problems.

Download Slides

Derivative-free optimisation for least-squares problems

Published:

Least-squares problems (such as parameter estimation) are ubiquitous across quantitative disciplines. Optimisation algorithms for solving such problems are numerous and well-established. However, in cases where models are computationally expensive, black box, or noisy, classical algorithms can be impractical or even fail. Derivative-free optimisation (DFO) methods provide an alternative approach which can handle these settings. In this talk, Lindon will introduce a derivative-free version of the classical Gauss-Newton method, discuss its theoretical guarantees and software implementation, and describe applications of this technique to parameter estimation of global climate models and image reconstruction.

Download Slides

Inexact Derivative-Free Optimisation for Bilevel Learning

Published:

When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).

Download Slides

Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems

Published:

Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high, particularly for large-scale problems. Considering nonlinear least-squares problems, we improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems. This is joint work with Coralia Cartis and Tyler Ferguson (Univerity of Oxford).

Download Slides

Block Methods for Scalable Derivative-Free Optimisation

Published:

Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.

Download Slides

Inexact Derivative-Free Optimization for Bilevel Learning

Published:

When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).

Download Slides

Large-Scale Derivative-Free Optimization using Subspace Methods

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.

Download Slides

Large-Scale Derivative-Free Optimization using Subspace Methods

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.

Download Slides

Large-Scale Derivative-Free Optimization using Subspace Methods

Published:

In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.

Download Slides

Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization

Published:

When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization methods are a suitable alternative. In existing techniques for derivative-free optimization, the linear algebra cost of constructing function approximations. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a new derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results. This is joint work with Coralia Cartis (Oxford).

Download Slides

Inexact Derivative-Free Optimization for Bilevel Learning

Published:

When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence can be implemented in practice. This is joint work with Matthias Ehrhardt (Bath, UK).

Download Slides

Inexact Derivative-Free Optimization for Bilevel Learning

Published:

When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice.

Download Slides

Derivative-Free Optimization with Convex Constraints

Published:

When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).

Download Slides

Derivative-Free Optimization with Convex Constraints

Published:

When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).

Download Slides

Large-scale derivative-free optimization using random subspace methods

Published:

Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).

Download Slides

Large-scale derivative-free optimization using random subspace methods

Published:

Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).

Black-Box Optimisation Techniques for Complex Systems

Published:

I will introduce some techniques for optimisation suitable for complex systems, such as those involving significant computation or random noise. I will also discuss some applications of these ideas, focusing specifically on optimising data acquisition processes, drawing on examples from medical imaging.

Download Slides

Analyzing Inexact Hypergradients for Bilevel Learning

Published:

Estimating hyperparameters is an important and long-standing problem in machine learning. We consider the case where hyperparameter estimation can be formulated as a bilevel optimization problem. One difficulty with this formulation is that the exact gradient with respect to the hyperparameters cannot be computed and must instead be approximated. We provide an analysis of hypergradient estimation in a flexible framework which allows using both automatic differentiation or the Implicit Function Theorem, and demonstrate the relative performance of several approaches. This is joint work with Matthias Ehrhardt (Bath, UK).

Download Slides

Analyzing Inexact Hypgergradients for Bilevel Learning

Published:

Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modelled as the solution to an optimization problem. In this case the exact gradient with respect to the hyperparameters cannot be computed and approximate strategies are required. We provide an analysis of two classes of methods based on inexact automatic differentiation or approximate implicit differentiation. Our analysis reveals that these two strategies are actually tightly connected, and we derive a priori and a posteriori estimates for both methods which can be used to bound computations and gain further insights what their accuracy actually depends on.

Download Slides

Large-scale derivative-free optimization using random subspace methods

Published:

Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).

Download Slides

Expected decrease for derivative-free algorithms using random subspaces

Published:

When optimising functions that are black-box, noisy and/or computationally expensive, it may be impractical to get gradient information, requiring the use of derivative-free optimisation (DFO) algorithms. Compared to traditional nonlinear optimisation methods, DFO methods are typically not as scalable (in terms of number of decision variables). However, recent DFO approaches based on iterative steps in randomly drawn subspaces has shown promise as a way of improving scalability. In this talk, I will outline these approaches, and a novel average-case analysis that demonstrates why lower dimensional subspaces typically perform well (even though this is not guaranteed by existing theory).

Download Slides

Differentiation: the good, the bad, and the ugly

Published:

For most (sufficiently simple) smooth functions, most high school calculus students can compute its derivative by repeated application of calculus identities (derivatives of simple functions, chain rule, etc.). In this talk I aim to convince you that this simple topic leads to some surprisingly difficult and interesting mathematical problems. I will discuss how derivatives are calculated in real-world settings, and show that without proper care, you can get catastrophically bad results. I will then introduce some advanced techniques that overcome these issues.

Download Slides

Expected decrease for derivative-free algorithms using random subspaces

Published:

A promising approach for improving the scalability for DFO methods is to work in low-dimensional subspaces that are iteratively drawn at random. For such methods, the connection between the subspace dimension and the algorithmic guarantees is not yet fully understood. I will introduce a new average-case analysis for direct search and model-based DFO in random subspaces which allows us to better understand why working in low-dimensional subspaces often outperforms higher-dimensional subspaces.

Download Slides

Randomised Subspace Methods for Scalable Derivative-Free Optimisation

Published:

Most algorithms for optimising nonlinear functions rely on access to (possibly stochastic) derivative information. However, for problems including adversarial example generation for neural networks and fine-tuning large language models, good derivative information can be difficult to obtain and derivative-free optimisation (DFO) algorithms are beneficial. Although there are many approaches for DFO, they generally struggle to solve large-scale problems such as those arising in machine learning. In this talk, I will introduce new scalable DFO algorithms based on random subspaces and develop a novel average-case analysis of such algorithms.

Download Slides

Model Construction for Convex-Constrained Derivative-Free Optimization

Published:

In model-based derivative-free optimisation algorithms, black-box functions are typically approximated using polynomial interpolation models. Most existing model-based DFO methods for constrained optimisation assume the ability to construct sufficiently accurate interpolation models, but this is not always achievable when sampling only feasible points. In this talk, I will outline a new approximation theory for linear and quadratic interpolation models in the presence of convex constraints.

Download Slides

An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning

Published:

A common problem in data science is the determination of model hyperparameters. One approach for learning hyperparameters is to use bilevel optimisation, where the lower-level problem is the standard learning optimisation problem, and the upper-level problem is to learn the hyperparameters (e.g. by minimising validation error). In this setting, particularly for large-scale problems, neither exact function values nor exact gradients are attainable, necessitating methods that only rely on inexact evaluation of such quantities. I will present a new bilevel optimisation algorithm with adaptive inexactness suitable for hyperparameter learning. Numerical results on problems from imaging demonstrate its robustness and strong performance. This is joint work with Mohammad Sadegh Salehi, Matthias Ehrhardt (University of Bath) and Subhadip Mukherjee (IIT Kharagpur).

Download Slides

teaching

Teaching experience 1

Undergraduate course, University 1, Department, 2014

This is a description of a teaching experience. You can use markdown like any other post.

Teaching experience 2

Workshop, University 1, Department, 2015

This is a description of a teaching experience. You can use markdown like any other post.