Portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 1
Short description of portfolio item number 2
Published in 116th European Study Group with Industry, 2016
Published in InFoMM CDT Report, 2016
Published in Agri-Food Study Group with Industry, 2017
Published in International Communications in Heat and Mass Transfer, 2017
Published in 136th European Study Group with Industry, 2018
Published in Alan Turing Institute Data Study Group, 2018
Published in Mathematical Programming Computation, 2019
Published in ACM Transactions on Mathematical Software, 2019
Published in PhD Thesis, University of Oxford, 2019
Published in Workshop on "Beyond first-order methods in ML systems", 37th International Conference on Machine Learning (ICML 2020), 2020
Published in 12th OPT Workshop on Optimization for Machine Learning (NeurIPS 2020), 2020
Published in Journal of Mathematical Imaging and Vision, 2021
Published in Optimization, 2021
Published in Journal of Climate, 2021
Published in Mathematical Programming, 2022
Published in EURO Journal on Computational Optimization, 2022
Published in SIAM Journal on Optimization, 2022
Published in Journal of Open Source Software, 2022
Published in Wood Science and Technology, 2022
Published in ANZIAM Journal, 2023
Published in Physical Review A, 2023
Published in Optics Express, 2023
Published in SIAM Journal on Optimization, 2023
Published in IMA Journal of Applied Mathematics, 2023
Published in Transactions on Machine Learning Research, 2024
Published in arXiv preprint arXiv:2407.14915, 2024
Published in Mathematics of Computation, 2024
Published in arXiv preprint arXiv:2412.14431, 2024
Published in arXiv preprint arXiv:2501.04889, 2025
Published in Scale Space and Variational Methods in Computer Vision (SSVM), 2025
Published in SIAM Journal on Optimization, 2025
Published in Journal of Synchrotron Radiation, 2025
Published in Physical Review A, 2025
Published in SIAM Journal on Mathematics of Data Science, 2025
Derivative-Free Gauss-Newton solver
Derivative-free solver for nonlinear least-squares problems
General-purpose derivative-free optimization solver
Python interface to CUTEst optimization testing package
Python routines for solving trust-region subproblems
Derivative-free block Gauss-Newton method (for large-scale nonlinear least-squares problems)
Flexible optimization package using direct search (derivative-free) methods
Published:
Derivative-free optimisation (DFO) algorithms are a category of optimisation methods for situations when one is unable to compute or estimate derivatives of the objective. The need for DFO arises in applications where techniques such as finite differencing or algorithmic differentiation are inaccurate or impractical, such as when the objective has noise (e.g. Monte Carlo simulations in finance) or is very expensive (e.g. climate simulations). \n
Published:
We present DFO-GN, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems. As is common in derivative-free optimization, DFO-GN uses interpolation of function values to build a model of the objective, which is then used within a trust-region framework to give a globally-convergent algorithm requiring $O(\epsilon^{-2})$ iterations to reach approximate first-order criticality within tolerance $\epsilon$. This algorithm is a simplification of the method by Zhang, Conn and Scheinberg (SIOPT, 2010), where we replace quadratic models for each residual with linear models. We demonstrate that DFO-GN performs comparably to the method of Zhang et al. in terms of objective evaluations, as well as having a substantially faster runtime and improved scalability. Finally, we present two extensions of DFO-GN designed to improve its performance on large-scale and noisy problems respectively.
Published:
We present DFO-LS, a software package for derivative-free optimization (DFO) for nonlinear least-squares problems, that has simplified models, flexible initialization and improved robustness to noise. Inspired by the Gauss-Newton method, DFO-LS constructs simplified linear regression models for the residuals. DFO-LS also has improved flexibility for expensive problems, whereby it can begin making progress from as few as two objective evaluations. Numerical results show DFO-LS can gain reasonable progress on some medium-scale problems with fewer objective evaluations than is needed for one gradient evaluation. For noisy problems, DFO-LS allows a wide variety of sample averaging methodologies, the construction of highly overdetermined regression models, and restart strategies. Our extensive numerical experimentation shows that restarting the solver when stagnation is detected is a cheap and effective mechanism for achieving robustness, with superior performance. We also discuss our package Py-BOBYQA, a Python implementation of BOBYQA (Powell, 2009) which implements some of these features for general objective problems.
Published:
Classical nonlinear optimization algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimization methods which do not require gradient information, so-called derivative-free optimization (DFO). In this talk, I will introduce DFO and discuss two new software packages for DFO: DFO-LS for nonlinear least-squares problems, and Py-BOBYQA (a Python implementation of Powell’s BOBYQA) for general minimization problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which helps Py-BOBYQA to escape local minima, and show its favorable performance on global optimization problems.
Published:
Classical nonlinear optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). This has applications in areas such as climate modelling, hyperparameter tuning and generating adversarial examples in deep learning. In this talk, I will introduce DFO and discuss two new software packages for DFO for nonlinear least-squares problems and for general minimisation problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which improves the ability of these methods to escape local minima, and show its favourable performance on global optimisation problems.
Published:
In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box and computationally expensive or stochastic, gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods for general minimisation and nonlinear least-squares problems.
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Published:
Classical nonconvex optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective function and testing for convergence. In settings where the objective is expensive to evaluate and/or noisy, evaluating its gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). DFO has applications in areas such as finance, climate modelling and machine learning. In this talk, I will introduce DFO methods and recent progress in the theory and implementation of these methods, with a particular focus on least-squares problems (e.g. parameter fitting).
Published:
In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box, computationally expensive and/or stochastic - which occurs in a variety of practical settings - gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm, which has grown rapidly in popularity and maturity in recent years. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods. My focus will be on both general minimisation and nonlinear least-squares problems specifically.
Published:
Derivative-free optimization (DFO) methods are an important class of optimization routines for many problems in data science, such as hyperparameter optimization and adversarial attacks for neural networks. However, in model-based DFO methods, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations. I will also discuss how this approach may be extended to DFO for general nonlinear optimization problems.
Published:
Least-squares problems (such as parameter estimation) are ubiquitous across quantitative disciplines. Optimisation algorithms for solving such problems are numerous and well-established. However, in cases where models are computationally expensive, black box, or noisy, classical algorithms can be impractical or even fail. Derivative-free optimisation (DFO) methods provide an alternative approach which can handle these settings. In this talk, Lindon will introduce a derivative-free version of the classical Gauss-Newton method, discuss its theoretical guarantees and software implementation, and describe applications of this technique to parameter estimation of global climate models and image reconstruction.
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).
Published:
Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high, particularly for large-scale problems. Considering nonlinear least-squares problems, we improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems. This is joint work with Coralia Cartis and Tyler Ferguson (Univerity of Oxford).
Published:
Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization methods are a suitable alternative. In existing techniques for derivative-free optimization, the linear algebra cost of constructing function approximations. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a new derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results. This is joint work with Coralia Cartis (Oxford).
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence can be implemented in practice. This is joint work with Matthias Ehrhardt (Bath, UK).
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice.
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Published:
I will introduce some techniques for optimisation suitable for complex systems, such as those involving significant computation or random noise. I will also discuss some applications of these ideas, focusing specifically on optimising data acquisition processes, drawing on examples from medical imaging.
Published:
Estimating hyperparameters is an important and long-standing problem in machine learning. We consider the case where hyperparameter estimation can be formulated as a bilevel optimization problem. One difficulty with this formulation is that the exact gradient with respect to the hyperparameters cannot be computed and must instead be approximated. We provide an analysis of hypergradient estimation in a flexible framework which allows using both automatic differentiation or the Implicit Function Theorem, and demonstrate the relative performance of several approaches. This is joint work with Matthias Ehrhardt (Bath, UK).
Published:
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modelled as the solution to an optimization problem. In this case the exact gradient with respect to the hyperparameters cannot be computed and approximate strategies are required. We provide an analysis of two classes of methods based on inexact automatic differentiation or approximate implicit differentiation. Our analysis reveals that these two strategies are actually tightly connected, and we derive a priori and a posteriori estimates for both methods which can be used to bound computations and gain further insights what their accuracy actually depends on.
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Published:
When optimising functions that are black-box, noisy and/or computationally expensive, it may be impractical to get gradient information, requiring the use of derivative-free optimisation (DFO) algorithms. Compared to traditional nonlinear optimisation methods, DFO methods are typically not as scalable (in terms of number of decision variables). However, recent DFO approaches based on iterative steps in randomly drawn subspaces has shown promise as a way of improving scalability. In this talk, I will outline these approaches, and a novel average-case analysis that demonstrates why lower dimensional subspaces typically perform well (even though this is not guaranteed by existing theory).
Published:
For most (sufficiently simple) smooth functions, most high school calculus students can compute its derivative by repeated application of calculus identities (derivatives of simple functions, chain rule, etc.). In this talk I aim to convince you that this simple topic leads to some surprisingly difficult and interesting mathematical problems. I will discuss how derivatives are calculated in real-world settings, and show that without proper care, you can get catastrophically bad results. I will then introduce some advanced techniques that overcome these issues.
Published:
A promising approach for improving the scalability for DFO methods is to work in low-dimensional subspaces that are iteratively drawn at random. For such methods, the connection between the subspace dimension and the algorithmic guarantees is not yet fully understood. I will introduce a new average-case analysis for direct search and model-based DFO in random subspaces which allows us to better understand why working in low-dimensional subspaces often outperforms higher-dimensional subspaces.
Published:
Most algorithms for optimising nonlinear functions rely on access to (possibly stochastic) derivative information. However, for problems including adversarial example generation for neural networks and fine-tuning large language models, good derivative information can be difficult to obtain and derivative-free optimisation (DFO) algorithms are beneficial. Although there are many approaches for DFO, they generally struggle to solve large-scale problems such as those arising in machine learning. In this talk, I will introduce new scalable DFO algorithms based on random subspaces and develop a novel average-case analysis of such algorithms.
Published:
In model-based derivative-free optimisation algorithms, black-box functions are typically approximated using polynomial interpolation models. Most existing model-based DFO methods for constrained optimisation assume the ability to construct sufficiently accurate interpolation models, but this is not always achievable when sampling only feasible points. In this talk, I will outline a new approximation theory for linear and quadratic interpolation models in the presence of convex constraints.
Published:
A common problem in data science is the determination of model hyperparameters. One approach for learning hyperparameters is to use bilevel optimisation, where the lower-level problem is the standard learning optimisation problem, and the upper-level problem is to learn the hyperparameters (e.g. by minimising validation error). In this setting, particularly for large-scale problems, neither exact function values nor exact gradients are attainable, necessitating methods that only rely on inexact evaluation of such quantities. I will present a new bilevel optimisation algorithm with adaptive inexactness suitable for hyperparameter learning. Numerical results on problems from imaging demonstrate its robustness and strong performance. This is joint work with Mohammad Sadegh Salehi, Matthias Ehrhardt (University of Bath) and Subhadip Mukherjee (IIT Kharagpur).
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.