Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
Flow dynamics in complex buildings (airports and rail stations)
Published in 116th European Study Group with Industry, 2016
Derivative-Free Optimisation for Data Fitting
Published in InFoMM CDT Report, 2016
Optimising the Design of Polythene Hydroponics Beds
Published in Agri-Food Study Group with Industry, 2017
A heat and mass transfer study of carbon paste baking
Published in International Communications in Heat and Mass Transfer, 2017
Cable Routing on Ships
Published in 136th European Study Group with Industry, 2018
Capturing the Purpose of Websites
Published in Alan Turing Institute Data Study Group, 2018
A Derivative-Free Gauss-Newton Method
Published in Mathematical Programming Computation, 2019
Improving the Flexibility and Robustness of Model-Based Derivative-Free Optimization Solvers
Published in ACM Transactions on Mathematical Software, 2019
Derivative-Free Algorithms for Nonlinear Optimisation Problems
Published in PhD Thesis, University of Oxford, 2019
Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems
Published in Workshop on "Beyond first-order methods in ML systems", 37th International Conference on Machine Learning (ICML 2020), 2020
Efficient Hyperparameter Tuning with Dynamic Accuracy Derivative-Free Optimization
Published in 12th OPT Workshop on Optimization for Machine Learning (NeurIPS 2020), 2020
Inexact Derivative-Free Optimization for Bilevel Learning
Published in Journal of Mathematical Imaging and Vision, 2021
Escaping local minima with local derivative-free methods: a numerical investigation
Published in Optimization, 2021
Does Model Calibration Reduce Uncertainty in Climate Projections?
Published in Journal of Climate, 2021
Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization
Published in Mathematical Programming, 2022
A Simplified Convergence Theory for Byzantine Resilient Stochastic Gradient Descent
Published in EURO Journal on Computational Optimization, 2022
Model-Based Derivative-Free Methods for Convex-Constrained Optimization
Published in SIAM Journal on Optimization, 2022
PyCUTEst: an open source Python package of optimization test problems
Published in Journal of Open Source Software, 2022
On the selection of the weighting parameter value in optimizing Eucalyptus globulus pulp yield models based on NIR spectra
Published in Wood Science and Technology, 2022
Furnace Vestibule Heat Transport Models
Published in ANZIAM Journal, 2023
Optimizing illumination patterns for classical ghost imaging
Published in Physical Review A, 2023
Mask design, fabrication, and experimental ghost imaging applications for patterned X-ray illumination
Published in Optics Express, 2023
Direct search based on probabilistic descent in reduced spaces
Published in SIAM Journal on Optimization, 2023
Analyzing Inexact Hypergradients for Bilevel Learning
Published in IMA Journal of Applied Mathematics, 2023
Non-Uniform Smoothness for Gradient Descent
Published in Transactions on Machine Learning Research, 2024
Black-box Optimization Algorithms for Regularized Least-squares Problems
Published in arXiv preprint arXiv:2407.14915, 2024
Expected decrease for derivative-free algorithms using random subspaces
Published in Mathematics of Computation, 2024
Randomized Subspace Derivative-Free Optimization with Quadratic Models and Second-Order Convergence
Published in arXiv preprint arXiv:2412.14431, 2024
Projected proximal gradient trust-region algorithm for nonsmooth optimization
Published in arXiv preprint arXiv:2501.04889, 2025
Bilevel Learning with Inexact Stochastic Gradients
Published in Scale Space and Variational Methods in Computer Vision (SSVM), 2025
Model Construction for Convex-Constrained Derivative-Free Optimization
Published in SIAM Journal on Optimization, 2025
High-resolution X-ray scanning with a diffuse Huffman-patterned probe to reduce radiation damage
Published in Journal of Synchrotron Radiation, 2025
Neutron beam shaping by ghost projection
Published in Physical Review A, 2025
An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning
Published in SIAM Journal on Mathematics of Data Science, 2025
software
DFO-GN
Derivative-Free Gauss-Newton solver
DFO-LS
Derivative-free solver for nonlinear least-squares problems
Py-BOBYQA
General-purpose derivative-free optimization solver
PyCUTEst
Python interface to CUTEst optimization testing package
trustregion
Python routines for solving trust-region subproblems
DFBGN
Derivative-free block Gauss-Newton method (for large-scale nonlinear least-squares problems)
directsearch
Flexible optimization package using direct search (derivative-free) methods
talks
Derivative-Free Optimisation Methods for Nonlinear Least-Squares Problems
Published:
Derivative-free optimisation (DFO) algorithms are a category of optimisation methods for situations when one is unable to compute or estimate derivatives of the objective. The need for DFO arises in applications where techniques such as finite differencing or algorithmic differentiation are inaccurate or impractical, such as when the objective has noise (e.g. Monte Carlo simulations in finance) or is very expensive (e.g. climate simulations). \n
Improving the efficiency of derivative-free methods for nonlinear least squares problems
Published:
We present DFO-GN, a derivative-free version of the Gauss-Newton method for solving nonlinear least-squares problems. As is common in derivative-free optimization, DFO-GN uses interpolation of function values to build a model of the objective, which is then used within a trust-region framework to give a globally-convergent algorithm requiring $O(\epsilon^{-2})$ iterations to reach approximate first-order criticality within tolerance $\epsilon$. This algorithm is a simplification of the method by Zhang, Conn and Scheinberg (SIOPT, 2010), where we replace quadratic models for each residual with linear models. We demonstrate that DFO-GN performs comparably to the method of Zhang et al. in terms of objective evaluations, as well as having a substantially faster runtime and improved scalability. Finally, we present two extensions of DFO-GN designed to improve its performance on large-scale and noisy problems respectively.
A flexible, robust and efficient derivative-free solver for least-squares
Published:
We present DFO-LS, a software package for derivative-free optimization (DFO) for nonlinear least-squares problems, that has simplified models, flexible initialization and improved robustness to noise. Inspired by the Gauss-Newton method, DFO-LS constructs simplified linear regression models for the residuals. DFO-LS also has improved flexibility for expensive problems, whereby it can begin making progress from as few as two objective evaluations. Numerical results show DFO-LS can gain reasonable progress on some medium-scale problems with fewer objective evaluations than is needed for one gradient evaluation. For noisy problems, DFO-LS allows a wide variety of sample averaging methodologies, the construction of highly overdetermined regression models, and restart strategies. Our extensive numerical experimentation shows that restarting the solver when stagnation is detected is a cheap and effective mechanism for achieving robustness, with superior performance. We also discuss our package Py-BOBYQA, a Python implementation of BOBYQA (Powell, 2009) which implements some of these features for general objective problems.
Improving the Flexibility and Robustness of Derivative-Free Optimization Solvers
Published:
Classical nonlinear optimization algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimization methods which do not require gradient information, so-called derivative-free optimization (DFO). In this talk, I will introduce DFO and discuss two new software packages for DFO: DFO-LS for nonlinear least-squares problems, and Py-BOBYQA (a Python implementation of Powell’s BOBYQA) for general minimization problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which helps Py-BOBYQA to escape local minima, and show its favorable performance on global optimization problems.
Improving the Flexibility and Robustness of Derivative-Free Optimisation Solvers
Published:
Classical nonlinear optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective and testing for convergence. In settings where the objective is expensive to evaluate or noisy, evaluating the gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). This has applications in areas such as climate modelling, hyperparameter tuning and generating adversarial examples in deep learning. In this talk, I will introduce DFO and discuss two new software packages for DFO for nonlinear least-squares problems and for general minimisation problems. I will describe their novel features aimed at expensive and/or noisy problems, and show their state-of-the-art performance. Time permitting, I will also show a heuristic method which improves the ability of these methods to escape local minima, and show its favourable performance on global optimisation problems.
Improving the efficiency and robustness of black-box optimisation
Published:
In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box and computationally expensive or stochastic, gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods for general minimisation and nonlinear least-squares problems.
Improving the scalability of derivative-free optimization for nonlinear least-squares problems
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Improving the scalability of derivative-free optimization for nonlinear least-squares problems
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Improving the scalability of derivative-free optimization for nonlinear least-squares problems
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Derivative-Free Algorithms for Nonconvex Optimisation
Published:
Classical nonconvex optimisation algorithms require the availability of gradient evaluations for constructing local approximations to the objective function and testing for convergence. In settings where the objective is expensive to evaluate and/or noisy, evaluating its gradient may be too expensive or inaccurate, so cannot be used; we must turn to optimisation methods which do not require gradient information, so-called derivative-free optimisation (DFO). DFO has applications in areas such as finance, climate modelling and machine learning. In this talk, I will introduce DFO methods and recent progress in the theory and implementation of these methods, with a particular focus on least-squares problems (e.g. parameter fitting).
Improving the efficiency and robustness of black-box optimisation
Published:
In classical nonlinear optimisation, the availability of first-order information is crucial to constructing accurate local models for the objective and finding descent directions. However, when the objective function is black-box, computationally expensive and/or stochastic - which occurs in a variety of practical settings - gradients may be too expensive to compute or too inaccurate to be useful. In this setting, derivative-free optimisation (DFO) provides an alternative class of algorithm, which has grown rapidly in popularity and maturity in recent years. In this talk, I will present new techniques for model-based DFO, which yield an improvement in the efficiency and robustness of existing methods. My focus will be on both general minimisation and nonlinear least-squares problems specifically.
Improving the scalability of model-based derivative-free optimization
Published:
Derivative-free optimization (DFO) methods are an important class of optimization routines for many problems in data science, such as hyperparameter optimization and adversarial attacks for neural networks. However, in model-based DFO methods, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations. I will also discuss how this approach may be extended to DFO for general nonlinear optimization problems.
Derivative-free optimisation for least-squares problems
Published:
Least-squares problems (such as parameter estimation) are ubiquitous across quantitative disciplines. Optimisation algorithms for solving such problems are numerous and well-established. However, in cases where models are computationally expensive, black box, or noisy, classical algorithms can be impractical or even fail. Derivative-free optimisation (DFO) methods provide an alternative approach which can handle these settings. In this talk, Lindon will introduce a derivative-free version of the classical Gauss-Newton method, discuss its theoretical guarantees and software implementation, and describe applications of this technique to parameter estimation of global climate models and image reconstruction.
Inexact Derivative-Free Optimisation for Bilevel Learning
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).
Scalable Derivative-Free Optimization for Nonlinear Least-Squares Problems
Published:
Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high, particularly for large-scale problems. Considering nonlinear least-squares problems, we improve on state-of-the-art DFO by performing dimensionality reduction in the observational space using sketching methods, avoiding the construction of a full local model. Our approach has a per-iteration computational cost which is linear in problem dimension in a big data regime, and numerical evidence demonstrates that, compared to existing software, it has dramatically improved runtime performance on overdetermined least-squares problems. This is joint work with Coralia Cartis and Tyler Ferguson (Univerity of Oxford).
Block Methods for Scalable Derivative-Free Optimisation
Published:
Derivative-free optimisation (DFO) methods are an important class of optimisation routines with applications in areas such as in image analysis and data science. However, in model-based DFO methods, the computational cost of constructing local models can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will introduce a derivative-free method based on exploration of random subspaces, suitable for nonlinear least-squares problems. This method has a substantially reduced computational cost (in terms of linear algebra), while still making progress using few objective evaluations.
Inexact Derivative-Free Optimization for Bilevel Learning
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice. Using problems from image analysis, we demonstrate that our approach dramatically reduces the computational requirements of bilevel learning. This is joint work with Matthias Ehrhardt (Bath).
Large-Scale Derivative-Free Optimization using Subspace Methods
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Large-Scale Derivative-Free Optimization using Subspace Methods
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Large-Scale Derivative-Free Optimization using Subspace Methods
Published:
In existing techniques for model-based derivative-free optimization, the computational cost of constructing local models and Lagrange polynomials can be high. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a model-based derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results.
Scalable Subspace Methods for Derivative-Free Nonlinear Least-Squares Optimization
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization methods are a suitable alternative. In existing techniques for derivative-free optimization, the linear algebra cost of constructing function approximations. As a result, these algorithms are not as suitable for large-scale problems as derivative-based methods. In this talk, I will discuss a new derivative-free algorithm based on exploration of random subspaces, its worst-case complexity bounds, and some numerical results. This is joint work with Coralia Cartis (Oxford).
Inexact Derivative-Free Optimization for Bilevel Learning
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence can be implemented in practice. This is joint work with Matthias Ehrhardt (Bath, UK).
Inexact Derivative-Free Optimization for Bilevel Learning
Published:
When variational regularisation methods are used to solve inverse problems, they suffer from the drawback of having potentially many parameters which the user must specify. A common approach to handle this is to learn these parameters from data. While mathematically appealing, this strategy leads to a bilevel optimisation problem which is difficult to solve computationally. Theoretically, algorithms for bilevel learning rely on access to exact solutions to the lower-level regularisation problem, but this condition is not guaranteed in practice. In this talk, we describe a novel approach using dynamic accuracy derivative-free optimisation for solving bilevel learning problems. This approach still retains convergence guarantees but allows the regularisation problem to be solved inexactly and hence is able to be implemented in practice.
Derivative-Free Optimization with Convex Constraints
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).
Derivative-Free Optimization with Convex Constraints
Published:
When optimizing functions which are computationally expensive and/or noisy, gradient information is often impractical to obtain or inaccurate. As a result, so-called ‘derivative-free’ optimization (DFO) methods are a suitable alternative. In this talk, I will show how existing methods for interpolation-based DFO can be extended to nonconvex problems with convex constraints, accessed only through projections. I will introduce a worst-case complexity analysis and show how existing geometric considerations of model accuracy (from the unconstrained setting) can be generalized to the constrained case. I will then show numerical results in the case of nonlinear least-squares optimization. This is joint work with Matthew Hough (University of Queensland and University of Waterloo).
Large-scale derivative-free optimization using random subspace methods
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Large-scale derivative-free optimization using random subspace methods
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Black-Box Optimisation Techniques for Complex Systems
Published:
I will introduce some techniques for optimisation suitable for complex systems, such as those involving significant computation or random noise. I will also discuss some applications of these ideas, focusing specifically on optimising data acquisition processes, drawing on examples from medical imaging.
Analyzing Inexact Hypergradients for Bilevel Learning
Published:
Estimating hyperparameters is an important and long-standing problem in machine learning. We consider the case where hyperparameter estimation can be formulated as a bilevel optimization problem. One difficulty with this formulation is that the exact gradient with respect to the hyperparameters cannot be computed and must instead be approximated. We provide an analysis of hypergradient estimation in a flexible framework which allows using both automatic differentiation or the Implicit Function Theorem, and demonstrate the relative performance of several approaches. This is joint work with Matthias Ehrhardt (Bath, UK).
Analyzing Inexact Hypgergradients for Bilevel Learning
Published:
Estimating hyperparameters has been a long-standing problem in machine learning. We consider the case where the task at hand is modelled as the solution to an optimization problem. In this case the exact gradient with respect to the hyperparameters cannot be computed and approximate strategies are required. We provide an analysis of two classes of methods based on inexact automatic differentiation or approximate implicit differentiation. Our analysis reveals that these two strategies are actually tightly connected, and we derive a priori and a posteriori estimates for both methods which can be used to bound computations and gain further insights what their accuracy actually depends on.
Large-scale derivative-free optimization using random subspace methods
Published:
Many standard optimization algorithms require being able to cheaply and accurately compute derivatives for the objective and/or constraint functions. However, in the presence of noise, or computationally expensive or black-box procedures, derivative information may be inaccurate or impractical to compute. Derivative-Free Optimization (DFO) encompasses a variety of techniques for nonlinear optimization in the absence of derivatives. However, such techniques can struggle on large-scale problems for reasons including high linear algebra costs and strong dimension-dependency of worst-case complexity bounds. In this talk, I will discuss model-based and direct search DFO algorithms based on iterative searches in randomly drawn subspaces and show how these methods can be used to improve the scalability of DFO. This is joint work with Coralia Cartis (Oxford) and Clément Royer (Paris Dauphine-PSL).
Expected decrease for derivative-free algorithms using random subspaces
Published:
When optimising functions that are black-box, noisy and/or computationally expensive, it may be impractical to get gradient information, requiring the use of derivative-free optimisation (DFO) algorithms. Compared to traditional nonlinear optimisation methods, DFO methods are typically not as scalable (in terms of number of decision variables). However, recent DFO approaches based on iterative steps in randomly drawn subspaces has shown promise as a way of improving scalability. In this talk, I will outline these approaches, and a novel average-case analysis that demonstrates why lower dimensional subspaces typically perform well (even though this is not guaranteed by existing theory).
Differentiation: the good, the bad, and the ugly
Published:
For most (sufficiently simple) smooth functions, most high school calculus students can compute its derivative by repeated application of calculus identities (derivatives of simple functions, chain rule, etc.). In this talk I aim to convince you that this simple topic leads to some surprisingly difficult and interesting mathematical problems. I will discuss how derivatives are calculated in real-world settings, and show that without proper care, you can get catastrophically bad results. I will then introduce some advanced techniques that overcome these issues.
Expected decrease for derivative-free algorithms using random subspaces
Published:
A promising approach for improving the scalability for DFO methods is to work in low-dimensional subspaces that are iteratively drawn at random. For such methods, the connection between the subspace dimension and the algorithmic guarantees is not yet fully understood. I will introduce a new average-case analysis for direct search and model-based DFO in random subspaces which allows us to better understand why working in low-dimensional subspaces often outperforms higher-dimensional subspaces.
Randomised Subspace Methods for Scalable Derivative-Free Optimisation
Published:
Most algorithms for optimising nonlinear functions rely on access to (possibly stochastic) derivative information. However, for problems including adversarial example generation for neural networks and fine-tuning large language models, good derivative information can be difficult to obtain and derivative-free optimisation (DFO) algorithms are beneficial. Although there are many approaches for DFO, they generally struggle to solve large-scale problems such as those arising in machine learning. In this talk, I will introduce new scalable DFO algorithms based on random subspaces and develop a novel average-case analysis of such algorithms.
Model Construction for Convex-Constrained Derivative-Free Optimization
Published:
In model-based derivative-free optimisation algorithms, black-box functions are typically approximated using polynomial interpolation models. Most existing model-based DFO methods for constrained optimisation assume the ability to construct sufficiently accurate interpolation models, but this is not always achievable when sampling only feasible points. In this talk, I will outline a new approximation theory for linear and quadratic interpolation models in the presence of convex constraints.
An adaptively inexact first-order method for bilevel optimization with application to hyperparameter learning
Published:
A common problem in data science is the determination of model hyperparameters. One approach for learning hyperparameters is to use bilevel optimisation, where the lower-level problem is the standard learning optimisation problem, and the upper-level problem is to learn the hyperparameters (e.g. by minimising validation error). In this setting, particularly for large-scale problems, neither exact function values nor exact gradients are attainable, necessitating methods that only rely on inexact evaluation of such quantities. I will present a new bilevel optimisation algorithm with adaptive inexactness suitable for hyperparameter learning. Numerical results on problems from imaging demonstrate its robustness and strong performance. This is joint work with Mohammad Sadegh Salehi, Matthias Ehrhardt (University of Bath) and Subhadip Mukherjee (IIT Kharagpur).
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.