New Optimizer Development

New optimizers can be developed within modOpt. Developing optimizers in modOpt offers several advantages. For instance, stable and efficient modules already available in modOpt can be reused for the new algorithm, eliminating the need for developers to implement these components from scratch. Since all optimizers in modOpt must be derived from the Optimizer base class, tools for checking first derivatives, visualizing, recording, and hot-starting are automatically inherited by new optimizers.

Subclasses derived from Optimizer must implement an initialize method that sets the solver_name and declares any optimizer-specific options. Developers are required to define the available_outputs attribute within the initialize method. This attribute specifies the data that the optimizer will provide after each iteration of the algorithm by calling the update_outputs method. Developers must also define a setup method to handle any pre-processing of the problem data and configuration of the optimizer’s modules.

The core of an optimizer in modOpt lies in the solve method. This method implements the numerical algorithm and iteratively calls the ‘_compute’ methods from the problem object. Upon completion of the optimization, the solve method should assign a results attribute that holds the optimization results in the form of a dictionary. Developers may optionally implement a print_results method to override the default implementation provided by the base class and customize the presentation of the results.

Developers may need to implement additional methods for setting up constraints, their bounds, and derivatives, depending on their optimizer.

Note

Since HDF5 files from optimizer recording are incompatible with text editors, developers can provide users with the readable_outputs option during optimizer instantiation to export optimizer-generated data as plain text files. For each variable listed in readable_outputs, a separate file is generated, with rows representing optimizer iterations. While creating a new optimizer, developers may declare this option so that users are able to take advantage of this feature already implemented in the Optimizer base class. The list of variables allowed for readable_outputs is any subset of the keys in the available_outputs attribute.

Developing the BFGS optimizer

The following example shows the implementation of the BFGS algorithm for unconstrained optimization, employing modules for approximating Hessians and performing line searches. The Hessians are approximated using a damped BFGS algorithm and the line searches enforce strong Wolfe conditions. The code also demonstrates how to document the API for a new optimizer.

import numpy as np
import time
from modopt import Optimizer
from modopt.line_search_algorithms import Minpack2LS
from modopt.approximate_hessians import BFGSScipy as BFGS

class QuasiNewton(Optimizer):
    """
    Quasi-Newton method for unconstrained optimization.

    Parameters
    ----------
    problem : Problem or ProblemLite
        Object containing the problem to be solved.
    recording : bool, default=False
        If ``True``, record all outputs from the optimization.
        This needs to be enabled for hot-starting the same problem later,
        if the optimization is interrupted.
    out_dir : str, optional
        The directory to store all the output files generated from the optimization.
    hot_start_from : str, optional
        The record file from which to hot-start the optimization.
    hot_start_atol : float, default=0.
        The absolute tolerance check for the inputs
        when reusing outputs from the hot-start record.
    hot_start_rtol : float, default=0.
        The relative tolerance check for the inputs
        when reusing outputs from the hot-start record.
    visualize : list, default=[]
        The list of scalar variables to visualize during the optimization.
    keep_viz_open : bool, default=False
        If ``True``, keep the visualization window open after the optimization is complete.
    turn_off_outputs : bool, default=False
        If ``True``, prevent modOpt from generating any output files.

    maxiter : int, default=500
        Maximum number of iterations.
    opt_tol : float, default=1e-6
        Optimality tolerance.
        Certifies convergence when the 2-norm of the gradient is less than this value.
    readable_outputs : list, default=[]
        List of outputs to be written to readable text output files.
        Available outputs are: 'itr', 'obj', 'x', 'opt', 'time', 'nfev', 'ngev', 'step'.
    """
    def initialize(self):
        self.solver_name = 'bfgs'

        self.obj = self.problem._compute_objective
        self.grad = self.problem._compute_objective_gradient

        self.options.declare('maxiter', default=500, types=int)
        self.options.declare('opt_tol', types=float, default=1e-6)
        self.options.declare('readable_outputs', types=list, default=[])

        self.available_outputs = {
            'itr': int,
            'obj': float,
            # for arrays from each iteration, sizes need to be declared
            'x': (float, (self.problem.nx, )),
            'opt': float,
            'time': float,
            'nfev': int,
            'ngev': int,
            'step': float,
        }

    def setup(self):
        self.LS = Minpack2LS(f=self.obj, g=self.grad)
        self.QN = BFGS(nx=self.problem.nx,
                       exception_strategy='damp_update')

    def solve(self):
        # Assign shorter names to variables and methods
        opt_tol = self.options['opt_tol']
        maxiter = self.options['maxiter']

        obj = self.obj
        grad = self.grad

        start_time = time.time()

        # Set initial values for current iterates
        x_k = self.problem.x0 * 1.
        f_k = obj(x_k)
        g_k = grad(x_k)

        # Iteration counter
        itr = 0

        opt = np.linalg.norm(g_k)   # optimality measure
        nfev = 1                    # number of objective function evaluations
        ngev = 1                    # number of objective gradient evaluations

        # Initializing declared outputs
        self.update_outputs(itr=0,
                            x=x_k,
                            obj=f_k,
                            opt=opt,
                            time=time.time() - start_time,
                            nfev=nfev,
                            ngev=ngev,
                            step=0.)

        while (opt > opt_tol and itr < maxiter):
            itr_start = time.time()
            itr += 1

            # Hessian approximation
            B_k = self.QN.B_k

            # ALGORITHM STARTS HERE
            # >>>>>>>>>>>>>>>>>>>>>

            # Compute the search direction toward the next iterate
            p_k = np.linalg.solve(B_k, -g_k)

            # Compute the step length along the search direction via a line search
            alpha, f_k, g_new, slope_new, new_f_evals, new_g_evals, converged = self.LS.search(
                x=x_k, p=p_k, f0=f_k, g0=g_k)

            nfev += new_f_evals
            ngev += new_g_evals

            # A step of length 1e-4 is taken along p_k if line search does not converge
            if not converged:
                alpha = 1e-4
                d_k = p_k * alpha

                x_k += d_k
                f_k = obj(x_k)

                g_new = grad(x_k)
                w_k = g_new - g_k
                g_k = g_new

            else:
                d_k = alpha * p_k
                x_k += d_k

                w_k = g_new - g_k
                g_k = g_new

            opt = np.linalg.norm(g_k)

            # Update the Hessian approximation
            self.QN.update(d_k, w_k)

            # <<<<<<<<<<<<<<<<<<<
            # ALGORITHM ENDS HERE

            # Update arrays inside outputs dict with new values from the current iteration
            self.update_outputs(itr=itr,
                                x=x_k,
                                obj=f_k,
                                opt=opt,
                                time=time.time() - start_time,
                                nfev=nfev,
                                ngev=ngev,
                                step=alpha)

        self.total_time = time.time() - start_time

        self.results = {
            'x': x_k, 
            'objective': f_k, 
            'optimality': opt, 
            'nfev': nfev, 
            'ngev': ngev,
            'niter': itr, 
            'time': self.total_time,
            'converged': opt <= opt_tol,
            }
        
        # Run post-processing for the Optimizer() base class
        self.run_post_processing()
        
        return self.results

Solving an unconstrained problem using the BFGS optimizer

In the following code, we solve the problem

\[ \underset{x_1, x_2 \in \mathbb{R}}{\text{minimize}} \quad x_1^2 + x_2^2 \]

using the QuasiNewton optimizer we developed above.

from modopt import ProblemLite
name = 'quartic'
x0 = np.array([500., 5.])
obj = lambda x: np.sum(x**2)
grad = lambda x: 2 * x
prob = ProblemLite (name=name, x0=x0, obj=obj, grad =grad)

optimizer = QuasiNewton(problem=prob, maxiter=100, opt_tol=1e-6)
results = optimizer.solve()
optimizer.print_results(summary_table=True, all=True)

	Solution from modOpt:
	----------------------------------------------------------------------------------------------------
	Problem                  : quartic
	Solver                   : bfgs
	x                        : [0. 0.]
	objective                : 0.0
	optimality               : 0.0
	nfev                     : 4
	ngev                     : 4
	niter                    : 2
	time                     : 0.010946035385131836
	converged                : True
	total_callbacks          : 8
	obj_evals                : 4
	grad_evals               : 4
	hess_evals               : 0
	con_evals                : 0
	jac_evals                : 0
	reused_callbacks         : 0
	out_dir                  : quartic_outputs/2025-01-27_12.49.49.006498
	----------------------------------------------------------------------------------------------------

================================================================================================================
                                             modOpt summary table:                                              
================================================================================================================
         #        itr              obj              opt             time       nfev       ngev             step 
         0          0     2.500250E+05     1.000050E+03     1.288891E-03          1          1     0.000000E+00 
         1          1     2.500250E-03     1.000050E-01     1.006889E-02          3          3     4.999500E-01 
         2          2     0.000000E+00     0.000000E+00     1.075673E-02          4          4     1.000000E+00 
================================================================================================================

Modules

Reusable modules are available for line searches, merit functions, Hessian approximations, and quadratic programming. For more details on the Optimizer class or any of the modules, visit the API Reference page.

Advanced users should note that the repository includes additional modules beyond those documented in the API Reference. Exploring the directories corresponding to the respective module categories might be helpful.