JAXopt

Documentation

  • Basics
  • Unconstrained optimization
  • Constrained optimization
  • Quadratic programming
  • Non-smooth optimization
  • Stochastic optimization
  • Root finding
  • Fixed point resolution
  • Nonlinear least squares
  • Linear system solving
  • Implicit differentiation
  • Loss and objective functions
  • Line search
    • Sufficient decrease and curvature conditions
    • Algorithms
      • jaxopt.BacktrackingLineSearch
      • jaxopt.HagerZhangLineSearch
  • Perturbed optimization

API

  • API at a glance

Examples

  • Notebook gallery
  • Example gallery

About

  • Authors
  • Changelog
  • Source code
  • Issue tracker
  • Development
JAXopt
  • »
  • Line search
  • View page source

Line search

Given current parameters \(x_k\) and a descent direction \(p_k\), the goal of a line search method is to find a step size \(\alpha_k\) such that the one-dimensional function

\[\varphi(\alpha_k) \triangleq f(x_k + \alpha_k p_k)\]

is minimized or at least a sufficient decrease is guaranteed.

Sufficient decrease and curvature conditions

Exactly minimizing \(\varphi\) is often computationally costly. Instead, it is often preferred to search for \(\alpha_k\) satisfying certain conditions. One example of these conditions are the Wolfe conditions

\[\begin{split}\begin{aligned} f(x_k + \alpha_k p_k) &\le f(x_k) + c_1 \alpha_k \nabla f(x_k)^\top p_k \\ \nabla f(x_k + \alpha_k p_k)^\top p_k &\ge c_2 \nabla f(x_k)^\top p_k \end{aligned}\end{split}\]

where \(0 < c_1 < c_2 < 1\). These conditions are explained in greater detail in Nocedal and Wright, see equations (3.6a) and (3.6b) there.

A step size may satisfy the Wolfe conditions without being particularly close to a minimizer of \(\varphi\) (Nocedal and Wright, Figure 3.5). The curvature condition in the second equation can be modified to force the step size to lie in at least a broad neighborhood of a stationary point of \(\varphi\). Combined with the sufficient decrease condition in the first equation, these are known as the strong Wolfe conditions

\[\begin{split}\begin{aligned} f(x_k + \alpha_k p_k) &\le f(x_k) + c_1 \alpha_k \nabla f(x_k)^\top p_k \\ |\nabla f(x_k + \alpha_k p_k)^\top p_k| &\le c_2 |\nabla f(x_k)^\top p_k| \end{aligned}\end{split}\]

where again \(0 < c_1 < c_2 < 1\). See Nocedal and Wright, equations (3.7a) and (3.7b).

Algorithms

jaxopt.BacktrackingLineSearch(fun[, ...])

Backtracking line search.

jaxopt.HagerZhangLineSearch(fun[, ...])

Hager-Zhang line search.

The BacktrackingLineSearch algorithm iteratively reduces the step size by some decrease factor until the conditions above are satisfied. Example:

ls = BacktrackingLineSearch(fun=fun, maxiter=20, condition="strong-wolfe",
                            decrease_factor=0.8)
stepsize, state = ls.run(init_stepsize=1.0, params=params,
                         descent_direction=descent_direction,
                         value=value, grad=grad)

where

  • init_stepsize is the first step size value to try,

  • params are the current parameters \(x_k\),

  • descent_direction is the provided descent direction \(p_k\) (optional, defaults to \(-\nabla f(x_k)\)),

  • value is the current value \(f(x_k)\) (optional, recomputed if not provided),

  • grad is the current gradient \(\nabla f(x_k)\) (optional, recomputed if not provided),

The returned state contains useful information such as state.params, which contains \(x_k + \alpha_k p_k\) and state.grad, which contains \(\nabla f(x_k + \alpha_k p_k)\).

References:

  • Numerical Optimization, Jorge Nocedal and Stephen Wright, Second edition.

Previous Next

© Copyright 2021-2022, the JAXopt authors.

Built with Sphinx using a theme provided by Read the Docs.