General Unconstrained Minimization¶
Modeling¶
FirstOrderFunction
¶
-
class FirstOrderFunction¶
Instances of
FirstOrderFunction
implement the evaluation of a function and its gradient.class FirstOrderFunction { public: virtual ~FirstOrderFunction() {} virtual bool Evaluate(const double* const parameters, double* cost, double* gradient) const = 0; virtual int NumParameters() const = 0; };
-
bool FirstOrderFunction::Evaluate(const double *const parameters, double *cost, double *gradient) const¶
Evaluate the cost/value of the function. If
gradient
is notnullptr
then evaluate the gradient too. If evaluation is successful return,true
else returnfalse
.cost
guaranteed to be nevernullptr
,gradient
can benullptr
.
-
int FirstOrderFunction::NumParameters() const¶
Number of parameters in the domain of the function.
GradientProblem
¶
-
class GradientProblem¶
class GradientProblem {
public:
explicit GradientProblem(FirstOrderFunction* function);
GradientProblem(FirstOrderFunction* function,
Manifold* manifold);
int NumParameters() const;
int NumTangentParameters() const;
bool Evaluate(const double* parameters, double* cost, double* gradient) const;
bool Plus(const double* x, const double* delta, double* x_plus_delta) const;
};
Instances of GradientProblem
represent general non-linear
optimization problems that must be solved using just the value of the
objective function and its gradient. Unlike the Problem
class, which can only be used to model non-linear least squares
problems, instances of GradientProblem
not restricted in the
form of the objective function.
Structurally GradientProblem
is a composition of a
FirstOrderFunction
and optionally a Manifold
.
The FirstOrderFunction
is responsible for evaluating the cost
and gradient of the objective function.
The Manifold
is responsible for going back and forth between the
ambient space and the local tangent space. When a Manifold
is not
provided, then the tangent space is assumed to coincide with the ambient
Euclidean space that the gradient vector lives in.
The constructor takes ownership of the FirstOrderFunction
and
Manifold
objects passed to it.
-
void Solve(const GradientProblemSolver::Options &options, const GradientProblem &problem, double *parameters, GradientProblemSolver::Summary *summary)¶
Solve the given
GradientProblem
using the values inparameters
as the initial guess of the solution.
Solving¶
GradientProblemSolver::Options
¶
-
class GradientProblemSolver::Options¶
GradientProblemSolver::Options
controls the overall behavior of the solver. We list the various settings and their default values below.
-
bool GradientProblemSolver::Options::IsValid(std::string *error) const¶
Validate the values in the options struct and returns true on success. If there is a problem, the method returns false with
error
containing a textual description of the cause.
-
LineSearchDirectionType GradientProblemSolver::Options::line_search_direction_type¶
Default:
LBFGS
Choices are
STEEPEST_DESCENT
,NONLINEAR_CONJUGATE_GRADIENT
,BFGS
andLBFGS
.
-
LineSearchType GradientProblemSolver::Options::line_search_type¶
Default:
WOLFE
Choices are
ARMIJO
andWOLFE
(strong Wolfe conditions). Note that in order for the assumptions underlying theBFGS
andLBFGS
line search direction algorithms to be guaranteed to be satisfied, theWOLFE
line search should be used.
-
NonlinearConjugateGradientType GradientProblemSolver::Options::nonlinear_conjugate_gradient_type¶
Default:
FLETCHER_REEVES
Choices are
FLETCHER_REEVES
,POLAK_RIBIERE
andHESTENES_STIEFEL
.
-
int GradientProblemSolver::Options::max_lbfs_rank¶
Default: 20
The L-BFGS hessian approximation is a low rank approximation to the inverse of the Hessian matrix. The rank of the approximation determines (linearly) the space and time complexity of using the approximation. Higher the rank, the better is the quality of the approximation. The increase in quality is however is bounded for a number of reasons.
The method only uses secant information and not actual derivatives.
The Hessian approximation is constrained to be positive definite.
So increasing this rank to a large number will cost time and space complexity without the corresponding increase in solution quality. There are no hard and fast rules for choosing the maximum rank. The best choice usually requires some problem specific experimentation.
-
bool GradientProblemSolver::Options::use_approximate_eigenvalue_bfgs_scaling¶
Default:
false
As part of the
BFGS
update step /LBFGS
right-multiply step, the initial inverse Hessian approximation is taken to be the Identity. However, [Oren] showed that using instead \(I * \gamma\), where \(\gamma\) is a scalar chosen to approximate an eigenvalue of the true inverse Hessian can result in improved convergence in a wide variety of cases. Settinguse_approximate_eigenvalue_bfgs_scaling
to true enables this scaling inBFGS
(before first iteration) andLBFGS
(at each iteration).Precisely, approximate eigenvalue scaling equates to
\[\gamma = \frac{y_k' s_k}{y_k' y_k}\]With:
\[y_k = \nabla f_{k+1} - \nabla f_k\]\[s_k = x_{k+1} - x_k\]Where \(f()\) is the line search objective and \(x\) the vector of parameter values [NocedalWright].
It is important to note that approximate eigenvalue scaling does not always improve convergence, and that it can in fact significantly degrade performance for certain classes of problem, which is why it is disabled by default. In particular it can degrade performance when the sensitivity of the problem to different parameters varies significantly, as in this case a single scalar factor fails to capture this variation and detrimentally downscales parts of the Jacobian approximation which correspond to low-sensitivity parameters. It can also reduce the robustness of the solution to errors in the Jacobians.
-
LineSearchInterpolationType GradientProblemSolver::Options::line_search_interpolation_type¶
Default:
CUBIC
Degree of the polynomial used to approximate the objective function. Valid values are
BISECTION
,QUADRATIC
andCUBIC
.
-
double GradientProblemSolver::Options::min_line_search_step_size¶
The line search terminates if:
\[\|\Delta x_k\|_\infty < \text{min_line_search_step_size}\]where \(\|\cdot\|_\infty\) refers to the max norm, and \(\Delta x_k\) is the step change in the parameter values at the \(k\)-th iteration.
-
double GradientProblemSolver::Options::line_search_sufficient_function_decrease¶
Default:
1e-4
Solving the line search problem exactly is computationally prohibitive. Fortunately, line search based optimization algorithms can still guarantee convergence if instead of an exact solution, the line search algorithm returns a solution which decreases the value of the objective function sufficiently. More precisely, we are looking for a step size s.t.
\[f(\text{step_size}) \le f(0) + \text{sufficient_decrease} * [f'(0) * \text{step_size}]\]This condition is known as the Armijo condition.
-
double GradientProblemSolver::Options::max_line_search_step_contraction¶
Default:
1e-3
In each iteration of the line search,
\[\text{new_step_size} \geq \text{max_line_search_step_contraction} * \text{step_size}\]Note that by definition, for contraction:
\[0 < \text{max_step_contraction} < \text{min_step_contraction} < 1\]
-
double GradientProblemSolver::Options::min_line_search_step_contraction¶
Default:
0.6
In each iteration of the line search,
\[\text{new_step_size} \leq \text{min_line_search_step_contraction} * \text{step_size}\]Note that by definition, for contraction:
\[0 < \text{max_step_contraction} < \text{min_step_contraction} < 1\]
-
int GradientProblemSolver::Options::max_num_line_search_step_size_iterations¶
Default:
20
Maximum number of trial step size iterations during each line search, if a step size satisfying the search conditions cannot be found within this number of trials, the line search will stop.
As this is an ‘artificial’ constraint (one imposed by the user, not the underlying math), if
WOLFE
line search is being used, and points satisfying the Armijo sufficient (function) decrease condition have been found during the current search (in \(\leq\)max_num_line_search_step_size_iterations
). Then, the step size with the lowest function value which satisfies the Armijo condition will be returned as the new valid step, even though it does not satisfy the strong Wolfe conditions. This behaviour protects against early termination of the optimizer at a sub-optimal point.
-
int GradientProblemSolver::Options::max_num_line_search_direction_restarts¶
Default:
5
Maximum number of restarts of the line search direction algorithm before terminating the optimization. Restarts of the line search direction algorithm occur when the current algorithm fails to produce a new descent direction. This typically indicates a numerical failure, or a breakdown in the validity of the approximations used.
-
double GradientProblemSolver::Options::line_search_sufficient_curvature_decrease¶
Default:
0.9
The strong Wolfe conditions consist of the Armijo sufficient decrease condition, and an additional requirement that the step size be chosen s.t. the magnitude (‘strong’ Wolfe conditions) of the gradient along the search direction decreases sufficiently. Precisely, this second condition is that we seek a step size s.t.
\[\|f'(\text{step_size})\| \leq \text{sufficient_curvature_decrease} * \|f'(0)\|\]Where \(f()\) is the line search objective and \(f'()\) is the derivative of \(f\) with respect to the step size: \(\frac{d f}{d~\text{step size}}\).
-
double GradientProblemSolver::Options::max_line_search_step_expansion¶
Default:
10.0
During the bracketing phase of a Wolfe line search, the step size is increased until either a point satisfying the Wolfe conditions is found, or an upper bound for a bracket containing a point satisfying the conditions is found. Precisely, at each iteration of the expansion:
\[\text{new_step_size} \leq \text{max_step_expansion} * \text{step_size}\]By definition for expansion
\[\text{max_step_expansion} > 1.0\]
-
int GradientProblemSolver::Options::max_num_iterations¶
Default:
50
Maximum number of iterations for which the solver should run.
-
double GradientProblemSolver::Options::max_solver_time_in_seconds¶
Default:
1e6
Maximum amount of time for which the solver should run.
-
double GradientProblemSolver::Options::function_tolerance¶
Default:
1e-6
Solver terminates if
\[\frac{|\Delta \text{cost}|}{\text{cost}} \leq \text{function_tolerance}\]where, \(\Delta \text{cost}\) is the change in objective function value (up or down) in the current iteration of the line search.
-
double GradientProblemSolver::Options::gradient_tolerance¶
Default:
1e-10
Solver terminates if
\[\|x - \Pi \boxplus(x, -g(x))\|_\infty \leq \text{gradient_tolerance}\]where \(\|\cdot\|_\infty\) refers to the max norm, \(\Pi\) is projection onto the bounds constraints and \(\boxplus\) is Plus operation for the manifold associated with the parameter vector.
-
double GradientProblemSolver::Options::parameter_tolerance¶
Default:
1e-8
Solver terminates if
\[\|\Delta x\| \leq (\|x\| + \text{parameter_tolerance}) * \text{parameter_tolerance}\]where \(\Delta x\) is the step computed by the linear solver in the current iteration of the line search.
-
bool GradientProblemSolver::Options::minimizer_progress_to_stdout¶
Default:
false
By default the
Minimizer
progress is logged toSTDERR
depending on thevlog
level. If this flag is set to true, andGradientProblemSolver::Options::logging_type
is notSILENT
, the logging output is sent toSTDOUT
.The progress display looks like
0: f: 2.317806e+05 d: 0.00e+00 g: 3.19e-01 h: 0.00e+00 s: 0.00e+00 e: 0 it: 2.98e-02 tt: 8.50e-02 1: f: 2.312019e+05 d: 5.79e+02 g: 3.18e-01 h: 2.41e+01 s: 1.00e+00 e: 1 it: 4.54e-02 tt: 1.31e-01 2: f: 2.300462e+05 d: 1.16e+03 g: 3.17e-01 h: 4.90e+01 s: 2.54e-03 e: 1 it: 4.96e-02 tt: 1.81e-01
Here
f
is the value of the objective function.d
is the change in the value of the objective function if the step computed in this iteration is accepted.g
is the max norm of the gradient.h
is the change in the parameter vector.s
is the optimal step length computed by the line search.it
is the time take by the current iteration.tt
is the total time taken by the minimizer.
-
std::vector<IterationCallback> GradientProblemSolver::Options::callbacks¶
Callbacks that are executed at the end of each iteration of the
Minimizer
. They are executed in the order that they are specified in this vector. By default, parameter blocks are updated only at the end of the optimization, i.e., when theMinimizer
terminates. This behavior is controlled byGradientProblemSolver::Options::update_state_every_iteration
. If the user wishes to have access to the update parameter blocks when his/her callbacks are executed, then setGradientProblemSolver::Options::update_state_every_iteration
to true.The solver does NOT take ownership of these pointers.
-
bool GradientProblemSolver::Options::update_state_every_iteration¶
Default:
false
Normally the parameter vector is only updated when the solver terminates. Setting this to true updates it every iteration. This setting is useful when building an interactive application using Ceres and using an
IterationCallback
.
GradientProblemSolver::Summary
¶
-
class GradientProblemSolver::Summary¶
Summary of the various stages of the solver after termination.
-
std::string GradientProblemSolver::Summary::BriefReport() const¶
A brief one line description of the state of the solver after termination.
-
std::string GradientProblemSolver::Summary::FullReport() const¶
A full multiline description of the state of the solver after termination.
-
bool GradientProblemSolver::Summary::IsSolutionUsable() const¶
Whether the solution returned by the optimization algorithm can be relied on to be numerically sane. This will be the case if GradientProblemSolver::Summary:termination_type is set to CONVERGENCE, USER_SUCCESS or NO_CONVERGENCE, i.e., either the solver converged by meeting one of the convergence tolerances or because the user indicated that it had converged or it ran to the maximum number of iterations or time.
-
TerminationType GradientProblemSolver::Summary::termination_type¶
The cause of the minimizer terminating.
-
double GradientProblemSolver::Summary::initial_cost¶
Cost of the problem (value of the objective function) before the optimization.
-
double GradientProblemSolver::Summary::final_cost¶
Cost of the problem (value of the objective function) after the optimization.
-
std::vector<IterationSummary> GradientProblemSolver::Summary::iterations¶
IterationSummary
for each minimizer iteration in order.
-
int num_cost_evaluations¶
Number of times the cost (and not the gradient) was evaluated.
-
int num_gradient_evaluations¶
Number of times the gradient (and the cost) were evaluated.
-
double GradientProblemSolver::Summary::total_time_in_seconds¶
Time (in seconds) spent in the solver.
-
double GradientProblemSolver::Summary::cost_evaluation_time_in_seconds¶
Time (in seconds) spent evaluating the cost vector.
-
double GradientProblemSolver::Summary::gradient_evaluation_time_in_seconds¶
Time (in seconds) spent evaluating the gradient vector.
-
int GradientProblemSolver::Summary::num_tangent_parameters¶
Dimension of the tangent space of the problem. This is different from
GradientProblemSolver::Summary::num_parameters
if aManifold
object is used.
-
LineSearchDirectionType GradientProblemSolver::Summary::line_search_direction_type¶
Type of line search direction used.
-
LineSearchType GradientProblemSolver::Summary::line_search_type¶
Type of the line search algorithm used.
-
LineSearchInterpolationType GradientProblemSolver::Summary::line_search_interpolation_type¶
When performing line search, the degree of the polynomial used to approximate the objective function.