Davidon–Fletcher–Powell formula explained

The Davidon–Fletcher–Powell formula (or DFP; named after William C. Davidon, Roger Fletcher, and Michael J. D. Powell) finds the solution to the secant equation that is closest to the current estimate and satisfies the curvature condition. It was the first quasi-Newton method to generalize the secant method to a multidimensional problem. This update maintains the symmetry and positive definiteness of the Hessian matrix.

Given a function

f(x)

, its gradient (

\nablaf

), and positive-definite Hessian matrix

, the Taylor series is

f(x_k+s_k)=f(x_k)+\nabla

	T
f(x
	k)

s_k+

	1
	2

	T
s
	k

{B}s_k+...,

and the Taylor series of the gradient itself (secant equation)

\nablaf(x_k+s_k)=\nablaf(x_k)+Bs_k+...

is used to update

The DFP formula finds a solution that is symmetric, positive-definite and closest to the current approximate value of

B_k

B_k+1= (I-\gamma_ky_k

	T)
s
	k

B_k(I-\gamma_ks_k

	T)
y
	k

+\gamma_ky_k

	T,
y
	k

where

y_k=\nablaf(x_k+s_k)-\nablaf(x_k),

\gamma_k=

	T
y		s_k
	k

and

B_k

is a symmetric and positive-definite matrix.

The corresponding update to the inverse Hessian approximation

H_k=

	-1
B
	k

is given by

H_k+1=H_k-

y_k

	T
y
	k

H_k

	T
y		H_ky_k
	k

	T
s
	k

	T
y		s_k
	k

is assumed to be positive-definite, and the vectors

	T
s
	k

and

must satisfy the curvature condition

	T
s
	k

y_k=

	T
s
	k

Bs_k>0.

The DFP formula is quite effective, but it was soon superseded by the Broyden–Fletcher–Goldfarb–Shanno formula, which is its dual (interchanging the roles of y and s).^[1]

Compact representation

By unwinding the matrix recurrence for

B_k

, the DFP formula can be expressedas a compact matrix representation. Specifically, defining

$S_k = \begin s_0 & s_1 & \ldots & s_ \end,$ $Y_k = \begin y_0 & y_1 & \ldots & y_ \end,$

and upper triangular and diagonal matrices

$\big(R_k\big)_ := \big(R^_k\big)_ = s^T_y_, \quad \big(R^_k\big)_ = y^T_s_, \quad (D_k)_ := \big(D^_k\big)_ = s^T_y_ \quad \quad \text 1 \le i \le j \le k$

the DFP matrix has the equivalent formula

$B_k = B_0 + J_k N^_k J^T_k,$

$J_k = \begin Y_k & Y_k - B_0 S_k \end$

$N_k = \begin 0_ & R^_k \\ \big(R^_k \big)^T & R_k+R^T_k-(D_k+S^T_k B_0 S_k) \end$

The inverse compact representation can be found by applying the Sherman-Morrison-Woodbury inverse to

B_k

. The compact representation is particularly useful for limited-memory and constrained problems.^[2]

Notes and References

Book: Avriel, Mordecai . Nonlinear Programming: Analysis and Methods . Prentice-Hall . 1976 . 0-13-623603-0 . 352–353 .
2403.12206 . math.OC . J. J. . Brust . Useful Compact Representations for Data-Fitting . 2024.

Davidon–Fletcher–Powell formula explained

Compact representation

See also

Further reading

Notes and References