Chapter 10Calculating and using derivatives

In previous chapters, we have referred several times to derivatives, that is, to gradients and Hessians of functions. In practice, having good derivative information is important to obtaining solutions or to knowing that we have a valid solution. This chapter will look at ways in which we can acquire and use such information.

10.1 Why and how

Derivative information is important

  • because the KKT conditions (Karush, 1939; Kuhn and Tucker, 1951) for a minimum require first derivatives to be “zero” and the Hessian, that is, second derivative matrix, to be positive definite;
  • because many methods can use gradient information.

Indeed, even methods that claim to be “derivative free” will often use the concepts of gradients and Hessians, either for the function to be minimized or for an approximating model.

It is my experience that the main utility of good derivative information is in testing that we indeed have a solution. That is, it is useful for termination test and improves performance because it allows us to cease trying to proceed when our journey is complete. In some cases, approximate derivatives may actually give better performance for some gradient methods in initial steps when we are far from the solution. This is similar to secant methods outperforming Newton methods in early iterations.

Unfortunately, the calculation of derivatives is not a trivial task. This chapter looks at some approaches and presents some recommendations ...

Get Nonlinear Parameter Optimization Using R Tools now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.