Most of the limitations of NRM in multiple dimensions are the same as those for NM in one dimension. The main issues stem from three major areas (per usual), initial point, derivatives, and function evaluations.
Our initial value for Newton’s method greatly determines the effectiveness.The closer our initial guess is to our minima the better the method will work. However, this requires some intuition of the function which can get quite complex. Lets take a look at two different initial points (20,20) and (2,2). The graph of (20,20) method is below.
As you can see the method is flying back and forth past our minima. However, the method eventually converges at (0.9987, 0.4987), our target minima. Below I have included a matrix containing our initial step and all of the following steps (taken from matlab output).
Now lets try it again with our other initial point (2,2). This is already close to our minimum so it shouldn’t take too many iterations.
The closer initial staring point converged in a fifth of the iterations which means a less function evaluations. This brings us to our next two issues, derivatives and function evaluations. We need to be able to find the Hessian of our function as well as the gradient. If we cannot take these two we will need to pick a different method or modify our existing. Most problems stem from the Hessian. Sometime the Hessian cannot be found, other times it can be found but not inverted, but most of the time it is just too expensive to evaluate or invert. Mathematicians have come up with alternative methods called quasi-newton methods (these actually stem from the secant methods I mentioned before) which seek to remedy this issue. Like the one dimensional version, there are a lot of function evaluations. Each derivative of the gradient must be evaluated as well as all of the second derivatives that make up the hessians. The hessian then needs to be inverted which is expensive at higher dimensions.Now you have two tools to deal with finding minima at higher dimensions. Up next we will explore new more advanced methods for optimization.-Marcello




We now take the inverse of the Hessian and then find our next point. The iterative equation is listed below as well as the output.


This looks awfully like our iterative steps from our one dimensional Newton’s method. . As we get closer to our minimum the gradient, g, of our function moves toward 0 which makes xk≈ xk+1. This is when we know our method has found a minimum (hopefully). Hopefully this all feels familiar. Next post we will see how Newton Raphson works.-Marcello
From this point we must calculate the gradient. I have taken the gradient of our function (f(x,y) = x^2 -x+cos(y+x)+y^2). This gradient must now be evaluated at our point (8,8)





We have an infinite number of directions to move In but let’s choose four test cases. We will choose the gradient, the negative gradient, and two other directions. For our initial point lets choose (-6,8). This point lets use evaluate our gradient Mapping these directions (all with a step of 0.1 ) produces the following graph.
So you probably realized where we were headed. Following the steps above we have determined our iterative step. Like our previous one-dimensional search methods, our gradient search method iterates the above equation until a tolerance condition is met. The method described above is called the Method of Steepest Descent or Gradient Descent. Next Post we will go over the method with a real example.-Marcello