fminunc with gradient slower than without

  • Follow


hi there, 

I have a problem with fminunc. It seems that when I provide the gradient option, the minimization is much slower. I report this simple example. I have a simple function ff = x^4 and I minimize it with fminunc. 
this is what I get:

function [e g] = ff(x)
e=x^4;
g=4*(x^3);

tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'on')); end; toc
Elapsed time is 16.014073 seconds.

tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'off')); end; toc
Elapsed time is 6.857243 seconds.

version 7.9.0.529 (R2009b)

How is it possible that when the gradient is on it takes 3 times as much as it takes when the gradient is off? Shouldn't it be spending time doing the gradient approximation? or, even if in a 1D case this is not really a cost, should not it take the same time? 

thanks for your help,

best, 

 ste
0
Reply stefano 11/17/2009 6:10:21 PM

"stefano pellegrini" <stefanodic@yahoo.it> wrote in message <hduoud$8g7$1@fred.mathworks.com>...
> hi there, 
> 
> I have a problem with fminunc. It seems that when I provide the gradient option, the minimization is much slower. I report this simple example. I have a simple function ff = x^4 and I minimize it with fminunc. 
> this is what I get:
> 
> function [e g] = ff(x)
> e=x^4;
> g=4*(x^3);
> 
> tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'on')); end; toc
> Elapsed time is 16.014073 seconds.
> 
> tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'off')); end; toc
> Elapsed time is 6.857243 seconds.
> 
> version 7.9.0.529 (R2009b)
> 
> How is it possible that when the gradient is on it takes 3 times as much as it takes when the gradient is off? Shouldn't it be spending time doing the gradient approximation? or, even if in a 1D case this is not really a cost, should not it take the same time? 
> 
> thanks for your help,
> 
> best, 
> 
>  ste

This is because fminunc runs a different algorithm depending on whether you provide the gradient or not. 

The default setting of the option 'LargeScale' is 'on', but the large scale algorithm only runs when you can compute your own gradient. Otherwise, fminunc runs the medium-scale Quasi-Newton algorithm.

It may be reasonable that the large scale algorithm take more computation time than the medium scale version if, say, you have a relatively small and inexpensive problem/objective (which appears to be the case). The real test may be to check the number of iterations taken, or turn the LargeScale option off, and run with and without the gradient on.

-Steve
0
Reply Steve 11/17/2009 7:04:20 PM


"Steve" <steve.grikschat@mathworks.com> wrote in message <hdus3k$n3$1@fred.mathworks.com>...
> "stefano pellegrini" <stefanodic@yahoo.it> wrote in message <hduoud$8g7$1@fred.mathworks.com>...
> > hi there, 
> > 
> > I have a problem with fminunc. It seems that when I provide the gradient option, the minimization is much slower. I report this simple example. I have a simple function ff = x^4 and I minimize it with fminunc. 
> > this is what I get:
> > 
> > function [e g] = ff(x)
> > e=x^4;
> > g=4*(x^3);
> > 
> > tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'on')); end; toc
> > Elapsed time is 16.014073 seconds.
> > 
> > tic, for i=1:500; fminunc(@ff, 100, optimset('gradobj', 'off')); end; toc
> > Elapsed time is 6.857243 seconds.
> > 
> > version 7.9.0.529 (R2009b)
> > 
> > How is it possible that when the gradient is on it takes 3 times as much as it takes when the gradient is off? Shouldn't it be spending time doing the gradient approximation? or, even if in a 1D case this is not really a cost, should not it take the same time? 
> > 
> > thanks for your help,
> > 
> > best, 
> > 
> >  ste
> 
> This is because fminunc runs a different algorithm depending on whether you provide the gradient or not. 
> 
> The default setting of the option 'LargeScale' is 'on', but the large scale algorithm only runs when you can compute your own gradient. Otherwise, fminunc runs the medium-scale Quasi-Newton algorithm.
> 
> It may be reasonable that the large scale algorithm take more computation time than the medium scale version if, say, you have a relatively small and inexpensive problem/objective (which appears to be the case). The real test may be to check the number of iterations taken, or turn the LargeScale option off, and run with and without the gradient on.
> 
> -Steve

indeed, this solves my problem. 
thanks a lot!
0
Reply stefano 11/17/2009 8:39:19 PM

2 Replies
372 Views

(page loaded in 0.002 seconds)

Similiar Articles:












7/25/2012 1:25:04 AM


Reply: