COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Nonlinear curve fit fails to converge

• Follow

```I have data that exhibit a transition from one state to another as time passes.  The initial and final values are unknown but are expected to be constant, and the sharpness and timing of the transition vary, but should be approximately exponential in shape.  In short, I'll need at least 4 free parameters (and probably 5) for a description of the data.

Using TableCurve2D (which claims to use proprietary tweaks to Leverberg-Marquardt), I can fit them beautifully and without difficulty with an asymmetric sigmoidal form with reverse asymmetry.  However, I'd like to automate the fitting process using MatLab.

I've found that if I encode the function in MatLab and use "fit", it either fails to converge, or it comes up with utterly unreasonable values for the free parameters.  However, if I feed it something very close to the output of TableCurve for starting guesses, then MatLab is well-behaved and the parameters aren't altered.

A representative dataset can be found here:

http://bowdoin.edu/faculty/m/mbattle/matlab/

Here's the relevant piece of my MatLab code (with starting parameters that give an excellent fit):

fo_ = fitoptions('method','NonlinearLeastSquares','Lower',[-Inf -Inf -Inf -Inf    0]);
st_ = [-32.0187 141.47 0.7675  0.000217 0.493 ];
set(fo_,'Startpoint',st_);
ft_ = fittype('A+B*(1.0-(1.0+exp((x+D*log(2.0.^(1.0/E)-1.0)-C)/D)).^(-E))',...
'dependent',{'y'},'independent',{'x'},...
'coefficients',{'A', 'B', 'C', 'D', 'E'});

cf_ = fit(xdata,ydata,ft_,fo_);

If I use the following starting values instead:

st_ = [-32.0 140.0 0.75 0.0004 0.51 ];

MatLab "optimizes" to the following coefficients

-47.0175  84.025   0.7670   0.000385  25.009

which give an awful fit.

Does anyone have any idea why MatLab should be so finicky about starting values and so unstable in its minimization algorithm?  Is there some other approach I should take?  I've tried both "trust region" and "Levenberg-Marquardt" with equal (i.e no) success.

Thanks in advance for any help you can offer.

Mark
```
 0
Reply Mark 10/18/2010 7:50:05 PM

```"Mark BAttle" <mabttle@bowdoinremovethis.edu> wrote in message <i9i8dd\$cek\$1@fred.mathworks.com>...
> I have data that exhibit a transition from one state to another as time passes.  The initial and final values are unknown but are expected to be constant, and the sharpness and timing of the transition vary, but should be approximately exponential in shape.  In short, I'll need at least 4 free parameters (and probably 5) for a description of the data.
>
> Using TableCurve2D (which claims to use proprietary tweaks to Leverberg-Marquardt), I can fit them beautifully and without difficulty with an asymmetric sigmoidal form with reverse asymmetry.  However, I'd like to automate the fitting process using MatLab.
>
> I've found that if I encode the function in MatLab and use "fit", it either fails to converge, or it comes up with utterly unreasonable values for the free parameters.  However, if I feed it something very close to the output of TableCurve for starting guesses, then MatLab is well-behaved and the parameters aren't altered.
>
> A representative dataset can be found here:
>
> http://bowdoin.edu/faculty/m/mbattle/matlab/
>
> Here's the relevant piece of my MatLab code (with starting parameters that give an excellent fit):
>
> fo_ = fitoptions('method','NonlinearLeastSquares','Lower',[-Inf -Inf -Inf -Inf    0]);
> st_ = [-32.0187 141.47 0.7675  0.000217 0.493 ];
> set(fo_,'Startpoint',st_);
> ft_ = fittype('A+B*(1.0-(1.0+exp((x+D*log(2.0.^(1.0/E)-1.0)-C)/D)).^(-E))',...
>      'dependent',{'y'},'independent',{'x'},...
>      'coefficients',{'A', 'B', 'C', 'D', 'E'});
>
> cf_ = fit(xdata,ydata,ft_,fo_);
>
> If I use the following starting values instead:
>
> st_ = [-32.0 140.0 0.75 0.0004 0.51 ];
>
> MatLab "optimizes" to the following coefficients
>
> -47.0175  84.025   0.7670   0.000385  25.009
>
> which give an awful fit.
>
> Does anyone have any idea why MatLab should be so finicky about starting values and so unstable in its minimization algorithm?  Is there some other approach I should take?  I've tried both "trust region" and "Levenberg-Marquardt" with equal (i.e no) success.
>
> Thanks in advance for any help you can offer.
>
> Mark

This is not matlab being finicky. It is a fact of life when
dealing with highly nonlinear functions. And a function
that transitions sharply from one state to another is
quite nonlinear.

Good starting values improve the behavior. Use of a tool
like a partitioned least squares can help very much too,
if there are a few linear parameters that can be removed
from the process.

I would also ask why you are using a nonlinear model?
Very often there is no reason other than that the user
thought they had no other choice. If so, then a better
choice does exist, a least squares spline model. Since a
spline can be constrained to be both monotone and to
have the fundamental shape you desire, it will give
very nice results with out any need for starting values
at all. Find SLM here:

http://www.mathworks.com/matlabcentral/fileexchange/24443

It requires nothing more than the optimization toolbox.

John
```
 0
Reply John 10/18/2010 8:24:04 PM

```John,

Thanks very much for the reply (below).  I understand your point about the challenges of nonlinear fitting, but I was frustrated because TableCurve2D made such easy work of it (it was completely insensitive to the starting guesses).  I guess I hadn't realized how good their engine is.

I actually do have good reasons for choosing this functional form.  The underlying process (gas mixing in a volume with input and output flows) should lead to something very close to this shape.  Most importantly, I want to be able to extrapolate to the eventual asymptote in some cases where the actual observations haven't had time to reach that point.  A spline (or loess or any other non-parametric approach) won't do this for me, but the underlying model of the asymmetric sigmoid has this feature built into it very nicely.

Can you point me to any references for partitioned least squares?  Any further thoughts?

Thanks again,

Mark

> This is not matlab being finicky. It is a fact of life when
> dealing with highly nonlinear functions. And a function
> that transitions sharply from one state to another is
> quite nonlinear.
>
> Good starting values improve the behavior. Use of a tool
> like a partitioned least squares can help very much too,
> if there are a few linear parameters that can be removed
> from the process.
>
> I would also ask why you are using a nonlinear model?
> Very often there is no reason other than that the user
> thought they had no other choice. If so, then a better
> choice does exist, a least squares spline model. Since a
> spline can be constrained to be both monotone and to
> have the fundamental shape you desire, it will give
> very nice results with out any need for starting values
> at all. Find SLM here:
>
> http://www.mathworks.com/matlabcentral/fileexchange/24443
>
> It requires nothing more than the optimization toolbox.
>
> John
```
 0
Reply Mark 10/18/2010 9:30:06 PM

```"Mark BAttle" <mabttle@bowdoinremovethis.edu> wrote in message <i9ie8u\$4ms\$1@fred.mathworks.com>...
> John,
>
> Thanks very much for the reply (below).  I understand your point about the challenges of nonlinear fitting, but I was frustrated because TableCurve2D made such easy work of it (it was completely insensitive to the starting guesses).  I guess I hadn't realized how good their engine is.
>
> I actually do have good reasons for choosing this functional form.  The underlying process (gas mixing in a volume with input and output flows) should lead to something very close to this shape.  Most importantly, I want to be able to extrapolate to the eventual asymptote in some cases where the actual observations haven't had time to reach that point.  A spline (or loess or any other non-parametric approach) won't do this for me, but the underlying model of the asymmetric sigmoid has this feature built into it very nicely.
>
> Can you point me to any references for partitioned least squares?  Any further thoughts?
>

I seem to recall that Seber and Wild had a description,
although they did not call it that. Perhaps they called
it separable least squares, or something like that. (My
copy is upstairs, and I'm feeling lazy. I can find it if
needed.)

You can find it as the method used in my fminspleas tool
on the fex:

http://www.mathworks.com/matlabcentral/fileexchange/10093

It will help greatly if you have several linear parameters
in the model. You can also find pleas, a similar tool
that uses partitioned least squares on top of fminunc.
It is in my optimization tips and tricks.

http://www.mathworks.com/matlabcentral/fileexchange/8553

The shape of the extrapolant out in the tails of the
curve is crucial to that predicted value. And this is
VERY dependent on the particular model you choose
as well as the actual data itself. A little bit of noise will
strongly impact that predicted extrapolated value.

However, I disagree that a spline won't be able to
extrapolate for you. You need to know how to make
it work. Extrapolation is dangerous of course, and is
only as good as the data you have as well as the tool.

x = sort(randn(1,20));
y = erf(x);
slm = slmengine(x,y,'knots',-5:.5:5,'increasing','on', ...
'plot','on','concavedown',[.5,5],'concaveup',[-5,-.5], ...
'rightslope',0,'leftslope',0,'endconditions','natural');

slmeval(5,slm)
ans =
0.9986

Good choices for the constraints applied on the slm
approximant are important of course.

John
```
 0
Reply John 10/18/2010 11:43:03 PM

```John,

Very impressive set of routines in your "tips and tricks" package!  Since I'm getting to the point of questions too narrow for this broad audience, I've sent you two e-mails to the address listed in your profile.  I'd be delighted if you could read and respond to those.

Thanks very much,

Mark

>
> I seem to recall that Seber and Wild had a description,
> although they did not call it that. Perhaps they called
> it separable least squares, or something like that. (My
> copy is upstairs, and I'm feeling lazy. I can find it if
> needed.)
>
> You can find it as the method used in my fminspleas tool
> on the fex:
>
> http://www.mathworks.com/matlabcentral/fileexchange/10093
>
> It will help greatly if you have several linear parameters
> in the model. You can also find pleas, a similar tool
> that uses partitioned least squares on top of fminunc.
> It is in my optimization tips and tricks.
>
> http://www.mathworks.com/matlabcentral/fileexchange/8553
>
> The shape of the extrapolant out in the tails of the
> curve is crucial to that predicted value. And this is
> VERY dependent on the particular model you choose
> as well as the actual data itself. A little bit of noise will
> strongly impact that predicted extrapolated value.
>
> However, I disagree that a spline won't be able to
> extrapolate for you. You need to know how to make
> it work. Extrapolation is dangerous of course, and is
> only as good as the data you have as well as the tool.
>
> x = sort(randn(1,20));
> y = erf(x);
> slm = slmengine(x,y,'knots',-5:.5:5,'increasing','on', ...
>      'plot','on','concavedown',[.5,5],'concaveup',[-5,-.5], ...
>      'rightslope',0,'leftslope',0,'endconditions','natural');
>
> slmeval(5,slm)
> ans =
>        0.9986
>
> Good choices for the constraints applied on the slm
> approximant are important of course.
>
> John
```
 0
Reply Mark 10/19/2010 8:18:43 PM

```John,

Very impressive set of routines in your "tips and tricks" package!  Since I'm getting to the point of questions too narrow for this broad audience, I've sent you two e-mails to the address listed in your profile.  I'd be delighted if you could read and respond to those.

Thanks very much,

Mark

>
> I seem to recall that Seber and Wild had a description,
> although they did not call it that. Perhaps they called
> it separable least squares, or something like that. (My
> copy is upstairs, and I'm feeling lazy. I can find it if
> needed.)
>
> You can find it as the method used in my fminspleas tool
> on the fex:
>
> http://www.mathworks.com/matlabcentral/fileexchange/10093
>
> It will help greatly if you have several linear parameters
> in the model. You can also find pleas, a similar tool
> that uses partitioned least squares on top of fminunc.
> It is in my optimization tips and tricks.
>
> http://www.mathworks.com/matlabcentral/fileexchange/8553
>
> The shape of the extrapolant out in the tails of the
> curve is crucial to that predicted value. And this is
> VERY dependent on the particular model you choose
> as well as the actual data itself. A little bit of noise will
> strongly impact that predicted extrapolated value.
>
> However, I disagree that a spline won't be able to
> extrapolate for you. You need to know how to make
> it work. Extrapolation is dangerous of course, and is
> only as good as the data you have as well as the tool.
>
> x = sort(randn(1,20));
> y = erf(x);
> slm = slmengine(x,y,'knots',-5:.5:5,'increasing','on', ...
>      'plot','on','concavedown',[.5,5],'concaveup',[-5,-.5], ...
>      'rightslope',0,'leftslope',0,'endconditions','natural');
>
> slmeval(5,slm)
> ans =
>        0.9986
>
> Good choices for the constraints applied on the slm
> approximant are important of course.
>
> John
```
 0
Reply Mark 10/19/2010 8:18:43 PM

```In article <i9kuf3\$p7r\$1@fred.mathworks.com>,
Mark Battle <mbattle@bowdoinremovethis.edu> wrote:
>Very impressive set of routines in your "tips and tricks" package!
>Since I'm getting to the point of questions too narrow for this broad
>audience, I've sent you two e-mails to the address listed in your
>profile.  I'd be delighted if you could read and respond to those.

Mark, John - Please do continue your exchange here. It is
still interesting to some and it's nice to read such articulate
questions and answers.

Francis
```
 0
Reply fburton 10/20/2010 10:59:34 AM

```>
> Mark, John - Please do continue your exchange here. It is
> still interesting to some and it's nice to read such articulate
> questions and answers.
>
> Francis

Back by popular demand....

The e-mail exchange that occurred off-line could be summarized as follows:

I (Mark) gave more of the physical description of my system:

*  The physical system is gas flowing through a measurement cell.  At some time, the input gas stream is switched and the new gas gradually flushes out the measurement cell.  I want to know the composition of the new gas.
* The data are measured gas concentrations (more properly, mixing ratios), as a function of time.
* If I had a long time series and the switch occurred early enough, I'd just average the last n values of my dataset and have my answer.  However, for a variety of reasons, the dataset often ends too soon after the switch to make this simple calculation.
* Physics tells me that in a perfect world, the concentration should simply be of the form C(t) = Cf + (Ci-Cf)e-f(t-t*) for all t>t* (where t* is the time of the transition) and C(t) = Ci for all t=<t*
* Unfortunately, the world isn't perfect.  So while the approach to the final asypmtotic value Cf is indeed exponential, the increase initiated at t* is not sudden and discontinuous, but occurs with some smoothness (probably reflecting a different, small volume that needs to be flushed) with a different time constant.  For these reasons, I settled on the asymmetric sigmoid as a model, since it mathematically embodies all of these mechanistic properties.
* I have hundreds of datasets (maybe over a thousand), so fits have to be done programmatically

That's the narrative form.  Stated in the most simplistic terms...

* Values start out ~constant at Ci, either high (or low), undergo a smooth transition at t = t* and end up lower (or higher).
* The time of the transition (t*) varies from one dataset to the next
* The transition is always late enough that Ci is easily determined
* Usually, the transition is early enough that there is a constant final period.  However, even in these cases, exactly how much data should be averaged to determine Cf is a challenge
* Sometimes the transition is late enough so that the asymptotic value has not yet reached a constant final value
* My goal is to determine the Cf if it has been reached, or if it hasn't, to determine what it would be.

That's my challenge.  I don't give a hoot about Ci, f, or even t* (although it would be nice to track some objective measure of t* from one dataset to the next).  If I could get MatLab to reliably fit with a sigmoid, I'm pretty sure I'd be all set, but that doesn't seem to be in the cards.

What do you think?  Could SLM really do this for me programatically?

==============================================

To which John replied (with several graphics that I can't paste in here):

The problem with sigmoid models in my experience is they always seem to have the wrong shape. The virtue of these models is they have nice parameters that MAY sometimes be useful to characterize a problem, even when there is no valid mechanistic model for their shape. We used lots of these models for models for the copier project in my early years at Kodak, and on many other problems.

It is also one big reason why I developed the ideas behind SLM. Many people don't have a reason for a specific nonlinear fit, except that it has the rough shape they want. Then they find that the model they choose fails to have the exact shape they need, so they fail to get decent convergence. Many times what I saw is that the toe and shoulder of such a curve had different shapes. This is a characteristic of problems with multiple "rate" constants.

My thought is to use a few simple constraints on the form of spline model. Here is a simple set of data, that we know has asymptotes at both ends of the curve, at +/- 1. I'll choose a set of data that stops before it gets that close though.

>> x = rand(100,1)*5 - 2;
>> y = erf(x/2) + randn(size(x))/100;
>> plot(x,y,'o')
>> grid on

We can fit this with a spline model of course, but extrapolation is rough with splines. See that I've chosen knots that extend well outside of the range of the data, so the spline model will smoothly extrapolate.

>> slm = slmengine(x,y,'plot','on','knots',-4:.25:4);

The trick is to think about what characteristics you will expect.

1. The curve should be monotone increasing.
2. At each end, if it has reached an asymptote, then the slope at each end must be zero.

>> slm = slmengine(x,y,'plot','on','knots',[-4:.25:4],'increasing','on','leftslope',0,'rightslope',0);

Ok, so it is still poor. But we are not done. I might try adding natural end conditions, i.e., the second derivative is also zero at the ends of the curve.

>>  slm = slmengine(x,y,'plot','on','knots',[-4:.25:4],'increasing','on','leftslope',0,'rightslope',0,'endconditions','natural');

What helps in the end is to recognize that the shape has positive curvature over part of the curve, and negative curvature in the top half. See that I've not told it exactly where the curvature changes sign. I left the middle part of the curve inconclusive.

>> slm = slmengine(x,y,'plot','on','knots',[-4:.25:4],'increasing','on','leftslope',0,'rightslope',0,'endconditions','natural','concaveup',[-4,-1],'concavedown',[1 4]);

But I will argue that this is actually a decent estimate of the upper asymptote, gained from nothing more than a spline fit with some intelligently chosen constraints. I never needed to pick a model. No starting values were necessary. And I had to make no assumptions about the curve beyond a few simple ones that were very logical. Also note that I used obscenely more knots in this spline than I really needed, but that the constraints provide a tremendous regularization on the shape of the curve. The nice thing about my use of too many knots, is I don't need to worry about the exact location of the transition. The spline figures it all out.

The prediction of the upper end point is not far off from the value we know to be true. Even the lower end point, which is extrapolated much farther, seems reasonable.

>> slmeval(4,slm,0)
ans =
1.017

I'm not sure that a nonlinear model would give me a better measure of that endpoint, but I do know that if I tried to fit this curve with a exponentially based sigmoid, the shoulders would not have the correct shape. They never do.

You can get other measures of the curve. For example, plotting the first derivative identifies the point of maximum slope.

>> plotslm(slm)

We can compute that point too.

>> [maxsl,location] = slmpar(slm,'maxslope')
maxsl =
0.57594

location =
-0.11286

Perhaps this may be a good measure of the transition point you mentioned. One might even be able to come up with measures of the other parameters in your curve, effective rate constants for example.

Regards,
John
```
 0
Reply Mark 10/20/2010 12:25:05 PM

```fburton@nyx.net (Francis Burton) wrote in message <1287572374.751872@irys.nyx.net>...
> In article <i9kuf3\$p7r\$1@fred.mathworks.com>,
> Mark Battle <mbattle@bowdoinremovethis.edu> wrote:
> >Very impressive set of routines in your "tips and tricks" package!
> >Since I'm getting to the point of questions too narrow for this broad
> >audience, I've sent you two e-mails to the address listed in your
> >profile.  I'd be delighted if you could read and respond to those.
>
> Mark, John - Please do continue your exchange here. It is
> still interesting to some and it's nice to read such articulate
> questions and answers.
>

I'll see what I can do.

People have for many years been building nonlinear models
of the form Mark has posed. Nasty things, with asymmetric
shapes in the toe and shoulder of a vaguely sigmoidal
curve. These things are achieved by putting in parameters
in exponents or other nasty places. And those parameters
are always tough to fit.

In fact, in my early years at Kodak, this was one of the things
I worked with. Kodak had many problems of that sort, since
any type of exposure system has a similar system response -
asymptotes at each end, and a transition between. Film,
paper, copiers, a few others I can think of.

What happens is the scientist usually does not have an explicit
mathematical model for the system under study. At best, they
may have a metaphorical model, i.e., a model of a system
they do understand, where that model has parameters they
can estimate from data. A good example of this is the use of
disease transmission models to then model sales of a product.
Since sales are often encouraged by word of mouth, the
disease models may well end up being viable. This is what I
call a metaphorical model - one model used as a metaphor
to understand a completely different but at the same time,
similar process.

A problem with metaphorical models is also their virtue. They
can be used to nicely predict behaviors of a system, but they
can also predict some strange stuff when the metaphor is
abused. In a talk on mathematical modeling I gave, I used
the example of Romeo referring to Juliet as the sun. The
metaphor gives us understanding about Romeo's feelings
towards Juliet, but at the same time we understand that
Juliet does not have a surface temperature that is sufficient
to vaporize titanium. We need to know that all metaphors
have their limits.

Other scientists build nonlinear mathematical models
specifically based on shape. (I even teach how to do this in
one of my FEX submissions.) They know the basic shape of
the system they wish to model, so they construct a functional
form that can take on that shape, then they estimate the
coefficients of the model. If the coefficients can be interpreted
in useful ways, so much the better.

There are a third group of scientists who have actual
mathematical models of their system, based on first
principles and mechanistic models of their processes.

Really, I tend to argue that only the latter group of scientists
have a strong case for the use of nonlinear regression to
estimate model coefficients. The other groups of scientists
might at least consider the use of least squares spline models
as an alternative to a nonlinear regression.

The problem with nonlinear models when applied to curves
is they never seem to have exactly the right shape. The toe
or shoulder always seems just a bit wrong. And if you are
using a model for extrapolation, then fitting that toe or
shoulder accurately is crucial.

Spline models can be constrained to do many of the common
things that people want to see. In fact, these constraints can
provide a powerful regularizing effect on the shape of a curve.
This is the fundamental thesis behind SLM. Those tools allow
the user to embody their knowledge of the fundamental shape
of a system in the form of a set of simply named constraints.

The point of all this is that a spline model can be used on the
curves I've seen from Mark. Splines can indeed be used to
extrapolate, at least as well as a nonlinear metaphor might do,
as long as the proper information about the curve shape is
built into the model.

So if you have chosen the model for a curve simply because
it has the shape you know must be there, then use a spline,
but build your knowledge of that shape into the spline itself.
This is a somewhat Bayesian way of thinking about the problem,
an approach that has worked nicely for me for many years of
data modeling. It works. It allows me to use spline models
with relatively little fear of where I put the knots - just throw
plenty of knots at the problem, but use intelligently chosen
constraints on the final shape.

John
```
 0
Reply John 10/20/2010 1:07:03 PM

```John,

Thanks for all of the effort that went in to your replies (prev. post and via e-mail) as well as straightening out my elementary coding error.

Here's where things stand:

1) In an effort to give the conventional non-linear fit one more chance, I implemented the partitioned non-linear least squares method using your PLEAS package.  While this reduced the number of free parameters the non-linear fitter was optimizing (from 5 to 3), the problem was still too much for MatLab, as even the slightest deviation from optimal starting guesses caused Matlab to wander off and give a lousy fit.

2) More importantly, your arguments for spline fitting were so convincing that I got things going with your SLM package.  You were good enough to actually do a fit to my sample dataset.  As you said in your e-mail:

===================================

Using the data you posted, I got this fairly easily.
>> slm = slmengine(x,y,'plot','on','knots',linspace(.764,.77,25),'increasing','on','leftslope',0,'rightslope',0,'concavedown',[.7675 .77])

A predicted upper asymptote.
>> slmeval(0.77,slm,0)
ans =
108.28
And a maximum slope, which happens at 0.76744.
>> [maxsl,location] = slmpar(slm,'maxslope')
maxsl =
1.2148e+05
location =
0.76744

I see that the bottom end has some funny stuff happening, but that does not hurt anything on top.
==================================

The fit looked great and the arguments looked utterly compelling.  That is, until I implemented the code myself and realized two things:

1) You cheated!  The whole idea is to do this programatically, and yet when you set up the spline, you specified a point (0.7675) beyond which the spline must be concave down.  I'm willing to bet nearly anything that you chose this point after looking at a plot of the data.  In my hundreds of other datasets, I don't know where this inflection point is, so in principle, I can't make that specification.  That's why a fixed functional form with Lev-Marq minimization is so attractive to me.  In principle, I should be able to just wander around in parameter space until I find the transitions like this, with limited a-priori knowledge.

Now, in truth, your spline formalism is so attractive that I did some cheating of my own.  I defined the beginning-of-downward-concavity as the point (in X) just beyond the where the Y-value has gone through at least half of its observed change.  I've visually scanned enough of the datasets to know that this is a pretty safe definition (but it's still a cheat).  Once I had this input to your SLMengine, I was off and running.  Except...

2) The extrapolated value seems to be highly dependent on the details of the choices for the SLM parameters.  Specifically, you spaced 25 knots between
0.764 and 0.77 and started downward concavity at .7675.  That gave a predicted upper asymptote of 108.28.  My code spaces 25 knots between 0.7628 and 0.7711
and starts concavity at 0.767569.  I get an upper asymptote of 106.60.  This difference (108.28 vs 106.60) is really significant for me.  If you let SLMengine make plots for you, you can see the difference in the two splines right away.

I'm not sure where this leaves me.  I'd love to use SLM (and am quite happy to accept the "cheat" that I implemented), but if values hop around by 1-1.5% depending on initial assumptions, I'm sunk.

Any thoughts?

Mark
```
 0
Reply Mark 10/20/2010 1:14:05 PM

```"Mark Battle" <mbattle@bowdoinremovethis.edu> wrote in message <i9mput\$m6e\$1@fred.mathworks.com>...
> John,
>
> Thanks for all of the effort that went in to your replies (prev. post and via e-mail) as well as straightening out my elementary coding error.
>
> Here's where things stand:
>
> 1) In an effort to give the conventional non-linear fit one more chance, I implemented the partitioned non-linear least squares method using your PLEAS package.  While this reduced the number of free parameters the non-linear fitter was optimizing (from 5 to 3), the problem was still too much for MatLab, as even the slightest deviation from optimal starting guesses caused Matlab to wander off and give a lousy fit.
>

Yep.

That is what happens. Those nonlinear models are very
unstable. An issue is when the shape of the toe or
shoulder is not perfect, then the curve will either fail
to converge, or it will yield a strangely poor prediction
for the upper endpoint.

Either case is bad.

> 2) More importantly, your arguments for spline fitting were so convincing that I got things going with your SLM package.  You were good enough to actually do a fit to my sample dataset.  As you said in your e-mail:
>
> ===================================
>
> Using the data you posted, I got this fairly easily.
> >> slm = slmengine(x,y,'plot','on','knots',linspace(.764,.77,25),'increasing','on','leftslope',0,'rightslope',0,'concavedown',[.7675 .77])
>
> A predicted upper asymptote.
> >> slmeval(0.77,slm,0)
> ans =
>        108.28
> And a maximum slope, which happens at 0.76744.
> >> [maxsl,location] = slmpar(slm,'maxslope')
> maxsl =
>    1.2148e+05
> location =
>       0.76744
>
> I see that the bottom end has some funny stuff happening, but that does not hurt anything on top.
> ==================================
>
> The fit looked great and the arguments looked utterly compelling.  That is, until I implemented the code myself and realized two things:
>
> 1) You cheated!  The whole idea is to do this programatically, and yet when you set up the spline, you specified a point (0.7675) beyond which the spline must be concave down.  I'm willing to bet nearly anything that you chose this point after looking at a plot of the data.

Yes, I did, mainly because the scaling of the x-axis was so
narrow that I had to pick a number off the plot. :)

On the other hand, how would you choose the parameters
for a nonlinear curve fit? If you pick some fixed number
for all curves, how would you possibly ever expect that
nonlinear regression model to converge?

If I wanted to do this programmatically, I would...

1. Find a rough estimate of the maximum of the data. max(y)
will suffice.

2. Find a rough estimate of the minimum of the data.
min(y) will suffice.

3. Find an estimate of the mid point of the transition region.
I might do this via an initial spline fit with only a few knots,
and no constraints, with no extrapolation.

For example,

slm0 = slmengine(x,y,'knots',15);

% find the location of the approximate mid point on the
transition as

xmid = slmeval((min(y) + max(y))/2,slm0,-1);

Now, choose regions where the curve is forced to be
concavedown as some point above that mid point, to the
end of the curve. So perhaps the upper one third of the
curve will be concave down. The lower third of the
curve will be concaveup. Leave it alone in the middle
third.

Once you have chosen the appropriate regions on the
curve programmatically, do a second fit to the data.
Use extra knots to allow the spline to extrapolate a
reasonable distance, giving you a decent estimate of
the asymptotes.

> In my hundreds of other datasets, I don't know where this inflection point is, so in principle, I can't make that specification.

Sure you can. Just leave a wide enough region in the
center where no constraint is applied. The spline will take
care of the rest. Problems in the fit generally happen
only in the tails, so that is the only place where you
need the constraint.

>  That's why a fixed functional form with Lev-Marq minimization is so attractive to me.  In principle, I should be able to just wander around in parameter space until I find the transitions like this, with limited a-priori knowledge.
>

Curve fits don't work that way. The tool gets lost in
the parameter space when you give it poor starting
values, diverging to some crapola fit.

> Now, in truth, your spline formalism is so attractive that I did some cheating of my own.  I defined the beginning-of-downward-concavity as the point (in X) just beyond the where the Y-value has gone through at least half of its observed change.  I've visually scanned enough of the datasets to know that this is a pretty safe definition (but it's still a cheat).  Once I had this input to your SLMengine, I was off and running.  Except...
>
> 2) The extrapolated value seems to be highly dependent on the details of the choices for the SLM parameters.  Specifically, you spaced 25 knots between
> 0.764 and 0.77 and started downward concavity at .7675.  That gave a predicted upper asymptote of 108.28.  My code spaces 25 knots between 0.7628 and 0.7711
> and starts concavity at 0.767569.  I get an upper asymptote of 106.60.  This difference (108.28 vs 106.60) is really significant for me.  If you let SLMengine make plots for you, you can see the difference in the two splines right away.
>
> I'm not sure where this leaves me.  I'd love to use SLM (and am quite happy to accept the "cheat" that I implemented), but if values hop around by 1-1.5% depending on initial assumptions, I'm sunk.
>
> Any thoughts?

The problem is, if we put confidence intervals around
that asymptote based on your data, the two estimates
may not be distinguishable from each other. Essentially,
if we change the region in the center where the spline
is allowed to be either concave up or down, and this
dramatically effects your estimate of the upper end
point, then that end point is probably not adequately
defined by your data to yield a stable estimate.

The best that I can offer is to use some fixed scheme
as described above. Possibly use bootstrp to wrap
confidence intervals around the estimate.

All I can offer are the words of Mark Twin, in my favorite
quote on mathematics that nobody seems to know...

&#8220;In the space of one hundred and seventy six years the Lower Mississippi has shortened itself two hundred and forty-two miles. That is an average of a trifle over a mile and a third per year. Therefore, any calm person, who is not blind or idiotic, can see that in the Old Oölitic Silurian Period, just a million years ago next November, the Lower Mississippi was upwards of one million three hundred thousand miles long, and stuck out over the Gulf of Mexico like a fishing-pole. And by the same token any person can see that seven hundred and forty-two years from now the Lower Mississippi will be only a mile and three-quarters long, and Cairo [Illinois] and New Orleans will have joined their streets together and be plodding comfortably along under a single mayor and a mutual board of aldermen. There is something fascinating about science. One gets such wholesale returns of conjecture
out of such a trifling investment of fact.&#8221;

Mark Twain, "Life on the Mississippi"

John
```
 0
Reply John 10/20/2010 2:26:03 PM

```Mark, I played around with your model and data a bit.  I imagine that as
usual, everything John says is right and on target.  But I wonder if
you've looked at the underlying reason why you're having trouble fitting
this parametric model:

If you plot the 1-D profiles of the (5-dimensional) SSE surface, you
find that four of the profiles are the kind of thing you hope for:
vaguely quadratic bowls that are pretty wide.  But the SSE profile for
the third param, C, is entirely different.  It's essentially flat,
except that once you are within about 0.5% (not 50%, not 5%, but one
half of one percent) of the solution, it spikes sharply down to the
minimum.  Which is to say, you have to be VERY close to the solution to
be able to usefully look at the gradient and move in the right direction.

I don't know if that is typical of this model for other data sets.  It
seems that what you'd have to do is some kind of 1D search on C,
minimizing the SSE for the other four params at each value of C, until
you find something that gets you somewhere.  Then you could apply FIT or
NLINFIT or LSQCURVEFIT.

Hope this helps.
```
 0
Reply Peter 10/20/2010 6:43:43 PM

```I think you can set a good initial condition and bounds automatically based on the data.

Given your desired equation:
A+B*(1.0-(1.0+exp((x+D*log(2.0.^(1.0/E)-1.0)-C)/D)).^(-E))

A represents the initial asymptote
B represents the distance to the final asymptote
C represents the x location of the inflection point
D represents the sharpness at the inflection point. The derivative at the inflection is B/4/D
E represents the skewness, with E=1 being unskewed.

Does the following work?

%---------------------------------------------------------------------
upsidedown = 0;
if ydata(end) < ydata(1)
ydata = -ydata;         %There's probably a more elegant way to do this
upsidedown = 1;
end

% Find the extents of the data
xmin = xdata(1); xmax = xdata(end);
ymin = ydata(1); ymax = ydata(end);
dydx = (ymax-ymin)/(xmax-xmin);

% Use these to set up the initial conditions and bounds
Ao = ymin;
Bo = ymax - ymin;
Co = (xmax+xmin)/2;
Do = Bo/4/dydx;
Eo = 1;

IC = [Ao Bo Co Do Eo];
LB = [-Inf ymin xmin 0 1/1023];  % 2^(1024) = Inf
UB = [ymax Inf xmax Inf Inf];

% Do the fitting
fo_ = fitoptions('method','NonlinearLeastSquares','Lower',LB,'Upper',UB,'Startpoint',IC);
ft_ = fittype('A+B*(1.0-(1.0+exp((x+D*log(2.0.^(1.0/E)-1.0)-C)/D)).^(-E))',...
'dependent',{'y'},'independent',{'x'},...
'coefficients',{'A', 'B', 'C', 'D', 'E'});
cf_ = fit(xdata,ydata,ft_,fo_);

if upsidedown
ydata = -ydata;
cf_.A = -cf_.A;
cf_.B = -cf_.B;
end

plot(xdata,ydata,xdata,cf_(xdata));
cf_
```
 0
Reply Teja 10/27/2010 12:15:05 PM

```I wonder why TableCurve2D supposedly does the fit without effect, whereas Matlab struggles.

This makes me also wonder what are proprietary tweaks to the Leverberg-Marquardt method.
```
 0
Reply Godzilla 10/28/2010 1:45:05 AM

13 Replies
426 Views

(page loaded in 0.197 seconds)

Similiar Articles:

7/23/2012 9:31:30 AM

 Reply: