On 5/5/2010 1:14 AM, naimead wrote:
> Hello all,
> I am relatively new to matlab and I came up with a project that I have 39 numbers and I have to find their distribution. I performed a liliefors test and I found that the distribution is not Gaussian. From an example in bibliography I know that the specific data if it isn't Gaussian it would be gamma so I did the following :
> [phat, pci] = mle(X,'distribution','gamma');
> but there are negative numbers included in these 39 numbers and it can't be gamma with negative numbers.How can I possibly found which is the right distribution?Is it right to use abs(X)?When I did that, the distribution was both gamma and Poisson so something must be wrong.
Naimead, if this is a random sample, then clearly it did not come from a
Gaussian distribution, as LILLIETEST confirms. Clearly they did not
come from a gamma distribution, because as you point out, something like
half of them are negative, and even if you add a constant to all of the
values, the shape is completely wrong. If your theory says that it must
be normal or gamma, then either your theory is wrong, or you've
misinterpreted it, or your data have not been collected according to the
assumptions of your theory.
Without meaning to sound unhelpful, noone is going to be able to
reconcile the above for you. Arbitrarily modifying 50% of your data to
meet your theoretical expectations probably isn't the right path to take.
There are other distributions you could investigate. One of them is the
extreme value, which happens to be supported by LILLIETEST (and in fact,
the p-value is not significant at 5%). I am NOT saying that this is the
distribution you should use, only that it is a possibility. There are
other possibilities, such as the generalized extreme value. Both the EV
and the GEV are supported by the Distribution Fitting Tool GUI,
DFITTOOL. This tool will help you look at the different fits against
the data, both the PDFs and the CDFs. It turns out that the GEV is not
such a bad fit (look at the CDF plot in DFITTOOL). Again, I am NOT
saying that this is the distribution you should use.
There are other more complicated distributions, such as a three
parameter gamma. That is not explicitly supported by the Statistics
Toolbox, but you could perhaps use MLE to fit it. You are on your own
My advice would be to revisit your assumptions, because clearly they are
wrong. Hope this helps.