On 9/1/2010 2:58 PM, Arthur Zheng wrote:
> Background information:
> I have two samples, X1 and X2. X1 and X2 are categorical (2 or 3 cases).
> I want to use chi-square test to test whether X1 and X2 are drawn from
> the same underlying distribution.
> Does matlab have a function for chi-square test for two samples, like
> There is a function called "CHI2GOF", but I haven't figured out how to
> use it to the 2 sample case.
You can use CHI2GOF, but it's really intended more for testing goodness
of fit of a single sample against a distribution family. You're doing
something more like contingency table analysis, which is what CROSSTAB
is for. Not sure what form your data are in, but any of these should work:
% cook up some sample data
k = 5;
p = rand(1,k); p = p./sum(p);
M = 200; N = 250;
x = randsample(1:k,M,true,p); m = histc(x,1:k);
y = randsample(1:k,N,true,p); n = histc(y,1:k);
% Do the test by hand
phat = (m+n) ./ (M+N);
em = phat*M; en = phat*N;
chi2 = sum(([m n] - [em en]).^2 ./ [em en]);
df = k-1;
pval = 1 - chi2cdf(chi2,df);
% Trick CHI2GOF into doing a two sample test. Note the
% nparams value must be such that 2*k - nparams - 1 = k-1
[~,pval,stats] = chi2gof(1:10,'ctrs',1:10,'freq',[m n], ...
'expected',[em en],'nparams',k, 'emin',0)
% Use CROSSTAB
[tbl,chi2,pval] = crosstab([x y],[ones(size(x)) 2*ones(size(y))])
Hope this helps.