I want to calculate a covariance matrix on a matrix of time series data. The individual series may be of differing lengths; the data matrix contains "NaN" entries for the period before a given time series starts. (Thus, for instance, we may have n values in column 1; column 2 may start with m "NaN" entries followed by an n-m column of values; column 3 may have k "NaN" entries followed by an n-k column of values and so on.)
I currently have some rather clumsy code to calculate each covariance entry separately. For entry (n,m), it extracts columns n and m, takes only the subset with entries in each column, and takes the covariance of those series. The code appears below.
I have two questions:
1. The code doesn't work. I wind up with a "Subscripted assignment dimension mismatch" error at the line "retcov(n, m) = cov(T1,T2);" Does anyone have an idea what I'm doing wrong?
2. This feels like a laboured approach to manage a matrix calculation that Excel performs fairly effortlessly (it automatically disregards nonnumeric entries). Is there a better way to go about calculating a covariance matrix on incomplete data?
Thanks.
sz = size(Returns);
retcov = zeros(sz(2),sz(2));
for n = 1:sz(2)
tempdata1 = Returns(:,n);
tempdata1 = tempdata1(isfinite(tempdata1));
k1 = length(tempdata1);
for m =1:sz(2);
tempdata2 = Returns(:,m);
tempdata2 = tempdata2(isfinite(tempdata2));
k2 = length(tempdata2);
k = min(k1, k2);
tempdata = [tempdata1(k1 - k + 1:k1) tempdata2(k2 - k + 1:k2)];
retcov(n, m) = cov(tempdata(:,1),tempdata(:,2));
end
end
|
|
0
|
|
|
|
Reply
|
William
|
9/9/2010 6:30:26 AM |
|
A = [NaN 10, 20, 13, NaN];
B = [NaN NaN, 5, 3, 2];
idx = ~isnan(A) & ~isnan(B);
cov(A(idx), B(idx))
ans =
24.50 7.00
7.00 2.00
Oleg
|
|
0
|
|
|
|
Reply
|
Oleg
|
9/9/2010 7:15:22 AM
|
|
Also if you have the Stats or the Financial TB you can use:
"c = nancov(..., 'pairwise') computes c(i,j) using rows with no NaN values in columns ior j. The result may not be a positive definite matrix. c = nancov(..., 'complete') is the default, and it omits rows with any NaN values, even if they are not in column i or j. The mean is removed from each column before calculating the result."
Be aware that the "pairwise" option (even if self implemented) can lead to a non invertible VarCovVar matrix...
Oleg
|
|
0
|
|
|
|
Reply
|
Oleg
|
9/9/2010 7:55:23 AM
|
|
"Oleg Komarov" <oleg.komarovRemove.this@hotmail.it> wrote in message <i6a3tb$eos$1@fred.mathworks.com>...
> Also if you have the Stats or the Financial TB you can use:
>
> "c = nancov(..., 'pairwise') computes c(i,j) using rows with no NaN values in columns ior j. The result may not be a positive definite matrix. c = nancov(..., 'complete') is the default, and it omits rows with any NaN values, even if they are not in column i or j. The mean is removed from each column before calculating the result."
>
> Be aware that the "pairwise" option (even if self implemented) can lead to a non invertible VarCovVar matrix...
>
> Oleg
Thanks for that, the nancov approach worked. Under the idx approach I still got the same "Subscripted assignment dimension mismatch" error message. I suspect there's something else going on with the data that's causing it.
|
|
0
|
|
|
|
Reply
|
William
|
9/10/2010 4:21:04 AM
|
|
|
3 Replies
179 Views
(page loaded in 0.027 seconds)
Similiar Articles: 'out of memory' covariance matrix - comp.soft-sys.matlab ...... covarians matrix - comp.soft-sys.matlab ... i would like to calculate the varians and covarians from that matrix, but my PC memory... ... Covariance matrix - missing data ... Re: Help with panel data analysis - comp.soft-sys.sas... parameter for the random effects covariance matrix ... whether the AR(1) or Toeplitz covariance > fits > > the observed data. ... statement, but I assume I'm missing ... mixed models with repeated measurements - comp.soft-sys.stat.spss ...... Generally, I do not believe an AR residual variance-covariance matrix is appropriate for data that ... way to proceed *and* you have no (or very, very little) data missing ... Bootstrapping multivariate data - comp.soft-sys.matlab... bootstrapping of multivariate data. For instance, I have a matrix of sa... ... (here are the mean vector and covariance matrix ... Sorry, for some reason that link was missing ... Yule-Walker and Levinson-Durbin Algorithm - comp.soft-sys.matlab ...Moreover, aryule funtion doesnt provide the covariance matrix ... Does any one can tell me what I am missing?? ... and I saw what you pointed, however for real data ... nested Anova with Random factor - comp.soft-sys.matlabOtherwise we are assuming that this covariance is the ... affecting ... model (i.e. with AR residual matrix) that is nested ... have no (or very, very little) data missing ... Predicted probabilities in GLIMMIX - comp.soft-sys.sas> > data test; > input > id outcome ; > cards ... Can anyone explain what I'm missing here ... soft-sys.matlab ... How to output the variance covariance matrix ... ... how to filter observations - comp.soft-sys.sasOne way to do this might be like this: data new ... the approach is not too bad: Just set to missing ... H0=EWMA covariance matrix at time t lambda=smoothing parameter X ... Find the median value of an array. - comp.lang.fortran... it is fastest for any, or even most, sets of input data. ... I think the missing case is type parameters. One often ... calculate local minimum in a large matrix - comp.soft-sys ... Covariance matrix - missing data - Newsreader - MATLAB CentralFile exchange, MATLAB Answers, newsgroup access, Links, and Blogs for the MATLAB & Simulink user community Principal Component Analysis With Missing Data and OutliersBut in the case of missing data, especially when a significant ... it is difficult to gather sufficient training data to guarantee that the covariance matrix is full ... 7/16/2012 3:34:46 AM
|