Kaplan-Meier survival analysis

  • Follow


Hi,
I'm new to survival analysis. I'm taking over a project from someone using SPSS. 
I tried ecdf(t, 'censoring', censored), but the plots look different from SPSS. 
I already realised that SPSS asks for [time_to_event logical_event_happend] as input while Matlab asks for [time_to_event censored]. 
Did I get that right? 
But why is the curve upside down? and how could I plot the points where data is censored (how do I get y-values)?

Thanks for any hint
Jasmin 
0
Reply Jasmin 1/25/2011 1:43:03 PM

On 1/25/2011 8:43 AM, Jasmin wrote:
> But why is the curve upside down?

You may need to use

 >> help ecdf
  ECDF Empirical (Kaplan-Meier) cumulative distribution function.
[snip]
    'function'    The type of function returned as the F output argument,
                  chosen from 'cdf' (the default), 'survivor', or
                  'cumulative hazard'.

and then plot the values yourself as shown in the help.  If you call 
ECDF with no outputs, it will plot the empirical CDF, and you may be 
asking for the survivor function.
0
Reply Peter 1/25/2011 2:45:47 PM


Hi Peter,
thank you for your help. 
[f,x] = ecdf(t,'censoring',censored, 'function', 'survivor');
I managed to plot the survivor curve, however it still looks different than an spss output. For example, I have survival times of t >100 in my input data, but the curve stops at ~90. Why don't I get x values for the complete timeframe t?

Thank you
Jasmin
0
Reply Jasmin 1/26/2011 10:10:07 AM

Hi,
thanks to the help from Peter, my code for Kaplan-Meier plots looks like this:

% ecdf is the empirical cumulative distribution function (Kaplan-Meier) 
[f,x] = ecdf(t,'censoring',censored, 'function', 'survivor'); %
% add first and last data points
x=[0; x; max(t)];
f= [1; f; f(end)];
% stair plot
figure
stairs(x,f);
ylim([0 1])

% to add marks for censored data
hold on
% get censored times and function values
xi = t(logical(censored));
[dum,step] = histc(xi,x);
step(xi>max(t)) = numel(x);
fi = f(step);
plot(xi,fi, 'b+')
hold off

My next step is to figure out how a log-rank test could be implemented, then I can finally migrate that project from SPSS to MatLab.
Any ideas or links to a pseudo code for the log-rank test? might also be called Mantel-Cox?

Thank you
Jasmin
0
Reply Jasmin 1/26/2011 10:41:07 AM

On 1/26/2011 5:10 AM, Jasmin wrote:
> I managed to plot the survivor curve, however it still looks different
> than an spss output. For example, I have survival times of t >100 in my
> input data, but the curve stops at ~90. Why don't I get x values for the
> complete timeframe t?

Hard to tell from your description, but if all of your largest 
observations are censored, then ECDF will plot the K-M curve only out to 
the largest observed failure, and the curve does not go to 1 (CDF) or 0 
(SF).  That's all it _can_ do.  You can see that if you run the ECDF 
help example several times (it uses random data, so you have to try it a 
few times).
0
Reply Peter 1/26/2011 4:54:19 PM

> My next step is to figure out how a log-rank test could be implemented, 
> then I can finally migrate that project from SPSS to MatLab.
> Any ideas or links to a pseudo code for the log-rank test? might also be 
> called Mantel-Cox?

Jasmin, I can confirm there is no log-rank function in the Statistics 
Toolbox. I see a couple on the file exchange, but I don't have any 
experience with them.

Another option is to perform a Cox proportional hazards regression 
(coxphfit) using an indicator variable for one of the groups, and test 
whether the coefficient of the indicator variable is significantly different 
from zero. I cannot offer a mathematical proof that this will work, but 
quoting Wikipedia:

"The logrank statistic can be derived as the score test for the Cox 
proportional hazards model comparing two groups. It is therefore 
asymptotically equivalent to the likelihood ratio test statistic based from 
that model."

-- Tom 

0
Reply Tom 1/27/2011 7:23:25 PM

Hi, 

I managed an implementation that gives me similar results for the p value of a log rank test compared to SPSS or R. But only similar, not exactly identical. Why? Any clues?

1.
I get x=times of events, N=number at risk and D=number of deaths or events according to ecdf() for cumulated data of group1 and group2.
2.
Then I calculate number at risk per group:
d1 = histc(t1(~censored1),x); % deaths 
c1 = histc(t1(censored1),x); % censored
atRisk1 = size(t1,1) - cumsum(d1) - cumsum([0; c1(1:end-1)]);
d2 = histc(t2(~censored2),x); % deaths
c2 = histc(t2(censored2),x); % censored
atRisk2 = size(t2,1) - cumsum(d2) - cumsum([0; c2(1:end-1)]);
3.
expected number of event, given null hypothesis that risk is identical in both groups:
E=[ sum(atRisk1.*D./N) sum(atRisk2.*D./N) ]
4. 
observed number of events:
O = [sum(~censored1) sum(~censored2)]
5.
Chi square test for difference between observed and expected
chi2 = sum(((O - E).^2)./E);
pval = 1 - chi2cdf(chi2,1);


Any idea is welcome and appreciated.

Thanks
Jasmin
0
Reply Jasmin 2/3/2011 11:50:05 AM

6 Replies
633 Views

(page loaded in 0.058 seconds)

Similiar Articles:










7/21/2012 4:29:21 AM


Reply: