COMPGROUPS.NET | Search | Post Question | Groups | Stream | About | Register

### Kaplan-Meier survival analysis

• Follow

```Hi,
I'm new to survival analysis. I'm taking over a project from someone using SPSS.
I tried ecdf(t, 'censoring', censored), but the plots look different from SPSS.
I already realised that SPSS asks for [time_to_event logical_event_happend] as input while Matlab asks for [time_to_event censored].
Did I get that right?
But why is the curve upside down? and how could I plot the points where data is censored (how do I get y-values)?

Thanks for any hint
Jasmin
```
 0

```On 1/25/2011 8:43 AM, Jasmin wrote:
> But why is the curve upside down?

You may need to use

>> help ecdf
ECDF Empirical (Kaplan-Meier) cumulative distribution function.
[snip]
'function'    The type of function returned as the F output argument,
chosen from 'cdf' (the default), 'survivor', or
'cumulative hazard'.

and then plot the values yourself as shown in the help.  If you call
ECDF with no outputs, it will plot the empirical CDF, and you may be
```
 0

```Hi Peter,
[f,x] = ecdf(t,'censoring',censored, 'function', 'survivor');
I managed to plot the survivor curve, however it still looks different than an spss output. For example, I have survival times of t >100 in my input data, but the curve stops at ~90. Why don't I get x values for the complete timeframe t?

Thank you
Jasmin
```
 0

```Hi,
thanks to the help from Peter, my code for Kaplan-Meier plots looks like this:

% ecdf is the empirical cumulative distribution function (Kaplan-Meier)
[f,x] = ecdf(t,'censoring',censored, 'function', 'survivor'); %
% add first and last data points
x=[0; x; max(t)];
f= [1; f; f(end)];
% stair plot
figure
stairs(x,f);
ylim([0 1])

% to add marks for censored data
hold on
% get censored times and function values
xi = t(logical(censored));
[dum,step] = histc(xi,x);
step(xi>max(t)) = numel(x);
fi = f(step);
plot(xi,fi, 'b+')
hold off

My next step is to figure out how a log-rank test could be implemented, then I can finally migrate that project from SPSS to MatLab.
Any ideas or links to a pseudo code for the log-rank test? might also be called Mantel-Cox?

Thank you
Jasmin
```
 0

```On 1/26/2011 5:10 AM, Jasmin wrote:
> I managed to plot the survivor curve, however it still looks different
> than an spss output. For example, I have survival times of t >100 in my
> input data, but the curve stops at ~90. Why don't I get x values for the
> complete timeframe t?

Hard to tell from your description, but if all of your largest
observations are censored, then ECDF will plot the K-M curve only out to
the largest observed failure, and the curve does not go to 1 (CDF) or 0
(SF).  That's all it _can_ do.  You can see that if you run the ECDF
help example several times (it uses random data, so you have to try it a
few times).
```
 0

```> My next step is to figure out how a log-rank test could be implemented,
> then I can finally migrate that project from SPSS to MatLab.
> Any ideas or links to a pseudo code for the log-rank test? might also be
> called Mantel-Cox?

Jasmin, I can confirm there is no log-rank function in the Statistics
Toolbox. I see a couple on the file exchange, but I don't have any
experience with them.

Another option is to perform a Cox proportional hazards regression
(coxphfit) using an indicator variable for one of the groups, and test
whether the coefficient of the indicator variable is significantly different
from zero. I cannot offer a mathematical proof that this will work, but
quoting Wikipedia:

"The logrank statistic can be derived as the score test for the Cox
proportional hazards model comparing two groups. It is therefore
asymptotically equivalent to the likelihood ratio test statistic based from
that model."

-- Tom

```
 0

```Hi,

I managed an implementation that gives me similar results for the p value of a log rank test compared to SPSS or R. But only similar, not exactly identical. Why? Any clues?

1.
I get x=times of events, N=number at risk and D=number of deaths or events according to ecdf() for cumulated data of group1 and group2.
2.
Then I calculate number at risk per group:
d1 = histc(t1(~censored1),x); % deaths
c1 = histc(t1(censored1),x); % censored
atRisk1 = size(t1,1) - cumsum(d1) - cumsum([0; c1(1:end-1)]);
d2 = histc(t2(~censored2),x); % deaths
c2 = histc(t2(censored2),x); % censored
atRisk2 = size(t2,1) - cumsum(d2) - cumsum([0; c2(1:end-1)]);
3.
expected number of event, given null hypothesis that risk is identical in both groups:
E=[ sum(atRisk1.*D./N) sum(atRisk2.*D./N) ]
4.
observed number of events:
O = [sum(~censored1) sum(~censored2)]
5.
Chi square test for difference between observed and expected
chi2 = sum(((O - E).^2)./E);
pval = 1 - chi2cdf(chi2,1);

Any idea is welcome and appreciated.

Thanks
Jasmin
```
 0

6 Replies
633 Views

Similiar Articles:

7/21/2012 4:29:21 AM