1 Introduction

It is a “well-known fact” that with time-varying covariate(s), there is no such thing as proportional hazards (PH) (Cox and Oakes 1984). That is, if the ratio between any two individuals’ hazard functions is not time-constant, then by definition, the proportional hazards assumption does not hold. Nevertheless, under certain conditions, it is possible to perform statistical analysis with standard Cox regression software despite the presence of time-varying covariates. It is also possible to follow up with formal tests of the assumption of proportional hazards. This fact is a continuing source of confusion, since it is an apparent contradiction to test for the presence of a property that by definition is not present.

We argue that there is a need for a distinction in terminology, and we introduce the concept Uniform Relative Risk Increase (URRI), which should be understood as an extension of the proportional hazards assumption into the space of extended Cox regression models (Therneau and Grambsch 2000) (sometimes called extended proportional hazards models). It should be clear, however, that new methods are not introduced: This is just a matter of terminology, and a justification of current practice.

URRI is defined for situations where a time-dependent covariate is a step function. In that situation it is possible to formulate a PH model for a pseudo population, created by splitting an observational spell into pieces with cut points equal to the jump time points of the covariate in question. Each piece is then regarded as an independent individual, with a time-constant value on that covariate.

The need for new terminology became urgent when I, the first author of this paper, was a second supervisor on a thesis (Boman and Öhman 2018) by the second and third authors. They were studying the effect of certain crisis years on mortality in a historical context (nineteenth century Sweden). The effect of the crisis was modeled with the aid of a time-varying indicator for the years of crisis in an extended Cox regression. When I suggested that the should “test for proportional hazards” of the crisis indicator, they objected with the motivation that by assumption, an extended Cox model is not a PH model. Since they insisted, I had to give them credit, and we started discussing ways out of this dilemma. We agreed upon the need for a new terminology that would cover their case, hence this paper.

2 A motivating example

Suppose there are two individuals A and B, where under “normal conditions”, the hazard of dying for B is twice that of A, in adult ages. Formally, we have

\[ h_B(t) = 2 h_A(t), \; 20 < t \le 60, \] where \(t\) is age, see Figure 2.1.

Proportional hazards, two individuals.

Figure 2.1: Proportional hazards, two individuals.

This is proportional hazards. Now, suppose that A experience a five-year crisis in ages 40–45, while B experience the same crisis in ages 45–50 (B is exactly five years older than A). Suppose further that the effect of the crisis on mortality is to increase the hazard of dying by 20 per cent for both A and B, see Figure 2.2.

Non-proportional hazards, two individuals.

Figure 2.2: Non-proportional hazards, two individuals.

Clearly not proportional hazards! The data set may look like

##   id enter exit event crisis
## 1  A    20   40 FALSE     no
## 2  A    40   45 FALSE    yes
## 3  A    45   60 FALSE     no
## 4  B    20   45 FALSE     no
## 5  B    45   50 FALSE    yes
## 6  B    50   60 FALSE     no

However, if we hypothetically regard data as generated by four individuals, \(A_0\), \(A_1\), \(B_0\), and \(B_1\), see Figure 2.3

Proportional hazards, four pseudo-individuals.

Figure 2.3: Proportional hazards, four pseudo-individuals.

Then we (potentially) have proportional hazards: There are no time-varying covariates, and it is perfectly legal to test for proportional hazards. The data set may look like

##   id enter exit event crisis
## 1 A0    20   40 FALSE     no
## 2 A0    45   60 FALSE     no
## 3 A1    40   45 FALSE    yes
## 4 B0    20   45 FALSE     no
## 5 B0    50   60 FALSE     no
## 6 B1    45   50 FALSE    yes

3 Male mortality, 19th century Skellefteå

As an example from real life, there is a built-in data set in the R (R Core Team 2018) package eha (Broström 2012, 2018), male.mortality. The first six lines:

##   id  enter   exit event birthdate   ses
## 1  1  0.000 20.000     0  1800.010 upper
## 2  2  3.478 17.562     1  1800.015 lower
## 3  3  0.000 13.463     0  1800.031 upper
## 4  3 13.463 20.000     0  1800.031 lower
## 5  4  0.000 20.000     0  1800.064 lower
## 6  5  0.000  0.089     0  1800.084 lower

Here, the covariate ses (“socio-economic status”) is time-varying. We may check for URRI by regarding each row in the data frame as a unique individual and test for proportional hazards.

fit <- coxreg(Surv(enter, exit, event) ~ birthdate + ses, data = male.mortality)
cox.zph(fit)
##               rho  chisq     p
## birthdate -0.0643 1.0630 0.303
## sesupper   0.0158 0.0687 0.793
## GLOBAL         NA 1.1263 0.569

There is no evidence of non-proportionality.

drop1(fit, test = "Chisq")
## Single term deletions
## 
## Model:
## Surv(enter, exit, event) ~ birthdate + ses
##           Df    AIC     LRT  Pr(>Chi)
## <none>       3687.3                  
## birthdate  1 3692.6  7.2752  0.006991
## ses        1 3701.4 16.1095 5.978e-05

Obviously, both birthdate and ses are statistically significant.

A graphical illustration regarding ses, see Figure 3.1.

fit0 <- coxreg(Surv(enter, exit, event) ~ birthdate + strata(ses), data = male.mortality)
plot(fit0, col = c("red", "blue"), xlab = "Years after age 20.", ylab = "Cumulative hazards", las = 1)