Modern Stochastics: Theory and Applications logo


  • Help
Login Register

  1. Home
  2. Issues
  3. Volume 11, Issue 4 (2024)
  4. Estimation in Cox proportional hazards m ...

Modern Stochastics: Theory and Applications

Submit your article Information Become a Peer-reviewer
  • Article info
  • Full article
  • Related articles
  • More
    Article info Full article Related articles

Estimation in Cox proportional hazards model with heteroscedastic errors in covariates
Volume 11, Issue 4 (2024), pp. 479–489
Oksana Chernova ORCID icon link to view author Oksana Chernova details  

Authors

 
Placeholder
https://doi.org/10.15559/24-VMSTA258
Pub. online: 30 May 2024      Type: Research Article      Open accessOpen Access

Received
1 March 2024
Revised
13 May 2024
Accepted
14 May 2024
Published
30 May 2024

Abstract

Consistent estimators of the baseline hazard rate and the regression parameter are constructed in the Cox proportional hazards model with heteroscedastic measurement errors, assuming that the baseline hazard function belongs to a certain class of functions with bounded Lipschitz constants.

1 Introduction

Survival analysis is a set of statistical methods for analysis of data representing times to the occurrence of some specified event. It is an important part of mathematical statistics due to a wide range of applications: medicine, reliability theory, etc.
The Cox proportional hazards (CPH) model is a semi-parametric regression model that is used to study the association between the survival time of subjects (the so-called lifetime) and one or more predictor variables (the so-called covariates). One of the main features of survival analysis is the presence of incomplete observations or censoring. In such cases only partial information about the true lifetime is available, e.g., that it exceeds some value in case of right censoring. The primary quantities of interest to estimate are the hazard function and the regression parameter. The hazard function represents the instantaneous rate at which events occur at a particular time, given that the individual has survived up to that time. The vector of regression parameters represents effects of covariates on the hazard.
D. R. Cox [4] estimates the regression parameter using partial likelihood without additional assumptions on the baseline hazard rate. Estimates of cumulative hazard are proposed by [9] and [3]. In these papers, it is assumed that the baseline hazard rate belongs to some parametric family (piece-wise constant on certain intervals). Additionally, models without errors in variables are considered.
Nonparametric inference under shape constraints has been an actively researched field in recent decades. It is a framework where the estimated parameters or functions are constrained to satisfy certain shape properties such as monotonicity, convexity or log-concavity. The development of nonparametric methods for estimation of monotone density started with pioneering paper of Grenander [7]. The 2018 special issue of Statistical Science was devoted to inference under shape constraints. A review of recent progress in log-concave density estimation is given in [16]. A review of methods for shape constrainted baseline hazard function in case of censored data is presented in [8]. A monotone baseline hazard rate in Cox model is considered in [14] and [5]. A different variation of shape constrained Cox model with application to breast cancer patients’ survival is considered in [15].
One should be cautious to make inference using regression when covariates could be measured with errors. It is known that applying of naive methodology may lead to inconsistent estimation, see Wallace [17]. CPH model with measurement errors is studied, among others, by Kong and Gu [10] and Augustin [1]. Typically one estimates at first the vector of regression parameters, and then an estimator of the cumulative baseline hazard rate is constructed.
In Kukush et al. [11] the baseline hazard rate belongs to a bounded set of nonnegative Lipschitz functions, and is estimated simultaneously with the vector of regression parameters. This approach is further developed in Kukush and Chernova [12], where the baseline hazard rate belongs to an unbounded set of nonnegative Lipschitz functions.
In all aforementioned papers, measurement errors are assumed to be independent and identically distributed. In practice, measurement errors can be expected to vary considerably among different subjects. E.g., Augustin et al. [2] propose the regression calibration estimation method in CPH model under heteroscedastic measurement errors for nutritional data.
In the present paper we consider a CPH model similar to [11] and [12], but with heteroscedastic measurement errors. The paper is organized as follows. Section 2 describes the observation model and gives main assumptions. In Section 3 we construct a simultaneous consistent estimator $({\hat{\lambda }_{n}},{\hat{\beta }_{n}})$ of the baseline hazard rate and the regression parameter in CPH model with heteroscedastic errors in covariates under bounded parameter set. In Section 4 we do the same for unbounded parameter set, and Section 5 concludes.

2 Model description

The Cox proportional hazards model introduced in [4] assumes that the lifetime T has a hazard rate at moment t for a subject with random vector of covariates X as follows:
\[ \lambda (t|\mathbf{X};{\lambda _{0}},{\beta _{0}})={\lambda _{0}}(t)\exp ({\beta _{0}^{\top }}\mathbf{X}),\hspace{1em}t\ge 0.\]
Here, ${\beta _{0}}$ is a regression parameter belonging to ${\Theta _{\beta }}\subset {\mathbb{R}^{k}}$, and ${\lambda _{0}}(\cdot )\in {\Theta _{\lambda }}\subset C[0,\tau ]$ is a baseline hazard function. The unbounded parameter set ${\Theta _{\lambda }}$ consists of all nonnegative functions with bounded Lipschitz constant. Instead of the lifetime T, right-censored data $Y:=\min \{T,C\}$ and $\Delta :={I_{\{T\le C\}}}$ are available. The censor C has unknown distribution concentrated on a given interval $[0,\tau ]$. A pair $(\mathbf{X},T)$ and random variable C are independent.
Assume that instead of true covariates, one can only observe surrogate vector variables
\[ \mathbf{{W_{i}}}=\mathbf{{X_{i}}}+\mathbf{{U_{i}}},\]
where (not necessarily identically distributed) measurement errors $\mathbf{{U_{i}}},i=1,2,\dots ,n$, are mutually independent centered random vectors that are also independent of the random sequence $(\mathbf{{X_{i}}},{T_{i}},{C_{i}},{Y_{i}},{\Delta _{i}}),i=1,2,\dots ,n$. The moment generating functions ${M_{\mathbf{{U_{i}}}}}(z):=\mathsf{E}\hspace{2.5pt}{e^{{z^{\top }}\mathbf{{U_{i}}}}}$ of the random measurement errors $\mathbf{{U_{i}}}$ are assumed known. The goal is to estimate β and λ based on observations $\left({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}}\right),i=1,\dots ,n$.
Introduce the following assumptions.
  • (i) ${\Theta _{\lambda }}:=\{\hspace{2.5pt}f:[0,\tau ]\to \mathbb{R}\hspace{2.5pt}|\hspace{0.2778em}f(t)\ge a,\hspace{2.5pt}\forall t\in [0,\tau ],\hspace{0.2778em}f(0)\le A,\hspace{0.2778em}\text{and}\hspace{2.5pt}|f(t)-f(s)|\le L|t-s|,\forall t,s\in [0,\tau ]\hspace{2.5pt}\}$, where a, A and L are fixed positive constants, with $a\lt A$.
  • (i’) ${\Theta _{\lambda }}:=\{\hspace{2.5pt}f:[0,\tau ]\to \mathbb{R}\hspace{2.5pt}|\hspace{0.2778em}f(t)\ge 0,\hspace{2.5pt}\forall t\in [0,\tau ],\hspace{0.2778em}\text{and}\hspace{2.5pt}|f(t)-f(s)|\le L|t-s|,\forall t,s\in [0,\tau ]\hspace{2.5pt}\}$, where L is a fixed positive constant.
  • (ii) ${\Theta _{\boldsymbol{\beta }}}$ is a compact set in ${\mathbb{R}^{k}}$.
  • (iii) There exist positive K and ϵ such that for all $n\ge 1$,
    \[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\hspace{2.5pt}{e^{2D\| \mathbf{{U_{i}}}\| }}\le K,\hspace{2.5pt}\text{with}\hspace{2.5pt}D:=\underset{\boldsymbol{\beta }\in {\Theta _{\boldsymbol{\beta }}}}{\max }\| \boldsymbol{\beta }\| +\epsilon .\]
  • (iv) $\mathsf{E}\hspace{2.5pt}{e^{2D\| \mathbf{X}\| }}\lt \infty $, with D defined in (iii).
  • (v) $\tau \gt 0$ is the right endpoint of censor’s distribution, i.e. $\mathsf{P}(C\gt \tau )=0$ and for all $\epsilon \gt 0$, $\mathsf{P}(C\gt \tau -\epsilon )\gt 0$.
  • (vi) The matrix $\mathsf{E}\hspace{2.5pt}\mathbf{\mathbf{X}{X^{\top }}}$ of second moments is positive definite.
Likelihood construction in presence of censored data is described in [13]. In case where covariates are observed without errors, the log-likelihood function is given by
\[\begin{aligned}{}{Q_{n}}(\lambda ,\boldsymbol{\beta })& :=\frac{1}{n}{\sum \limits_{i=1}^{n}}q({Y_{i}},{\Delta _{i}},{\mathbf{X}_{i}};\lambda ,\boldsymbol{\beta }),\hspace{1em}\text{with}\\ {} q({Y_{i}},{\Delta _{i}},\mathbf{{X_{i}}};\lambda ,\boldsymbol{\beta })& :={\Delta _{i}}\cdot (\log \lambda ({Y_{i}})+{\boldsymbol{\beta }^{\top }}\mathbf{{X_{i}}})-\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{X_{i}}}){\int _{0}^{{Y_{i}}}}\lambda (u)du.\end{aligned}\]
T. Augustin [1] proposed the following objective function to adjust for homoscedastic measurement errors
\[ {Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta }):=\frac{1}{n}{\sum \limits_{i=1}^{n}}{q^{cor}}({Y_{i}},{\Delta _{i}},{\mathbf{W}_{i}};\lambda ,\boldsymbol{\beta }),\]
with
\[ {q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta }):={\Delta _{i}}\cdot (\log \lambda ({Y_{i}})+{\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})-\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du.\]
We have
\[ \mathsf{E}[{q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}})\hspace{2.5pt}|\hspace{2.5pt}{Y_{i}},{\Delta _{i}},\mathbf{{X_{i}}}]=q({Y_{i}},{\Delta _{i}},\mathbf{{X_{i}}}).\]
Therefore,
\[ \mathsf{E}\hspace{2.5pt}{q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}})=\mathsf{E}\hspace{2.5pt}q({Y_{i}},{\Delta _{i}},\mathbf{{X_{i}}})=\mathsf{E}\hspace{2.5pt}q({Y_{1}},{\Delta _{1}},\mathbf{{X_{1}}}).\]
The latter equality means that $\mathsf{E}[{q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}})]$ does not depend on i.
Denote
\[ {q_{\infty }}(\lambda ,\beta )=\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{q^{cor}}({Y_{i}},{\Delta _{i}},{\mathbf{W}_{i}};\lambda ,\boldsymbol{\beta })=\mathsf{E}{q^{cor}}({Y_{1}},{\Delta _{1}},\mathbf{{W_{1}}}).\]
We will make use of both forms of ${q_{\infty }}$.
We define a simultaneous estimator of the baseline hazard rate and regression parameter under bounded and unbounded parameter sets as follows.
Definition 1 (under bounded parameter set).
A Borel function $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)=\left({\hat{\lambda }_{n}},{\hat{\boldsymbol{\beta }}_{n}}\right)$ of observations $({Y_{i}},{\Delta _{i}},{\mathbf{W}_{i}})$, $i=1,\dots ,n$, with values in $\Theta ={\Theta _{\beta }}\times {\Theta _{\lambda }}$, where ${\Theta _{\lambda }}$ is bounded, and such that
(1)
\[ \left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)=\arg \underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\max }{Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta }),\]
is called a simultaneous estimator of the baseline hazard rate and the regression parameter under the bounded parameter set Θ.
Definition 2 (under unbounded parameter set).
Let $\{{\varepsilon _{n}}\}$ be a fixed sequence of positive numbers such that ${\varepsilon _{n}}\downarrow 0$ as $n\to \infty $. A Borel function $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)=\left({\hat{\lambda }_{n}},{\hat{\boldsymbol{\beta }}_{n}}\right)$ of observations $({Y_{i}},{\Delta _{i}},{\mathbf{W}_{i}})$, $i=1,\dots ,n$, with values in $\Theta ={\Theta _{\beta }}\times {\Theta _{\lambda }}$, where ${\Theta _{\lambda }}$ is unbounded, and such that
(2)
\[ {Q_{n}^{cor}}\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)\ge \underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }{Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta })-{\varepsilon _{n}},\]
is called a simultaneous estimator of the baseline hazard rate and the regression parameter over the unbounded parameter set Θ.

3 Simultaneous estimation under bounded parameter set

We extend the result about consistency of a simultaneous estimator of regression parameter and baseline hazard under bounded parameter set from Kukush et al. [11] to the case of heterogeneous measurement errors. In the next section we proceed with a similar result for unbounded parameter set. The main result of this section is the following theorem.
Theorem 1.
Under (i)–(vi), $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)$ defined in (1) is a strongly consistent estimator of true parameters $\left({\lambda _{0}},\boldsymbol{\beta }\right)$.
In what follows we rely on the following version of the Strong Law of Large Numbers for not necessary identically distributed random variables and the next easy to prove Statement 1.
Theorem (Kolmogorov’s Strong Law of Large Numbers, section 10.7 [6]).
Let ${\{{\xi _{n}}\}_{n\ge 1}}$ be a sequence of independent random variables (not necessary identically distributed) and such that
\[ {\sum \limits_{n=1}^{\infty }}\frac{\mathrm{Var}({\xi _{n}})}{{n^{2}}}\lt \infty \hspace{2.5pt}.\]
Then
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}{\xi _{i}}-\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{\xi _{i}}\to 0\hspace{1em}\textit{a.s. as}\hspace{2.5pt}n\to \infty \hspace{2.5pt}.\]
Statement 1.
Let ${\{{s_{n}}\}_{n\ge 1}}$ be a real valued sequence such that $\frac{1}{n}{\textstyle\sum _{i=1}^{n}}{s_{i}}$ is bounded. Then ${\textstyle\sum _{n=1}^{\infty }}{s_{n}}/{n^{2}}$ converges.
Remark 1.
By K we will denote any positive deterministic constant the exact value of which is not important. Note that K may change from line to line (or even within one line).
Remark 2.
Condition (iii) implies that there exists positive K, such that for all $n\ge 1$,
(3)
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\hspace{2.5pt}\| \mathbf{{U_{i}}}{\| ^{2}}\le K.\]
Proof of Theorem 1.
Similarly to the proof of Theorem 1 from [11], one can show the strong consistency of the estimators if
  • (a) $\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }|{Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta })-{q_{\infty }}(\lambda ,\beta )|\to 0\hspace{1em}\text{a.s. as}\hspace{2.5pt}n\to \infty $;
  • (b) ${q_{\infty }}(\lambda ,\beta )\le {q_{\infty }}({\lambda _{0}},{\beta _{0}})$, and equality holds if and only if $\lambda ={\lambda _{0}}$, $\beta ={\beta _{0}}$.
Denote by $\frac{\partial {q^{cor}}}{\partial \boldsymbol{\beta }}$ the derivative of ${q^{cor}}$ with respect to vector $\boldsymbol{\beta }$. For a fixed value of $\boldsymbol{\beta }$, consider ${q^{cor}}$ as a function of λ, i.e. ${q^{cor}}(\cdot ,\boldsymbol{\beta }):C[0,\tau ]\to \mathbb{R}$. Denote by $\frac{\partial {q^{cor}}}{\partial \lambda }$ the Fréchet derivative of ${q^{cor}}$ with respect to the function λ. Then $\frac{\partial {q^{cor}}}{\partial \lambda }:C[0,\tau ]\to \mathcal{L}(C[0,\tau ],\mathbb{R})$ is a linear continuous functional. For $h\in C[0,\tau ]$, let $\left\langle \frac{\partial {q^{cor}}}{\partial \lambda },h\right\rangle $ denote the action of the functional $\frac{\partial {q^{cor}}}{\partial \lambda }$ on h. We have
\[ \left\langle \frac{\partial {q^{cor}}}{\partial \lambda }(Y,\Delta ,W;\lambda ,\boldsymbol{\beta }),h\right\rangle =\frac{\Delta h(Y)}{\lambda (Y)}-\frac{{e^{{\boldsymbol{\beta }^{\top }}\mathbf{W}}}}{{M_{\mathbf{U}}}(\boldsymbol{\beta })}{\int _{0}^{Y}}h(u)du,\]
\[ \left|\left|\frac{\partial {q^{cor}}}{\partial \lambda }(Y,\Delta ,W;\lambda ,\boldsymbol{\beta })\right|\right|=\underset{\left|\left|h\right|\right|=1}{\sup }\left\langle \frac{\partial {q^{cor}}}{\partial \lambda }(Y,\Delta ,W;\lambda ,\boldsymbol{\beta }),\hspace{2.5pt}h\right\rangle ,\]
\[\begin{aligned}{}\frac{\partial {q^{cor}}}{\partial \boldsymbol{\beta }}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })& ={\Delta _{i}}\cdot \mathbf{{W_{i}}}-\frac{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })\mathbf{{W_{i}}}-\mathsf{E}(\mathbf{{U_{i}}}{\mathbf{e}^{{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}}})}{{M_{\mathbf{{U_{i}}}}^{2}}(\boldsymbol{\beta })}{e^{{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}}}{\int _{0}^{Y}}\lambda (u)du.\end{aligned}\]
According to [11] in order to verify (a) it suffices to show:
  • (a1) ${Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta })-{q_{\infty }}(\lambda ,\beta )\to 0\hspace{1em}\text{a.s. as}\hspace{2.5pt}n\to \infty \hspace{2.5pt}\text{for all}\hspace{2.5pt}(\lambda ,\beta )\in \Theta $;
  • (a2) there exists a positive constant K such that
    (4)
    \[\begin{aligned}{}& \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }\left|\left|\frac{\partial {q^{cor}}}{\partial \beta }({Y_{i}},{\Delta _{i}},{W_{i}};\lambda ,\boldsymbol{\beta })\right|\right|\le K,\end{aligned}\]
    (5)
    \[\begin{aligned}{}& \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }\left|\left|\frac{\partial {q^{cor}}}{\partial \lambda }({Y_{i}},{\Delta _{i}},{W_{i}};\lambda ,\boldsymbol{\beta })\right|\right|\le K,\end{aligned}\]
    for all $n\ge 1$, where supremum is taken over ${\Theta _{\lambda }}\times \text{conv}({\Theta _{\beta }})$.
  • (a3) ${q_{\infty }}(\lambda ,\beta )$ is continuous in $(\lambda ,\beta )$.
To investigate when the condition (a1) holds, rewrite
\[\begin{aligned}{}{Q_{n}^{cor}}(\lambda ,\boldsymbol{\beta }):=& \frac{1}{n}{\sum \limits_{i=1}^{n}}{\Delta _{i}}\cdot \log \lambda ({Y_{i}})+\frac{1}{n}{\sum \limits_{i=1}^{n}}{\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{X_{i}}}+\frac{1}{n}{\sum \limits_{i=1}^{n}}{\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}-\\ {} & -\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du.\end{aligned}\]
First and second summand converge to their expectations due to SLLN. Consider third summand. It holds $\mathrm{Var}({\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}})\le \mathsf{E}{({\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}})^{2}}\le K\cdot \mathsf{E}\mathbf{{\left|\left|{U_{i}}\right|\right|^{2}}}$. Then
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathrm{Var}({\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}})\le K\cdot \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\mathbf{{\left|\left|{U_{i}}\right|\right|^{2}}}\]
is bounded due to Remark 2. Therefore by Statement 1
\[ {\sum \limits_{i=1}^{\infty }}\frac{\mathrm{Var}({\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}})}{{i^{2}}}\lt \infty \hspace{2.5pt}.\]
SLLN yields
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}{\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}-\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}[{\Delta _{i}}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}]\to 0\hspace{1em}\text{a.s. as}\hspace{2.5pt}n\to \infty .\]
Consider forth summand. We have
\[ \exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}}){\int _{0}^{{Y_{i}}}}\lambda (u)du\le K\cdot {e^{(D-\varepsilon )\left|\left|\mathbf{{W_{i}}}\right|\right|}}.\]
Due to Jensen’s inequality ${M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })\ge {e^{\mathsf{E}{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}}}=1$. For all $i\ge 1$ using (iii)–(iv) we obtain
\[\begin{aligned}{}& \mathrm{Var}\left(\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du\right)\le \mathsf{E}{\left(\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du\right)^{2}}\le \\ {} & \le \frac{K}{{\min _{\beta }}{M_{\mathbf{{U_{i}}}}^{2}}(\boldsymbol{\beta })}\mathsf{E}\left[{e^{2(D-\varepsilon )\left|\left|\mathbf{{X_{i}}}\right|\right|}}\right]\cdot \mathsf{E}\left[{e^{2(D-\varepsilon )\left|\left|\mathbf{{U_{i}}}\right|\right|}}\right]\le K\hspace{2.5pt}\cdot \mathsf{E}\left[{e^{2(D-\varepsilon )\left|\left|\mathbf{{U_{i}}}\right|\right|}}\right].\end{aligned}\]
Then
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathrm{Var}\left(\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du\right)\le K\cdot \frac{1}{n}\hspace{2.5pt}\mathsf{E}\left[{e^{2(D-\varepsilon )\left|\left|\mathbf{{U_{i}}}\right|\right|}}\right].\]
Using Statement 1
(6)
\[ {\sum \limits_{i=1}^{\infty }}\frac{1}{{i^{2}}}\mathrm{Var}\left(\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du\right)\lt \infty \hspace{2.5pt},\]
and therefore
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du-\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\left[\frac{\exp ({\boldsymbol{\beta }^{\top }}\mathbf{{W_{i}}})}{{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}{\int _{0}^{{Y_{i}}}}\lambda (u)du\right]\to 0\]
a.s. as $n\to \infty $. Thus, under conditions (i)–(iv) (a1) holds.
Next, we will verify (a2). We have
\[\begin{aligned}{}\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }& \left|\left|\frac{\partial {q^{cor}}}{\partial \beta }({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })\right|\right|\le \left|\left|\mathbf{{W_{i}}}\right|\right|+\frac{K}{{\min _{\beta }}{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}\underset{\beta }{\sup }\left|\left|\mathbf{{W_{i}}}\right|\right|{e^{{\boldsymbol{\beta }^{\top }}\left|\left|\mathbf{{W_{i}}}\right|\right|}}+\\ {} & +\frac{K}{{\min _{\beta }}{M_{\mathbf{{U_{i}}}}^{2}}(\boldsymbol{\beta })}\underset{\beta }{\sup }\mathsf{E}\left[\left|\left|\mathbf{{U_{i}}}\right|\right|{e^{{\boldsymbol{\beta }^{\top }}\left|\left|\mathbf{{U_{i}}}\right|\right|}}\right]{e^{{\boldsymbol{\beta }^{\top }}\left|\left|\mathbf{{W_{i}}}\right|\right|}}.\end{aligned}\]
Then
\[\begin{aligned}{}& \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }\left|\left|\frac{\partial {q^{cor}}}{\partial \beta }({Y_{i}},{\Delta _{i}},{W_{i}};\lambda ,\boldsymbol{\beta })\right|\right|\le \mathsf{E}\left|\left|{X_{1}}\right|\right|+\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\left|\left|{U_{i}}\right|\right|+\\ {} & +\mathsf{E}\underset{\beta }{\sup }\left(\left|\left|X\right|\right|{e^{{\boldsymbol{\beta }^{\top }}\mathbf{X}}}\right)\cdot \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{e^{D\left|\left|{U_{i}}\right|\right|}}+{e^{D\left|\left|X\right|\right|}}\cdot \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\underset{\beta }{\sup }\left(\left|\left|{U_{i}}\right|\right|{e^{{\boldsymbol{\beta }^{\top }}\mathbf{{U_{i}}}}}\right).\end{aligned}\]
Conditions (iii)–(iv) imply that $\mathsf{E}\left|\left|{X_{i}}\right|\right|{e^{(D-\varepsilon )\left|\left|{X_{i}}\right|\right|}}$ and $\mathsf{E}\left|\left|{U_{i}}\right|\right|{e^{(D-\varepsilon )\left|\left|{U_{i}}\right|\right|}}$ are bounded for all $i\ge 1$. So there exists a positive constant, such that (4) holds.
One can show that
\[ \underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }\left|\left|\frac{\partial {q^{cor}}}{\partial \lambda }({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })\right|\right|\le \underset{\lambda }{\sup }\frac{1}{\left|\left|\lambda \right|\right|}+\frac{\tau \cdot {e^{D(\left|\left|{X_{i}}\right|\right|+\left|\left|{U_{i}}\right|\right|)}}}{{\min _{\beta }}{M_{\mathbf{{U_{i}}}}}(\boldsymbol{\beta })}.\]
Therefore
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}\underset{(\lambda ,\boldsymbol{\beta })\in \Theta }{\sup }\left|\left|\frac{\partial {q^{cor}}}{\partial \lambda }({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })\right|\right|\le K\left(1+\mathsf{E}{e^{D\left|\left|X\right|\right|}}\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{e^{D\left|\left|{U_{i}}\right|\right|}}\right).\]
Conditions (i)–(iv) imply that (5) holds.
(a3) It is clear that ${q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })$ is continuous in $(\lambda ,\beta )\in \Theta $, and under conditions (i)–(iv) $\mathsf{E}|{q^{cor}}({Y_{i}},{\Delta _{i}},\mathbf{{W_{i}}};\lambda ,\boldsymbol{\beta })|$ is bounded for all $1\le i\le n$. Thus, the dominated convergence theorem implies that ${q_{\infty }}(\lambda ,\boldsymbol{\beta })$ is continuous in $(\lambda ,\beta )\in \Theta $.
Proof that (b) holds is essentially the same as in [11].
To sum up, under (i)–(iv), the pair $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)$ is a strongly consistent estimator of true parameter $\left({\lambda _{0}},\boldsymbol{\beta }\right)$.  □

4 Simultaneous estimator under unbounded parameter set

Let us now consider baseline hazards function form unbounded parameter set defined in (i’). The goal of this section is to extend Theorem 3 from [12] in the case of heteroscadastic measurement errors. The main result of this section follows.
Theorem 2.
Under (i’)–(vi), $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)$ defined in (2) is a strongly consistent estimator of true parameters $\left({\lambda _{0}},\boldsymbol{\beta }\right)$.
Proof.
We follow the line of the proof of Theorem 3 from [12]. A relation holds eventually if it is valid for all sample sizes n starting from some random number, almost surely.
We firstly show that
(7)
\[ \underset{(\lambda ,\beta )\in {\Theta ^{R}}}{\sup }{Q_{n}^{cor}}(\lambda ,\beta )\gt \underset{(\lambda ,\beta )\in \Theta \setminus {\Theta ^{R}}}{\sup }{Q_{n}^{cor}}(\lambda ,\beta )\]
eventually for sufficiently large nonrandom numbers $R\gt ||{\lambda _{0}}||$, where ${\Theta _{\lambda }^{R}}={\Theta _{\lambda }}\cap \bar{B}(0,R)$, ${\Theta ^{R}}={\Theta _{\lambda }^{R}}\times {\Theta _{\beta }}$.
Denote ${D_{1}}={\max _{\beta }}\| \beta \| $. We have
\[ \underset{(\lambda ,\beta )\in \Theta \setminus {\Theta ^{R}}}{\sup }{Q_{n}^{cor}}(\lambda ,\beta )\le {I_{1}}+\underset{\substack{\lambda \in {\Theta _{\lambda }}:\\ {} \lambda (0)\gt R}}{\sup }{I_{2}}+{I_{3}},\]
where
\[\begin{aligned}{}{I_{1}}& =-(R-L\tau )\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp (-{D_{1}}||{W_{i}}||){Y_{i}}\cdot I(\Delta =0)}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )},\\ {} {I_{2}}& =\ln (\lambda (0)+L\tau )\frac{1}{n}\sum \limits_{i:{\Delta _{i}}=1}{\Delta _{i}}-(\lambda (0)+L\tau )\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp (-{D_{1}}||{W_{i}}||){Y_{i}}\cdot I(\Delta =1)}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )},\\ {} {I_{3}}& =\frac{1}{n}\sum \limits_{i:{\Delta _{i}}=1}{D_{1}}||{W_{i}}||+2L\tau \frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp (-{D_{1}}||{W_{i}}||){Y_{i}}\cdot I(\Delta =1)}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )}\hspace{2.5pt}.\end{aligned}\]
We have
\[\begin{aligned}{}& Var\left({Y_{i}}{e^{-{D_{1}}\left|\left|{W_{i}}\right|\right|}}I(\Delta =0)\right)\le \mathsf{E}\left({Y_{i}^{2}}{e^{-2{D_{1}}\left|\left|{W_{i}}\right|\right|}}\right)\le K\cdot \mathsf{E}{e^{-2{D_{1}}\left|\left|{X_{1}}\right|\right|}}\mathsf{E}{e^{-2{D_{1}}\left|\left|{U_{i}}\right|\right|}},\\ {} & \frac{1}{n}{\sum \limits_{i=1}^{n}}Var\left({Y_{i}}{e^{-{D_{1}}\left|\left|{W_{i}}\right|\right|}}I(\Delta =0)\right)\le K\cdot \mathsf{E}{e^{-2{D_{1}}\left|\left|{X_{1}}\right|\right|}}\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{e^{-2{D_{1}}\left|\left|{U_{i}}\right|\right|}}\le K.\end{aligned}\]
SLLN yields
\[ {I_{1}}+(R-L\tau )\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\mathsf{E}[\hspace{2.5pt}C\cdot I(\Delta =0)\exp (-{D_{1}}||{W_{i}}||)\hspace{2.5pt}]}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )}\to 0\]
almost surely as $n\to \infty $. This means that eventually
\[ {I_{1}}\le -(R-L\tau ){D_{2}},\]
where ${D_{2}}\gt 0$.
Let
\[ {A_{n}}=\frac{1}{n}{\sum \limits_{i=1}^{n}}{\Delta _{i}},\hspace{2.5pt}\hspace{2.5pt}{B_{n}}=\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\exp (-{D_{1}}||{W_{i}}||){Y_{i}}}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )}{1_{\{{\Delta _{i}}=1\}}}.\]
Since ${A_{n}}\gt 0$ and ${B_{n}}\gt 0$ eventually, we obtain
\[ {I_{2}}\le \underset{z\gt 0}{\max }({A_{n}}\ln z-z{B_{n}})={A_{n}}\left(\ln \left(\frac{{A_{n}}}{{B_{n}}}\right)-1\right).\]
Analogously,
\[\begin{aligned}{}& Var\left({Y_{i}}{e^{-{D_{1}}\left|\left|{W_{i}}\right|\right|}}I(\Delta =1)\right)\le \mathsf{E}\left({Y_{i}^{2}}{e^{-2{D_{1}}\left|\left|{W_{i}}\right|\right|}}\right)\le K\cdot \mathsf{E}{e^{-2{D_{1}}\left|\left|{X_{1}}\right|\right|}}\mathsf{E}{e^{-2{D_{1}}\left|\left|{U_{i}}\right|\right|}},\\ {} & \frac{1}{n}{\sum \limits_{i=1}^{n}}Var\left({Y_{i}}{e^{-{D_{1}}\left|\left|{W_{i}}\right|\right|}}I(\Delta =1)\right)\le K.\end{aligned}\]
By SLLN,
\[ {A_{n}}\to \mathsf{P}(\Delta =1)\gt 0,\hspace{1em}{B_{n}}-\frac{1}{n}{\sum \limits_{i=1}^{n}}\frac{\mathsf{E}[\hspace{2.5pt}T\cdot I(\Delta =1)\exp (-{D_{1}}||{W_{i}}||)\hspace{2.5pt}]}{\underset{\beta \in {\Theta _{\beta }}}{\max }{M_{U}}(\beta )}\to 0\]
respectively, almost surely as $n\to \infty $. Hence ${I_{2}}$ is eventually bounded from above by some positive constant ${D_{3}}$.
Further, it follows from the strong law of large numbers that ${I_{3}}$ is eventually bounded from above by some positive constant ${D_{4}}$. Hence
\[ \underset{n\to \infty }{\overline{\lim }}\underset{(\lambda ,\beta )\in \Theta \setminus {\Theta ^{R}}}{\sup }{Q_{n}^{cor}}(\lambda ,\beta )\le -(R-L\tau ){D_{2}}+{D_{3}}+{D_{4}}.\]
Note that the constants ${D_{2}}$, ${D_{3}}$ and ${D_{4}}$ introduced above do not depend on $\beta \in {\Theta _{\beta }}$. Letting $R\to +\infty $, we get
\[ \underset{n\to \infty }{\overline{\lim }}\underset{(\lambda ,\beta )\in \Theta \setminus {\Theta ^{R}}}{\sup }{Q_{n}^{cor}}(\lambda ,\beta )\to -\infty ,\hspace{1em}R\to +\infty .\]
This proves that inequality (7) holds eventually for sufficiently large R.
Further, one can repeat reasoning from [12] to show that
\[ ({\hat{\lambda }_{n}}(\omega ),{\hat{\beta }_{n}}(\omega ))\to ({\lambda _{0}},\beta ),\hspace{1em}n\to \infty ,\]
for all $\omega \in A$, $P(A)=1$.
The theorem is proved.  □

5 Conclusions

We have shown consistency of a simultaneous consistent estimator of the baseline hazard rate and the regression parameter in the Cox proportional hazards model assuming baseline hazard function is Lipshitz continuous with fixed constant in the case of heteroscedastic measurement errors for both bounded and unbounded parameter set, respectively.

Acknowledgement

I would like to express my sincere gratitude to my mentor, Professor Oleksandr Kukush, for his invaluable guidance and support throughout my journey in survival analysis. His expertise, encouragement, and dedication have been instrumental in shaping my understanding of this field. I am deeply thankful for his mentorship, which has greatly enriched my research experience and academic pursuits.

References

[1] 
Augustin, T.: An exact corrected log-likelihood function for Cox’s proportional hazards model under measurement error and some extensions. Scand. J. Stat. 31(1), 43–50 (2004). MR2042597. https://doi.org/10.1111/j.1467-9469.2004.00371.x
[2] 
Augustin, T., Döring, A., Rummel, D.: Regression calibration for Cox regression under heteroscedastic measurement error—determining risk factors of cardiovascular diseases from error-prone nutritional replication data. In: Recent Advances in Linear Models and Related Areas, pp. 253–278. Springer (2008). MR2523854. https://doi.org/10.1007/978-3-7908-2064-5_13
[3] 
Breslow, N.: Covariance analysis of censored survival data. Biometrics, 89–99 (1974)
[4] 
Cox, D.R.: Regression models and life-tables. J. Roy. Statist. Soc. Ser. B 34, 187–220 (1972). MR0341758
[5] 
Durot, C., Lopuhaä, H.P.: Limit theory in monotone function estimation. Statist. Sci. 33(4), 547–567 (2018). MR3881208. https://doi.org/10.1214/18-STS664
[6] 
Feller, W.: An Introduction to Probability Theory and Its Applications. Vol. I, 509 (1968). MR0228020
[7] 
Grenander, U.: On the theory of mortality measurement. II. Skand. Aktuarietidskr. 39, 125–1531957 (1956). MR0093415. https://doi.org/10.1080/03461238.1956.10414944
[8] 
Groeneboom, P., Jongbloed, G.: Some developments in the theory of shape constrained inference. Statist. Sci. 33(4), 473–492 (2018). MR3881204. https://doi.org/10.1214/18-STS657
[9] 
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data vol. 360. John Wiley & Sons (2011) MR0570114
[10] 
Kong, F.H., Gu, M.: Consistent estimation in Cox proportional hazards model with covariate measurement errors. Statist. Sinica 9(4), 953–969 (1999). MR1744820
[11] 
Kukush, A., Baran, S., Fazekas, I., Usoltseva, E.: Simultaneous estimation of baseline hazard rate and regression parameters in Cox proportional hazards model with measurement error. J. Statist. Res. 45(2), 77–94 (2011). MR2934363
[12] 
Kukush, O.G., Chernova, O.O.: Consistent estimation in the Cox proportional hazards model with measurement errors under an unboundedness condition for the parameter set. Teor. Ĭmovı¯r. Mat. Stat. 96, 100–109 (2017). MR3666874. https://doi.org/10.1090/tpms/1036
[13] 
Lawless, J.F.: Statistical Models and Methods for Lifetime Data vol. 362. John Wiley & Sons (2011) MR0640866
[14] 
Lopuhaä, H.P., Nane, G.F.: Shape constrained non-parametric estimators of the baseline distribution in Cox proportional hazards model. Scand. J. Stat. 40(3), 619–646 (2013). MR3091700. https://doi.org/10.1002/sjos.12008
[15] 
Qin, J., Deng, G., Ning, J., Yuan, A., Shen, Y.: Estrogen receptor expression on breast cancer patients’ survival under shape-restricted Cox regression model. Ann. Appl. Stat. 15(3), 1291–1307 (2021). MR4316649. https://doi.org/10.1214/21-aoas1446
[16] 
Samworth, R.J.: Recent progress in log-concave density estimation. Statist. Sci. 33(4), 493–509 (2018). MR3881205. https://doi.org/10.1214/18-STS666
[17] 
Wallace, M.: Analysis in an imperfect world. Significance 17(1), 14–19 (2020). MR4446481
Reading mode PDF XML

Table of contents
  • 1 Introduction
  • 2 Model description
  • 3 Simultaneous estimation under bounded parameter set
  • 4 Simultaneous estimator under unbounded parameter set
  • 5 Conclusions
  • Acknowledgement
  • References

Copyright
© 2024 The Author(s). Published by VTeX
by logo by logo
Open access article under the CC BY license.

Keywords
Cox proportional hazards model right censoring heteroscedastic measurement errors

MSC2010
62H12 62N02

Metrics
since March 2018
566

Article info
views

121

Full article
views

144

PDF
downloads

56

XML
downloads

Export citation

Copy and paste formatted citation
Placeholder

Download citation in file


Share


RSS

  • Theorems
    3
Theorem 1.
Theorem (Kolmogorov’s Strong Law of Large Numbers, section 10.7 [6]).
Theorem 2.
Theorem 1.
Under (i)–(vi), $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)$ defined in (1) is a strongly consistent estimator of true parameters $\left({\lambda _{0}},\boldsymbol{\beta }\right)$.
Theorem (Kolmogorov’s Strong Law of Large Numbers, section 10.7 [6]).
Let ${\{{\xi _{n}}\}_{n\ge 1}}$ be a sequence of independent random variables (not necessary identically distributed) and such that
\[ {\sum \limits_{n=1}^{\infty }}\frac{\mathrm{Var}({\xi _{n}})}{{n^{2}}}\lt \infty \hspace{2.5pt}.\]
Then
\[ \frac{1}{n}{\sum \limits_{i=1}^{n}}{\xi _{i}}-\frac{1}{n}{\sum \limits_{i=1}^{n}}\mathsf{E}{\xi _{i}}\to 0\hspace{1em}\textit{a.s. as}\hspace{2.5pt}n\to \infty \hspace{2.5pt}.\]
Theorem 2.
Under (i’)–(vi), $\left(\hat{\lambda },\hat{\boldsymbol{\beta }}\right)$ defined in (2) is a strongly consistent estimator of true parameters $\left({\lambda _{0}},\boldsymbol{\beta }\right)$.

MSTA

MSTA

  • Online ISSN: 2351-6054
  • Print ISSN: 2351-6046
  • Copyright © 2018 VTeX

About

  • About journal
  • Indexed in
  • Editors-in-Chief

For contributors

  • Submit
  • OA Policy
  • Become a Peer-reviewer

Contact us

  • ejournals-vmsta@vtex.lt
  • Mokslininkų 2A
  • LT-08412 Vilnius
  • Lithuania
Powered by PubliMill  •  Privacy policy