Extreme residuals in regression model. Minimax approach

Ivanov, Aleksander; Matsak, Ivan; Polotskiy, Sergiy

doi:10.15559/15-VMSTA40CNF

1 Introduction

Consider the model of linear regression

(1)

\[y_{j}=\sum \limits_{i=1}^{q}\theta _{i}x_{ji}+\epsilon _{j},\hspace{1em}j=\overline{1,N},\]

where $\theta =(\theta _{1},\theta _{2},\dots ,\theta _{q})$ is an unknown parameter, $\epsilon _{j}$ are independent identically distributed (i.i.d.) random variables (r.v.-s) with distribution function (d.f.) $F(x)$, and $X=(x_{ji})$ is a regression design matrix.

Let $\widehat{\theta }=(\widehat{\theta _{1}},\dots ,\widehat{\theta _{q}})$ be the least squares estimator (LSE) of θ. Introduce the notation

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \widehat{y_{j}}=\sum \limits_{i=1}^{q}\widehat{\theta _{i}}x_{ji},\hspace{2em}\widehat{\epsilon _{j}}=y_{j}-\widehat{y_{j}},\hspace{1em}j=\overline{1,N};\\{} & \displaystyle Z_{N}=\underset{1\le j\le N}{\max }\epsilon _{j},\hspace{2em}\widehat{Z_{N}}=\underset{1\le j\le N}{\max }\widehat{\epsilon _{j}},\\{} & \displaystyle {Z_{N}^{\ast }}=\underset{1\le j\le N}{\max }|\epsilon _{j}|,\hspace{2em}{\widehat{Z_{N}}}^{\ast }=\underset{1\le j\le N}{\max }|\widehat{\epsilon _{j}}|.\end{array}\]

Asymptotic behavior of the r.v.-s $Z_{N}$, ${Z_{N}^{\ast }}$ is studied in the theory of extreme values (see classical works by Frechet [10], Fisher and Tippet [3], and Gnedenko [5] and monographs [4, 8]). In the papers [6, 7], it was shown that under mild assumptions asymptotic properties of the r.v.-s $Z_{N}$, $\widehat{Z_{N}}$, ${Z_{N}^{\ast }}$, and ${\widehat{Z_{N}}}^{\ast }$ are similar in the cases of both finite variance and heavy tails of observation errors $\epsilon _{j}$.

In the present paper, we study asymptotic properties of minimax estimator (MME) of θ and maximal absolute residual. For MME, we keep the same notation $\widehat{\theta }$.

Definition 1.

A random variable $\widehat{\theta }=(\widehat{\theta _{1}},\dots ,\widehat{\theta _{q}})$ is called MME for θ by the observations (1)

(2)

\[\widehat{\varDelta }=\varDelta (\widehat{\theta })=\underset{\tau \in {\mathbb{R}}^{q}}{\min }\varDelta (\tau ),\]

where

\[\varDelta (\tau )=\underset{1\le j\le N}{\max }\left|y_{j}-\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\right|.\]

Denote $W_{N}=\min _{1\le j\le N}\epsilon _{j}$ and let $R_{N}=Z_{N}-W_{N}$ and $Q_{N}=\frac{Z_{N}+W_{N}}{2}$ be the range and midrange of the sequence $\epsilon _{j},\hspace{2.5pt}j=\overline{1,N}$.

The following statement shows essential difference in the behavior of MME and LSE.

Statement 1.

(i) If the model (1) contains a constant term, namely, $x_{j1}=1$, $j=\overline{1,N}$, then almost surely (a.s.)

(3)
\[\widehat{\varDelta }\le \frac{R_{N}}{2}.\hspace{2.5pt}\]
(ii) If the model (1) has the form

(4)
\[y_{j}=\theta +\epsilon _{j},\hspace{1em}j=\overline{1,N},\]
then a.s.
\[\widehat{\varDelta }=\frac{R_{N}}{2},\hspace{2em}\widehat{\theta }-\theta =Q_{N}.\]

Remark 1.

From the point (ii) of Statement 1 it follows that MME $\widehat{\theta }$ is not consistent in the model (4) with some $\epsilon _{j}$ having all the moments (see Example 2).

Remark 2.

The value $\widehat{\varDelta }$ can be represented as a solution of the following linear programming problem (LPP):

(5)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \widehat{\varDelta }& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\underset{\varDelta \in \mathcal{D}}{\min }\varDelta ,\\{} \displaystyle \mathcal{D}& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(\tau ,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}:\left|y_{j}\hspace{0.1667em}-\hspace{0.1667em}\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\right|\hspace{0.1667em}\le \hspace{0.1667em}\varDelta ,\hspace{2.5pt}j\hspace{0.1667em}=\hspace{0.1667em}\overline{1,N}\Bigg\}\\{} & \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(\tau ,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}:\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \hspace{0.1667em}\ge \hspace{0.1667em}y_{j},-\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \ge -y_{j},\hspace{2.5pt}j\hspace{0.1667em}=\hspace{0.1667em}\overline{1,N}\Bigg\}.\end{array}\]

So, the problem (2) of determination of the values $\widehat{\varDelta }$ and $\widehat{\theta }$ is reduced to solving LPP (5). The LPP can be efficiently solved numerically by the simplex method; see [2, 12]). Investigation of asymptotic properties of maximal absolute residual $\widehat{\varDelta }$ and MME $\widehat{\theta }$ is quite difficult in the case of general model (1). However, under additional assumptions on regression experiment design and observation errors $\epsilon _{j}$, it is possible to find the limiting distribution of $\widehat{\varDelta }$, to prove the consistency of MME $\widehat{\theta }$, and even estimate the rate of convergence $\widehat{\theta }\to \theta $, $N\to \infty $.

2 The main theorems

First, we recall briefly some results of extreme value theory. Let r.v.-s $(\epsilon _{j})$ have the d.f. $F(x)$. Assume that for some constants $b_{n}>0$ and $a_{n}$, as $n\to \infty $,

(6)

\[b_{n}(Z_{n}-a_{n})\stackrel{D}{\longrightarrow }\zeta ,\]

and ζ has a nondegenerate d.f. $G(x)=\mathbb{P}(\zeta <x)$. If assumption (6) holds, then we say that d.f. F belongs to the domain of maximum attraction of the probability distribution G and write $F\in D(G)$.

If $F\in D(G)$, then G must have just one of the following three types of distributions [5, 8]:

Type I:

\[\varPhi _{\alpha }(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}0,\hspace{1em}& x\le 0,\\{} \exp \big(-{x}^{-\alpha }\big),\hspace{1em}& \alpha >0,\hspace{2.5pt}x>0;\end{array}\right.\]

Type II:

\[\varPsi _{\alpha }(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}\exp \big(-{(-x)}^{\alpha }\big),\hspace{1em}& \alpha >0,\hspace{2.5pt}x\le 0,\\{} 1,\hspace{1em}& x>0;\end{array}\right.\]

Type III:

(7)

\[\hspace{1em}\varLambda (x)=\exp \big(-{e}^{-x}\big),\hspace{0.1667em}\infty <x<\infty .\]

Necessary and sufficient conditions for convergence to each of d.f.-s $\varPhi _{\alpha }$, $\varPsi _{\alpha }$, Λ are also well known.

Suppose in the model (1) that:

(A1) ($\epsilon _{j}$) are symmetric r.v.-s;
(A2) ($\epsilon _{j}$) satisfy relation (6), that is, $F\in D(G)$ with normalizing constants $a_{n}$ and $b_{n}$, where G is one of the d.f.-s. $\varPhi _{\alpha }$, $\varPsi _{\alpha }$, Λ defined in (7).

Assume further that regression experiment design is organized as follows:

(8)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle x_{j}& \displaystyle =(x_{j1},\dots ,x_{jq})\in \{v_{1},v_{2},\dots ,v_{k}\},\hspace{1em}v_{l}=(v_{l1},\dots ,v_{lq})\in {\mathbb{R}}^{q},\\{} \displaystyle v_{m}& \displaystyle \ne v_{l},\hspace{1em}m\ne l;\end{array}\]

that is, $x_{j}$ take some fixed values only. Besides, suppose that

(9)

\[x_{j}=V_{l}\hspace{1em}\text{for}\hspace{2.5pt}j\in I_{l},\hspace{2.5pt}l=\overline{1,k},\]

$\operatorname{card}(I_{l})=n$, $I_{m}\cap I_{l}=\oslash $, $m\ne l$, $N=kn$ is the sample size,

\[V=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{11}& v_{12}& \dots & v_{1q}\\{} v_{21}& v_{22}& \dots & v_{2q}\\{} \dots & \dots & \dots & \dots \\{} v_{k1}& v_{k2}& \dots & v_{kq}\end{array}\right).\]

Theorem 1.

Under assumptions (A1), (A2), (8), and (9),

(10)

\[\varDelta _{n}=b_{n}(\widehat{\varDelta }-a_{n})\stackrel{D}{\to }\varDelta _{0},\hspace{1em}n\to \infty ,\]

where

(11)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varDelta _{0}& \displaystyle =\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{0}^{\ast }}(u),\\{} \displaystyle {L_{0}^{\ast }}(u)& \displaystyle =\sum \limits_{l=1}^{k}\big(u_{l}\zeta _{l}+{u^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big),\hspace{1em}u=\big(u_{1},\dots ,u_{k},{u^{\prime }_{1}},\dots ,{u^{\prime }_{k}}\big),\\{} \displaystyle {\mathcal{D}}^{\ast }& \displaystyle =\Bigg\{u\ge 0:\sum \limits_{l=1}^{k}\big(u_{l}-{u^{\prime }_{l}}\big)v_{li}=0,\hspace{0.1667em}\sum \limits_{l=1}^{k}\big(u_{l}+{u^{\prime }_{l}}\big)=1,\hspace{2.5pt}i=\overline{1,q}\Bigg\},\end{array}\]

$\zeta _{l}$, ${\zeta ^{\prime }_{l}}$ , $l=\overline{1,k}$, are i.r.v.-s having d.f. $G(x)$.

For a number sequence $b_{n}\to \infty $ and random sequence $(\xi _{n})$, we will write $\xi _{n}\stackrel{P}{=}O({b_{n}^{-1}})$ if

\[\underset{n}{\sup }\mathbb{P}\big(b_{n}|\xi _{n}|>C\big)\to 0\hspace{1em}\text{as}\hspace{2.5pt}C\to \infty .\]

Assume that $k\ge q$ and there exists square submatrix $\widetilde{V}\subset V$ of order q

\[\widetilde{V}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{l_{1}1}& \dots & v_{l_{1}q}\\{} \dots & \dots & \dots \\{} v_{l_{q}1}& \dots & v_{l_{q}q}\end{array}\right),\]

such that

(12)

\[\det \widetilde{V}\ne 0.\]

Theorem 2.

Assume that, under conditions of Theorem 1, $k\ge q$, assumption (12) holds and

(13)

\[b_{n}\to \infty \hspace{1em}as\hspace{2.5pt}n\to \infty .\]

Then MME $\widehat{\theta }$ is consistent, and

\[\widehat{\theta }_{i}-\theta _{i}\stackrel{P}{=}O\big({b_{n}^{-1}}\big),\hspace{1em}i=\overline{1,q}.\]

Example 1.

Let in the model of simple linear regression

(14)

\[y_{j}=\theta _{0}+\theta _{1}x_{j}+\epsilon _{j},\hspace{1em}j=\overline{1,N},\]

$x_{j}=v$, $j=\overline{1,N}$, that is, $k=1$ and $q=2$.

Then such a model can be rewritten in the form (4) with $\theta =\theta _{0}+\theta _{1}v$. Clearly, the parameters $\theta _{0}$, $\theta _{1}$ cannot be defined unambiguously here. So, it does not make sense to speak about the consistency of MME $\widehat{\theta }$ when $k<q$.

Example 2.

Consider regression model (4) with errors $\epsilon _{j}$ having the Laplace density $f(x)=\frac{1}{2}{e}^{-|x|}$. For this distribution, the famous von Mises condition is satisfied ([8], p. 16) for the type III distribution, that is, $F\in D(\varLambda )$. For symmetric $F\in D(\varLambda ),$ we have

\[\underset{n\to \infty }{\lim }\mathbb{P}\{2b_{n}Q_{n}<x\}=\frac{1}{1+{e}^{-x}}.\]

The limiting distribution is a logistic one (see [9], p. 62). Using further well-known formulas for the type Λ ([9], p. 49) $a_{n}={F}^{-1}(1-\frac{1}{n})$ and $b_{n}=nf(a_{n})$, we find $a_{n}=\ln \frac{n}{2}$ and $b_{n}=1$. From Statement 1 it follows now that MME $\widehat{\theta }$ is not consistent. Thus, condition (13) of Theorem 2 cannot be weakened.

The following lemma allows us to check condition (13).

Lemma 1.

Let $F\in D(G)$. Then we have:

1. If $G=\varPhi _{\alpha }$, then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle x_{F}& \displaystyle =\sup \big\{x:F(x)<1\big\}=\infty ,\hspace{2em}\gamma _{n}={F}^{-1}\bigg(1-\frac{1}{n}\bigg)\to \infty ,\\{} \displaystyle b_{n}& \displaystyle ={\gamma _{n}^{-1}}\to 0\hspace{1em}\textit{as }\hspace{2.5pt}n\to \infty .\end{array}\]
Thus, (13) does not hold.
2. If $G=\varPsi _{\alpha }$, then
\[x_{F}<\infty ,\hspace{1em}1-F(x_{F}-x)={x}^{\alpha }L(x),\]
where $L(x)$ is a slowly varying (s.v.) function at zero, and there exists s.v. at infinity function $L_{1}(x)$ such that
\[b_{n}={(x_{F}-\gamma _{n})}^{-1}={n}^{\alpha }L_{1}(n)\to \infty \hspace{1em}\textit{as }\hspace{2.5pt}n\to \infty .\]
So (13) is true.
3. If $G=\varLambda $, then
\[b_{n}=r(\gamma _{n}),\hspace{1em}\textit{where }\hspace{2.5pt}r(x)={R^{\prime }}(x),R(x)=-\ln (1-F(x)).\]
Clearly, (13) holds if
\[x_{F}=\infty ,\hspace{2em}r(x)\to \infty \hspace{1em}\textit{as }\hspace{2.5pt}x\to \infty .\]

Similar results can be found in [9], Corollary 2.7, pp. 44–45; see also [4, 8].

Set

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle Z_{nl}=\underset{j\in I_{l}}{\max }\epsilon _{j},\hspace{2em}W_{nl}=\underset{j\in I_{l}}{\min }\epsilon _{j}\\{} & \displaystyle R_{nl}=Z_{nl}-W_{nl},\hspace{2em}Q_{nl}=\frac{Z_{nl}+W_{nl}}{2},\hspace{1em}l=\overline{1,k}.\end{array}\]

It turns out that Theorems 1 and 2 can be significantly simplified in the case $k=q$.

Theorem 3.

Let for the model (1) conditions (8) and (9) be satisfied, $k=q$, and a matrix V satisfies condition (12). Then we have:

(i)

(15)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\varDelta }& \displaystyle =\frac{1}{2}\underset{1\le l\le q}{\max }R_{nl},\end{array}\]

(16)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\theta }_{i}-\theta _{i}& \displaystyle =\frac{\det VQ_{(i)}}{\det V},\hspace{1em}i=\overline{1,q},\end{array}\]

where the matrix $VQ_{(i)}$ is obtained from V by replacement of the ith column by the column ${(Q_{n1},\dots ,Q_{nq})}^{T}$.
(ii) If additionally conditions $(A_{1}),(A_{2})$ are satisfied, then

(17)
\[\underset{n\to \infty }{\lim }\mathbb{P}\big(2b_{n}(\hat{\varDelta }-a_{n})<x\big)={\big(G\star G(x)\big)}^{q},\]
where $G\star G(x)={\int _{-\infty }^{\infty }}G(x-y)dG(y),$ and for $i=\overline{1,q}$, as $n\to \infty $,

(18)
\[2b_{n}(\hat{\theta }_{i}-\theta _{i})\stackrel{D}{\longrightarrow }\frac{\det V\zeta _{(i)}}{\det V},\]
the matrix $V\zeta _{(i)}$ is obtained from the V by the replacement of the ith column by the column ${(\zeta _{1}-{\zeta ^{\prime }_{1}},\dots ,\zeta _{q}-{\zeta ^{\prime }_{q}})}^{T}$, where all the r.v.-s $\zeta _{i},{\zeta ^{\prime }_{i}}$ are independent and have d.f. G.

Remark 3.

Suppose that in the model (1), under assumptions (8), (9), $k<q$, and there exists a nondegenerate submatrix $\widetilde{V}\subset V$ of order k. Then

\[\hat{\varDelta }\le \frac{1}{2}\underset{1\le l\le k}{\max }R_{nl}\hspace{2.5pt}\hspace{2.5pt}\hspace{2.5pt}a.s.\]

Remark 4.

For standard LSE,

\[\hat{\theta _{i}}-\theta _{i}\stackrel{P}{=}O\big({n}^{-1/2}\big);\]

therefore, if, under the conditions of Theorems 2 and 3,

(19)

\[{n}^{-1/2}b_{n}\to \infty \hspace{1em}\text{as}\hspace{2.5pt}n\to \infty ,\]

then MME is more efficient than LSE.

In [6] (see also [9]), it is proved that if $F\in D(\varLambda )$, then for any $\delta >0$, $b_{n}=O({n}^{\delta })$. From this relation and Lemma 1 it follows that (19) is not satisfied for domains of maximum attraction $D(\varPhi _{\alpha })$ and $D(\varLambda _{\alpha })$. In the case of domain $D(\varPsi _{\alpha })$, condition (19) holds for $\alpha \in (0,2)$. For example, assume that r.v.-s $(\epsilon _{j})$ are symmetrically distributed on the interval $[-1,1]$ and

\[1-F(1-h)={h}^{\alpha }L(h)\hspace{1em}\text{as}\hspace{2.5pt}\hspace{2.5pt}h\downarrow 0,\hspace{2.5pt}\alpha \in (0,2),\]

where $L(h)$ is an s.v. function at zero. Then $b_{n}={n}^{1/\alpha }L_{1}(n)$, where $L_{1}$ is an s.v. at infinity function, and, under the conditions of Theorems 2 and 3, as $n\to \infty $,

\[|\hat{\theta _{i}}-\theta _{i}|\stackrel{P}{=}O\big({\big({n}^{1/\alpha }L_{1}(n)\big)}^{-1}\big)=o\big({n}^{-1/2}\big).\]

The next example also appears to be interesting.

Example 3.

Let $(\epsilon _{j})$ be uniformly distributed in $[-1,1]$, that is, $F(x)=\frac{x+1}{2},\hspace{0.1667em}x\in [-1,1]$. It is well known that $F\in D(\varPsi _{1})$, $a_{n}=1,\hspace{0.1667em}b_{n}=\frac{n}{2}$. Then, under the conditions of Theorem 3, as $n\to \infty $,

\[\mathbb{P}\big(n(1-\hat{\varDelta })<x\big)\to 1-{\big[\mathbb{P}\{\zeta _{1}+\zeta _{2}>x\}\big]}^{q}=1-{(1+x)}^{q}\exp (-qx),\]

where $\zeta _{1},\zeta _{2}$ are i.i.d. r.v.-s, and $\mathbb{P}(\zeta _{i}<x)=1-\exp (-x),\hspace{0.1667em}x>0$.

The following corollary is an immediate consequence of the Theorem 3.

Corollary 1.

If for simple linear regression (14), conditions (8) and (9) are satisfied, $k=q=2$, and

\[V=\left(\begin{array}{c@{\hskip10.0pt}c}1& v_{1}\\{} 1& v_{2}\end{array}\right),\hspace{1em}v_{1}\ne v_{2},\]

then we have:

(i)
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hat{\varDelta }=\frac{1}{2}\max (R_{n1},R_{n2}),\\{} & \displaystyle \hat{\theta }_{1}-\theta _{1}=\frac{Q_{n2}-Q_{n1}}{v_{2}-v_{1}},\hspace{2em}\hat{\theta }_{0}-\theta _{0}=\frac{Q_{n1}v_{2}-Q_{n2}v_{1}}{v_{2}-v_{1}};\end{array}\]
(ii) under assumptions $(A_{1})$ and $(A_{2})$, relation (17) holds for $q=2$, and, as $n\to \infty $,
\[2b_{n}(\hat{\theta }_{1}-\theta _{1})\stackrel{D}{\longrightarrow }\frac{\zeta _{2}-{\zeta ^{\prime }_{2}}-\zeta _{1}+{\zeta ^{\prime }_{1}}}{v_{2}-v_{1}},\]

\[2b_{n}(\hat{\theta }_{0}-\theta _{0})\stackrel{D}{\longrightarrow }\frac{(\zeta _{1}-{\zeta ^{\prime }_{1}})v_{2}-(\zeta _{2}-{\zeta ^{\prime }_{2}})v_{1}}{v_{2}-v_{1}},\]
where the r.v.-s $\zeta _{1},{\zeta ^{\prime }_{1}},\zeta _{2},{\zeta ^{\prime }_{2}}$ are independent and have d.f. G.

Remark 5.

The conditions of Theorem 3 do not require (13). So it describes the asymptotic distribution of $\hat{\theta }$ even for nonconsistent MME.

3 Proofs of the main results

Let us start with the following elementary lemma, where $Z_{n}(t)$, $W_{n}(t)$, $R_{n}(t)$, and $Q_{n}(t)$ are determined by a sequence $t=\{t_{1},\dots ,t_{n}\}$ and are respectively the maximum, minimum, range, and midrange of the sequence t.

Lemma 2.

Let $t_{1},\dots ,t_{n}$ be any real numbers, and

(20)

\[\alpha _{n}=\underset{s\in R}{\min }\underset{1\le j\le n}{\max }|t_{j}-s|.\]

Then $\alpha _{n}=R_{n}(t)/2$; moreover, the minimum in (20) is attained at the point $s=Q_{n}(t)$.

Proof.

Choose $s=Q_{n}(t)$. Then

\[\underset{1\le i\le n}{\max }|t_{i}-s|=Z_{n}(t)-Q_{n}(t)=Q_{n}(t)-W_{n}(t)=\frac{1}{2}R_{n}(t).\]

If $s=Q_{n}(t)+\delta $, then, for $\delta >0$,

\[\underset{1\le i\le n}{\max }|t_{i}-s|=s-W_{n}(t)=\frac{1}{2}R_{n}(t)+\delta ,\]

and, for $\delta <0$,

\[\underset{1\le i\le n}{\max }|t_{i}-s|=Z_{n}(t)-s=\frac{1}{2}R_{n}(t)-\delta ,\]

that is, $s=Q_{n}(t)$ is the point of minimum. □

Proof of Statement 1.

We will use Lemma 2:

\[\hat{\varDelta }=\underset{\tau \in {R}^{q}}{\min }\underset{1\le j\le N}{\max }\Bigg|\epsilon _{j}-\sum \limits_{i=1}^{q}(\tau _{i}-\theta _{i})x_{ji}\Bigg|\le \le \underset{\tau _{1}\in {R}^{q}}{\min }\underset{1\le j\le N}{\max }\big|\epsilon _{j}-(\tau _{1}-\theta _{1})\big|=\frac{1}{2}R_{N}\]

(we put $\tau _{i}=0$, $i\ge 2$). The point (ii) of Statement 2 follows directly from Lemma 2. □

Proof of Theorem 1.

Using the notation

\[d=(d_{1},\dots ,d_{q}),\hspace{1em}d_{i}=\tau _{i}-\theta _{i},\hspace{2.5pt}i=\overline{1,q},\]

and taking into account Eq. (1), conditions (8) and (9), we rewrite LPP (5) in the following form:

(21)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\varDelta }& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\underset{\varDelta \in \mathcal{D}_{1}}{\min }\varDelta ,\\{} \displaystyle \mathcal{D}_{1}& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(d,\varDelta )\in {\mathbb{R}}^{q}\times \mathbb{R}_{+}:\sum \limits_{i=1}^{q}d_{i}x_{ji}+\varDelta \ge \epsilon _{j},-\sum \limits_{i=1}^{q}d_{i}x_{ji}+\varDelta \ge -\epsilon _{j},j=\overline{1,N}\Bigg\}\\{} & \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(d,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}\hspace{0.1667em}:\hspace{0.1667em}\sum \limits_{i=1}^{q}d_{i}v_{li}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \hspace{0.1667em}\ge \hspace{0.1667em}Z_{nl},-\sum \limits_{i=1}^{q}d_{i}v_{li}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \ge -W_{nl},l=\overline{1,k}\Bigg\}.\end{array}\]

LPP dual to (21) has the form

(22)

\[\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{n}^{\ast }}(u),\]

where ${L_{n}^{\ast }}(u)={\sum _{l=1}^{k}}(u_{l}Z_{nl}-{u^{\prime }_{l}}W_{nl})$, and the domain ${\mathcal{D}}^{\ast }$ is given by (11).

According to the basic duality theorem ([11], Chap. 4),

\[\hat{\varDelta }=\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{n}^{\ast }}(u).\]

Hence, we obtain

\[\begin{array}{r@{\hskip0pt}l}\displaystyle b_{n}(\hat{\varDelta }-a_{n})& \displaystyle =\underset{u\in {\mathcal{D}}^{\ast }}{\max }b_{n}\big({L_{n}^{\ast }}(u)-a_{n}\big)=\underset{u\in {\mathcal{D}}^{\ast }}{\max }g_{n}(u),\\{} \displaystyle g_{n}(u)& \displaystyle =\sum \limits_{l=1}^{k}\big[u_{l}b_{n}(Z_{nl}-a_{n})+{u^{\prime }_{l}}b_{n}(-W_{nl}-a_{n})\big].\end{array}\]

Denote by ${\varGamma }^{\ast }$ the set of vertices of the domain ${\mathcal{D}}^{\ast }$ and

\[g_{0}(u)=\sum \limits_{l=1}^{k}\big(u_{l}\zeta _{l}+{u^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big).\]

Since the maximum in LPP (22) is attained at one of the vertices ${\varGamma }^{\ast }$,

\[\underset{u\in {\mathcal{D}}^{\ast }}{\max }g_{n}(u)=\underset{u\in {\varGamma }^{\ast }}{\max }g_{n}(u),\hspace{1em}n\ge 1.\]

Obviously, $\operatorname{card}({\varGamma }^{\ast })<\infty $. Thus, to prove (10), it suffices to prove that, as $n\to \infty $

\[\underset{u\in {\varGamma }^{\ast }}{\max }g_{n}(u)\stackrel{D}{\longrightarrow }\underset{u\in {\varGamma }^{\ast }}{\max }g_{0}(u)\]

(23)

\[\big(g_{n}(u),u\in {\varGamma }^{\ast }\big)\stackrel{D}{\longrightarrow }\big(g_{0}(u),u\in {\varGamma }^{\ast }\big).\]

The Cramer–Wold argument (see, e.g., §7 of the book [1]) reduces (23) to the following relation: for any $t_{m}\in R$ , as $n\to \infty $,

\[\sum \limits_{{u}^{(m)}\in {\varGamma }^{\ast }}g_{n}\big({u}^{(m)}\big)t_{m}\stackrel{D}{\longrightarrow }\sum \limits_{{u}^{(m)}\in {\varGamma }^{\ast }}g_{0}\big({u}^{(m)}\big)t_{m}.\]

The last convergence holds if for any $c_{l},{c^{\prime }_{l}}$, as $n\to \infty $,

(24)

\[\sum \limits_{l=1}^{k}\big[c_{l}(Z_{nl}-a_{n})+{c^{\prime }_{l}}(-W_{nl}-a_{n})\big]\stackrel{D}{\longrightarrow }\sum \limits_{l=1}^{k}\big(c_{l}\zeta _{l}+{c^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big).\]

Under the conditions of Theorem 1,

(25)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \zeta _{nl}& \displaystyle =b_{n}(Z_{nl}-a_{n})\stackrel{D}{\longrightarrow }\zeta _{l},\\{} \displaystyle {\zeta ^{\prime }_{nl}}& \displaystyle =b_{n}(-W_{nl}-a_{n})\stackrel{D}{\longrightarrow }{\zeta ^{\prime }_{l}},\hspace{1em}l=\overline{1,k}.\end{array}\]

The vectors $(Z_{nl},W_{nl})$, $l=\overline{1,k}$, are independent, and, on the other hand, $Z_{nl}$ and $W_{nl}$ are asymptotically independent as $n\to \infty $ ([8], p. 28). To obtain (24), it remains to apply once more the Cramer–Wold argument. □

Proof of Theorem 2.

Let $\hat{d}=(\hat{d}_{1},\dots ,\hat{d}_{q}),\hat{\varDelta }$ be the solution of LPP (21), and $\gamma _{l}={\sum _{i=1}^{q}}\hat{d}_{i}v_{li}$. Then, for any $l=\overline{1,k}$,

(26)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \gamma _{l}+\hat{\varDelta }& \displaystyle \ge Z_{nl},\\{} \displaystyle -\gamma _{l}+\hat{\varDelta }& \displaystyle \ge -W_{nl}.\end{array}\]

Rewrite the asymptotic relation (25) and (10) in the form

(27)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle Z_{nl}=a_{n}+\frac{\zeta _{nl}}{b_{n}},\hspace{2em}-W_{nl}=a_{n}+\frac{{\zeta ^{\prime }_{nl}}}{b_{n}},\\{} & \displaystyle \zeta _{nl}\stackrel{D}{\longrightarrow }\zeta _{l},\hspace{2em}{\zeta ^{\prime }_{nl}}\stackrel{D}{\longrightarrow }{\zeta ^{\prime }_{l}},\end{array}\]

and

(28)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hat{\varDelta }=a_{n}+\frac{\varDelta _{n}}{b_{n}},\\{} & \displaystyle \varDelta _{n}\stackrel{D}{\longrightarrow }\varDelta _{0}\hspace{1em}as\hspace{2.5pt}n\to \infty .\end{array}\]

Combining (26)–(28), we obtain, for $l=\overline{1,k}$,

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \gamma _{l}\ge Z_{nl}-\hat{\varDelta }& \displaystyle =\frac{\zeta _{nl}-\varDelta _{n}}{b_{n}}=O\big({b_{n}^{-1}}\big),\\{} \displaystyle \gamma _{l}\le W_{nl}+\hat{\varDelta }& \displaystyle =\frac{-{\zeta ^{\prime }_{nl}}+\varDelta _{n}}{b_{n}}=O\big({b_{n}^{-1}}\big).\end{array}\]

Choose $l_{1},\dots ,l_{q}$ satisfying (12). Then

\[\sum \limits_{i=1}^{q}\hat{d}_{i}v_{l_{j}i}=\gamma _{l_{j}}=O\big({b_{n}^{-1}}\big),\hspace{1em}j=\overline{1,q},\]

and by Cramer’s rule,

\[\hat{\theta }_{i}-\theta _{i}=\hat{d}_{i}=\frac{\det \tilde{V}\gamma _{(i)}}{\det \tilde{V}}=O\big({b_{n}^{-1}}\big),\]

where the matrix $\tilde{V}\gamma _{(i)}$ is obtained from $\tilde{V}$ by replacement of the ith column by the column ${(\gamma _{l_{1}},\dots ,\gamma _{l_{q}})}^{T}$. □

Proof of Theorem 3.

(i) We have

(29)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varDelta & \displaystyle =\underset{\tau \in {R}^{q}}{\min }\underset{1\le l\le q}{\max }\underset{j\in I_{l}}{\max }\left|y_{j}-\sum \limits_{i=1}^{q}\tau _{i}v_{li}\right|\\{} & \displaystyle =\underset{d\in {R}^{q}}{\min }\underset{1\le l\le q}{\max }\underset{j\in I_{l}}{\max }\left|\epsilon _{j}-\sum \limits_{i=1}^{q}d_{i}v_{li}\right|.\end{array}\]

By Lemma 2,

\[\underset{s\in R}{\min }\underset{j\in I_{l}}{\max }|\epsilon _{j}-s|=\frac{1}{2}R_{nl}\hspace{1em}as\hspace{2.5pt}s=Q_{nl},\hspace{2.5pt}l=\overline{1,q}.\]

Therefore, the minimum in d is attained in (29) at the point $\hat{d}$ being the solution of the system of linear equations

\[\sum \limits_{i=1}^{q}d_{i}v_{li}=Q_{nl},\hspace{1em}l=\overline{1,q}.\]

Since the matrix V is nonsingular, by Cramer’s rule

\[\hat{d}_{i}=\hat{\theta }_{i}-\theta _{i}=\frac{\det VQ_{(i)}}{\det V},\hspace{1em}i=\overline{1,q}.\]

Obviously, for such a choice of $\hat{d}$, $\varDelta =\frac{1}{2}\max _{1\le l\le q}R_{nl}$, thats is, we have obtained formulae (15) and (16).

(ii) Using the asymptotic independence of r.v.-s $Z_{n}$ and $W_{n}$, we derive the following statement.

Lemma 3.

If r.v.-s $(\epsilon _{j})$ satisfy conditions $(A_{1})$, $(A_{2})$, then, as $n\to \infty $,

(30)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle b_{n}(R_{n}-2a_{n})& \displaystyle \stackrel{D}{\longrightarrow }\zeta +{\zeta ^{\prime }},\end{array}\]

(31)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle 2b_{n}Q_{n}& \displaystyle \stackrel{D}{\longrightarrow }\zeta -{\zeta ^{\prime }},\end{array}\]

where ζ and ${\zeta ^{\prime }}$ are independent r.v.-s and have d.f. G.

In fact, this lemma is contained in Theorem 2.9.2 of the book [4] (see also Theorem 2.10 in [9]).

Equality (17) of Theorem 3 follows immediately from relation (30) of Lemma 3.

Similarly, from the asymptotic relation (31 ) and Eq. (16) we obtain (18) applying once more the Cramer–Wold argument. □

Remark3 follows directly from Theorem 3. Indeed, let $k<q$, and let there exist a nonsingular submatrix $\widetilde{V}\subset V$,

\[\widetilde{V}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{1i_{1}}& \dots & v_{1i_{k}}\\{} \dots & \dots & \dots \\{} v_{ki_{1}}& \dots & v_{ki_{k}}\end{array}\right).\]

Choosing in LPP (21) from Theorem 1, $d_{i}=0$ for all $i\ne i_{1},i_{2},\dots i_{k}$ (i.e., taking $\tau _{i}=\theta _{i}$ for such indices i), we pass to the problem (29). It remains to apply Eq. (15) of Theorem 3.

Remark 6.

Using the notation $\bar{\zeta }-\bar{{\zeta ^{\prime }}}={(\zeta _{1}-{\zeta ^{\prime }_{1}},\dots ,\zeta _{q}-{\zeta ^{\prime }_{q}})}^{T}$, the coordinatewise relation (18) of Theorem 3 can be rewritten in the equivalent vector form

(32)

\[2b_{n}(\hat{\theta }-\theta )\stackrel{D}{\longrightarrow }{V}^{-1}\big(\bar{\zeta }-\bar{{\zeta ^{\prime }}}\big)\hspace{1em}as\hspace{2.5pt}n\to \infty .\]

If $\operatorname{Var}\zeta ={\sigma _{G}^{2}}$ of r.v. ζ having d.f.G exists, then the covariance matrix of the limiting distribution in (32) is $C_{G}=2{\sigma _{G}^{2}}{({V}^{T}V)}^{-1}$.

Authors

Abstract

1 Introduction

(1)

Definition 1.

(2)

Statement 1.

(3)

(4)

Remark 1.

Remark 2.

(5)

2 The main theorems

(6)

(7)

(8)

(9)

Theorem 1.

(10)

(11)

(12)

Theorem 2.

(13)

Example 1.

(14)

Example 2.

Lemma 1.

Theorem 3.

(15)

(16)

(17)

(18)

Remark 3.

Remark 4.

(19)

Example 3.

Corollary 1.

Remark 5.

3 Proofs of the main results

Lemma 2.

(20)

Proof.

Proof of Statement 1.

Proof of Theorem 1.

(21)

(22)

(23)

(24)

(25)

Proof of Theorem 2.

(26)

(27)

(28)

Proof of Theorem 3.

(29)

Lemma 3.

(30)

(31)

Remark 6.

(32)

References

Export citation

Copy and paste formatted citation

Download citation in file

Theorem 1.

(10)

(11)

Theorem 2.

(13)

Theorem 3.

(15)

(16)

(17)

(18)