Extreme residuals in regression model. Minimax approach

Ivanov, Aleksander; Matsak, Ivan; Polotskiy, Sergiy

doi:10.15559/15-VMSTA40CNF

1 Introduction

Consider the model of linear regression

(1)

\[y_{j}=\sum \limits_{i=1}^{q}\theta _{i}x_{ji}+\epsilon _{j},\hspace{1em}j=\overline{1,N},\]

where $\theta =(\theta _{1},\theta _{2},\dots ,\theta _{q})$ is an unknown parameter, $\epsilon _{j}$ are independent identically distributed (i.i.d.) random variables (r.v.-s) with distribution function (d.f.) $F(x)$, and $X=(x_{ji})$ is a regression design matrix.

Let $\widehat{\theta }=(\widehat{\theta _{1}},\dots ,\widehat{\theta _{q}})$ be the least squares estimator (LSE) of θ. Introduce the notation

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \widehat{y_{j}}=\sum \limits_{i=1}^{q}\widehat{\theta _{i}}x_{ji},\hspace{2em}\widehat{\epsilon _{j}}=y_{j}-\widehat{y_{j}},\hspace{1em}j=\overline{1,N};\\{} & \displaystyle Z_{N}=\underset{1\le j\le N}{\max }\epsilon _{j},\hspace{2em}\widehat{Z_{N}}=\underset{1\le j\le N}{\max }\widehat{\epsilon _{j}},\\{} & \displaystyle {Z_{N}^{\ast }}=\underset{1\le j\le N}{\max }|\epsilon _{j}|,\hspace{2em}{\widehat{Z_{N}}}^{\ast }=\underset{1\le j\le N}{\max }|\widehat{\epsilon _{j}}|.\end{array}\]

Asymptotic behavior of the r.v.-s $Z_{N}$, ${Z_{N}^{\ast }}$ is studied in the theory of extreme values (see classical works by Frechet [10], Fisher and Tippet [3], and Gnedenko [5] and monographs [4, 8]). In the papers [6, 7], it was shown that under mild assumptions asymptotic properties of the r.v.-s $Z_{N}$, $\widehat{Z_{N}}$, ${Z_{N}^{\ast }}$, and ${\widehat{Z_{N}}}^{\ast }$ are similar in the cases of both finite variance and heavy tails of observation errors $\epsilon _{j}$.

In the present paper, we study asymptotic properties of minimax estimator (MME) of θ and maximal absolute residual. For MME, we keep the same notation $\widehat{\theta }$.

Definition 1.

A random variable $\widehat{\theta }=(\widehat{\theta _{1}},\dots ,\widehat{\theta _{q}})$ is called MME for θ by the observations (1)

(2)

\[\widehat{\varDelta }=\varDelta (\widehat{\theta })=\underset{\tau \in {\mathbb{R}}^{q}}{\min }\varDelta (\tau ),\]

where

\[\varDelta (\tau )=\underset{1\le j\le N}{\max }\left|y_{j}-\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\right|.\]

Denote $W_{N}=\min _{1\le j\le N}\epsilon _{j}$ and let $R_{N}=Z_{N}-W_{N}$ and $Q_{N}=\frac{Z_{N}+W_{N}}{2}$ be the range and midrange of the sequence $\epsilon _{j},\hspace{2.5pt}j=\overline{1,N}$.

The following statement shows essential difference in the behavior of MME and LSE.

Statement 1.

(i) If the model (1) contains a constant term, namely, $x_{j1}=1$, $j=\overline{1,N}$, then almost surely (a.s.)

(3)
\[\widehat{\varDelta }\le \frac{R_{N}}{2}.\hspace{2.5pt}\]
(ii) If the model (1) has the form

(4)
\[y_{j}=\theta +\epsilon _{j},\hspace{1em}j=\overline{1,N},\]
then a.s.
\[\widehat{\varDelta }=\frac{R_{N}}{2},\hspace{2em}\widehat{\theta }-\theta =Q_{N}.\]

Remark 1.

From the point (ii) of Statement 1 it follows that MME $\widehat{\theta }$ is not consistent in the model (4) with some $\epsilon _{j}$ having all the moments (see Example 2).

Remark 2.

The value $\widehat{\varDelta }$ can be represented as a solution of the following linear programming problem (LPP):

(5)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \widehat{\varDelta }& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\underset{\varDelta \in \mathcal{D}}{\min }\varDelta ,\\{} \displaystyle \mathcal{D}& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(\tau ,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}:\left|y_{j}\hspace{0.1667em}-\hspace{0.1667em}\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\right|\hspace{0.1667em}\le \hspace{0.1667em}\varDelta ,\hspace{2.5pt}j\hspace{0.1667em}=\hspace{0.1667em}\overline{1,N}\Bigg\}\\{} & \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(\tau ,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}:\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \hspace{0.1667em}\ge \hspace{0.1667em}y_{j},-\sum \limits_{i=1}^{q}\tau _{i}x_{ji}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \ge -y_{j},\hspace{2.5pt}j\hspace{0.1667em}=\hspace{0.1667em}\overline{1,N}\Bigg\}.\end{array}\]

So, the problem (2) of determination of the values $\widehat{\varDelta }$ and $\widehat{\theta }$ is reduced to solving LPP (5). The LPP can be efficiently solved numerically by the simplex method; see [2, 12]). Investigation of asymptotic properties of maximal absolute residual $\widehat{\varDelta }$ and MME $\widehat{\theta }$ is quite difficult in the case of general model (1). However, under additional assumptions on regression experiment design and observation errors $\epsilon _{j}$, it is possible to find the limiting distribution of $\widehat{\varDelta }$, to prove the consistency of MME $\widehat{\theta }$, and even estimate the rate of convergence $\widehat{\theta }\to \theta $, $N\to \infty $.

2 The main theorems

First, we recall briefly some results of extreme value theory. Let r.v.-s $(\epsilon _{j})$ have the d.f. $F(x)$. Assume that for some constants $b_{n}>0$ and $a_{n}$, as $n\to \infty $,

(6)

\[b_{n}(Z_{n}-a_{n})\stackrel{D}{\longrightarrow }\zeta ,\]

and ζ has a nondegenerate d.f. $G(x)=\mathbb{P}(\zeta <x)$. If assumption (6) holds, then we say that d.f. F belongs to the domain of maximum attraction of the probability distribution G and write $F\in D(G)$.

If $F\in D(G)$, then G must have just one of the following three types of distributions [5, 8]:

Type I:

\[\varPhi _{\alpha }(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}0,\hspace{1em}& x\le 0,\\{} \exp \big(-{x}^{-\alpha }\big),\hspace{1em}& \alpha >0,\hspace{2.5pt}x>0;\end{array}\right.\]

Type II:

\[\varPsi _{\alpha }(x)=\left\{\begin{array}{l@{\hskip10.0pt}l}\exp \big(-{(-x)}^{\alpha }\big),\hspace{1em}& \alpha >0,\hspace{2.5pt}x\le 0,\\{} 1,\hspace{1em}& x>0;\end{array}\right.\]

Type III:

(7)

\[\hspace{1em}\varLambda (x)=\exp \big(-{e}^{-x}\big),\hspace{0.1667em}\infty <x<\infty .\]

Necessary and sufficient conditions for convergence to each of d.f.-s $\varPhi _{\alpha }$, $\varPsi _{\alpha }$, Λ are also well known.

Suppose in the model (1) that:

(A1) ($\epsilon _{j}$) are symmetric r.v.-s;
(A2) ($\epsilon _{j}$) satisfy relation (6), that is, $F\in D(G)$ with normalizing constants $a_{n}$ and $b_{n}$, where G is one of the d.f.-s. $\varPhi _{\alpha }$, $\varPsi _{\alpha }$, Λ defined in (7).

Assume further that regression experiment design is organized as follows:

(8)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle x_{j}& \displaystyle =(x_{j1},\dots ,x_{jq})\in \{v_{1},v_{2},\dots ,v_{k}\},\hspace{1em}v_{l}=(v_{l1},\dots ,v_{lq})\in {\mathbb{R}}^{q},\\{} \displaystyle v_{m}& \displaystyle \ne v_{l},\hspace{1em}m\ne l;\end{array}\]

that is, $x_{j}$ take some fixed values only. Besides, suppose that

(9)

\[x_{j}=V_{l}\hspace{1em}\text{for}\hspace{2.5pt}j\in I_{l},\hspace{2.5pt}l=\overline{1,k},\]

$\operatorname{card}(I_{l})=n$, $I_{m}\cap I_{l}=\oslash $, $m\ne l$, $N=kn$ is the sample size,

\[V=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{11}& v_{12}& \dots & v_{1q}\\{} v_{21}& v_{22}& \dots & v_{2q}\\{} \dots & \dots & \dots & \dots \\{} v_{k1}& v_{k2}& \dots & v_{kq}\end{array}\right).\]

Theorem 1.

Under assumptions (A1), (A2), (8), and (9),

(10)

\[\varDelta _{n}=b_{n}(\widehat{\varDelta }-a_{n})\stackrel{D}{\to }\varDelta _{0},\hspace{1em}n\to \infty ,\]

where

(11)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varDelta _{0}& \displaystyle =\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{0}^{\ast }}(u),\\{} \displaystyle {L_{0}^{\ast }}(u)& \displaystyle =\sum \limits_{l=1}^{k}\big(u_{l}\zeta _{l}+{u^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big),\hspace{1em}u=\big(u_{1},\dots ,u_{k},{u^{\prime }_{1}},\dots ,{u^{\prime }_{k}}\big),\\{} \displaystyle {\mathcal{D}}^{\ast }& \displaystyle =\Bigg\{u\ge 0:\sum \limits_{l=1}^{k}\big(u_{l}-{u^{\prime }_{l}}\big)v_{li}=0,\hspace{0.1667em}\sum \limits_{l=1}^{k}\big(u_{l}+{u^{\prime }_{l}}\big)=1,\hspace{2.5pt}i=\overline{1,q}\Bigg\},\end{array}\]

$\zeta _{l}$, ${\zeta ^{\prime }_{l}}$ , $l=\overline{1,k}$, are i.r.v.-s having d.f. $G(x)$.

For a number sequence $b_{n}\to \infty $ and random sequence $(\xi _{n})$, we will write $\xi _{n}\stackrel{P}{=}O({b_{n}^{-1}})$ if

\[\underset{n}{\sup }\mathbb{P}\big(b_{n}|\xi _{n}|>C\big)\to 0\hspace{1em}\text{as}\hspace{2.5pt}C\to \infty .\]

Assume that $k\ge q$ and there exists square submatrix $\widetilde{V}\subset V$ of order q

\[\widetilde{V}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{l_{1}1}& \dots & v_{l_{1}q}\\{} \dots & \dots & \dots \\{} v_{l_{q}1}& \dots & v_{l_{q}q}\end{array}\right),\]

such that

(12)

\[\det \widetilde{V}\ne 0.\]

Theorem 2.

Assume that, under conditions of Theorem 1, $k\ge q$, assumption (12) holds and

(13)

\[b_{n}\to \infty \hspace{1em}as\hspace{2.5pt}n\to \infty .\]

Then MME $\widehat{\theta }$ is consistent, and

\[\widehat{\theta }_{i}-\theta _{i}\stackrel{P}{=}O\big({b_{n}^{-1}}\big),\hspace{1em}i=\overline{1,q}.\]

Example 1.

Let in the model of simple linear regression

(14)

\[y_{j}=\theta _{0}+\theta _{1}x_{j}+\epsilon _{j},\hspace{1em}j=\overline{1,N},\]

$x_{j}=v$, $j=\overline{1,N}$, that is, $k=1$ and $q=2$.

Then such a model can be rewritten in the form (4) with $\theta =\theta _{0}+\theta _{1}v$. Clearly, the parameters $\theta _{0}$, $\theta _{1}$ cannot be defined unambiguously here. So, it does not make sense to speak about the consistency of MME $\widehat{\theta }$ when $k<q$.

Example 2.

Consider regression model (4) with errors $\epsilon _{j}$ having the Laplace density $f(x)=\frac{1}{2}{e}^{-|x|}$. For this distribution, the famous von Mises condition is satisfied ([8], p. 16) for the type III distribution, that is, $F\in D(\varLambda )$. For symmetric $F\in D(\varLambda ),$ we have

\[\underset{n\to \infty }{\lim }\mathbb{P}\{2b_{n}Q_{n}<x\}=\frac{1}{1+{e}^{-x}}.\]

The limiting distribution is a logistic one (see [9], p. 62). Using further well-known formulas for the type Λ ([9], p. 49) $a_{n}={F}^{-1}(1-\frac{1}{n})$ and $b_{n}=nf(a_{n})$, we find $a_{n}=\ln \frac{n}{2}$ and $b_{n}=1$. From Statement 1 it follows now that MME $\widehat{\theta }$ is not consistent. Thus, condition (13) of Theorem 2 cannot be weakened.

The following lemma allows us to check condition (13).

Lemma 1.

Let $F\in D(G)$. Then we have:

1. If $G=\varPhi _{\alpha }$, then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle x_{F}& \displaystyle =\sup \big\{x:F(x)<1\big\}=\infty ,\hspace{2em}\gamma _{n}={F}^{-1}\bigg(1-\frac{1}{n}\bigg)\to \infty ,\\{} \displaystyle b_{n}& \displaystyle ={\gamma _{n}^{-1}}\to 0\hspace{1em}\textit{as }\hspace{2.5pt}n\to \infty .\end{array}\]
Thus, (13) does not hold.
2. If $G=\varPsi _{\alpha }$, then
\[x_{F}<\infty ,\hspace{1em}1-F(x_{F}-x)={x}^{\alpha }L(x),\]
where $L(x)$ is a slowly varying (s.v.) function at zero, and there exists s.v. at infinity function $L_{1}(x)$ such that
\[b_{n}={(x_{F}-\gamma _{n})}^{-1}={n}^{\alpha }L_{1}(n)\to \infty \hspace{1em}\textit{as }\hspace{2.5pt}n\to \infty .\]
So (13) is true.
3. If $G=\varLambda $, then
\[b_{n}=r(\gamma _{n}),\hspace{1em}\textit{where }\hspace{2.5pt}r(x)={R^{\prime }}(x),R(x)=-\ln (1-F(x)).\]
Clearly, (13) holds if
\[x_{F}=\infty ,\hspace{2em}r(x)\to \infty \hspace{1em}\textit{as }\hspace{2.5pt}x\to \infty .\]

Similar results can be found in [9], Corollary 2.7, pp. 44–45; see also [4, 8].

Set

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle Z_{nl}=\underset{j\in I_{l}}{\max }\epsilon _{j},\hspace{2em}W_{nl}=\underset{j\in I_{l}}{\min }\epsilon _{j}\\{} & \displaystyle R_{nl}=Z_{nl}-W_{nl},\hspace{2em}Q_{nl}=\frac{Z_{nl}+W_{nl}}{2},\hspace{1em}l=\overline{1,k}.\end{array}\]

It turns out that Theorems 1 and 2 can be significantly simplified in the case $k=q$.

Theorem 3.

Let for the model (1) conditions (8) and (9) be satisfied, $k=q$, and a matrix V satisfies condition (12). Then we have:

(i)

(15)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\varDelta }& \displaystyle =\frac{1}{2}\underset{1\le l\le q}{\max }R_{nl},\end{array}\]

(16)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\theta }_{i}-\theta _{i}& \displaystyle =\frac{\det VQ_{(i)}}{\det V},\hspace{1em}i=\overline{1,q},\end{array}\]

where the matrix $VQ_{(i)}$ is obtained from V by replacement of the ith column by the column ${(Q_{n1},\dots ,Q_{nq})}^{T}$.
(ii) If additionally conditions $(A_{1}),(A_{2})$ are satisfied, then

(17)
\[\underset{n\to \infty }{\lim }\mathbb{P}\big(2b_{n}(\hat{\varDelta }-a_{n})<x\big)={\big(G\star G(x)\big)}^{q},\]
where $G\star G(x)={\int _{-\infty }^{\infty }}G(x-y)dG(y),$ and for $i=\overline{1,q}$, as $n\to \infty $,

(18)
\[2b_{n}(\hat{\theta }_{i}-\theta _{i})\stackrel{D}{\longrightarrow }\frac{\det V\zeta _{(i)}}{\det V},\]
the matrix $V\zeta _{(i)}$ is obtained from the V by the replacement of the ith column by the column ${(\zeta _{1}-{\zeta ^{\prime }_{1}},\dots ,\zeta _{q}-{\zeta ^{\prime }_{q}})}^{T}$, where all the r.v.-s $\zeta _{i},{\zeta ^{\prime }_{i}}$ are independent and have d.f. G.

Remark 3.

Suppose that in the model (1), under assumptions (8), (9), $k<q$, and there exists a nondegenerate submatrix $\widetilde{V}\subset V$ of order k. Then

\[\hat{\varDelta }\le \frac{1}{2}\underset{1\le l\le k}{\max }R_{nl}\hspace{2.5pt}\hspace{2.5pt}\hspace{2.5pt}a.s.\]

Remark 4.

For standard LSE,

\[\hat{\theta _{i}}-\theta _{i}\stackrel{P}{=}O\big({n}^{-1/2}\big);\]

therefore, if, under the conditions of Theorems 2 and 3,

(19)

\[{n}^{-1/2}b_{n}\to \infty \hspace{1em}\text{as}\hspace{2.5pt}n\to \infty ,\]

then MME is more efficient than LSE.

In [6] (see also [9]), it is proved that if $F\in D(\varLambda )$, then for any $\delta >0$, $b_{n}=O({n}^{\delta })$. From this relation and Lemma 1 it follows that (19) is not satisfied for domains of maximum attraction $D(\varPhi _{\alpha })$ and $D(\varLambda _{\alpha })$. In the case of domain $D(\varPsi _{\alpha })$, condition (19) holds for $\alpha \in (0,2)$. For example, assume that r.v.-s $(\epsilon _{j})$ are symmetrically distributed on the interval $[-1,1]$ and

\[1-F(1-h)={h}^{\alpha }L(h)\hspace{1em}\text{as}\hspace{2.5pt}\hspace{2.5pt}h\downarrow 0,\hspace{2.5pt}\alpha \in (0,2),\]

where $L(h)$ is an s.v. function at zero. Then $b_{n}={n}^{1/\alpha }L_{1}(n)$, where $L_{1}$ is an s.v. at infinity function, and, under the conditions of Theorems 2 and 3, as $n\to \infty $,

\[|\hat{\theta _{i}}-\theta _{i}|\stackrel{P}{=}O\big({\big({n}^{1/\alpha }L_{1}(n)\big)}^{-1}\big)=o\big({n}^{-1/2}\big).\]

The next example also appears to be interesting.

Example 3.

Let $(\epsilon _{j})$ be uniformly distributed in $[-1,1]$, that is, $F(x)=\frac{x+1}{2},\hspace{0.1667em}x\in [-1,1]$. It is well known that $F\in D(\varPsi _{1})$, $a_{n}=1,\hspace{0.1667em}b_{n}=\frac{n}{2}$. Then, under the conditions of Theorem 3, as $n\to \infty $,

\[\mathbb{P}\big(n(1-\hat{\varDelta })<x\big)\to 1-{\big[\mathbb{P}\{\zeta _{1}+\zeta _{2}>x\}\big]}^{q}=1-{(1+x)}^{q}\exp (-qx),\]

where $\zeta _{1},\zeta _{2}$ are i.i.d. r.v.-s, and $\mathbb{P}(\zeta _{i}<x)=1-\exp (-x),\hspace{0.1667em}x>0$.

The following corollary is an immediate consequence of the Theorem 3.

Corollary 1.

If for simple linear regression (14), conditions (8) and (9) are satisfied, $k=q=2$, and

\[V=\left(\begin{array}{c@{\hskip10.0pt}c}1& v_{1}\\{} 1& v_{2}\end{array}\right),\hspace{1em}v_{1}\ne v_{2},\]

then we have:

(i)
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hat{\varDelta }=\frac{1}{2}\max (R_{n1},R_{n2}),\\{} & \displaystyle \hat{\theta }_{1}-\theta _{1}=\frac{Q_{n2}-Q_{n1}}{v_{2}-v_{1}},\hspace{2em}\hat{\theta }_{0}-\theta _{0}=\frac{Q_{n1}v_{2}-Q_{n2}v_{1}}{v_{2}-v_{1}};\end{array}\]
(ii) under assumptions $(A_{1})$ and $(A_{2})$, relation (17) holds for $q=2$, and, as $n\to \infty $,
\[2b_{n}(\hat{\theta }_{1}-\theta _{1})\stackrel{D}{\longrightarrow }\frac{\zeta _{2}-{\zeta ^{\prime }_{2}}-\zeta _{1}+{\zeta ^{\prime }_{1}}}{v_{2}-v_{1}},\]

\[2b_{n}(\hat{\theta }_{0}-\theta _{0})\stackrel{D}{\longrightarrow }\frac{(\zeta _{1}-{\zeta ^{\prime }_{1}})v_{2}-(\zeta _{2}-{\zeta ^{\prime }_{2}})v_{1}}{v_{2}-v_{1}},\]
where the r.v.-s $\zeta _{1},{\zeta ^{\prime }_{1}},\zeta _{2},{\zeta ^{\prime }_{2}}$ are independent and have d.f. G.

Remark 5.

The conditions of Theorem 3 do not require (13). So it describes the asymptotic distribution of $\hat{\theta }$ even for nonconsistent MME.

3 Proofs of the main results

Let us start with the following elementary lemma, where $Z_{n}(t)$, $W_{n}(t)$, $R_{n}(t)$, and $Q_{n}(t)$ are determined by a sequence $t=\{t_{1},\dots ,t_{n}\}$ and are respectively the maximum, minimum, range, and midrange of the sequence t.

Lemma 2.

Let $t_{1},\dots ,t_{n}$ be any real numbers, and

(20)

\[\alpha _{n}=\underset{s\in R}{\min }\underset{1\le j\le n}{\max }|t_{j}-s|.\]

Then $\alpha _{n}=R_{n}(t)/2$; moreover, the minimum in (20) is attained at the point $s=Q_{n}(t)$.

Proof.

Choose $s=Q_{n}(t)$. Then

\[\underset{1\le i\le n}{\max }|t_{i}-s|=Z_{n}(t)-Q_{n}(t)=Q_{n}(t)-W_{n}(t)=\frac{1}{2}R_{n}(t).\]

If $s=Q_{n}(t)+\delta $, then, for $\delta >0$,

\[\underset{1\le i\le n}{\max }|t_{i}-s|=s-W_{n}(t)=\frac{1}{2}R_{n}(t)+\delta ,\]

and, for $\delta <0$,

\[\underset{1\le i\le n}{\max }|t_{i}-s|=Z_{n}(t)-s=\frac{1}{2}R_{n}(t)-\delta ,\]

that is, $s=Q_{n}(t)$ is the point of minimum. □

Proof of Statement 1.

We will use Lemma 2:

\[\hat{\varDelta }=\underset{\tau \in {R}^{q}}{\min }\underset{1\le j\le N}{\max }\Bigg|\epsilon _{j}-\sum \limits_{i=1}^{q}(\tau _{i}-\theta _{i})x_{ji}\Bigg|\le \le \underset{\tau _{1}\in {R}^{q}}{\min }\underset{1\le j\le N}{\max }\big|\epsilon _{j}-(\tau _{1}-\theta _{1})\big|=\frac{1}{2}R_{N}\]

(we put $\tau _{i}=0$, $i\ge 2$). The point (ii) of Statement 2 follows directly from Lemma 2. □

Proof of Theorem 1.

Using the notation

\[d=(d_{1},\dots ,d_{q}),\hspace{1em}d_{i}=\tau _{i}-\theta _{i},\hspace{2.5pt}i=\overline{1,q},\]

and taking into account Eq. (1), conditions (8) and (9), we rewrite LPP (5) in the following form:

(21)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{\varDelta }& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\underset{\varDelta \in \mathcal{D}_{1}}{\min }\varDelta ,\\{} \displaystyle \mathcal{D}_{1}& \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(d,\varDelta )\in {\mathbb{R}}^{q}\times \mathbb{R}_{+}:\sum \limits_{i=1}^{q}d_{i}x_{ji}+\varDelta \ge \epsilon _{j},-\sum \limits_{i=1}^{q}d_{i}x_{ji}+\varDelta \ge -\epsilon _{j},j=\overline{1,N}\Bigg\}\\{} & \displaystyle \hspace{0.1667em}=\hspace{0.1667em}\Bigg\{(d,\varDelta )\in {\mathbb{R}}^{q}\hspace{0.1667em}\times \hspace{0.1667em}\mathbb{R}_{+}\hspace{0.1667em}:\hspace{0.1667em}\sum \limits_{i=1}^{q}d_{i}v_{li}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \hspace{0.1667em}\ge \hspace{0.1667em}Z_{nl},-\sum \limits_{i=1}^{q}d_{i}v_{li}\hspace{0.1667em}+\hspace{0.1667em}\varDelta \ge -W_{nl},l=\overline{1,k}\Bigg\}.\end{array}\]

LPP dual to (21) has the form

(22)

\[\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{n}^{\ast }}(u),\]

where ${L_{n}^{\ast }}(u)={\sum _{l=1}^{k}}(u_{l}Z_{nl}-{u^{\prime }_{l}}W_{nl})$, and the domain ${\mathcal{D}}^{\ast }$ is given by (11).

According to the basic duality theorem ([11], Chap. 4),

\[\hat{\varDelta }=\underset{u\in {\mathcal{D}}^{\ast }}{\max }{L_{n}^{\ast }}(u).\]

Hence, we obtain

\[\begin{array}{r@{\hskip0pt}l}\displaystyle b_{n}(\hat{\varDelta }-a_{n})& \displaystyle =\underset{u\in {\mathcal{D}}^{\ast }}{\max }b_{n}\big({L_{n}^{\ast }}(u)-a_{n}\big)=\underset{u\in {\mathcal{D}}^{\ast }}{\max }g_{n}(u),\\{} \displaystyle g_{n}(u)& \displaystyle =\sum \limits_{l=1}^{k}\big[u_{l}b_{n}(Z_{nl}-a_{n})+{u^{\prime }_{l}}b_{n}(-W_{nl}-a_{n})\big].\end{array}\]

Denote by ${\varGamma }^{\ast }$ the set of vertices of the domain ${\mathcal{D}}^{\ast }$ and

\[g_{0}(u)=\sum \limits_{l=1}^{k}\big(u_{l}\zeta _{l}+{u^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big).\]

Since the maximum in LPP (22) is attained at one of the vertices ${\varGamma }^{\ast }$,

\[\underset{u\in {\mathcal{D}}^{\ast }}{\max }g_{n}(u)=\underset{u\in {\varGamma }^{\ast }}{\max }g_{n}(u),\hspace{1em}n\ge 1.\]

Obviously, $\operatorname{card}({\varGamma }^{\ast })<\infty $. Thus, to prove (10), it suffices to prove that, as $n\to \infty $

\[\underset{u\in {\varGamma }^{\ast }}{\max }g_{n}(u)\stackrel{D}{\longrightarrow }\underset{u\in {\varGamma }^{\ast }}{\max }g_{0}(u)\]

(23)

\[\big(g_{n}(u),u\in {\varGamma }^{\ast }\big)\stackrel{D}{\longrightarrow }\big(g_{0}(u),u\in {\varGamma }^{\ast }\big).\]

The Cramer–Wold argument (see, e.g., §7 of the book [1]) reduces (23) to the following relation: for any $t_{m}\in R$ , as $n\to \infty $,

\[\sum \limits_{{u}^{(m)}\in {\varGamma }^{\ast }}g_{n}\big({u}^{(m)}\big)t_{m}\stackrel{D}{\longrightarrow }\sum \limits_{{u}^{(m)}\in {\varGamma }^{\ast }}g_{0}\big({u}^{(m)}\big)t_{m}.\]

The last convergence holds if for any $c_{l},{c^{\prime }_{l}}$, as $n\to \infty $,

(24)

\[\sum \limits_{l=1}^{k}\big[c_{l}(Z_{nl}-a_{n})+{c^{\prime }_{l}}(-W_{nl}-a_{n})\big]\stackrel{D}{\longrightarrow }\sum \limits_{l=1}^{k}\big(c_{l}\zeta _{l}+{c^{\prime }_{l}}{\zeta ^{\prime }_{l}}\big).\]

Under the conditions of Theorem 1,

(25)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \zeta _{nl}& \displaystyle =b_{n}(Z_{nl}-a_{n})\stackrel{D}{\longrightarrow }\zeta _{l},\\{} \displaystyle {\zeta ^{\prime }_{nl}}& \displaystyle =b_{n}(-W_{nl}-a_{n})\stackrel{D}{\longrightarrow }{\zeta ^{\prime }_{l}},\hspace{1em}l=\overline{1,k}.\end{array}\]

The vectors $(Z_{nl},W_{nl})$, $l=\overline{1,k}$, are independent, and, on the other hand, $Z_{nl}$ and $W_{nl}$ are asymptotically independent as $n\to \infty $ ([8], p. 28). To obtain (24), it remains to apply once more the Cramer–Wold argument. □

Proof of Theorem 2.

Let $\hat{d}=(\hat{d}_{1},\dots ,\hat{d}_{q}),\hat{\varDelta }$ be the solution of LPP (21), and $\gamma _{l}={\sum _{i=1}^{q}}\hat{d}_{i}v_{li}$. Then, for any $l=\overline{1,k}$,

(26)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \gamma _{l}+\hat{\varDelta }& \displaystyle \ge Z_{nl},\\{} \displaystyle -\gamma _{l}+\hat{\varDelta }& \displaystyle \ge -W_{nl}.\end{array}\]

Rewrite the asymptotic relation (25) and (10) in the form

(27)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle Z_{nl}=a_{n}+\frac{\zeta _{nl}}{b_{n}},\hspace{2em}-W_{nl}=a_{n}+\frac{{\zeta ^{\prime }_{nl}}}{b_{n}},\\{} & \displaystyle \zeta _{nl}\stackrel{D}{\longrightarrow }\zeta _{l},\hspace{2em}{\zeta ^{\prime }_{nl}}\stackrel{D}{\longrightarrow }{\zeta ^{\prime }_{l}},\end{array}\]

and

(28)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hat{\varDelta }=a_{n}+\frac{\varDelta _{n}}{b_{n}},\\{} & \displaystyle \varDelta _{n}\stackrel{D}{\longrightarrow }\varDelta _{0}\hspace{1em}as\hspace{2.5pt}n\to \infty .\end{array}\]

Combining (26)–(28), we obtain, for $l=\overline{1,k}$,

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \gamma _{l}\ge Z_{nl}-\hat{\varDelta }& \displaystyle =\frac{\zeta _{nl}-\varDelta _{n}}{b_{n}}=O\big({b_{n}^{-1}}\big),\\{} \displaystyle \gamma _{l}\le W_{nl}+\hat{\varDelta }& \displaystyle =\frac{-{\zeta ^{\prime }_{nl}}+\varDelta _{n}}{b_{n}}=O\big({b_{n}^{-1}}\big).\end{array}\]

Choose $l_{1},\dots ,l_{q}$ satisfying (12). Then

\[\sum \limits_{i=1}^{q}\hat{d}_{i}v_{l_{j}i}=\gamma _{l_{j}}=O\big({b_{n}^{-1}}\big),\hspace{1em}j=\overline{1,q},\]

and by Cramer’s rule,

\[\hat{\theta }_{i}-\theta _{i}=\hat{d}_{i}=\frac{\det \tilde{V}\gamma _{(i)}}{\det \tilde{V}}=O\big({b_{n}^{-1}}\big),\]

where the matrix $\tilde{V}\gamma _{(i)}$ is obtained from $\tilde{V}$ by replacement of the ith column by the column ${(\gamma _{l_{1}},\dots ,\gamma _{l_{q}})}^{T}$. □

Proof of Theorem 3.

(i) We have

(29)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varDelta & \displaystyle =\underset{\tau \in {R}^{q}}{\min }\underset{1\le l\le q}{\max }\underset{j\in I_{l}}{\max }\left|y_{j}-\sum \limits_{i=1}^{q}\tau _{i}v_{li}\right|\\{} & \displaystyle =\underset{d\in {R}^{q}}{\min }\underset{1\le l\le q}{\max }\underset{j\in I_{l}}{\max }\left|\epsilon _{j}-\sum \limits_{i=1}^{q}d_{i}v_{li}\right|.\end{array}\]

By Lemma 2,

\[\underset{s\in R}{\min }\underset{j\in I_{l}}{\max }|\epsilon _{j}-s|=\frac{1}{2}R_{nl}\hspace{1em}as\hspace{2.5pt}s=Q_{nl},\hspace{2.5pt}l=\overline{1,q}.\]

Therefore, the minimum in d is attained in (29) at the point $\hat{d}$ being the solution of the system of linear equations

\[\sum \limits_{i=1}^{q}d_{i}v_{li}=Q_{nl},\hspace{1em}l=\overline{1,q}.\]

Since the matrix V is nonsingular, by Cramer’s rule

\[\hat{d}_{i}=\hat{\theta }_{i}-\theta _{i}=\frac{\det VQ_{(i)}}{\det V},\hspace{1em}i=\overline{1,q}.\]

Obviously, for such a choice of $\hat{d}$, $\varDelta =\frac{1}{2}\max _{1\le l\le q}R_{nl}$, thats is, we have obtained formulae (15) and (16).

(ii) Using the asymptotic independence of r.v.-s $Z_{n}$ and $W_{n}$, we derive the following statement.

Lemma 3.

If r.v.-s $(\epsilon _{j})$ satisfy conditions $(A_{1})$, $(A_{2})$, then, as $n\to \infty $,

(30)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle b_{n}(R_{n}-2a_{n})& \displaystyle \stackrel{D}{\longrightarrow }\zeta +{\zeta ^{\prime }},\end{array}\]

(31)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle 2b_{n}Q_{n}& \displaystyle \stackrel{D}{\longrightarrow }\zeta -{\zeta ^{\prime }},\end{array}\]

where ζ and ${\zeta ^{\prime }}$ are independent r.v.-s and have d.f. G.

In fact, this lemma is contained in Theorem 2.9.2 of the book [4] (see also Theorem 2.10 in [9]).

Equality (17) of Theorem 3 follows immediately from relation (30) of Lemma 3.

Similarly, from the asymptotic relation (31 ) and Eq. (16) we obtain (18) applying once more the Cramer–Wold argument. □

Remark3 follows directly from Theorem 3. Indeed, let $k<q$, and let there exist a nonsingular submatrix $\widetilde{V}\subset V$,

\[\widetilde{V}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}v_{1i_{1}}& \dots & v_{1i_{k}}\\{} \dots & \dots & \dots \\{} v_{ki_{1}}& \dots & v_{ki_{k}}\end{array}\right).\]

Choosing in LPP (21) from Theorem 1, $d_{i}=0$ for all $i\ne i_{1},i_{2},\dots i_{k}$ (i.e., taking $\tau _{i}=\theta _{i}$ for such indices i), we pass to the problem (29). It remains to apply Eq. (15) of Theorem 3.

Remark 6.

Using the notation $\bar{\zeta }-\bar{{\zeta ^{\prime }}}={(\zeta _{1}-{\zeta ^{\prime }_{1}},\dots ,\zeta _{q}-{\zeta ^{\prime }_{q}})}^{T}$, the coordinatewise relation (18) of Theorem 3 can be rewritten in the equivalent vector form

(32)

\[2b_{n}(\hat{\theta }-\theta )\stackrel{D}{\longrightarrow }{V}^{-1}\big(\bar{\zeta }-\bar{{\zeta ^{\prime }}}\big)\hspace{1em}as\hspace{2.5pt}n\to \infty .\]

If $\operatorname{Var}\zeta ={\sigma _{G}^{2}}$ of r.v. ζ having d.f.G exists, then the covariance matrix of the limiting distribution in (32) is $C_{G}=2{\sigma _{G}^{2}}{({V}^{T}V)}^{-1}$.

Authors

Abstract