Goodness-of-fit test in a multivariate errors-in-variables model AX=B

Kukush, Alexander; Tsaregorodtsev, Yaroslav

doi:10.15559/16-VMSTA67

Modern Stochastics: Theory and Applications

Goodness-of-fit test in a multivariate errors-in-variables model

A X = B

Volume 3, Issue 4 (2016), pp. 287–302

Alexander Kukush Yaroslav Tsaregorodtsev

https://doi.org/10.15559/16-VMSTA67

Pub. online: 20 December 2016 Type: Research Article

Open Access

Received
10 November 2016

Revised
3 December 2016

Accepted
4 December 2016

Published
20 December 2016

Abstract

We consider a multivariable functional errors-in-variables model $AX\approx B$, where the data matrices A and B are observed with errors, and a matrix parameter X is to be estimated. A goodness-of-fit test is constructed based on the total least squares estimator. The proposed test is asymptotically chi-squared under null hypothesis. The power of the test under local alternatives is discussed.

1 Introduction

We study an overdetermined system of linear equations $AX\approx B$, which often occurs in the problems of dynamical system identification [10]. If matrices A and B are observed with additive uncorrelated errors of equal size, then the total least squares (TLS) method is used to solve the system [10].

In papers [3, 7, 9], under various conditions, the consistency of the TLS estimator $\hat{X}$ is proven as the number m of rows in the matrix A is increasing, assuming that the true value ${A}^{0}$ of the input matrix is nonrandom. The asymptotic normality of the estimator is studied in [3] and [6].

The model $AX\approx B$ with random measurement errors corresponds to the vector linear errors-in-variables model (EIVM). In [2], a goodness-of-fit test is constructed for a polynomial EIVM with nonrandom latent variable (i.e., in the functional case); the test can be also used in the structural case, where the latent variable is random with unknown probability distribution. A more powerful test in the polynomial EIVM is elaborated in [4].

In the paper [5], a goodness-of-fit test is constructed for the functional model $AX\approx B$, assuming that the error matrices $\tilde{A}$ and $\tilde{B}$ are independent and the covariance structure of $\tilde{A}$ is known. In the present paper, we construct a goodness-of-fit test in a more common situation, where the total covariance structure of the matrices $\tilde{A}$ and $\tilde{B}$ is known up to a scalar factor. A test statistic is based on the TLS estimator $\hat{X}$. Under the null hypothesis, the asymptotic behavior of the test statistic is studied based on results of [6] and, under local alternatives, based on [9].

The present paper is organized as follows. In Section 2, we describe the observation model, introduce the TLS estimator, and formulate known results on the strong consistency and asymptotic normality of the estimator. In the next section, we construct the goodness-of-fit test and show that the proposed test statistic has an asymptotic chi-squared distribution with the corresponding number of degrees of freedom. The power of the test with respect to the local alternatives is studied in Section 4, and Section 5 concludes. The proofs are given in Appendix.

We use the following notation: $\| C\| =\sqrt{\sum _{i,j}{c_{ij}^{2}}}$ is the Frobenius norm of a matrix $C=(c_{ij})$, and $\mathrm{I}_{p}$ is the unit matrix of size p. The symbol $\operatorname{\mathsf{E}}$ denotes the expectation and acts as an operator on the total product of quantities, and $\operatorname{\mathbf{cov}}$ means the covariance matrix of a random vector. The upper index ⊤ denotes transposition. In the paper, all the vectors are column ones. The bar means averaging over $i=1,\dots ,m$, for example, $\bar{a}:={m}^{-1}{\sum _{i=1}^{m}}a_{i}$, $\overline{a{b}^{\top }}:={m}^{-1}{\sum _{i=1}^{m}}a_{i}{b_{i}^{\top }}$. Convergence with probability one, in probability, and in distribution are denoted as $\stackrel{\mathrm{P}\mathrm{1}}{\to }$, $\stackrel{\mathrm{P}}{\to }$, and $\stackrel{\mathrm{d}}{\to }$, respectively. A sequence of random matrices that converges to zero in probability is denoted as $o_{p}(1)$, and a sequence of stochastically bounded random matrices is denoted as $O_{p}(1)$. The notation $\varepsilon \stackrel{\mathrm{d}}{=}\varepsilon _{1}$ means that random variables ε and $\varepsilon _{1}$ have the same probability distribution. Positive constants that do not depend on the sample size m are denoted as $\mathit{const}$, so that equalities like $2\cdot \mathit{const}=\mathit{const}$ are possible.

2 Observation model and total least squares estimator

2.1 The TLS problem

Consider the observation model

(2.1)

\[{A}^{0}{X}^{0}={B}^{0},\hspace{2em}A={A}^{0}+\tilde{A},\hspace{2em}B={B}^{0}+\tilde{B},\]

where ${A}^{0}\in {\mathbb{R}}^{m\times n}$, ${X}^{0}\in {\mathbb{R}}^{n\times d}$, and ${B}^{0}\in {\mathbb{R}}^{m\times d}$. The matrices A and B contain the data, ${A}^{0}$ and ${B}^{0}$ are unknown nonrandom matrices, and $\tilde{A}$, $\tilde{B}$ are the matrices of random errors.

We can rewrite model (2.1) in an implicit way. Introduce three matrices of size $m\times (n+d)$:

(2.2)

\[{C}^{0}:=\big[{A}^{0}\hspace{2.5pt}\hspace{2.5pt}{B}^{0}\big],\hspace{2em}\tilde{C}:=[\tilde{A}\hspace{2.5pt}\hspace{2.5pt}\tilde{B}],\hspace{2em}C:=[A\hspace{2.5pt}\hspace{2.5pt}B].\]

Then

\[C={C}^{0}+\tilde{C},\hspace{2em}{C}^{0}\cdot \left[\begin{array}{c}{X}^{0}\\{} -\mathrm{I}_{d}\end{array}\right]=0.\]

Let ${A}^{\top }=[a_{1}\dots a_{m}]$, ${B}^{\top }=[b_{1}\dots b_{m}]$, and we use similar notation for the rows of the matrices C, ${A}^{0}$, ${B}^{0}$, $\tilde{A}$, $\tilde{B}$, and $\tilde{C}$. Rewrite model (2.1) as a multivariate linear one:

(2.3)

\[{X}^{0\top }{a_{i}^{0}}={b_{i}^{0}},\]

(2.4)

\[b_{i}={b_{i}^{0}}+\tilde{b}_{i},\hspace{2em}a_{i}={a_{i}^{0}}+\tilde{a}_{i};\hspace{1em}i=1,\dots ,m.\]

Throughout the paper, the following assumption holds about the errors $\tilde{c_{i}}={[{\tilde{a_{i}}}^{\top }{\tilde{b}_{i}^{\top }}]}^{\top }$:

(i) The vectors $\tilde{c}_{i}$, $i\ge 1$, are i.i.d. with zero mean, and, moreover,

(2.5)
\[\operatorname{\mathbf{cov}}(\tilde{c}_{1})={\sigma }^{2}\mathrm{I}_{n+d},\]
with unknown $\sigma >0$.

Thus, the total error covariance structure is assumed to be known up to a scalar factor ${\sigma }^{2}$, and the errors are uncorrelated with equal variances.

For model (2.1), the TLS problem lies in searching such disturbances $\Delta \hat{A}$ and $\Delta \hat{B}$ that minimize the sum of squared corrections

(2.6)

\[\underset{(X\in {\mathbb{R}}^{n\times d},\Delta A,\Delta B)}{\min }\big(\| \Delta A{\| }^{2}+\| \Delta B{\| }^{2}\big),\]

provided that

(2.7)

\[(A-\Delta A)X=B-\Delta B.\]

2.2 The TLS estimator and its consistency

It can happen that for certain random realization, the optimization problem (2.6)–(2.7) has no solution. In the latter case, we set $\hat{X}=\infty $.

Definition 1.

The TLS estimator $\hat{X}$ of the matrix parameter ${X}^{0}$ in the model (2.1) is a Borel-measurable function of the observed matrices A and B such that its values lie in ${\mathbb{R}}^{n\times d}\cup \{\infty \}$ and it provides a solution to problem (2.6)–(2.7) in case there exists a solution, and $\hat{X}=\infty $ otherwise.

We need the following conditions to provide the consistency of the estimator:

(ii) $\operatorname{\mathsf{E}}\| \tilde{c}_{1}{\| }^{4}<\infty $.
(iii) $\frac{1}{m}{A}^{0\top }{A}^{0}\to V_{A}$ as $m\to \infty $, where $V_{A}$ is a nonsingular matrix.

The next result on the strong consistency of the estimator follows, for example, from Theorem 4.3 in [9].

Theorem 2.

Assume conditions (i)–(iii). Then, with probability one, for all $m\ge m_{0}(\omega )$, the TLS estimator $\hat{X}$ is finite, and, moreover, $\hat{X}\stackrel{\mathrm{P}\mathrm{1}}{\to }{X}^{0}$ as $m\to \infty $.

Define the loss function $Q(X)$ as follows:

(2.8)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle q(a,b;X)& \displaystyle :=\big({a}^{\top }X-{b}^{\top }\big){\big(\mathrm{I}_{d}+{X}^{\top }X\big)}^{-1}\big({X}^{\top }a-b\big),\end{array}\]

(2.9)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle Q(X)& \displaystyle :=\sum \limits_{i=1}^{m}q(a_{i},b_{i};X),\hspace{1em}X\in {\mathbb{R}}^{n\times d}.\end{array}\]

It is known that the TLS estimator minimizes the loss function (2.9); see formula (24) in [7].

Introduce the following unbiased estimating function related to the elementary loss function (2.8):

(2.10)

\[s(a,b;X)=a\big({a}^{\top }X-{b}^{\top }\big)-X{\big(\mathrm{I}_{d}+{X}^{\top }X\big)}^{-1}\big({X}^{\top }a-b\big)\big({a}^{\top }X-{b}^{\top }\big).\]

Lemma 3.

Assume conditions (i)–(iii). Then, with probability one, for all $m\ge m_{0}(\omega )$, the TLS estimator $\hat{X}$ is a solution to the equation

\[\sum \limits_{i=1}^{m}s(a_{i},b_{i};X)=0,\hspace{1em}X\in {\mathbb{R}}^{n\times d}.\]

In view of Theorem 2, the statement of Lemma 3 follows from Corollary 4(a) in [6].

2.3 Asymptotic normality of the estimator

We need further restrictions on the model. Recall that the augmented errors $\tilde{c}_{i}$ were introduced in Section 2.2, and the vectors ${a_{i}^{0}}$, $\tilde{b}_{i}$, and so on are those from model (2.3)–(2.4).

(iv) $\operatorname{\mathsf{E}}\| \tilde{c}_{1}{\| }^{4+2\delta }<\infty $ for some $\delta >0$;
(v) For δ from condition (iv), $\frac{1}{{m}^{1+\delta /2}}{\sum _{i=1}^{m}}\| {a_{i}^{0}}{\| }^{2+\delta }\to 0$ as $m\to \infty $.

Denote by ${\tilde{c}_{1}^{(p)}}$ the pth coordinate of the vector $\tilde{c}_{1}$.

(vi) For all $p,q,r=1,\dots ,n+d$, we have $\operatorname{\mathsf{E}}{\tilde{c}_{1}^{(p)}}{\tilde{c}_{1}^{(q)}}{\tilde{c}_{1}^{(r)}}=0$.

Under assumptions (i) and (iv), condition (vi) holds, for example, in two cases: (a) when the random vector $\tilde{c}_{1}$ is symmetrically distributed, or (b) when the components of the vector $\tilde{c}_{1}$ are independent and, moreover, for each $p=1,\dots ,n+d$, the asymmetry coefficient of the random variable ${\tilde{c}_{1}^{(p)}}$ equals 0.

Introduce the following random element in the space of collections of five matrices:

(2.11)

\[W_{i}=\big({a_{i}^{0}}{\tilde{a}_{i}^{\top }},{a_{i}^{0}}{\tilde{b}_{i}^{\top }},\tilde{a}_{i}{\tilde{a}_{i}^{\top }}-{\sigma }^{2}\mathrm{I}_{n},\tilde{a}_{i}{\tilde{b}_{i}^{\top }},\tilde{b}_{i}{\tilde{b}_{i}^{\top }}-{\sigma }^{2}\mathrm{I}_{d}\big).\]

The next statement on the asymptotic normality of the estimator follows from the proof of Theorem 8(b) in [6], where, instead of condition (vi), there was a stronger assumption that $\tilde{c}_{1}$ is symmetrically distributed, but the proof of Theorem 8(b) in [6] still works under the weaker condition (vi).

Theorem 4.

Assume conditions (i) and (iii)–(vi). Then:

(a)

(2.12)
\[\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}W_{i}\stackrel{\text{d}}{\longrightarrow }\varGamma =(\varGamma _{1},\dots ,\varGamma _{5})\hspace{1em}\textit{as }\hspace{2.5pt}m\to \infty ,\]
where Γ is a Gaussian centered random element with matrix components,
(b)

(2.13)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \sqrt{m}\big(\hat{X}-{X}^{0}\big)& \displaystyle \stackrel{\mathrm{d}}{\to }{V_{A}^{-1}}\varGamma \big({X}^{0}\big)\hspace{1em}\textit{as }\hspace{2.5pt}m\to \infty ,\end{array}\]

(2.14)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varGamma (X)& \displaystyle :=\varGamma _{1}X-\varGamma _{2}+\varGamma _{3}X-\varGamma _{4}\\{} & \displaystyle \hspace{1em}-X{\big(\mathrm{I}_{d}+{X}^{\top }X\big)}^{-1}\big({X}^{\top }\varGamma _{3}X-{X}^{\top }\varGamma _{4}-{\varGamma _{4}^{\top }}X+\varGamma _{5}\big),\end{array}\]

where $V_{A}$ is from condition (iii), and $\varGamma _{i}$ is from condition (2.12).

Remark 5.

Under the assumptions of Theorem 4, the components of random element (2.11) are uncorrelated, and therefore, the components of the limit element Γ are uncorrelated as well.

Let $f\in {\mathbb{R}}^{n\times 1}$. Under the conditions of Theorem 4, the convergence (2.13) implies that

(2.15)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \sqrt{m}{\big(\hat{X}-{X}^{0}\big)}^{\top }f\stackrel{\mathrm{d}}{\to }N\big(0,S\big({X}^{0},f\big)\big),\end{array}\]

(2.16)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hspace{1em}S\big({X}^{0},f\big)=\operatorname{\mathsf{E}}{\varGamma }^{\top }(X_{0}){V_{A}^{-1}}f{f}^{\top }{V_{A}^{-1}}\varGamma (X_{0}).\end{array}\]

Let a consistent estimator $\hat{f}=\hat{f}_{m}$ of the vector f be given. We want to construct a consistent estimator of matrix (2.16). The matrix $S({X}^{0},f)$ is expressed, for example, via the fourth moments of errors $\tilde{c}_{i}$, and those moments cannot be consistently estimated without additional assumptions on the error probability distribution. Therefore, an explicit expression for the latter matrix does not help to construct the desirable estimator. Nevertheless, we can construct something like the sandwich estimator [1, pp. 368–369].

The next statement on the consistency of the nuisance parameter estimators follows from the proof of Lemma 10 in [6]. Recall that the bar means averaging over the observations; see Section 1.

Lemma 6.

Assume the conditions of Theorem 4. Define the estimators:

(2.17)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\hat{\sigma }}^{2}& \displaystyle =\frac{1}{d}\mathrm{tr}\big[\big(\overline{b{b}^{\top }}-2{\hat{X}}^{\top }\overline{a{b}^{\top }}+{\hat{X}}^{\top }\overline{a{a}^{\top }}\hat{X}\big){\big(\mathrm{I}_{d}+{\hat{X}}^{\top }\hat{X}\big)}^{-1}\big],\end{array}\]

(2.18)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hat{V}_{A}& \displaystyle =\overline{a{a}^{\top }}-{\hat{\sigma }}^{2}\mathrm{I}_{n}.\end{array}\]

Then

(2.19)

\[{\hat{\sigma }}^{2}\stackrel{\mathrm{P}}{\to }{\sigma }^{2},\hspace{2em}\hat{V}_{A}\stackrel{\mathrm{P}}{\to }V_{A}.\]

The next asymptotic expansion of the TLS estimator is presented in [6], formulas (4.10) and (4.11).

Lemma 7.

Under the conditions of Theorem 4, we have:

(2.20)

\[\sqrt{m}\big(\hat{X}-{X}^{0}\big)=-{V_{A}^{-1}}\cdot \frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}s\big(a_{i},b_{i};{X}^{0}\big)+o_{p}(1).\]

In view of Lemma 7, introduce the sandwich estimator $\hat{S}(\hat{f})$ of the matrix (2.16):

(2.21)

\[\hat{S}(\hat{f})=\frac{1}{m}\sum \limits_{i=1}^{m}{s}^{\top }(a_{i},b_{i};\hat{X})\hspace{2.5pt}{\hat{V}_{A}^{-1}}\hat{f}{\hat{f}}^{\top }{\hat{V}_{A}^{-1}}\hspace{2.5pt}s(a_{i},b_{i};\hat{X}),\]

where the estimator $\hat{V}_{A}$ is given in (2.18).

Theorem 8.

Let $f\in {\mathbb{R}}^{n\times 1}$, and let $\hat{f}$ be a consistent estimator of this vector. Under the conditions of Theorem 4, the statistic $\hat{S}(\hat{f})$ is a consistent estimator of the matrix $S({X}^{0},f)$, that is, $\hat{S}(\hat{f})\stackrel{\mathrm{P}}{\to }S({X}^{0},f)$.

Appendix contains the proof of this theorem and of all further statements.

3 Construction of goodness-of-fit test

For the observation model (2.4), we test the following hypotheses concerning the response b and the latent variable ${a}^{0}$:

$\textbf{H}_{0}$ There exists such a matrix $X\in {\mathbb{R}}^{n\times d}$ that

(3.1)

\[\operatorname{\mathsf{E}}\big(b-{X}^{\top }{a}^{0}\big)=0,\hspace{1em}\text{and}\]

$\textbf{H}_{1}$ For each matrix $X\in {\mathbb{R}}^{n\times d}$,

(3.2)

\[\operatorname{\mathsf{E}}\big(b-{X}^{\top }{a}^{0}\big)\hspace{2.5pt}\text{is not identically zero.}\]

In fact, the null hypothesis means that the observation model (1.3)–(1.4) holds. Based on observations $a_{i}$, $b_{i}$, $i=1,\dots ,m$, we want to construct a test statistic to check this hypothesis. Let

(3.3)

\[{T_{m}^{0}}:=\frac{1}{m}\sum \limits_{i=1}^{m}\big(b_{i}-{\hat{X}}^{\top }a_{i}\big)=\overline{b-{\hat{X}}^{\top }a}.\]

Lemma 9.

Under the conditions of Theorem 4,

(3.4)

\[\sqrt{m}{T_{m}^{0}}=\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\big(\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}\big)-\sqrt{m}{\big(\hat{X}-{X}^{0}\big)}^{\top }\overline{{a}^{0}}+o_{p}(1).\]

We need the following stabilization condition on the latent variable:

(vii) $\displaystyle\frac{1}{m}\displaystyle\sum \limits_{i=1}^{m}{a_{i}^{0}}\to \mu _{a}$ as $m\to \infty $ with $\mu _{a}\in {\mathbb{R}}^{n\times 1}$.

Lemma 10.

Assume conditions (i) and (iii)–(vii). Then

(3.5)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \sqrt{m}{T_{m}^{0}}\stackrel{\mathrm{d}}{\to }N(0,\varSigma _{T}),\\{} & \displaystyle \hspace{1em}\varSigma _{T}={\sigma }^{2}\big(1-2{\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\big)\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big)+S\big({X}^{0},\mu _{a}\big).\end{array}\]

Lemma 11.

Assume the conditions of Lemma 10. Then:

(a) A strong consistent estimator of the vector $\mu _{a}$ from condition (vii) is given by the statistic
\[\hat{\mu }_{a}:=\bar{a}=\frac{1}{m}\sum \limits_{i=1}^{m}a_{i}.\]
(b) A consistent estimator of matrix (3.5) is given by the matrix statistic

(3.6)
\[\hat{\varSigma }_{T}:={\hat{\sigma }}^{2}\big(1-2{\hat{\mu }_{a}^{\top }}{\hat{V}_{A}^{-1}}\hat{\mu }_{a}\big)\big(\mathrm{I}_{d}+{\hat{X}}^{\top }\hat{X}\big)+\hat{S}(\hat{\mu }_{a}),\]
where ${\hat{\sigma }}^{2}$ and $\hat{V}_{A}$ are presented in (2.17) and (2.18), respectively, and $\hat{S}(\hat{\mu }_{a})$ is matrix (2.21) with $\hat{f}=\hat{\mu }_{a}$.

To ensure the nonsingularity of the matrix $\varSigma _{T}$, we impose a final restriction on the observation model:

(viii) There exists a finite matrix limit
\[S_{a}:=\underset{m\to \infty }{\lim }\frac{1}{m}\sum \limits_{i=1}^{m}\big({a_{i}^{0}}-\mu _{a}\big){\big({a_{i}^{0}}-\mu _{a}\big)}^{\top },\]

and, moreover, the matrix $S_{a}$ is nonsingular.

Remark 12.

Assume conditions (vii) and (viii). Then

\[\frac{1}{m}{A}^{0\top }{A}^{0}=\frac{1}{m}\sum \limits_{i=1}^{m}{a_{i}^{0}}{a_{i}^{0\top }}\to V_{A}=S_{a}+\mu _{a}{\mu _{a}^{\top }}\hspace{1em}\text{as}\hspace{2.5pt}m\to \infty ,\]

and $V_{A}$ is nonsingular as a sum of positive definite and positive semidefinite matrices. Thus, condition (iii) is a consequence of assumptions (vii) and (viii).

Lemma 13.

Assume conditions (i) and (iv)–(viii). Then:

(a) Matrix (3.5) is positive definite.
(b) With probability tending to one as $m\to \infty $, the symmetric matrix $\hat{\varSigma }_{T}$ is positive definite as well.

For $m\ge 1$ and ω from the underlying probability space Ω such that $\hat{\varSigma }_{T}$ is positive definite, we define the test statistic

(3.7)

\[{T_{m}^{2}}=m\cdot {\big\| {\hat{\varSigma }_{T}^{-1/2}}{T_{m}^{0}}\big\| }^{2}.\]

Lemmas 10 and 11(b) imply the following convergence of the test statistic.

Theorem 14.

Assume conditions (i) and (iv)–(viii). Then under hypothesis $\textbf{\textit{H}}_{0}$, ${T_{m}^{2}}\stackrel{\mathrm{d}}{\to }{\chi _{d}^{2}}$ as $m\to \infty $.

Given a confidence level α, $0<\alpha <1/2$, let ${\chi _{d\alpha }^{2}}$ be the upper α-quantile of the ${\chi _{d}^{2}}\hspace{2.5pt}$ probability law, that is, $\operatorname{\mathsf{P}}\{{\chi _{d}^{2}}>{\chi _{d\alpha }^{2}}\}=\alpha $. Based on Theorem 14, we construct the following goodness-of-fit test with the asymptotic confidence probability $1-\alpha $:

\[\begin{array}{r@{\hskip3.0pt}l}\text{If}\hspace{2.5pt}{T_{m}^{2}}\le {\chi _{d\alpha }^{2}},\hspace{3.0pt}& \text{then we accept the null hypothesis},\\{} \text{and if}\hspace{2.5pt}{T_{m}^{2}}>{\chi _{d\alpha }^{2}},\hspace{3.0pt}& \text{then we reject the null hypothesis}.\end{array}\]

4 Power of the test

Consider a sequence of models

(4.1)

\[\textbf{H}_{1,m}:\hspace{1em}b_{i}={X}^{\top }{a_{i}^{0}}+\frac{g({a_{i}^{0}})}{\sqrt{m}}+\tilde{b}_{i},\hspace{2em}a_{i}={a_{i}^{0}}+\tilde{a}_{i},\hspace{1em}i=1,\dots ,m.\]

Here $g:{\mathbb{R}}^{n}\to {\mathbb{R}}^{d}$ is a given nonlinear perturbation of the linear regression function.

For arbitrary function $f({a}^{0})$, denote the limit of averages

\[M\big(f\big({a}^{0}\big)\big)=\underset{m\to \infty }{\lim }\overline{f\big({a}^{0}\big)},\]

provided that the limit exists and is finite.

In order to study the behavior of the test statistic under local alternatives $\textbf{H}_{1,m}$, we impose two restrictions on the perturbation function g:

(ix) There exist $M(g({a}^{0}))$ and $M(g({a}^{0}){a}^{0\top })$.
(x) $\overline{\| g({a}^{0}){\| }^{2}}=o(m)$ as $m\to \infty $.

Under local alternatives $\textbf{H}_{1,m}$, we ensure the weak consistency and asymptotic normality of the TLS estimator $\hat{X}$.

Lemma 15.

Assume conditions (i) and (iv)–(x). Under local alternatives $\textbf{\textit{H}}_{1,m}$, we have:

(a) $\hat{X}\stackrel{\mathrm{P}}{\to }{X}^{0}$, ${\hat{\sigma }}^{2}\stackrel{\mathrm{P}}{\to }{\sigma }^{2}$.
(b) $\sqrt{m}(\hat{X}-{X}^{0})\stackrel{\mathrm{d}}{\to }{V_{A}^{-1}}\varGamma ({X}^{0})+{V_{A}^{-1}}M({a}^{0}{g}^{\top }({a}^{0}))$ as $m\to \infty $,

where $\varGamma (X)$ is defined in (2.11), (2.12), and (2.14).

Lemma 16.

Assume the conditions of Lemma 15. Then under local alternatives $\textbf{\textit{H}}_{1,m}$, we have:

(a) $\sqrt{m}{T_{m}^{0}}\stackrel{\mathrm{d}}{\to }N(C_{T},\varSigma _{T})$,

where $\varSigma _{T}$ is given by (3.5), and

(4.2)
\[C_{T}:=M\big(g\big({a}^{0}\big)\big)-M\big(g\big({a}^{0}\big){a}^{0\top }\big){V_{A}^{-1}}\mu _{a}.\]
(b) The estimator $\hat{\varSigma }_{T}$ given in (3.6) tends in probability to the asymptotic covariance matrix $\varSigma _{T}$.

Now, we define the noncentral chi-squared distribution ${\chi _{d}^{2}}(\tau )$ with d degrees of freedom and the noncentrality parameter τ.

Definition 17.

For $d\ge 1$ and $\tau \ge 0$, let ${\chi _{d}^{2}}(\tau )\stackrel{\mathrm{d}}{=}\| N(\tau e,\mathrm{I}_{d}){\| }^{2}$, where $e\in {\mathbb{R}}^{d}$, $\| e\| =1$, or, equivalently, ${\chi _{d}^{2}}(\tau )\stackrel{\mathrm{d}}{=}{(\gamma _{1}+\tau )}^{2}+{\sum _{i=2}^{d}}{\gamma _{i}^{2}}$, where $\{\gamma _{i}\}$ are i.i.d. standard normal random variables.

Lemma 16 implies directly the following convergence.

Theorem 18.

Assume conditions (i) and (iv)–(x). Then under local alternatives $\textbf{\textit{H}}_{1,m}$, we have:

(4.3)

\[{T_{m}^{2}}\stackrel{\mathrm{d}}{\to }{\chi _{d}^{2}}(\tau ),\hspace{2em}\tau :=\big\| {\varSigma _{T}^{-1/2}}C_{T}\big\| ,\]

where $C_{T}$ is given in (4.2).

Theorem 18 makes it possible to find the asymptotic power of the test under local alternatives $\textbf{H}_{1,m}$. It is evident that the asymptotic power is an increasing function of $\tau =\| {\varSigma _{T}^{-1/2}}C_{T}\| $. In other words, the larger τ, the more powerful the test.

5 Conclusion

We constructed a goodness-of-fit test for a multivariate linear errors-in-variables model, provided that the errors are uncorrelated with equal (unknown) variances and vanishing third moments. The latter moment assumption makes it possible to estimate consistently the asymptotic covariance matrix $\varSigma _{T}$ of the statistic ${T_{m}^{0}}$ and construct the test statistic ${T_{m}^{2}}$, which has the asymptotic ${\chi _{d}^{2}}$ distribution under the null hypothesis. The local alternatives $\textbf{H}_{1,m}$ are presented, under which the test statistic has the noncentral ${\chi _{d}^{2}}(\tau )$ asymptotic distribution. The larger τ, the larger the asymptotic power of the test.

In future, we will try to construct, like in [5], a more powerful test using within a test statistic the exponential weight function

\[\omega _{\lambda }(a)={e}^{{\lambda }^{\top }a},\hspace{1em}\lambda \in {\mathbb{R}}^{n\times 1}.\]

To this end, it is necessary to require the independence he terrors $\tilde{b}_{i}$ and $\tilde{a}_{i}$ and also the existence of exponential moments of the errors $\tilde{a}_{i}$. This is the price for a greater power of the test.

Appendix

Lemma 19.

Let $r>1$ be a fixed real number, and $\{\eta _{k}\}$ be an i.i.d. sequence with zero mean and finite moment $\operatorname{\mathsf{E}}|\eta _{1}{|}^{r}$. Assume also that a sequence $\{d_{k}\}$ of real numbers satisfies

\[\frac{1}{{m}^{r}}\sum \limits_{k=1}^{m}|d_{k}{|}^{r}\to 0\hspace{1em}\textit{as }m\to \infty .\]

Then

(5.1)

\[\overline{d\eta }=\frac{1}{m}\sum \limits_{k=1}^{m}d_{k}\eta _{k}\stackrel{\mathrm{P}}{\to }0.\]

Proof.

Without of loss generality, we may and do assume that $1<r<2$. It suffices to check that the following three conditions from Theorem 5 in [8, Chap. VI] (5.1):

(a) $\displaystyle\sum \limits_{k=1}^{m}\operatorname{\mathsf{P}}\{|d_{k}\eta _{k}|>m\}\le \displaystyle\sum \limits_{k=1}^{m}\displaystyle\frac{\operatorname{\mathsf{E}}|d_{k}\eta _{k}{|}^{r}}{{m}^{r}}=\displaystyle\frac{\operatorname{\mathsf{E}}|\eta _{1}{|}^{r}}{{m}^{r}}\displaystyle\sum \limits_{k=1}^{m}|d_{k}{|}^{r}\to 0\hspace{2.5pt}\text{as}\hspace{2.5pt}m\to \infty $;
(b) $\begin{array}{r@{\hskip0pt}l}& \displaystyle \frac{1}{{m}^{2}}\sum \limits_{k=1}^{m}\operatorname{\mathsf{D}}(d_{k}\eta _{k}\mathrm{I}(|d_{k}\eta _{k}|<m))\le \frac{1}{{m}^{2}}\sum \limits_{k=1}^{m}\operatorname{\mathsf{E}}({d_{k}^{2}}{\eta _{k}^{2}}\mathrm{I}(|d_{k}\eta _{k}|<m))\\{} & \displaystyle \hspace{2em}\le \frac{1}{{m}^{2}}\sum \limits_{k=1}^{m}\operatorname{\mathsf{E}}|d_{k}\eta _{k}{|}^{r}\cdot {m}^{2-r}=\frac{\operatorname{\mathsf{E}}|\eta _{1}{|}^{r}}{{m}^{r}}\sum \limits_{k=1}^{m}|d_{k}{|}^{r}\to 0\hspace{2.5pt}\text{as}\hspace{2.5pt}m\to \infty ;\end{array}$
(c) $\begin{array}{r@{\hskip0pt}l}\displaystyle \varepsilon _{m}& \displaystyle :=\frac{1}{m}\sum \limits_{k=1}^{m}\operatorname{\mathsf{E}}(d_{k}\eta _{k}\mathrm{I}(|d_{k}\eta _{k}|<m))=-\frac{1}{m}\sum \limits_{k=1}^{m}\operatorname{\mathsf{E}}(d_{k}\eta _{k}\mathrm{I}(|d_{k}\eta _{k}|\ge m)),\\{} \displaystyle |\varepsilon _{m}|& \displaystyle \le \frac{1}{m}\sum \limits_{k=1}^{m}\operatorname{\mathsf{E}}|d_{k}\eta _{k}{|}^{r}\cdot \frac{1}{{m}^{r-1}}=\frac{\operatorname{\mathsf{E}}|\eta _{1}{|}^{r}}{{m}^{r}}\sum \limits_{k=1}^{m}|d_{k}{|}^{r}\to 0\hspace{2.5pt}\text{as}\hspace{2.5pt}m\to \infty .\end{array}$

By the mentioned theorem from [8] the presented bounds imply the desired convergence. □

The next statement is a version of the Lyapunov CLT.

Lemma 20.

Let $\{z_{i}\}$ be a sequence of independent centered random vectors in ${\mathbb{R}}^{p}$ with $\overline{\operatorname{\mathbf{cov}}(z)}=\frac{1}{m}{\sum _{i=1}^{m}}\operatorname{\mathbf{cov}}(z_{i})\to S$ as $m\to \infty $. Assume also that, for some $\delta >0$,

(5.2)

\[\frac{1}{{m}^{1+\delta /2}}\sum \limits_{i=1}^{m}\operatorname{\mathsf{E}}\| z_{i}{\| }^{2+\delta }\le \mathit{const}.\]

Then

\[\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}z_{i}\stackrel{\mathrm{d}}{\to }N(0,S).\]

Proof of Theorem 8.

(a) We have:

(5.3)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle S(f)& \displaystyle :=\frac{1}{m}\sum \limits_{i=1}^{m}{s}^{\top }\big(a_{i},b_{i};{X}^{0}\big)\hspace{2.5pt}{V_{A}^{-1}}f{f}^{\top }{V_{A}^{-1}}\hspace{2.5pt}s\big(a_{i},b_{i};{X}^{0}\big)\\{} & \displaystyle \hspace{2.5pt}=\big(S(f)-\operatorname{\mathsf{E}}S(f)\big)+\operatorname{\mathsf{E}}S(f).\end{array}\]

In the proof of Theorem 8(a) in [6], the following expansion of the estimating function is used:

(5.4)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle s\big(a_{i},b_{i};{X}^{0}\big)& \displaystyle =W_{i1}{X}^{0}-W_{i2}+W_{i3}{X}^{0}-W_{i4}-{X}^{0}{\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big)}^{-1}\end{array}\]

(5.5)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hspace{1em}\times \big({X}^{0\top }W_{i3}{X}^{0}-{X}^{0\top }W_{i4}-{W_{i4}^{\top }}{X}^{0}+W_{i5}\big),\end{array}\]

where $W_{ij}$ are the components of the matrix collection (2.11).

We show that the term in parentheses on the right-hand side of (5.3) tends to zero in probability. Taking into account expansion (5.5), we write down one of summands of the expression $S(f)$:

(5.6)

\[L_{m}:=\frac{1}{m}\sum \limits_{i=1}^{m}{X}^{0\top }\tilde{a}_{i}{a_{i}^{0\top }}Z{a_{i}^{0}}{\tilde{a}_{i}^{\top }}{X}^{0},\hspace{1em}Z:={V_{A}^{-1}}f{f}^{\top }{V_{A}^{-1}}.\]

Let us explain why

(5.7)

\[L_{m}-\operatorname{\mathsf{E}}L_{m}\stackrel{\mathrm{P}}{\to }0.\]

It suffices to consider the matrix

\[\tilde{L}_{m}:=\frac{1}{m}\sum \limits_{i=1}^{m}\tilde{a}_{i}{a_{i}^{0\top }}Z{a_{i}^{0}}{\tilde{a}_{i}^{\top }}.\]

Up to a constant, its entries contain summands of the form

\[\frac{1}{m}\sum \limits_{i=1}^{m}{\tilde{a}_{i}^{(j)}}{a_{i}^{0(p)}}{a_{i}^{0(q)}}{\tilde{a}_{i}^{(r)}}.\]

Applying Lemma 19 to the expression

(5.8)

\[\frac{1}{m}\sum \limits_{i=1}^{m}{a_{i}^{0(p)}}{a_{i}^{0(q)}}\big({\tilde{a}_{i}^{(j)}}{\tilde{a}_{i}^{(r)}}-\operatorname{\mathsf{E}}{\tilde{a}_{i}^{(j)}}{\tilde{a}_{i}^{(r)}}\big),\]

we have $\operatorname{\mathsf{E}}{({\tilde{a}_{i}^{(j)}}{\tilde{a}_{i}^{(r)}})}^{2}\le \operatorname{\mathsf{E}}\| \tilde{a}_{i}{\| }^{4}<\infty $, and for δ from condition (v), we have:

\[\frac{1}{{m}^{1+\delta /2}}\sum \limits_{i=1}^{m}{\big|{a_{i}^{0(p)}}{a_{i}^{0(q)}}\big|}^{1+\delta /2}\le \frac{1}{{m}^{1+\delta /2}}\sum \limits_{i=1}^{m}{\big\| {a_{i}^{0}}\big\| }^{2+\delta }\to 0\hspace{1em}\text{as}\hspace{2.5pt}m\to \infty .\]

Thus, by Lemma 19 expression (5.8) tends to zero in probability. Then

\[\tilde{L}_{m}-\operatorname{\mathsf{E}}\tilde{L}_{m}\stackrel{\mathrm{P}}{\to }0,\]

whence we get (5.7).

In a similar way, other summands of $S(f)$ can be studied, and therefore,

\[S(f)-\operatorname{\mathsf{E}}S(f)\stackrel{\mathrm{P}}{\to }0.\]

Next, we verify directly the convergence

\[\operatorname{\mathsf{E}}S(f)\to S\big({X}^{0},f\big)=\operatorname{\mathsf{E}}{\varGamma }^{\top }\big({X}^{0}\big){V_{A}^{-1}}f{f}^{\top }{V_{A}^{-1}}\varGamma \big({X}^{0}\big)\hspace{1em}\text{as}\hspace{2.5pt}m\to \infty .\]

Therefore, $S(f)\stackrel{\mathrm{P}}{\to }S({X}^{0},f)$.

(b) Without any problem, in view of Theorem 2 and the consistency of estimators $\hat{V}_{A}$ and $\hat{f}$, the following convergences can be shown:

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle S(f)-\hat{S}(f)\stackrel{\mathrm{P}}{\to }0,\hspace{1em}\hat{S}(f):=\frac{1}{m}\sum \limits_{i=1}^{m}{s}^{\top }(a_{i},b_{i};\hat{X})\cdot Z\cdot s(a_{i},b_{i};\hat{X});\\{} & \displaystyle \hat{S}(f)-\hat{S}(\hat{f})\stackrel{\mathrm{P}}{\to }0.\end{array}\]

Here Z is the matrix from relations (5.6).

The desired convergence follows from the convergences established in parts (a) and (b) of the proof. □

Proof of Lemma 9.

For model (2.3)–(2.4), we have:

(5.9)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \hspace{24.0pt}\sqrt{m}{T_{m}^{0}}& \displaystyle =\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\big({b_{i}^{0}}+\tilde{b}_{i}-{\hat{X}}^{\top }{a_{i}^{0}}-{\hat{X}}^{\top }{\tilde{a}_{i}^{0}}\big)\end{array}\]

(5.10)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle =\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\big(\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}\big)-\sqrt{m}{\big(\hat{X}-{X}^{0}\big)}^{\top }\overline{{a}^{0}}+\mathit{rest},\\{} \displaystyle \text{where}\hspace{2.5pt}\mathit{rest}& \displaystyle =-{\big(\hat{X}-{X}^{0}\big)}^{\top }\cdot \frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\tilde{a}_{i}=o_{p}(1)\cdot O_{p}(1)=o_{p}(1).\hspace{41.0pt}\end{array}\]

□

Proof of Lemma 10.

By Theorem 4 (b),

\[\sqrt{m}\big(\hat{X}-{X}^{0}\big)=O_{p}(1).\]

Therefore, expansion (3.4) and condition (vii) imply that

(5.11)

\[\sqrt{m}{T_{m}^{0}}=\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\big(\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}\big)+\sqrt{m}{\big(\hat{X}-{X}^{0}\big)}^{\top }\mu _{a}+o_{p}(1).\]

Next, by expansion (2.20) we get:

(5.12)

\[\sqrt{m}{T_{m}^{0}}=\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}\big(\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}+{s}^{\top }\big(a_{i},b_{i};{X}^{0}\big){V_{A}^{-1}}\mu _{a}\big)+o_{p}(1).\]

The random vectors

(5.13)

\[z_{i}:=\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}+{s}^{\top }\big(a_{i},b_{i};{X}^{0}\big){V_{A}^{-1}}\mu _{a}\]

satisfy condition (5.2) with the number δ from assumptions (iv) and (v). Let us find the variance–covariance matrix $\varSigma _{i}$ of vector (5.13). We have

(5.14)

\[\varSigma _{i}=\operatorname{\mathbf{cov}}\big(\tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}\big)+\operatorname{\mathbf{cov}}\big({s}^{\top }\big(a_{i},b_{i};{X}^{0}\big)\hspace{2.5pt}{V_{A}^{-1}}\mu _{a}\big)+M+{M}^{\top }.\]

Here (see (2.11) and (5.5))

(5.15)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle M& \displaystyle :=\operatorname{\mathsf{E}}{s}^{\top }\big(a_{i},b_{i};{X}^{0}\big)\hspace{2.5pt}{V_{A}^{-1}}\mu _{a}\big(\tilde{b}_{i}-\tilde{a}_{i}{X}^{0}\big)\\{} & \displaystyle =\operatorname{\mathsf{E}}\big({X}^{0\top }\tilde{a}_{i}{a_{i}^{0\top }}-\tilde{b}_{i}{a_{i}^{0\top }}\big)\hspace{2.5pt}{V_{A}^{-1}}\mu _{a}\big({\tilde{b}_{i}^{\top }}-{\tilde{a}_{i}^{\top }}{X}^{0}\big);\\{} \displaystyle M& \displaystyle =-{X}^{0\top }\big(\operatorname{\mathsf{E}}\tilde{a}_{i}{a_{i}^{0\top }}{V_{A}^{-1}}\mu _{a}{\tilde{a}_{i}^{\top }}\big){X}^{0}-\operatorname{\mathsf{E}}\tilde{b}_{i}{a_{i}^{0\top }}{V_{A}^{-1}}\mu _{a}{\tilde{b}_{i}^{\top }}\end{array}\]

(5.16)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle =-{a_{i}^{0\top }}{V_{A}^{-1}}\mu _{a}{\sigma }^{2}\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big)={M}^{\top };\\{} \displaystyle \operatorname{\mathbf{cov}}\big(& \displaystyle \tilde{b}_{i}-{X}^{0\top }\tilde{a}_{i}\big)={\sigma }^{2}\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big);\\{} \displaystyle \operatorname{\mathbf{cov}}\big({s}^{\top }\big(a_{i},b_{i};& \displaystyle {X}^{0}\big)\hspace{0.1667em}{V_{A}^{-1}}\mu _{a}\big)=\operatorname{\mathsf{E}}{s}^{\top }\big(a_{i},b_{i};{X}^{0}\big)\cdot Z\cdot s\big(a_{i},b_{i};{X}^{0}\big).\end{array}\]

Then

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varSigma _{T}& \displaystyle :=\underset{m\to \infty }{\lim }\frac{1}{m}(\varSigma _{1}+\cdots +\varSigma _{m})\\{} & \displaystyle ={\sigma }^{2}\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big)+S\big({X}^{0},\mu _{a}\big)-2{\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}{\sigma }^{2}\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big),\end{array}\]

and this coincides with the right-hand side of equality (3.5).

Finally, the desired convergence follows from expansion (5.12) by Lemma 20 and Slutsky’s lemma. □

Proof of Lemma 11.

The convergence $\hat{\mu }_{a}\stackrel{\mathrm{P}\mathrm{1}}{\to }\mu _{a}$ is established by SLLN. The convergence

\[\hat{\varSigma }_{T}\stackrel{\mathrm{P}}{\to }\varSigma _{T}\]

follows from Theorem 8 (the role of f and $\hat{f}$ is played by $\mu _{a}$ and $\hat{\mu }_{a}$, respectively) and the consistency of estimators ${\hat{\sigma }}^{2}$, $\hat{\mu }_{a}$, and $\hat{V}_{A}$. □

Proof of Lemma 13.

(a) Hereafter, for symmetric matrices A and B, notation $A\ge B$ ($A>B$) means that the matrix $A-B$ is positive semidefinite (positive definite).

Condition (vi) ensures the independence of the matrix components $\varGamma _{i}$ in relation (2.12). Therefore,

\[S\big({X}^{0},\mu _{a}\big)\ge \operatorname{\mathbf{cov}}\big(\big({X}^{0\top }\tilde{a}_{i}-\hat{b}_{i}\big)\hspace{2.5pt}{\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\big)={\sigma }^{2}{\big({\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\big)}^{2}\big(\mathrm{I}_{d}+{X}^{0\top }{X}^{0}\big).\]

From equality (3.5) we have

(5.17)

\[\varSigma _{T}\ge {\sigma }^{2}{\big(1-{\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\big)}^{2}\cdot \mathrm{I}_{d}.\]

By condition (viii), $V_{A}>\mu _{a}{\mu _{a}^{\top }}$. In the case $\mu _{a}=0$, we get $\varSigma _{T}\ge {\sigma }^{2}\mathrm{I}_{d}>0$, and in the case $\mu _{a}\ne 0$, we put $z={V_{A}^{-1}}\mu _{a}$ and obtain:

\[{z}^{\top }V_{A}z={\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}>{\big({\mu _{a}^{\top }}z\big)}^{2}={\big({\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\big)}^{2};\]

thus, $1>{\mu _{a}^{\top }}{V_{A}^{-1}}\mu _{a}\hspace{2.5pt}$, and inequality (5.17) implies $\varSigma _{T}>0$.

Statement (b) follows from statement (a) and Lemma 11(b). □

Proof of Lemma 15.

(a) The local alternative (4.1) is corresponding to the perturbation matrix

\[{G}^{0}:=\left[\begin{array}{c}{g}^{\top }({a_{1}^{0}})\\{} \vdots \\{} {g}^{\top }({a_{m}^{0}})\end{array}\right].\]

Model (4.1) can be rewritten as a perturbed model (2.1),

(5.18)

\[{A}^{0}{X}^{0}={B}^{0},\hspace{2em}A={A}^{0}+\tilde{A},\hspace{2em}{B}^{\mathit{per}}:={B}^{0}+\frac{1}{\sqrt{m}}{G}^{0}+\tilde{B},\]

or as a perturbed model (2.2),

(5.19)

\[{C}^{0}=\big[{A}^{0}\hspace{2.5pt}\hspace{2.5pt}{B}^{0}\big],\hspace{2em}\tilde{C}=[\tilde{A}\hspace{2.5pt}\hspace{2.5pt}\tilde{B}],\hspace{2em}{C}^{\mathit{per}}:=\big[A\hspace{2.5pt}\hspace{2.5pt}{B}^{\mathit{per}}\big],\]

(5.20)

\[{C}^{0}\cdot \left[\begin{array}{c}{X}^{0}\\{} -\mathrm{I}_{d}\end{array}\right]=0.\]

Introduce the symmetric matrix

\[N={C}^{0\top }{C}^{0}+\lambda _{\min }\big({A}^{0\top }{A}^{0}\big)\mathrm{I}_{n+d}.\]

Due to condition (iii), as $m\to \infty $,

(5.21)

\[N=m{N}^{0}+o(m),\hspace{1em}{N}^{0}>0.\]

Consider two matrices of size $(n+d)\times (n+d)$:

(5.22)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle M_{1}& \displaystyle ={N}^{-1/2}{C}^{0\top }\big({C}^{\mathit{per}}-{C}^{0}\big){N}^{-1/2},\end{array}\]

(5.23)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle M_{2}& \displaystyle ={N}^{-1/2}\big({\big({C}^{\mathit{per}}-{C}^{0}\big)}^{\top }\big({C}^{\mathit{per}}-{C}^{0}\big)-{\sigma }^{2}mI_{n+d}\big){N}^{-1/2}.\end{array}\]

In view of the proof of Theorem 4.1 in [9], for the convergence

(5.24)

\[\hat{X}\stackrel{\mathrm{P}}{\to }{X}^{0},\]

it suffices to show that, as $m\to \infty $,

(5.25)

\[M_{1}\stackrel{\mathrm{P}}{\to }0,\hspace{2em}M_{2}\stackrel{\mathrm{P}}{\to }0,\]

or taking into account (5.21), that

(5.26)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle {M^{\prime }_{1}}& \displaystyle :=\frac{1}{m}{C}^{0\top }\tilde{C}+\frac{1}{m}{C}^{0\top }\cdot \frac{1}{\sqrt{m}}{G}^{0}\stackrel{\mathrm{P}}{\to }0,\end{array}\]

(5.27)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle {M^{\prime }_{2}}& \displaystyle :=\frac{1}{m}\bigg({\bigg(\tilde{C}+\frac{1}{\sqrt{m}}\big[0\hspace{2.5pt}\hspace{2.5pt}{G}^{0}\big]\bigg)}^{\top }\bigg(\tilde{C}+\frac{1}{\sqrt{m}}\big[0\hspace{2.5pt}\hspace{2.5pt}{G}^{0}\big]\bigg)-{\sigma }^{2}\mathrm{I}_{n+d}\bigg)\stackrel{\mathrm{P}}{\to }0.\end{array}\]

We study the most interesting summands, those that contain ${G}^{0}$ (the convergence of other summands was shown in the proof of Theorem 4.1 in [9]). We have

\[{M^{\prime\prime }_{1}}:=\frac{1}{{m}^{3/2}}{C}^{0\top }{G}^{0}=\frac{1}{{m}^{3/2}}\left[\begin{array}{c}{A}^{0\top }{G}^{0}\\{} {X}^{0\top }{A}^{0\top }{G}^{0}\end{array}\right],\]

and due to condition (ix), as $m\to \infty $,

(5.28)

\[\frac{1}{{m}^{3/2}}{A}^{0\top }{G}^{0}=\frac{1}{{m}^{3/2}}\sum \limits_{i=1}^{m}{a_{i}^{0}}{g}^{\top }\big({a_{i}^{0}}\big)=\frac{O(1)}{{m}^{1/2}}\to 0,\hspace{1em}{M^{\prime\prime }_{1}}\to 0.\]

Next, by condition (x),

(5.29)

\[{M^{\prime\prime }_{2}}:=\frac{1}{{m}^{2}}{G}^{0\top }{G}^{0}=\frac{1}{{m}^{2}}\sum \limits_{i=1}^{m}g\big({a_{i}^{0}}\big){g}^{\top }\big({a_{i}^{0}}\big),\]

(5.30)

\[\big\| {M^{\prime\prime }_{2}}\big\| \le \frac{\mathit{const}}{{m}^{2}}\sum \limits_{i=1}^{m}{\big\| g\big({a_{i}^{0}}\big)\big\| }^{2}\to 0\hspace{1em}\text{as}\hspace{2.5pt}\hspace{2.5pt}m\to \infty .\]

Finally,

(5.31)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle {M^{\prime\prime\prime }_{2}}& \displaystyle :=\frac{1}{{m}^{3/2}}{\tilde{C}}^{\top }{G}^{0}=\frac{1}{{m}^{3/2}}\sum \limits_{i=1}^{m}\tilde{c}_{i}{g}^{\top }\big({a_{i}^{0}}\big),\end{array}\]

(5.32)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \operatorname{\mathsf{E}}{\big\| {M^{\prime\prime\prime }_{2}}\big\| }^{2}& \displaystyle \le \frac{\mathit{const}}{{m}^{3}}\sum \limits_{i=1}^{m}{\big\| g\big({a_{i}^{0}}\big)\big\| }^{2}\to 0\hspace{1em}\text{as}\hspace{2.5pt}\hspace{2.5pt}m\to \infty ,\hspace{2.5pt}\hspace{2.5pt}{M^{\prime\prime\prime }_{2}}\stackrel{\mathrm{P}}{\to }0.\end{array}\]

We established the convergence in probability for the summands from (5.26) and (5.27) that contain the perturbation ${G}^{0}$. Therefore, (5.26) and (5.27) are satisfied, relation (5.25) is satisfied as well, and the results of [9] imply convergence (5.24).

The consistency of the estimator ${\hat{\sigma }}^{2}$ under local alternatives $\textbf{H}_{1,m}$ is established by formula (2.17) and boils down to the consistency of ${\hat{\sigma }}^{2}$ under the null hypothesis: the consistency of $\hat{X}$ has been proven already, and, moreover,

(5.33)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \overline{{b}^{\mathit{per}}{b}^{\mathit{per},\top }}& \displaystyle =\frac{1}{m}\sum \limits_{i=1}^{m}\bigg(b_{i}+\frac{1}{\sqrt{m}}g\big({a_{i}^{0}}\big)\bigg){\bigg(b_{i}+\frac{1}{\sqrt{m}}g\big({a_{i}^{0}}\big)\bigg)}^{\top }=\frac{1}{m}\sum \limits_{i=1}^{m}b_{i}{b_{i}^{\top }}+o_{p}(1),\end{array}\]

(5.34)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \overline{a{b}^{\mathit{per},\top }}& \displaystyle =\overline{a{b}^{\top }}+o_{p}(1).\end{array}\]

(b) After we established the consistency of $\hat{X}$ under alternatives $\textbf{H}_{1,m}$, we find an expansion similar to (2.20):

(5.35)

\[\begin{array}{r@{\hskip0pt}l}\displaystyle \sqrt{m}\big(\hat{X}-{X}^{0}\big)& \displaystyle =-{V_{A}^{-1}}\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}s\big(a_{i},{b_{i}^{\mathit{per}}};{X}^{0}\big)+o_{p}(1)\end{array}\]

(5.36)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle =-{V_{A}^{-1}}\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}s\big(a_{i},b_{i};{X}^{0}\big)-{V_{A}^{-1}}\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}{s_{i}^{\mathit{per}}}+o_{p}(1).\end{array}\]

Conditions (ix) and (x) ensure that, for perturbations ${s_{i}^{\mathit{per}}}$ of the estimating function, we have (the main contribution to ${s_{i}^{\mathit{per}}}$ is made by a linear summand $a{b}^{\top }$ from (2.10)):

(5.37)

\[\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}{s_{i}^{\mathit{per}}}=-\frac{1}{m}\sum \limits_{i=1}^{m}{a_{i}^{0}}{g}^{\top }\big({a_{i}^{0}}\big)+o_{p}(1).\]

Lemma 7, Theorem 4, and formulae (5.36) and (5.37) imply the desired convergence of the normalized TLS estimator. □

Proof of Lemma 16.

(a) Under the local alternatives, we have:

\[\sqrt{m}{T_{m}^{0}}|_{\textbf{H}_{1,m}}=\sqrt{m}{T_{m}^{0}}|_{\textbf{H}_{0}}+M\big(g\big({a}^{0}\big)\big)-\sqrt{m}{(\hat{X}|_{\textbf{H}_{1,m}}-\hat{X}|_{\textbf{H}_{0}})}^{\top }\mu _{a}+o_{p}(1).\]

Expansions (2.20), (5.36), and (5.37) imply that

\[\sqrt{m}(\hat{X}|_{\textbf{H}_{1,m}}-\hat{X}|_{\textbf{H}_{0}})\stackrel{\mathrm{P}}{\to }{V_{A}^{-1}}\cdot M\big({a}^{0}{g}^{\top }\big({a}^{0}\big)\big).\]

Then, by Lemma 10 and Slutsky’s lemma,

(5.38)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \sqrt{m}{T_{m}^{0}}|_{\textbf{H}_{1,m}}\stackrel{\mathrm{d}}{\to }N(C_{T},\varSigma _{T}),\end{array}\]

(5.39)

\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \hspace{1em}C_{T}=M\big(g\big({a}^{0}\big)\big)-M\big(g\big({a}^{0}\big){a}^{0\top }\big)\cdot {V_{A}^{-1}}\mu _{a}\hspace{2.5pt}.\end{array}\]

(b) Under the local alternatives, the estimators ${\hat{\sigma }}^{2}$, $\hat{\mu }_{a}$, $\hat{V}_{A}$, and $\hat{X}$ are still consistent. Moreover,

(5.40)

\[\hat{S}(\hat{\mu }_{a})=\frac{1}{m}\sum \limits_{i=1}^{m}{s}^{\top }\big(a_{i},{b_{i}^{\mathit{per}}};\hat{X}\big)\hspace{2.5pt}{\hat{V}_{A}^{-1}}\hat{\mu }_{a}{\hat{\mu }_{a}^{\top }}{\hat{V}_{A}^{-1}}\hspace{2.5pt}s\big(a_{i},{b_{i}^{\mathit{per}}};\hat{X}\big)\]

converges in probability to $S({X}^{0},\mu _{a})$ because expression (5.40) does not involve terms linear in ${b_{i}^{\mathit{per}}}$, and the perturbation of the vectors $b_{i}$ does not modify the asymptotic behavior of $\hat{S}(\hat{\mu }_{a})$ in transition from $\textbf{H}_{0}$ to the local alternatives.

Thus, estimator (3.6) does converge in probability to matrix (3.5). □

References

[1]

Carroll, R.J., Ruppert, D., Stefanski, L.A., Crainiceanu, C.M.: Measurement Error in Nonlinear Models: A Modern Perspective, 2nd edn. Monogr. Stat. Appl. Probab., vol. 105, p. 455. Chapman & Hall/CRC, Boca Raton, FL (2006). MR2243417. doi:10.1201/9781420010138

[2]

Cheng, C.-L., Kukush, A.G.: A goodness-of-fit test for a polynomial errors-in-variables model. Ukr. Math. J. 56(4), 527–543 (2004). MR2105906. doi:10.1007/s11253-005-0095-9

[3]

Gleser, L.J.: Estimation in a multivariate “errors in variables” regression model: large sample results. Ann. Stat. 9(1), 24–44 (1981). MR0600530

[4]

Hall, P., Ma, Y.: Testing the suitability of polynomial models in errors-in-variables problems. Ann. Stat. 35(6), 2620–2638 (2007). MR2382660. doi:10.1214/009053607000000361

[5]

Kukush, A., Polekha, M.: A goodness of-fit-test for a multivariate errors-in-variables model. Theory Stoch. Process. 12(3–4), 63–74 (2006). MR2316566

[6]

Kukush, A., Tsaregorodtsev, Y.: Asymptotic normality of total least squares estimator in a multivariate errors-in-variables model $AX=B$. Mod. Stoch. Theory Appl. 3(1), 47–57 (2016). MR3490052. doi:10.15559/16-VMSTA50

[7]

Kukush, A., Van Huffel, S.: Consistency of elementwise-weighted total least squares estimator in a multivariate errors-in-variables model $AX=B$. Metrika 59(1), 75–97 (2004). MR2043433. doi:10.1007/s001840300272

[8]

Petrov, V.V.: Limit Theorems of Probability Theory: Sequences of Independent Random Variables. Oxford Stud. Probab., vol. 4, p. 292. The Clarendon Press, Oxford University Press, New York (1995). Oxford Science Publications. MR1353441

[9]

Shklyar, S.V.: Conditions for the consistency of the total least squares estimator in an errors-in-variables linear regression model. Theory Prob. Math. Statist. 83, 148–162 (2010). MR2768857. doi:10.1090/S0094-9000-2012-00850-8

[10]

Van Huffel, S., Vandewalle, J.: The Total Least Squares Problem. Frontiers Appl. Math., vol. 9, p. 300. Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA (1991). MR1118607. doi:10.1137/1.9781611971002

Reading mode

Table of contents

1 Introduction
2 Observation model and total least squares estimator
3 Construction of goodness-of-fit test
4 Power of the test
5 Conclusion
Appendix
References

Open access article under the CC BY license.

Keywords

Goodness-of-fit test local alternatives multivariate errors-in-variables model power of test total least squares estimator

MSC2010

62J05 62H15 65F20

Metrics

since March 2018

637

Article info
views

976

Full article
views

364

PDF
downloads

172

XML
downloads

RSS

Theorems
5

Theorem 2.

Theorem 4.

Theorem 8.

Theorem 14.

Theorem 18.

Theorem 2.

Theorem 4.

Assume conditions (i) and (iii)–(vi). Then:

(a)

(2.12)
\[\frac{1}{\sqrt{m}}\sum \limits_{i=1}^{m}W_{i}\stackrel{\text{d}}{\longrightarrow }\varGamma =(\varGamma _{1},\dots ,\varGamma _{5})\hspace{1em}\textit{as }\hspace{2.5pt}m\to \infty ,\]
where Γ is a Gaussian centered random element with matrix components,
(b)

(2.13)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \sqrt{m}\big(\hat{X}-{X}^{0}\big)& \displaystyle \stackrel{\mathrm{d}}{\to }{V_{A}^{-1}}\varGamma \big({X}^{0}\big)\hspace{1em}\textit{as }\hspace{2.5pt}m\to \infty ,\end{array}\]

(2.14)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varGamma (X)& \displaystyle :=\varGamma _{1}X-\varGamma _{2}+\varGamma _{3}X-\varGamma _{4}\\{} & \displaystyle \hspace{1em}-X{\big(\mathrm{I}_{d}+{X}^{\top }X\big)}^{-1}\big({X}^{\top }\varGamma _{3}X-{X}^{\top }\varGamma _{4}-{\varGamma _{4}^{\top }}X+\varGamma _{5}\big),\end{array}\]

where $V_{A}$ is from condition (iii), and $\varGamma _{i}$ is from condition (2.12).

Theorem 8.

Theorem 14.

Assume conditions (i) and (iv)–(viii). Then under hypothesis $\textbf{\textit{H}}_{0}$, ${T_{m}^{2}}\stackrel{\mathrm{d}}{\to }{\chi _{d}^{2}}$ as $m\to \infty $.

Theorem 18.

Assume conditions (i) and (iv)–(x). Then under local alternatives $\textbf{\textit{H}}_{1,m}$, we have:

(4.3)

\[{T_{m}^{2}}\stackrel{\mathrm{d}}{\to }{\chi _{d}^{2}}(\tau ),\hspace{2em}\tau :=\big\| {\varSigma _{T}^{-1/2}}C_{T}\big\| ,\]

where $C_{T}$ is given in (4.2).

Authors

Abstract

1 Introduction

2 Observation model and total least squares estimator

2.1 The TLS problem

(2.1)

(2.2)

(2.3)

(2.4)

(2.5)

(2.6)

(2.7)

2.2 The TLS estimator and its consistency

Definition 1.

Theorem 2.

(2.8)

(2.9)

(2.10)

Lemma 3.

2.3 Asymptotic normality of the estimator

(2.11)

Theorem 4.

(2.12)

(2.13)

(2.14)

Remark 5.

(2.15)

(2.16)

Lemma 6.

(2.17)

(2.18)

(2.19)

Lemma 7.

(2.20)

(2.21)

Theorem 8.

3 Construction of goodness-of-fit test

(3.1)

(3.2)

(3.3)

Lemma 9.

(3.4)

Lemma 10.

(3.5)

Lemma 11.

(3.6)

Remark 12.

Lemma 13.

(3.7)

Theorem 14.

4 Power of the test

(4.1)

Lemma 15.

Lemma 16.

(4.2)

Definition 17.

Theorem 18.

(4.3)

5 Conclusion

Appendix

Lemma 19.

(5.1)

Proof.

Lemma 20.

(5.2)

Proof of Theorem 8.

(5.3)

(5.4)

(5.5)

(5.6)

(5.7)

(5.8)

Proof of Lemma 9.

(5.9)

(5.10)

Proof of Lemma 10.

(5.11)

(5.12)

(5.13)

(5.14)