1 Introduction
Consider a stochastic equation of the form
where $\{a_{\theta }(x),\theta \in \varTheta ,x\in \mathbb{R}\}$ is a measurable function, $\varTheta =(\theta _{1},\theta _{2})\subset \mathbb{R}$ is a parametric set. For a given $\theta \in \varTheta $, assuming that the drift term $a_{\theta }$ satisfies the standard local Lipschitz and linear growth conditions, Eq. (1) uniquely defines a Markov process X. The aim of this paper is to establish the local asymptotic normality property (LAN in the sequel) in a model, where the process X is discretely observed with a fixed time discretization value $h>0$, and a number of observations $n\to \infty $.
The LAN property provides a convenient and powerful tool for establishing lower efficiency bounds in a statistical model, e.g. [6, 17, 18]. Such a property for statistical models, based on discrete observations of processes with Lévy noise, was studied mostly in the cases where the likelihood function (or, at least its “main part”) is explicit in a sense, e.g., [1, 2, 7, 12, 13]. In the above references the models are linear in the sense that the process under the observation is either a Lévy process, or a solution of a linear (Ornstein-Uhlenbeck type) SDE driven by a Lévy process. The general non-linear case remains unstudied to a great extent, and apparently the main reason for this is that the transition probability density of the observed Markov process in this case is highly implicit. In this paper we develop the tools convenient for proving the LAN property in the framework of discretely observed solutions to SDE’s with a Lévy noise. To make the exposition reasonably transparent we confine ourselves to a particular case of one-dimensional and one-parameter model and a fixed sample frequency h. The various extensions (general state space, multiparameter model, high frequency sampling, etc.) are possible, but we postpone their detailed analysis for a further research.
Our approach consists of two principal parts. On one hand, we design a general sufficient condition for a statistical model based on a discrete observations of a Markov process to possess the LAN property, see Theorem 1 below. This result extends the classical LeCam’s result about the LAN property for i.i.d. samples. It is closely related to [5, Theorem 13], but with some substantial differences in the basic assumptions which makes our result well designed to study a model based on observations of a Lévy driven SDE, see Remark 1 below. On the other hand, we exploit Malliavin calculus-based integral representations of derivatives of 1st and 2nd orders for the log-likelihood, which we have derived in our recent papers [11] and [10]. The combination of these two principal parts leads to a required LAN property. We note that in the diffusion setting with high frequency sampling a Malliavin calculus-based approach to the proof of the LAN property is developed in [4]. Our approach is substantially different. The changes are yielded by a non-diffusive structure of the noise.
The structure of the paper follows the two-stage scheme outlined above. First, we formulate in Section 2.1 (and prove in Section 3) a general sufficient condition for the LAN property in a Markov model. Then we formulate in Section 2.2 (and prove in Section 4) our main result about the LAN property for the discretely observed solution to a Lévy driven SDE; here the proof involves the Malliavin calculus-based integral representations of derivatives of the log-likelihood from [11] and [10].
2 The main results
2.1 LAN property for discretely observed Markov processes
Let X be a Markov process taking its values in a locally compact metric space $\mathbb{X}$. The law of X is assumed to be dependent on a real-valued parameter θ; in what follows, we assume that the parametric set Θ is an interval $(\theta _{1},\theta _{2})\in \mathbb{R}$. We denote by ${\mathsf{P}_{x}^{\theta }}$ the law of X with $X_{0}=x$, which corresponds to the parameter value θ; the expectation w.r.t. ${\mathsf{P}_{x}^{\theta }}$ is denoted by ${\mathsf{E}_{x}^{\theta }}$. For a given $h>0$, we denote by ${\mathsf{P}_{x,n}^{\theta }}$ the law w.r.t. ${\mathsf{P}_{x}^{\theta }}$ of the vector ${X}^{n}=\{X_{hk},k=1,\dots ,n\}$ for discrete time observations of X with the step h. Denote by $\mathcal{E}_{n}$ the statistical experiment generated by the sample ${X}^{n}$ with $X_{0}=x$, i.e.
we refer to [8] for the notation and terminology. Our aim is to establish the LAN property for the sequence of experiments $\{\mathcal{E}_{n}\}$.
(2)
\[ \mathcal{E}_{n}=\big({\mathbb{X}}^{n},\mathcal{B}\big({\mathbb{X}}^{n}\big),{\mathsf{P}_{x,n}^{\theta }},\theta \in \varTheta \big);\]Recall that the sequence of statistical experiments $\{\mathcal{E}_{n}\}$ (or, equivalently, the family $\{{\mathsf{P}_{x,n}^{\theta }},\theta \in \varTheta \}$) is said to have the LAN property at the point $\theta _{0}\in \varTheta $ as $n\to \infty $ if for some sequence $r(n)>0,n\ge 1$ and all $u\in \mathbb{R}$
\[ Z_{n,\theta _{0}}(u):=\frac{d{\mathsf{P}_{x,n}^{\theta _{0}+r(n)u}}}{d{\mathsf{P}_{x,n}^{\theta _{0}}}}\big({X}^{n}\big)=\exp \bigg\{\Delta _{n}(\theta _{0})u-\frac{1}{2}{u}^{2}+\varPsi _{n}(u,\theta _{0})\bigg\},\]
with
In what follows we assume that X admits a transition probability density $p_{h}(\theta ;x,y)$ w.r.t. some σ-finite measure λ. Furthermore, we assume that the experiment $\mathcal{E}_{1}$ is regular; that is, for every $x\in \mathbb{X}$
Denote
note that the function $g_{h}$ is well defined by the definition of $q_{h}$ and satisfies
for every $x\in \mathbb{R}$, $\theta \in \varTheta $. Furthermore, denote
Assuming that the statistical experiment $\mathcal{E}_{n}$ is regular, the above integral is finite and defines the Fisher information for $\mathcal{E}_{n}$.
(7)
\[ I_{n}(\theta )={\sum \limits_{k=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big(g_{h}(\theta ;X_{h(k-1)},X_{hk})\big)}^{2}=4{\mathsf{E}_{x}^{\theta }}{\sum \limits_{k=1}^{n}}\int _{\mathbb{X}}{\big(q_{h}(\theta ;X_{h(k-1)},y)\big)}^{2}\lambda (dy).\]We fix $\theta _{0}\in \varTheta $, and put $r(n)={I_{n}^{-1/2}}(\theta _{0})$ for large enough n, assuming that for those n one has $I_{n}(\theta _{0})>0$.
Theorem 1.
Suppose the following.
Then $\{{\mathsf{P}_{x,n}^{\theta }},\theta \in \varTheta \}$ has the LAN property at the point $\theta _{0}$.
-
1. Statistical experiment (2) is regular for every $x\in \mathbb{X}$ and $n\ge 1$ and $I_{n}(\theta _{0})>0$ for large enough n.
-
5. For every $N>0$
(9)
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \underset{n\to \infty }{\lim }\underset{|v|<N}{\sup }{r}^{2}(n){\mathsf{E}_{x}^{\theta _{0}}}{\sum \limits_{j=1}^{n}}\int _{\mathbb{X}}\big(q_{h}\big(\theta _{0}+r(n)v;X_{h(j-1)},y\big)\\{} & \displaystyle \hspace{1em}-q_{h}(\theta _{0};X_{h(j-1)},y)\big){}^{2}\lambda (dy)=0.\end{array}\]
Remark 1.
The above theorem is closely related to [5, Theorem 13]. One important difference is that in [5] the main conditions are formulated in the terms of the functions
while within our approach the main assumptions are imposed on the log-likelihood derivative $g_{h}(\theta ;x,y)$, and can be verified efficiently e.g. in a model where X is defined by an SDE with jumps (see Section 2.2 below). Another important difference is that the whole approach in [5] is developed under the assumption that the log-likelihood function smoothly depends on the parameter θ. For a model where X is defined by an SDE with jumps, such an assumption may be very restrictive (see the detailed discussion in [11]). This is the reason why we use the assumption of regularity of the experiments instead. It is much milder and easily verifiable (see [11]).
Let us note briefly two possible extensions of the result above, can be obtained without any essential changes in the proof. We do not expose them here in details since they will not be used in the current paper.
Remark 3.
The statement of Theorem 1 still holds true if, instead of one $\theta _{0}$, a sequence $\theta _{n}\to \theta _{0}$ is considered, with conditions 2–5 have been changed, respectively. Moreover, in that case relations (3) and (4) would still hold true if instead of the fixed u the sequence $u_{n}\to u$ is considered. That is, under the uniform version of conditions 2–5 the uniform asymptotic normality holds true (see [8, Definition 2.2]).
2.2 LAN property for families of distributions of solutions to Lévy driven SDE’s
We assume that Z in the SDE (1) is a Lévy process without a diffusion component; that is,
\[ Z_{t}=ct+{\int _{0}^{t}}\int _{|u|>1}u\nu (ds,du)+{\int _{0}^{t}}\int _{|u|\le 1}u\tilde{\nu }(ds,du),\]
where ν is the Poisson point measure with the intensity measure $ds\mu (du)$, and $\tilde{\nu }(ds,du)=\nu (ds,du)-ds\mu (du)$ is the respective compensated Poisson measure. In the sequel, we assume the Lévy measure μ to satisfy the following.
One particularly important class of Lévy processes satisfying H consists of tempered α-stable processes (see [21]), that arise naturally in models of turbulence [20], economical models of stochastic volatility [3], etc.Denote by ${C}^{k,m}(\mathbb{R}\times \varTheta )$, $k,m\ge 0$ the class of functions $f:\mathbb{R}\times \varTheta \to \mathbb{R}$ that have continuous derivatives,
We assume the coefficient $a_{\theta }(x)$ in Eq. (1) to satisfy the following.
It is proved in [11] that under conditions A(i) and H, the following properties hold:
Hence all the pre-requisites for Theorem 1, given in Section 2.1, are true with $\lambda (dx)=dx$ (the Lebesgue measure).
Furthermore, under conditions A and H, for $\theta =\theta _{0}$ the corresponding Markov process X is ergodic, i.e. there exists the unique invariant probability measure ${\varkappa _{inv}^{\theta _{0}}}$ for X. One can verify this easily, using conditions sufficient for ergodicity of solutions to Lévy driven SDE’s, given in [19] and [14]. Denote by $\{{X_{t}^{st,\theta _{0}}},t\in \mathbb{R}\}$ the corresponding stationary version of X; that is, a Markov process, defined on whole $\mathbb{R}$, which has the same transition probabilities as X and one-dimensional distributions equal to ${\varkappa _{inv}^{\theta _{0}}}$. Clearly, the existence of such a process, on a proper probability space, is guaranteed by the Kolmogorov consistency theorem. Denote
(11)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\sigma }^{2}(\theta _{0})& \displaystyle =\mathsf{E}{\big(g_{h}\big(\theta _{0};{X_{0}^{st,\theta _{0}}},{X_{h}^{st,\theta _{0}}}\big)\big)}^{2}\\{} & \displaystyle =\int _{\mathbb{R}}\int _{\mathbb{R}}{\big(g_{h}(\theta _{0};x,y)\big)}^{2}p_{h}(\theta _{0};x,y)\hspace{0.1667em}dy{\varkappa _{inv}^{\theta _{0}}}(dx).\end{array}\]The following theorem performs the main result of this paper. Its proof is given in Section 4 below.
3 Proof of Theorem 1
The method of proof goes back to LeCam’s proof of the LAN property for i.i.d. samples, see e.g. Theorem II.1.1 and Theorem II.3.1 in [8]. In the Markov setting, the dependence in the observations leads to some additional technicalities (see e.g. (19)). The possible ways to overcome these additional difficulties can be found, in a slightly different setting, in the proof of [5, Theorem 13]. In order to keep the exposition transparent and self-contained, we prefer to give a complete proof of Theorem 1 rather than to give a chain of partly relevant references.
We divide the proof into several lemmas; in all the lemmas in this section we assume the conditions of Theorem 1 to be fulfilled. Values $x,\theta _{0}$ and u are fixed; we assume that n is large enough, so that $\theta _{0}+r(n)u\in \varTheta $. In order to simplify the notation below we write θ instead of $\theta _{0}$.
Denote
Lemma 1.
Proof.
By the regularity of $\mathcal{E}_{1}$ and the Cauchy inequality we have
hence the above inequality and (13) lead to the bound
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle {\mathsf{E}_{x}^{\theta }}{\bigg({\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)}^{2}\\{} & \displaystyle \hspace{1em}={\mathsf{E}_{x}^{\theta }}\underset{\{y:{p_{h}^{\theta }}(z,y)\ne 0\}}{\int }\Big(\sqrt{p_{h}\big(\theta +r(n)u;X_{h(j-1)},y\big)}\\{} & \displaystyle \hspace{2em}-\sqrt{p_{h}(\theta ;X_{h(j-1)},y)}-r(n)uq_{h}(\theta ;X_{h(j-1)},y)\Big){}^{2}\lambda (dy)\\{} & \displaystyle \hspace{1em}\le {\big(r(n)u\big)}^{2}{\mathsf{E}_{x}^{\theta }}\int _{\mathbb{R}}{\bigg({\int _{0}^{1}}q_{h}\big(\theta +r(n)uv,X_{h(j-1)},y\big)-q_{h}(\theta ;X_{h(j-1)},y)dv\bigg)}^{2}\lambda (dy)\\{} & \displaystyle \hspace{1em}\le {\big(r(n)u\big)}^{2}{\mathsf{E}_{x}^{\theta }}\int _{\mathbb{R}}\lambda (dy){\int _{0}^{1}}{\big(q_{h}\big(\theta +r(n)uv;X_{h(j-1)},y\big)-q_{h}(\theta ;X_{h(j-1)},y)\big)}^{2}dv.\end{array}\]
This inequality and (9) yield (13). To deduce (12) from (13) recall an elementary inequality
and write
\[ {\zeta _{jn}^{\theta }}(u)=\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})+\bigg({\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)=:A+B.\]
Then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\mathsf{E}_{x}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}& \displaystyle \le (1+\alpha )\frac{1}{4}{u}^{2}{r}^{2}(n){\mathsf{E}_{x}^{\theta }}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}\\{} & \displaystyle \hspace{1em}+\bigg(1+\frac{1}{2\alpha }\bigg){\mathsf{E}_{x}^{\theta }}{\bigg({\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)}^{2}.\end{array}\]
Recall that
(15)
\[ {\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}=I_{n}(\theta )={r}^{-2}(n),\]
\[ \underset{n\to \infty }{\limsup }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}\le \frac{1+\alpha }{4}{u}^{2}.\]
Since $\alpha >0$ is arbitrary, this completes the proof. □Proof.
By the Chebyshev inequality,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle {\mathsf{P}_{x}^{\theta }}\bigg\{\bigg|{\sum \limits_{j=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}-\frac{1}{4}{r}^{2}(n){u}^{2}{\sum \limits_{j=1}^{n}}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}\bigg|>\varepsilon \bigg\}\\{} & \displaystyle \hspace{1em}\le \frac{1}{\varepsilon }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}\bigg|{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}-\frac{1}{4}{r}^{2}(n){u}^{2}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}\bigg|\\{} & \displaystyle \hspace{1em}=\frac{1}{\varepsilon }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}\bigg|{\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg|\\{} & \displaystyle \hspace{2em}\times \bigg|{\zeta _{jn}^{\theta }}(u)+\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg|\end{array}\]
which by (14), for a given $\alpha >0$, is dominated by
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \frac{1}{2\alpha \varepsilon }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\bigg({\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)}^{2}\\{} & \displaystyle \hspace{1em}+\frac{\alpha }{2\varepsilon }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\bigg({\zeta _{jn}^{\theta }}(u)+\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)}^{2}.\end{array}\]
By (13) the first item of this expression tends to zero as $n\to \infty $. Furthermore, the Cauchy inequality together with (12) and (15) implies that for the second one the upper limit does not exceed
\[ \underset{n\to \infty }{\limsup }\bigg(\frac{\alpha }{\varepsilon }{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}+\frac{\alpha {u}^{2}}{2\varepsilon }{r}^{2}(n){\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}\bigg)\le \frac{3\alpha {u}^{2}}{2\varepsilon }.\]
Since $\alpha >0$ is arbitrary, this proves that the difference
\[ {\sum \limits_{j=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}-\frac{1}{4}{r}^{2}(n){u}^{2}{\sum \limits_{j=1}^{n}}{\big(g_{h}(\theta ;X_{h(j-1)},X_{hj})\big)}^{2}\]
tends to 0 in ${\mathsf{P}_{x}^{\theta }}$-probability. Combined with the condition 3 of Theorem 1, this gives the required statement. □Proof.
We have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\mathsf{P}_{x}^{\theta }}\Big\{\underset{1\le j\le n}{\max }\big|{\zeta _{jn}^{\theta }}(u)\big|>\varepsilon \Big\}& \displaystyle \le {\sum \limits_{j=1}^{n}}{\mathsf{P}_{x}^{\theta }}\big\{\big|{\zeta _{jn}^{\theta }}(u)\big|>\varepsilon \big\}\\{} & \displaystyle \le {\sum \limits_{j=1}^{n}}{\mathsf{P}_{x}^{\theta }}\bigg\{\bigg|{\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg|>\frac{\varepsilon }{2}\bigg\}\\{} & \displaystyle \hspace{1em}+{\sum \limits_{j=1}^{n}}{\mathsf{P}_{x}^{\theta }}\bigg\{\big|g_{h}(\theta ;X_{h(j-1)},X_{hj})\big|>\frac{\varepsilon }{4r(n)|u|}\bigg\}.\end{array}\]
The first sum in the r.h.s. of this inequality vanishes as $n\to \infty $ because of (13), the second sum vanishes because of the condition 4 of Theorem 1. □Because of the Markov structure of the sample, in addition to Lemma 2 we will need the following statement. Denote
Proof.
Denote
\[ \chi _{jn}={\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}-{\mathsf{E}_{x,j-1}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2},\hspace{1em}S_{n}={\sum \limits_{j=1}^{n}}\chi _{jn},\]
then by (16) it us enough to prove that $S_{n}\to 0$ in ${\mathsf{P}_{x}^{\theta }}$-probability. Fix $\varepsilon >0$, and put
\[ {\chi _{jn}^{\varepsilon }}={\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}1_{|{\zeta _{jn}^{\theta }}(u)|\le \varepsilon }-{\mathsf{E}_{x,j-1}^{\theta }}\big({\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}1_{|{\zeta _{jn}^{\theta }}(u)|\le \varepsilon }\big),\hspace{1em}{S_{n}^{\varepsilon }}={\sum \limits_{j=1}^{n}}{\chi _{jn}^{\varepsilon }}.\]
By construction, $\{{\chi _{j}^{\varepsilon }},\hspace{2.5pt}j=1,\dots ,n\}$ is a martingale difference, hence
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\mathsf{E}_{x}^{\theta }}{\big({S_{n}^{\varepsilon }}\big)}^{2}& \displaystyle ={\sum \limits_{k=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big({\chi _{jn}^{\varepsilon }}\big)}^{2}\le {\sum \limits_{k=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{4}1_{|{\zeta _{jn}^{\theta }}(u)|\le \varepsilon }\le {\varepsilon }^{2}{\mathsf{E}_{x}^{\theta }}{\sum \limits_{k=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}.\end{array}\]
Hence by (12) and the Cauchy inequality,
Now, let us estimate the difference $S_{n}-{S_{n}^{\varepsilon }}$. Note that, using the first statement in Lemma 1, one can improve the statement of Lemma 2 and show that the convergence (16) holds true in $L_{1}({\mathsf{P}_{x}^{\theta }})$; see e.g. Theorem A.I.4 in [8]. In particular, this means that the sequence
is uniformly integrable. Hence, because by Lemma 3 the probabilities of the sets
tend to zero as $n\to \infty $, we have
(21)
\[ {\varOmega _{n}^{\varepsilon }}=\Big\{\underset{j\le n}{\max }|\zeta _{jn}|>\varepsilon \Big\}\]
\[ {\mathsf{E}_{x}^{\theta }}\bigg(1_{{\varOmega _{n}^{\varepsilon }}}{\sum \limits_{j=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}\bigg)\to 0.\]
One has
\[ \chi _{jn}-{\chi _{jn}^{\varepsilon }}={\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}1_{|{\zeta _{jn}^{\theta }}(u)|>\varepsilon }-{\mathsf{E}_{x,j}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}1_{|{\zeta _{jn}^{\theta }}(u)|>\varepsilon },\]
hence
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\mathsf{E}_{x}^{\theta }}\big|S_{n}-{S_{n}^{\varepsilon }}\big|& \displaystyle \le 2{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}1_{|{\zeta _{jn}^{\theta }}(u)|>\varepsilon }\le 2{\mathsf{E}_{x}^{\theta }}\bigg(1_{{\varOmega _{n}^{\varepsilon }}}{\sum \limits_{j=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}\bigg)\to 0.\end{array}\]
Together with (20) this gives
\[ \underset{n\to \infty }{\limsup }{\mathsf{E}_{x}^{\theta }}|S_{n}|\le \frac{\varepsilon |u|}{2},\]
which completes the proof because $\varepsilon >0$ is arbitrary. □The final preparatory result we require is the following.
Proof.
We have the equality
\[ {\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}=\frac{p_{h}(\theta +r(n)u;X_{h(j-1)},X_{hj})}{p_{h}(\theta ;X_{h(j-1)},X_{hj})}-1-2{\zeta _{jn}^{\theta }}(u)\]
valid ${\mathsf{P}_{x}^{\theta }}$-a.s. Note that by the Markov property of X one has
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle {\mathsf{E}_{x,j-1}^{\theta }}\frac{p_{h}(\theta +r(n)u;X_{h(j-1)},X_{hj})}{p_{h}(\theta ;X_{h(j-1)},X_{hj})}\\{} & \displaystyle \hspace{1em}=\int _{\mathbb{X}}\frac{p_{h}(\theta +r(n)u;X_{h(j-1)},y)}{p_{h}(\theta ;X_{h(j-1)},y)}p_{h}(\theta ;X_{h(j-1)},y)\lambda (dy)=1;\end{array}\]
hence by Lemma 4 one has that
\[ {\sum \limits_{j=1}^{n}}{\mathsf{E}_{x,j-1}^{\theta }}{\zeta _{jn}^{\theta }}(u)\to -\frac{{u}^{2}}{8}\]
in ${\mathsf{P}_{x}^{\theta }}$-probability. Therefore, what we have to prove in fact is that
\[ V_{n}:=2{\sum \limits_{j=1}^{n}}\big({\zeta _{jn}^{\theta }}(u)-{\mathsf{E}_{x,j-1}^{\theta }}{\zeta _{jn}^{\theta }}(u)\big)-r(n)u{\sum \limits_{j=1}^{n}}g_{h}(\theta ;X_{h(j-1)},X_{hj})\to 0\]
in ${\mathsf{P}_{x}^{\theta }}$-probability. By (6) the sequence
\[ {\zeta _{jn}^{\theta }}(u)-{\mathsf{E}_{x,j-1}^{\theta }}{\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj}),\hspace{1em}j=1,\dots ,n\]
is a martingale difference, hence
\[ {\mathsf{E}_{x}^{\theta }}{V_{n}^{2}}\le 4{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}{\bigg({\zeta _{jn}^{\theta }}(u)-\frac{1}{2}r(n)ug_{h}(\theta ;X_{h(j-1)},X_{hj})\bigg)}^{2},\]
which tends to zero as $n\to \infty $ by (13). □Now, we can finalize the proof of Theorem 1. Fix $\varepsilon \in (0,1)$ and consider the sets ${\varOmega _{n}^{\varepsilon }}$ defined by (21); by Lemma 3 we have ${\mathsf{P}_{x}^{\theta }}({\varOmega _{n}^{\varepsilon }})\to 0$. Using the Taylor expansion for the function $\log (1+x)$, we obtain that there exist a constant $C_{\varepsilon }$ and random variables $\alpha _{jn}$ such that $|\alpha _{jn}|<C_{\varepsilon }$, for which the following identity holds true outside of the set ${\varOmega _{n}^{\varepsilon }}$:
\[ {\sum \limits_{j=1}^{n}}\log \frac{p_{h}(\theta +r(n)u;X_{h(j-1)},X_{hj})}{p_{h}(\theta ;X_{h(j-1)},X_{hj})}=2{\sum \limits_{j=1}^{n}}{\zeta _{jn}^{\theta }}(u)-{\sum \limits_{j=1}^{n}}{\big({\zeta _{jn}^{\theta }}(u)\big)}^{2}+{\sum \limits_{j=1}^{n}}\alpha _{jn}|{\zeta _{jn}^{\theta }}(u){|}^{3}.\]
Then by Lemma 2, Lemma 5, and Corollary 1 we have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \log Z_{n,\theta }(u)& \displaystyle ={\sum \limits_{j=1}^{n}}\log \frac{p_{h}(\theta +r(n)u;X_{h(j-1)},X_{hj})}{p_{h}(\theta ;X_{h(j-1)},X_{hj})}\\{} & \displaystyle =r(n)u{\sum \limits_{j=1}^{n}}g_{h}(\theta ;X_{h(j-1)},X_{hj})-\frac{{u}^{2}}{4}-\frac{{u}^{2}}{4}+\varPsi _{n},\end{array}\]
where $\varPsi _{n}\to 0$ in ${\mathsf{P}_{x}^{\theta }}$-probability. By the asymptotic normality condition 2, this completes the proof. □4 Proof of Theorem 2
To prove Theorem 2 we verify the conditions of Theorem 1. While doing that, we use the constructions and results from our recent papers [11, 10].
The regularity property, required in condition 1 of Theorem 1, is already proved in [11]. To prove other claims, involved into the conditions of Theorem 1, we will use the following auxiliary result several times.
Lemma 6.
Under conditions A and H for every $p\in (2,4+\beta )$ there exists a constant C such that for all $x\in \mathbb{R}$, $\theta \in (\theta _{-},\theta _{+})$ and $t\ge 0$
Let us outline briefly the subsequent argument. To prove conditions 2 and 3 of Theorem 1, we need in fact to prove a CLT and a LLN for the sums ${\sum _{j=1}^{n}}g_{h}(\theta _{0};X_{h(j-1)},X_{hj})$. The way to do this is quite standard: one should prove first such limit theorems for the stationary version of the process X, and then derive the limit behaviour of these sums under ${P_{x}^{\theta _{0}}}$. In this last step, the ergodic rates for the process X, and therefore the assumption A(ii), are essential. In the first step, which concerns the stationary version of the process X, we will need the following moment bounds for the invariant measure ${\varkappa _{inv}^{\theta _{0}}}$ for the process X with $\theta =\theta _{0}$.
Recall (e.g. [14], Section 3.2) that one standard way to construct ${\varkappa _{inv}^{\theta _{0}}}$ is to take a weak limit point (as $T\to \infty $) for the family of Khas’minskii’s averages
\[ {\varkappa _{T}^{\theta _{0}}}(dy)=\frac{1}{T}{\int _{0}^{T}}{\mathsf{P}_{x}^{\theta _{0}}}(X_{t}\in dy)\hspace{0.1667em}dt.\]
Then, by the Fatou lemma, the second relation in (23) implies the following moment bounds for ${\varkappa _{inv}^{\theta _{0}}}$.Everywhere below we assume conditions of Theorem 2 to hold true.
Lemma 7.
The sequence
\[ \frac{1}{\sqrt{n}}{\sum \limits_{j=1}^{n}}g_{h}(\theta _{0};X_{h(j-1)},X_{hj}),\hspace{1em}n\ge 1\]
is asymptotically normal w.r.t. ${P_{x}^{\theta _{0}}}$ with parameters $(0,{\sigma }^{2}(\theta _{0}))$, where ${\sigma }^{2}(\theta _{0})$ is defined in (11).
Proof.
The idea of the proof is similar to the one of the proof of Theorem 3.3 [16]. Denote
where $p,q>1$ are such that $1/p+1/q=1$. Since ${Y}^{1}$ has the distribution ${\mathsf{P}_{x}^{\theta _{0}}}$, by (23) we have for $p\in n(2,4+\beta )$
Similarly,
and the constant in the right hand side is finite by Corollary 2. Hence (25) and (26) yield that
\[ Q_{n}(\theta _{0},X)=\frac{1}{\sqrt{n}}{\sum \limits_{j=1}^{n}}g_{h}(\theta _{0};X_{h(j-1)},X_{hj}).\]
By Theorem 2.2 [19] (see also Theorem 1.2 [14]), the α-mixing coefficient $\alpha (t)$ for the stationary version ${X}^{st}$ of the process X does not exceed $C_{3}{e}^{-C_{4}t}$, where $C_{3},C_{4}$ are some positive constants. Then by CLT for stationary sequences (Theorem 18.5.3 [9]), the first relation in (23), and (24) we have
\[ Q_{n}\big(\theta _{0},{X}^{st,\theta _{0}}\big)\hspace{2.5pt}\Rightarrow \hspace{2.5pt}\mathcal{N}\big(0,{\widetilde{\sigma }}^{2}(\theta _{0})\big),\hspace{1em}n\to \infty \]
with
\[ {\widetilde{\sigma }}^{2}(\theta _{0})={\sum \limits_{k=-\infty }^{+\infty }}\mathsf{E}\big(g_{h}\big(\theta _{0};{X_{0}^{st,\theta _{0}}},{X_{h}^{st,\theta _{0}}}\big)g_{h}\big(\theta ;{X_{h(k-1)}^{st,\theta _{0}}},{X_{hk}^{st,\theta _{0}}}\big)\big).\]
Furthermore, under conditions of Theorem 2 there exists an exponential coupling for the process X; that is, a two-component process $Y=({Y}^{1},{Y}^{2})$, possibly defined on another probability space, such that ${Y}^{1}$ has the distribution ${\mathsf{P}_{x}^{\theta _{0}}}$, ${Y}^{2}$ has the same distribution with ${X}^{st,\theta _{0}}$, and for all $t>0$
with some constants $C_{1}$, $C_{2}$. The proof of this fact can be found in [15] (Theorem 2.2). Then for any Lipschitz continuous function $f:\mathbb{R}\to \mathbb{R}$ we have
(26)
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \big|{\mathsf{E}_{x}^{\theta }}f\big(Q_{n}(\theta _{0},X)\big)-\mathsf{E}f\big(Q_{n}\big(\theta _{0},{X}^{st,\theta _{0}}\big)\big)\big|\\{} & \displaystyle \hspace{1em}=\big|\mathsf{E}f\big(Q_{n}\big(\theta _{0},{Y}^{1}\big)\big)-\mathsf{E}f\big(Q_{n}\big(\theta _{0},{Y}^{2}\big)\big)\big|\\{} & \displaystyle \hspace{1em}\le \mathrm{Lip}(f)\mathsf{E}\big|Q_{n}\big(\theta _{0},{Y}^{1}\big)-Q_{n}\big(\theta _{0},{Y}^{2}\big)\big|\\{} & \displaystyle \hspace{1em}\le \frac{\mathrm{Lip}(f)}{\sqrt{n}}{\sum \limits_{k=1}^{n}}\mathsf{E}\big|g_{h}\big(\theta _{0};{Y_{h(k-1)}^{1}},{Y_{hk}^{1}}\big)\\{} & \displaystyle \hspace{2em}-g_{h}\big(\theta _{0};{Y_{h(k-1)}^{2}},{Y_{hk}^{2}}\big)\big|1_{({Y_{h(k-1)}^{1}},{Y_{hk}^{1}})\ne ({Y_{h(k-1)}^{2}},{Y_{hk}^{2}})}\\{} & \displaystyle \hspace{1em}\le \frac{2\mathrm{Lip}(f)}{\sqrt{n}}{\sum \limits_{k=1}^{n}}{\big(\mathsf{E}{\big|g_{h}\big(\theta _{0};{Y_{h(k-1)}^{1}},{Y_{hk}^{1}}\big)\big|}^{p}+\mathsf{E}{\big|g_{h}\big(\theta _{0};{Y_{h(k-1)}^{2}},{Y_{hk}^{2}}\big)\big|}^{p}\big)}^{1/p}\\{} & \displaystyle \hspace{2em}\times {\big(P\big({Y_{h(k-1)}^{1}}\ne {Y_{h(k-1)}^{2}}\big)+P\big({Y_{hk}^{1}}\ne {Y_{hk}^{2}}\big)\big)}^{1/q},\end{array}\](27)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mathsf{E}{\big|g_{h}\big(\theta _{0};{Y_{h(k-1)}^{1}},{Y_{hk}^{1}}\big)\big|}^{p}& \displaystyle ={\mathsf{E}_{x}^{\theta _{0}}}{\big|g_{h}(\theta _{0};X_{h(k-1)},X_{hk})\big|}^{p}\\{} & \displaystyle \le C{\mathsf{E}_{x}^{\theta _{0}}}\big(1+|X_{h(k-1)}{|}^{p}\big))\le C+{C}^{2}\big(1+|x{|}^{p}\big).\end{array}\](28)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mathsf{E}{\big|g_{h}\big(\theta _{0};{Y_{h(k-1)}^{2}},{Y_{hk}^{2}}\big)\big|}^{p}& \displaystyle =\mathsf{E}{\big|g_{h}\big(\theta _{0};{X_{h(k-1)}^{st,\theta _{0}}},{X_{hk}^{st,\theta _{0}}}\big)\big|}^{p}\\{} & \displaystyle \le C\mathsf{E}\big(1+|{X_{h(k-1)}^{st,\theta _{0}}}{|}^{p}\big))=C+C\int _{\mathbb{R}}|y{|}^{p}{\varkappa _{inv}^{\theta _{0}}}(dy),\end{array}\]
\[ {\mathsf{E}_{x}^{\theta }}f\big(Q_{n}(\theta _{0},X)\big)\to \mathsf{E}f(\xi ),\hspace{1em}n\to \infty ,\hspace{2.5pt}\xi \sim \mathcal{N}\big(0,{\widetilde{\sigma }}^{2}(\theta _{0})\big)\]
for every Lipschitz continuous function $f:\mathbb{R}\to \mathbb{R}$. This means that the sequence $Q_{n}(\theta _{0},X),n\ge 1$ is asymptotically normal w.r.t. ${P_{x}^{\theta _{0}}}$ with parameters $(0,{\tilde{\sigma }}^{2}(\theta _{0}))$.To conclude the proof, it remains to show that ${\widetilde{\sigma }}^{2}(\theta _{0})={\sigma }^{2}(\theta _{0})$. This follows easily from (6) because, by the Markov property of ${X}^{st,\theta _{0}}$,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\widetilde{\sigma }}^{2}(\theta _{0})& \displaystyle ={\sigma }^{2}(\theta _{0})+2{\sum \limits_{k=1}^{\infty }}\mathsf{E}\big(g_{h}\big(\theta _{0};{X_{0}^{st,\theta _{0}}},{X_{h}^{st,\theta _{0}}}\big)g_{h}\big(\theta ;{X_{h(k-1)}^{st,\theta _{0}}},{X_{hk}^{st,\theta _{0}}}\big)\big)\\{} & \displaystyle ={\sigma }^{2}(\theta _{0})+2{\sum \limits_{k=1}^{\infty }}\mathsf{E}\big[g_{h}\big(\theta ;{X_{0}^{st,\theta _{0}}},{X_{h}^{st,\theta _{0}}}\big)\big({\mathsf{E}_{x}^{\theta }}g_{h}(\theta _{0};x,X_{h})\big)_{x={X_{h(k-1)}^{st,\theta _{0}}}}\big].\end{array}\]
□Similarly, one can prove that
This proves that conditions 2–4 of Theorem 1 hold true. Condition 1 of Theorem 1 also holds true: regularity property is already proved in [11], and positivity of $I_{n}(\theta )$ follows from (29).
\[ \frac{1}{n}{\sum \limits_{j=1}^{n}}{\big(g_{h}(\theta _{0};X_{h(j-1)},X_{hj})\big)}^{2}\to {\sigma }^{2}(\theta _{0}),\hspace{1em}n\to \infty \]
in $L_{1}({\mathsf{P}_{x}^{\theta _{0}}})$; the argument is completely the same, with the CLT for a stationary sequence replaced by the Birkhoff-Khinchin ergodic theorem (we omit the details). Hence
(29)
\[ I_{n}(\theta _{0})\sim n{\sigma }^{2}(\theta _{0}),\hspace{2em}r(n)\sim \frac{1}{\sqrt{n}\sigma (\theta _{0})},\hspace{1em}n\to \infty .\]Let us prove (9), which then would allow us to apply Theorem 1. It is proved in [10] that, under the conditions of Theorem 2, the function $q_{h}(\theta ,x,y)$ is $L_{2}$-differentiable w.r.t. θ, and
\[ \partial _{\theta }q_{h}=\frac{1}{2}(\partial _{\theta }g_{h})\sqrt{p_{h}}+\frac{1}{4}{(g_{h})}^{2}\sqrt{p_{h}}.\]
In addition, it is proved therein that for every $\gamma \in [1,2+\beta /2)$
Then
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle {\mathsf{E}_{x}^{\theta }}\int _{\mathbb{R}}{\big(q_{h}\big(\theta +r(n)v,X_{h(j-1)},y\big)-q_{h}(\theta ,X_{h(j-1)},y)\big)}^{2}dy\\{} & \displaystyle \hspace{1em}\le r(n)v{\mathsf{E}_{x}^{\theta }}\int _{\mathbb{R}}dy{\int _{0}^{r(n)v}}{\big(\partial _{\theta }q_{h}(\theta +s,X_{h(j-1)},y)\big)}^{2}ds\\{} & \displaystyle \hspace{1em}\le \frac{r(n)v}{4}{\mathsf{E}_{x}^{\theta }}{\int _{0}^{r(n)v}}ds\int _{\mathbb{R}}{\bigg(\partial _{\theta }g_{h}(\theta +s;X_{h(j-1)},y)+\frac{1}{2}g_{h}{(\theta +s;X_{h(j-1)},y)}^{2}\bigg)}^{2}\\{} & \displaystyle \hspace{2em}\times {p_{h}^{s}}(X_{h(j-1)},y)dy\\{} & \displaystyle \hspace{1em}\le Cr{(n)}^{2}{v}^{2}{\mathsf{E}_{x}^{\theta }}\big(1+{(X_{h(j-1)})}^{4}\big);\end{array}\]
in the last inequality we have used (30) and the first relation in (23). Using the second relation in (23), we get then
\[ \underset{|v|<N}{\sup }r{(n)}^{2}{\mathsf{E}_{x}^{\theta }}{\sum \limits_{j=1}^{n}}{\mathsf{E}_{x}^{\theta }}\int _{\mathbb{R}}{\big(q_{h}\big(\theta +r(n)v,X_{h(j-1)},y\big)-q_{h}(\theta ,X_{h(j-1)},y)\big)}^{2}dy\le C{N}^{2}nr{(n)}^{4}\]
with a constant C that depends only on x. This relation together with (29) completes the proof.