1 Introduction
The stochastic differential equation (SDE) in ${\mathbb{R}^{d}}$
is considered. Here $({W_{t}}),t\ge 0$, is a ${d_{1}}$-dimensional Wiener process, b and σ are vector and matrix-valued Borel measurable functions of dimensions d and $d\times {d_{1}}$, respectively. To avoid any ambiguity, both coefficients will be assumed bounded, although, it is not always necessary in what follows.
(1)
\[ {X_{t}^{}}=x+{\int _{0}^{t}}b({X_{s}^{}})ds+{\int _{0}^{t}}\sigma ({X_{s}^{}})d{W_{s}},\hspace{0.1667em}t\ge 0,\]It is assumed that Equation (1) has a (weak or strong) solution, unique in distribution, which is a strong Markov process, see [8]. Naturally, under this condition the process ${X_{n}}$ – that is, our solution ${X_{t}}$ considered at integer times $t=0,1,\dots $ – is a Markov chain (MC), which is, of course, also strong Markov. The advantage of the total variation distance (although, it is not unique in this respect) for Markov processes is that once a convergence rate is established, say,
where ${\mu _{t}}$ is the marginal distribution of ${X_{t}}$, μ is any probability measure (ergodic limit for $({\mu _{n}})$), then this rate of convergence can be nearly verbatim transferred to the continuous time:
where $[t]$ is the integer part of t. Consider two independent versions of our Markov process ${X_{t}}$ (in continuous time), say, ${X_{t}^{1}}$ and ${X_{t}^{2}}$, with two different initial values ${x^{1}}$ and ${x^{2}}$, respectively (or distributions). Since we allow weak solutions, the processes ${X_{t}^{2}}$ and ${X_{t}^{2}}$, generally speaking, are defined on two different probability spaces with two different Wiener processes; without loss of generality, we may assume that these two probability spaces are independent on their direct product. Thus, we have two trajectories ${X^{2}}$ and ${X^{2}}$ on the same probability space (do not forget that Wiener processes are also different and independent of each other). Denote by $Q(x,d{x^{\prime }})$ the transition kernel
Here $Q({x_{1}},d{x^{\prime }})$ is not necessarily assumed to be absolutely continuous with respect to $Q({x_{2}},d{x^{\prime }})$; the integrand here is understood as the Radon–Nikodym derivative of the absolutely continuous component of $Q({x_{1}},d{x^{\prime }})$ with respect to $Q({x_{2}},d{x^{\prime }})$. In what follows a localised version of this condition will be stated and this localised version will be the object of our main interest in this paper. General approaches to coupling for SDEs require a (usually positive) recurrence and some form of local mixing. For the latter, beside intersections applicable only in the case $d=1$, the following tools can be used.
Hence, the goal of this paper is to attract more attention to the (local) MD condition, as this condition deserves it in the humble opinion of the author. There is also a hope that this list of available techniques may help in the future in studying ergodic properties for more general classes of processes.
-
• Lower and upper bounds of the transition density (requires Hölder’s continuity, at least, for the diffusion coefficient, as well as an “elliptic” or “hypoelliptic” nondegeneracy); here the petite sets condition, popular in the discrete time theory of ergodic Markov chains, along with recurrence properties may be used.
-
• Lower and upper bounds of the transition density only for the equation without the drift, including degenerate and highly degenerate cases: here Girsanov’s transformation is an efficient tool; petite sets conditions, generally speaking, do not work.
-
• In the absence of lower and upper bounds of the density, under the nondegeneracy condition and for general measurable coefficients of SDE, Harnack inequalities in parabolic or elliptic versions may be applied; a petite sets condition apparently may be proved; however, they are less efficient than the MD condition because the latter guarantees better estimates of the convergence rate.
One more point is that except for the method based on lower and upper bounds of the density, in all other more involved situations the popular DD condition [4], or, in its local version, the petite set condition is difficult to apply to SDEs, unlike the MD one; and even if it could be applied, the MD condition requires weaker assumptions and provides better estimates of the convergence rate.
Note that for discrete time stochastic models, and in certain cases for continuous time, too, one more natural approach to coupling is to use regeneration. Unfortunately, for general SDEs this method is not available. So, we do not discuss it here, although, the multidimensional coupling constructions for processes with continuous distributions are sometimes called “a generalised regeneration”.
Let us warn the reader that most of the results of this paper are known, perhaps, in a slightly different form; we just collect them here together. Yet, Section 2.6 is new, and the version of the result as it is stated in Section 2.5 is new, too. For simplicity we do not touch more general equations such as SDEs with jumps. However, in principle, more general Markov processes, in particular, SDEs with jumps, may also be tackled with the help of similar techniques.
It should be also highlighted that all methods discussed in what follows (but the elliptic Harnack inequality) can be applied with minor differences to nonhomogeneous SDEs, too, except that convergence would be for the distance in total variation between marginal distributions corresponding to any two initial measures, not to the invariant measure which does not exist in this situation.
The paper consists of two sections: this introduction and the main Section 2; in its turn, Section 2 is split into six subsections, most of them related to one of the coupling tools listed above. The majority of proofs are sketchy or dropped because the results are known; the only exceptions are the parts in Subsections 2.5 and 2.6; the latter about the elliptic Harnack inequality is new to the best of author’s knowledge. The proof of Lemma 2 has been provided for the reader’s convenience by the suggestion of one of the referees. We repeat that the paper presents the set of various tools for coupling for (homogeneous) SDEs. Neither recurrence – the necessary second ingredient in studying convergence and mixing rates – nor coupling itself (except for the basic Lemma 1 added for the reader’s convenience) are not the goals of this paper.
2 Main results
2.1 Case $d=1$, MD condition and coupling using intersections
In the 1D case for local coupling we can use intersections of two independent solutions of the same SDE with different initial values. Assume that ${X_{t}}$ and ${X^{\prime }_{t}}$ are two independent solutions of Equation (1) with different initial values ${X_{0}}=x$ and ${X^{\prime }_{0}}={x^{\prime }}$ in the one-dimensional case. The basis for applying coupling via intersections is the following result.
Proposition 1.
If b, σ, and ${\sigma ^{-1}}$ are bounded then
The first meeting time $\tau :=(t\ge 0:{X_{t}}={X^{\prime }_{t}})$ is a stopping time.
(3)
\[ \underset{-1\le x,{x^{\prime }}\le 1}{\inf }{\mathbb{P}_{x,{x^{\prime }}}}(\exists \hspace{0.1667em}s\in [0,1]:\hspace{0.1667em}{X_{s}}={X^{\prime }_{s}})>0.\]Proof.
It follows from the following two elementary steps.
1. Change the time for both SDEs making diffusion coefficients equal to one. There is no need to make it by the same random time change: generally speaking, the latter is not possible unless the diffusion coefficient is a constant. Since σ and ${\sigma ^{-1}}$ are bounded, the interval $[0,1]$ after this change becomes random, but for both processes it contains a nonrandom interval $[0,T]:=[0,{\inf _{x}}{\sigma ^{-2}}(x)]$. This can also be applied to nonhomogeneous SDEs with coefficients depending on time.
2. The random time change leaves the drift bounded. Hence, due to Girsanov’s transformation of measure it can be seen that the probability that the process with a lower initial value will attain the level $+1$ over $[0,T]$ is positive and bounded away from zero. Similarly, the probability that the process with a higher initial value will attain the level $-1$ over $[0,T]$ is positive and bounded away from zero. Therefore, they meet on $[0,T]$ with a positive probability which is bounded away from zero, as required. □
2.2 MD condition, “case b” and “petite set” conditions
In dimensions $d>1$ intersections do not work for the “normal” SDEs, and we now switch to the main topic of this paper – local mixing conditions. Global and local versions of the “petite set” and Markov–Dobrushin (MD) conditions will be stated. Most frequently either of them is applied in its local variant, but the global option also works in cases of a uniform ergodicity. It should be noted that, in fact, local versions may vary slightly depending on a particular setting; we only show their main appearances. The “petite set” condition is a localised version of the “case (b)” condition from [4, Chapter V, Section 5], which is, in turn, a simplification of the “condition D” (nowadays called the Doeblin–Doob one) from the same chapter in [4]. Let us highlight that the MD condition may also be in a global or local form.
Definition 2.
The process satisfies the condition “b” (from [4, Chapter V]) iff there exists a probability measure ν on the state space $\mathcal{X}$ and constants $T,c>0$ such that
See, in particular, [10] about the usage of the petite set condition in convergence studies. Let us recall that normally this local condition – as well as the local MD condition in Definition 5 in the next paragraphs – should be accomplished with certain recurrence assumptions or properties; however, as it was said earlier, recurrence is not the goal of this paper; it makes sense to study recurrence separately. Both conditions “case b” and MD in their global forms lead to an efficient exponential convergence uniform in the initial data. The next is a more general version of Definition 1.
Definition 5.
The following is called a local Markov–Dobrushin condition: there exist sets $D,{D^{\prime }}\subset \mathcal{X}$ in the state space and a constant $T>0$ such that
Remark 1.
Usually, but not necessarily ${D^{\prime }}=D$; in this case we use the notation $\kappa (D,{D^{\prime }};T)=:\kappa (D;T)$. Another possibility is ${D^{\prime }}={R^{d}}$. A sufficient condition for (7) is as follows: there exists a dominating measure $\nu (dy)$ such that ${\mu _{T}^{x}}(dy)\ll \nu (dy)$ for any $x\in D$, and
In general, there might be no dominating measure for all x simultaneously. Yet, as we shall see, (8) may be verified in most of the cases in what follows. Note that, of course, (8) with any $(D,{D^{\prime }})$ implies the same condition with $(D,{\mathbb{R}^{d}})$; however, it may be more convenient by technical reasons to have a bounded set ${D^{\prime }}$, as it was realised, for example, in [1].
(8)
\[ \kappa (D,{D^{\prime }};T)=\underset{{x_{0}},{x_{1}}\in D}{\inf }{\int _{{D^{\prime }}}}\left(\frac{{\mu _{T}^{{x_{0}}}}(dy)}{\nu (dy)}\wedge \frac{{\mu _{T}^{{x_{1}}}}(dy)}{\nu (dy)}\right)\nu (dy)>0.\]Clearly, the “petite set” condition implies the MD one, both in the global (“case b”) and local versions; for example, (4) implies (6): we have
\[ \underset{{x_{0}},{x_{1}}}{\inf }{\int _{}}\left(\frac{{\mu _{T}^{{x_{0}}}}(dy)}{{\mu _{T}^{{x_{1}}}}(dy)}\wedge 1\right){\mu _{T}^{{x_{1}}}}(dy)={\int _{}}\left(\frac{{\mu _{T}^{{x_{0}}}}(dy)}{\nu (dy)}\wedge \frac{{\mu _{T}^{{x_{1}}}}(dy)}{\nu (dy)}\right)\nu (dy)\ge c>0\]
under (4); but, generally speaking, not vice versa. The basis for applying coupling via any of them is the following coupling lemma (not to be confused with the coupling inequality). Let us add that the MD condition admits some further generalisation, see [15, 16], which provides in certain cases a slightly better bound for efficient convergence under slightly weaker assumptions. However, this note is just about tools which allow to check a local condition MD for SDEs. The following lemma clarifies why the MD condition is so useful; at the same time it serves as the basis for a further application of the MD condition to coupling technique for Markov processes.Lemma 1.
Let ${X^{1}}$ and ${X^{2}}$ be two random vectors in ${\mathbb{R}^{d}}$ on their probability spaces $({\Omega ^{1}},{\mathcal{F}^{1}},{\mathbb{P}^{1}})$ and $({\Omega ^{2}},{\mathcal{F}^{2}},{\mathbb{P}^{2}})$ (without loss of generality different, which may be made independent after we take their direct product) with densities ${p^{1}}$ and ${p^{2}}$ with respect to some reference measure Λ, correspondingly. Then, if
then there exists one more probability space $(\Omega ,\mathcal{F},\mathbb{P})$ and two random variables on it, ${\tilde{X}^{1}}$, ${\tilde{X}^{2}}$, such that
\[ \hspace{-0.1667em}\hspace{-0.1667em}\mathcal{L}({\tilde{X}^{j}})\hspace{-0.1667em}=\hspace{-0.1667em}\mathcal{L}({X^{j}}),j\hspace{-0.1667em}=\hspace{-0.1667em}1,\hspace{-0.1667em}2,\hspace{0.2778em}\& \hspace{0.2778em}\frac{\| \mathcal{L}({X^{1}})\hspace{-0.1667em}-\hspace{-0.1667em}\mathcal{L}({X^{2}}){\| _{TV}}}{2}\hspace{-0.1667em}=\hspace{-0.1667em}P({\tilde{X}^{1}}\hspace{-0.1667em}\ne \hspace{-0.1667em}{\tilde{X}^{2}})\hspace{-0.1667em}=\hspace{-0.1667em}p.\]
This is a well-known technical tool in the coupling method. The proof – which is simple enough – may be found, for example, in [13]. This reference should not be regarded as a claim that this lemma belongs to the author, although, who was the first inventor of this lemma is not clear to him. The way this lemma may be used for studying convergence rates for Markov chains can be seen, for example, in [13, 16]. The state space could be much more general than ${\mathbb{R}^{d}}$, but in this paper we are only concerned with SDEs in finite-dimensional Euclidean spaces.
The next lemma justifies the hint that to estimate the convergence rate for a Markov process to its invariant measure (assuming that it exists) in continuous time $({X_{t}},t\ge 0)$ it suffices to evaluate it for discrete times $n=0,1,\dots $ Its elementary proof is provided for the reader’s convenience. Let ${\mu _{t}^{X}}$ be the marginal distribution of ${X_{t}}$, and let ${\mu ^{X}}$ be its invariant measure.
Proof.
Due to the Markov property of X, by the Chapman–Kolmogorov equation using the convention ${a_{+}}=a\vee 0$, ${a_{-}}=(-a)\vee 0$, we get
\[\begin{array}{r}\displaystyle \frac{1}{2}\| {\mu _{t}^{X}}-{\mu ^{X}}{\| _{TV}}=\underset{A}{\sup }({\mathbb{P}_{x}}({X_{t}}\in A)-{\mathbb{P}_{\mu }}({X_{t}}\in A))\\ {} \displaystyle =\underset{A}{\sup }\iint 1(z\in A)({\mathbb{P}_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy)){\mathbb{P}_{y}}({X_{t-n}}\in dz)\\ {} \displaystyle =\underset{A}{\sup }\left(\iint 1(z\in A){({\mathbb{P}_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy))_{+}}{\mathbb{P}_{y}}({X_{t-n}}\in dz)\right.\\ {} \displaystyle \left.-\iint 1(z\in A){({\mathbb{P}_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy))_{-}}{\mathbb{P}_{y}}({X_{t-n}}\in dz)\right)\\ {} \displaystyle \le \underset{A}{\sup }\iint 1(z\in A){({P_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy))_{+}}{\mathbb{P}_{y}}({X_{t-n}}\in dz)\\ {} \displaystyle =\iint {\mathbb{P}_{y}}({X_{t-n}}\in dz){({\mathbb{P}_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy))_{+}}\\ {} \displaystyle =\int {({\mathbb{P}_{x}}({X_{n}}\in dy)-{\mathbb{P}_{\mu }}({X_{0}}\in dy))_{+}}=\frac{1}{2}\| {\mu _{n}^{X}}-{\mu ^{X}}{\| _{TV}},\end{array}\]
as required. □2.3 MD condition using lower and upper bounds of the transition density
Assume $d\ge 1$ and that for the transition densities (fundamental solutions in the PDE language) there exist the Gaussian type upper and lower bounds
with some appropriate function $g(x,{x^{\prime }})\ge 0$ and constants ${C_{t}},{C^{\prime }_{t}},{c_{t}},{c^{\prime }_{t}}>0$; in the nondegenerate case the function g may be taken in the form $g(x,{x^{\prime }})=|x-{x^{\prime }}{|^{2}}$; in the hypoelliptic cases it may be chosen similarly replacing the Euclidean norm $|x-{x^{\prime }}|$ with some other appropriate norm, which reflects the structure of the Hörmander type condition assumed. The upper bound in (9) can be established under the nondegeneracy of $\sigma {\sigma ^{\ast }}$ for bounded Hölder coefficients in x, see the details in [5, 6, 12]. For the full double Inequality (9) under the nondegeneracy and some other conditions which are not specified here, see, for example, [17, Theorem A], [2, Theorem 21], [7, Theorem 5]; under certain hypoellipticity conditions, see [3, Theorem 1.1] (where, actually, g itself also depends on t). Various similar inequalities may be also found in other sources. In particular, under the nondegeneracy condition on $\sigma {\sigma ^{\ast }}$, one can use ${C_{t}}=C{t^{-d/2}}$, ${C^{\prime }_{t}}={C^{\prime }}{t^{-d/2}}$, ${c_{t}}=ct$, ${c^{\prime }_{t}}={c^{\prime }}t$ with some $C,{C^{\prime }},c,{c^{\prime }}>0$; under the hypoelliptic conditions such constants have some other form; let us emphasize that various versions of Inequality (9) suit our goal: the most important property must be that there are both lower and upper bounds locally finite and locally bounded away from zero for some $t>0$.
(9)
\[ {C^{\prime }_{t}}\exp (-{c_{t}}g(x,{x^{\prime }}))\le {f_{t}}(x,{x^{\prime }})\le {C_{t}}\exp (-{c_{t}^{-1}}g(x,{x^{\prime }}))\]Clearly, a local “petite set” condition is satisfied under (9) with any bounded domain D (an open set by definition) and with the Lebesgue measure as ν. Hence, the MD condition is also valid. To the best of author’s knowledge this is the only case where the “petite set” condition can be applied to Markov SDEs in order to arrange coupling; however, this class of coefficients is wide enough, although, far from the most general.
2.4 MD condition using stochastic exponentials
In this section let us assume that $d\ge 1$ and that lower and upper bounds for transition densities hold true for the SDE with a “truncated drift”
\[ {X_{t}^{0}}=x+{\int _{0}^{t}}\sigma ({X_{s}^{0}})d{W_{s}}+{\int _{0}^{t}}{b_{1}}({X_{s}^{0}})ds,\]
while the goal is to arrange a local coupling for the full SDE (1) with the more involved drift of the form
where the function ${b_{2}}$ is just Borel measurable and bounded (this boundedness may be relaxed); ${\sigma ^{-1}}$ is assumed bounded, too. We are interested in establishing an MD condition for the full Equation (1). Note that in general, upper and lower bounds from the previous subsection for the solution of Equation (1) are not known. Denote
and let
\[ {\rho _{T}}\hspace{-0.1667em}:=\hspace{-0.1667em}\exp \hspace{-0.1667em}\left(\hspace{-0.1667em}-\hspace{-0.1667em}{\int _{0}^{T}}\hspace{-0.1667em}{\tilde{b}_{2}}({X_{t}})\hspace{0.1667em}d{W_{t}}\hspace{-0.1667em}-\hspace{-0.1667em}\frac{1}{2}\hspace{-0.1667em}{\int _{0}^{T}}\hspace{-0.1667em}{\left|{\tilde{b}_{2}}({X_{t}})\right|^{2}}\hspace{0.1667em}dt\right)\hspace{-0.1667em}.\]
Recall that ${\rho _{T}}$ is a probability density for any $T>0$. Denote by ${\mu _{t}}$ the marginal distribution of ${X_{t}}$.This inequality suffices for applications to coupling and convergence rates (given suitable recurrence estimates). For the proofs of very close statements (actually, even for degenerate SDEs), see [1, 14]. These proofs are based on Girsanov’s change of measure via the stochastic exponential ${\rho _{T}}$. Some other localised versions of this result may be established: for example, the sets ${B_{R}}$ under the infimum sign and as a domain of integration may, actually, differ.
2.5 MD condition using parabolic Harnack inequalities
As usual in this paper, in this section we assume that $d\ge 1$, coefficients b and σ are bounded (which can be relaxed by a localisation) and Borel measurable, and $\sigma {\sigma ^{\ast }}$ is uniformly nondegenerate. Under such conditions Krylov–Safonov’s Harnack parabolic inequality holds true [9, Theorem 1.1]; stated in terms of probabilities rather than solutions of PDEs it reads:
where ${\Gamma _{\epsilon }}$ is the parabolic boundary of the cylinder $((t,x):|x|\le 1;\epsilon \le t\le 1)$, i.e. (${\Gamma _{\epsilon }}={\Gamma _{\epsilon }^{(t=1)}}\cup {\Gamma _{\epsilon }^{(t<1)}}$), ${\Gamma _{\epsilon }}=((t,x):(|x|=1\hspace{0.2778em}\& \hspace{0.2778em}\epsilon \le t\le 1)\cup (|x|\le 1\hspace{0.2778em}\& \hspace{0.2778em}t=1))$, and
(11)
\[ \underset{|{x_{1}}|,|{x_{2}}|\le 1/4}{\sup }\frac{{\mathbb{P}_{}}({X_{\tau }^{0,{x_{1}}}}\in d\gamma )}{{\mathbb{P}_{}}({X_{\tau }^{\epsilon ,{x_{2}}}}\in d\gamma )}{|_{{\Gamma _{\epsilon }}}}\le N<\infty ,\]
\[ \tau :=\inf (t\ge 0:|{X_{t}}|\ge 1),\hspace{0.2778em}\text{with a convention}\hspace{0.2778em}\inf (\varnothing )=1;\]
the constant N depends on d, on the ellipticity constants of the diffusion, on the sup-norm of the drift, and on ϵ. Note that in (11) the measure in the numerator is absolutely continuous with respect to the one in the denominator, that is, there is no singular component in this situation. Let
\[ {\mu ^{{x_{1}}}}(d\gamma )={\mathbb{P}_{}}({X_{\tau }^{0,{x_{1}}}}\in d\gamma ),\hspace{1em}{\mu ^{\epsilon ,{x_{2}}}}(d\gamma )={\mathbb{P}_{}}({X_{\tau }^{\epsilon ,{x_{2}}}}\in d\gamma ),\]
where $d\gamma $ is the element of the boundary ${\Gamma _{\epsilon }}$. Then the following local mixing bound holds true.Theorem 2 (local MD via parabolic Harnack).
Let ${\mu ^{{x_{1}}}}({\Gamma _{\epsilon }})\ge q$ with some $q>0$. Then a version of the Markov–Dobrushin condition holds:
Note that the value q here may be chosen arbitrarily close to one, if $\epsilon >0$ is small enough. However, the decrease of ϵ implies the increase of the constant N in (11).
Proof.
Indeed, due to Inequality (11) we have
\[ f:=\frac{d{\mu ^{{x_{1}}}}}{d{\mu ^{\epsilon ,{x_{2}}}}}{|_{{\Gamma _{\epsilon }}}}\le N\hspace{1em}\& \hspace{1em}{\mu ^{{x_{1}}}}\ll {\mu ^{\epsilon ,{x_{2}}}}\hspace{0.2778em}\text{on}\hspace{2.5pt}{\Gamma _{\epsilon }}.\]
Denote by ${\tilde{\mu }^{\epsilon ,{x_{2}}}}$ the absolutely continuous part of ${\mu ^{\epsilon ,{x_{2}}}}$ with respect to ${\mu ^{{x_{1}}}}$ (we do not know whether there exists a nontrivial singular component here, but the calculus in what follows does not depend on this). Then
\[ \frac{d{\tilde{\mu }^{\epsilon ,{x_{2}}}}}{d{\mu ^{{x_{1}}}}}{|_{{\Gamma _{\epsilon }}}}=\frac{1}{f}\ge \frac{1}{N}.\]
Hence, the assumption ${\mu ^{{x_{1}}}}({\Gamma _{\epsilon }})\ge q$ implies
\[\begin{array}{r}\displaystyle {\int _{{\Gamma _{\epsilon }}}}\left(\frac{{\mu ^{\epsilon ,{x_{2}}}}(d\gamma )}{{\mu ^{{x_{1}}}}(d\gamma )}\wedge 1\right){\mu ^{{x_{{x_{1}}}}}}(d\gamma )={\int _{{\Gamma _{\epsilon }^{}}}}\left(\frac{{\tilde{\mu }^{\epsilon ,{x_{2}}}}(d\gamma )}{{\mu ^{{x_{1}}}}(d\gamma )}\wedge 1\right){\mu ^{{x_{1}}}}(d\gamma )\\ {} \displaystyle \ge {\int _{{\Gamma _{\epsilon }^{}}}}\frac{1}{N}{\mu ^{{x_{1}}}}(d\gamma )\ge \frac{q}{N}>0.\end{array}\]
□Sometimes it may be more convenient to use another version of the MD condition, which follows from Theorem 2. Denote
Corollary 1.
Under the assumptions of Theorem 2 the following version of the MD condition holds: there exists ${q^{\prime }}\in (0,q)$ such that
Note that here ${\mathbb{R}^{d}}$ plays the role of ${D^{\prime }}$ in the MD condition. In some cases this may not be convenient; however, using moment bounds of the solution a reasonable version of this inequality with some bounded ball ${B_{R}}$ in place of ${R^{d}}$ is, of course, possible. We leave it till further studies where such a replacement may be required.
Proof.
Note that due to the boundedness of σ and b,
Denote
We have
This was the first step in the reduction of the MD characteristics in the left hand side of (14) to the Harnack inequality: now we may deal with the measures ${\mu _{1}^{\epsilon ,z}}(dy)$ and ${\mu _{1}^{{x_{1}}}}(dy)$. However, these are still not the ones which show up in (11) or in (12). The next step will complete this reduction. Let
Then
We have
Therefore,
\[\begin{array}{r}\displaystyle {\mu _{1}^{{x_{2}}}}(dy)={\mathbb{P}_{{x_{2}}}}({X_{1}}\in dy)={\mathbb{E}_{{x_{2}}}}\mathbb{E}({X_{1}}\in dy|{X_{\epsilon }})\ge \\ {} \displaystyle \ge {\mathbb{E}_{{x_{2}}}}1(|{X_{\epsilon }}|\le 1/4)\mathbb{E}({X_{1}}\in dy|{X_{\epsilon }})={\mathbb{E}_{{x_{2}}}}1(|{X_{\epsilon }}|\le 1/4){\mu _{1}^{\epsilon ,{X_{\epsilon }}}}(dy).\end{array}\]
Hence, denoting ${\nu _{\epsilon ,{x_{2}}}}(dz):={\mathbb{P}_{{x_{2}}}}({X_{\epsilon }}\in dz)$, we find
\[\begin{array}{r}\displaystyle 1\wedge \frac{{\mu _{1}^{{x_{2}}}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\ge 1\wedge \frac{{\mathbb{E}_{{x_{2}}}}1(|{X_{\epsilon }}|\le 1/4){\mu _{1}^{\epsilon ,{X_{\epsilon }}}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\\ {} \displaystyle =1\wedge \frac{\textstyle\int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4){\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\\ {} \displaystyle =1\wedge \left(\int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)\frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right)\\ {} \displaystyle \ge \int {\nu _{\epsilon ,{x_{2}}}}(dz)\left(1\wedge 1(|z|\le 1/4)\frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right)\\ {} \displaystyle \ge \int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)\left(1\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right).\end{array}\]
So,
(14)
\[\begin{array}{r}\displaystyle \int \left(1\wedge \frac{{\mu _{1}^{{x_{2}}}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right){\mu _{1}^{{x_{1}}}}(dy)\\ {} \displaystyle \ge \int \left[\int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)\left(1\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right)\right]{\mu _{1}^{{x_{1}}}}(dy)\\ {} \displaystyle =\int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)\left[\int \left(1\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right){\mu _{1}^{{x_{1}}}}(dy)\right].\end{array}\](15)
\[ \int \left(1\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right){\mu _{1}^{{x_{1}}}}(dy)=\int \left(\frac{{\mu _{1}^{{x_{1}}}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\right){\tilde{\Lambda }_{{x_{1}},z}}(dy).\](16)
\[ \hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}\hspace{-0.1667em}\frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\wedge \frac{{\mu _{1}^{{x_{1}}}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\hspace{-0.1667em}\ge \hspace{-0.1667em}\frac{{\mathbb{P}_{{x_{1}}}}({X_{1}}\in dy,\tau <1)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\wedge \frac{{\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1-\epsilon )}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}.\]
\[\begin{array}{r}\displaystyle \int \left(\frac{{\mu _{1}^{{x_{1}}}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\right){\tilde{\Lambda }_{{x_{1}},z}}(dy)\\ {} \displaystyle \ge {\int _{{R^{d}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{1}}\in dy,\tau <1)}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\wedge \frac{{\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1-\epsilon )}{{\tilde{\Lambda }_{{x_{1}},z}}(dy)}\right){\tilde{\Lambda }_{{x_{1}},z}}(dy)\\ {} \displaystyle ={\int _{{R^{d}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{1}}\in dy,\tau <1)}{{\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1-\epsilon )}\wedge 1\right){\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1-\epsilon ).\end{array}\]
Further, since $|{x_{1}}|\le 1/4$ and $|z|\le 1/8$, then due to the strong Markov property and by virtue of Inequality (11) we have
\[\begin{array}{r}\displaystyle {\mathbb{P}_{{x_{1}}}}({X_{1}}\in dy,\tau <1)={\mathbb{E}_{{x_{1}}}}1(\tau <1)({X_{1}}\in dy)\\ {} \displaystyle ={\mathbb{E}_{{x_{1}}}}\mathbb{E}\left(1(\tau <1)({X_{1}}\in dy)|{\mathcal{F}_{\tau }}\right)={\mathbb{E}_{{x_{1}}}}1(\tau <1)\mathbb{E}\left(({X_{1}}\in dy)|{\mathcal{F}_{\tau }}\right)\\ {} \displaystyle ={\mathbb{E}_{{x_{1}}}}1(\tau <1)\mathbb{E}\left(({X_{1}}\in dy)|{X_{\tau }}\right)\\ {} \displaystyle ={\mathbb{E}_{{x_{1}}}}1(\tau <1){\mathbb{E}_{t,y}}\left(({X_{1-t}}\in dy)\right){|_{(t,y)=(\tau ,{X_{\tau }})}}\\ {} \displaystyle \ge \frac{q}{N}{E_{{x_{2}}}}1(\tau <1){\mathbb{E}_{t,y}}\left(({X_{1-t}}\in dy)\right){|_{(t,y)=(\tau ,{X_{\tau }})}}\\ {} \displaystyle =\frac{q}{N}{\mathbb{P}_{z}}({X_{1-\epsilon -t}}\in dy,\tau <1).\end{array}\]
So,
\[\begin{array}{r}\displaystyle {\int _{{R^{d}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{1}}\in dy,\tau <1)}{{\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1)}\wedge 1\right){\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1)\\ {} \displaystyle \ge \frac{q}{N}{\int _{{R^{d}}}}{\mathbb{P}_{z}}({X_{1-\epsilon }}\in dy,\tau <1)=\frac{q}{N}{\mathbb{P}_{z}}(\tau <1).\end{array}\]
Recall that in (14) the integrand involves the indicator $1(|z|\le 1/4)$. Clearly,
Hence, due to (14), (15) and (16),
\[\begin{array}{r}\displaystyle \int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)\left[\int \left(1\wedge \frac{{\mu _{1}^{\epsilon ,z}}(dy)}{{\mu _{1}^{{x_{1}}}}(dy)}\right){\mu _{1}^{{x_{1}}}}(dy)\right]\\ {} \displaystyle \ge \frac{q\kappa }{N}\int {\nu _{\epsilon ,{x_{2}}}}(dz)1(|z|\le 1/4)=\frac{q\kappa }{N}\int 1(|z|\le 1/4){\mathbb{P}_{{x_{2}}}}({X_{\epsilon }}\in dz)\ge \frac{{q^{\prime }}}{N}\end{array}\]
with some $0<{q^{\prime }}<q\kappa $, as required. Inequality (13) follows. □2.6 MD condition using elliptic Harnack inequalities
The assumptions of this sections are the same as in the previous one: $d\ge 1$, coefficients b and σ are bounded (which can be relaxed) and Borel measurable, and $\sigma {\sigma ^{\ast }}$ is uniformly nondegenerate. We have the elliptic Harnack inequality due to [11, Theorem 3.1], stated here in its probabilistic form (while in [11] it is offered in the language of elliptic PDEs): there exists a constant $N>0$ such that for any $0<R\le 1$ and any $A\in \partial {B_{R}}$,
where ${\tau _{R}}=\inf (t\ge 0:|{X_{t}}|\ge R)$, and $\partial {B_{R}}$ is the boundary of the ball ${B_{R}}$. This inequality itself is some MD condition. In fact, it is not clear whether this version of the Harnack inequality may be helpful for estimating the convergence rate of the distribution of ${X_{t}}$ to its stationary regime. Nevertheless, if it can be used for such a purpose – which is the author’s hope – then it might be more convenient to apply the following version of the MD condition based on Inequality (17). Let
Note that
(17)
\[ \underset{|x|\le R/8}{\sup }{\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in A)\le N\underset{|x|\le R/8}{\inf }{\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in A),\]
\[ \bigcup \limits_{T>0}{Q_{R,T}}={R_{+}}\times {B_{R}}={R_{+}}\times (x:\hspace{0.1667em}|x|\le R).\]
Denote by ${\Gamma _{R,T}}$ the part of the parabolic boundary of ${Q_{R,T}}$ corresponding to $t<T$, namely,
Denote ${\tau _{R,T}}:=\inf (t\ge 0:{X_{t}}\notin {B_{R}})\wedge T$, ${\tau _{R}}:=\inf (t\ge 0:{X_{t}}\notin {B_{R}})$, and
Theorem 3 (local MD via elliptic Harnack).
The local MD condition
holds true for any $T>0$ large enough.
(18)
\[ \underset{{x_{1}},{x_{2}}\in {B_{R}}}{\inf }{\int _{{B_{R}}}}\left(\frac{{\nu _{R,T}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R,T}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\ge \frac{{C_{R}}}{2N}\]In principle, it is possible to evaluate such values of T for which (18) holds, and potentially it might be useful for estimating convergence rates.
Proof.
Clearly, ${\tau _{R,T}}\le {\tau _{R}}$. Note that due to the nondegeneracy of σ we have ${\tau _{R}}<\infty $ a.s., and
as $T\uparrow \infty $, where the convergence is uniform with respect to A and $|x|\le R/8$.
\[ \underset{T\to \infty }{\lim }\underset{|x|\le R/8}{\inf }{\mathbb{P}_{x}}({\tau _{R,T}}={\tau _{R}})=\underset{T\to \infty }{\lim }\underset{|x|\le R/8}{\inf }{\mathbb{P}_{x}}({\tau _{R}}<T)=1.\]
Equivalently,
\[ \underset{T\to \infty }{\lim }\underset{|x|\le R/8}{\sup }{\mathbb{P}_{x}}({\tau _{R,T}}<{\tau _{R}})=\underset{T\to \infty }{\lim }\underset{|x|\le R/8}{\inf }{\mathbb{P}_{x}}(T<{\tau _{R}})=0.\]
Hence,
(19)
\[ {\nu _{R,T}^{x}}(A)={\mathbb{P}_{x}}({X_{{\tau _{R,T}}}}\in A)={\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in A,{\tau _{R}}<T)\uparrow {\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in A)={\nu _{R}^{x}}(A),\]Inequality (17) implies the following:
By virtue of the monotone convergence theorem and due to (19) we have for any ${x_{1}}$, ${x_{2}}$,
\[ 0<{N^{-1}}\le \underset{|x|\le 1/8}{\inf }\frac{{\nu _{R}^{x}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}{|_{\partial {B_{R}}}}\le \underset{|x|\le 1/8}{\sup }\frac{{\nu _{R}^{x}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}{|_{\partial {B_{R}}}}\le N<\infty .\]
As a consequence, for any $R>0$ we get
(20)
\[ \underset{{x_{1}},{x_{2}}\in {B_{R}}}{\inf }{\int _{{B_{R}}}}\left(\frac{{\nu _{R}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\ge \frac{{C_{R}}}{N}>0.\]
\[\begin{array}{r}\displaystyle {\int _{{B_{R}}}}\left(\frac{{\nu _{R,T}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R,T}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle \to {\int _{{B_{R}}}}\left(\frac{{\nu _{R}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma ),\hspace{1em}T\to \infty .\end{array}\]
Hence, for T large enough we obtain from (20),
\[ {\int _{{B_{R}}}}\left(\frac{{\nu _{R,T}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R,T}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\ge \frac{{C_{R}}}{2N}.\]
However, technically this is still not sufficient because we want a similar inequality with infimum ${\inf _{{x_{1}},{x_{2}}\in {B_{R}}}}$. Using the elementary inequality $(a-b)\wedge (c-d)\ge a\wedge c-b-d$ along with the identity
\[ {\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}<T)={\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in d\gamma )-{\mathbb{P}_{x}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T),\]
we get
\[\begin{array}{r}\displaystyle {\int _{{B_{R}}}}\left(\frac{{\nu _{R}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\ge {\int _{{B_{R}}}}\left(\frac{{\nu _{R,T}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R,T}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle ={\int _{{B_{R}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}<T)}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}<T)}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle \ge {\int _{{B_{R}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\mathbb{P}_{{x_{2}}}}({X_{{\tau _{R}}}}\in d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle -{\int _{{B_{R}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)}{{\nu _{R}^{0}}(d\gamma )}+\frac{{\mathbb{P}_{{x_{2}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma ).\end{array}\]
Here
\[\begin{array}{r}\displaystyle (\underset{{x_{1}},{x_{2}}\in {B_{R}}}{\sup })\hspace{0.2778em}{\int _{{B_{R}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)}{{\nu _{R}^{0}}(d\gamma )}+\frac{{\mathbb{P}_{{x_{2}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle =(\underset{{x_{1}},{x_{2}}\in {B_{R}}}{\sup })\hspace{0.2778em}{\int _{{B_{R}}}}\left({\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)+{\mathbb{P}_{{x_{2}}}}({X_{{\tau _{R}}}}\in d\gamma ,{\tau _{R}}\ge T)\right)\\ {} \displaystyle \le 2\underset{x\in {B_{R}}}{\sup }{\mathbb{P}_{x}}({\tau _{R}}\ge T)\to 0,\hspace{1em}T\to \infty .\end{array}\]
So,
\[\begin{array}{r}\displaystyle \underset{{x_{1}},{x_{2}}\in {B_{R}}}{\inf }{\int _{{B_{R}}}}\left(\frac{{\nu _{R,T}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R,T}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\\ {} \displaystyle \to \underset{{x_{1}},{x_{2}}\in {B_{R}}}{\inf }{\int _{{B_{R}}}}\left(\frac{{\nu _{R}^{{x_{1}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\nu _{R}^{{x_{2}}}}(d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma ),\hspace{1em}T\to \infty .\end{array}\]
On the other hand,
\[ \underset{{x_{1}},{x_{2}}\in {B_{R}}}{\inf }{\int _{{B_{R}}}}\left(\frac{{\mathbb{P}_{{x_{1}}}}({X_{{\tau _{R}}}}\in d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\wedge \frac{{\mathbb{P}_{{x_{2}}}}({X_{{\tau _{R}}}}\in d\gamma )}{{\nu _{R}^{0}}(d\gamma )}\right){\nu _{R}^{0}}(d\gamma )\ge \frac{{C_{R}}}{N}.\]
Therefore, there exists ${T_{0}}$ such that
which completes the proof. □