1 Introduction and the main result
This paper aims at the large deviation principle (LDP) for the solutions to the SDEs
with possibly discontinuous coefficients $a,\sigma $. Recall that a family of (the distributions of) random elements $\{{X}^{\varepsilon }\}$ taking values in a Polish space $\mathbb{X}$ is said to satisfy the LDP with rate function $I:\mathbb{X}\to [0,\infty ]$ and speed function $r:{\mathbb{R}}^{+}\to {\mathbb{R}}^{+}$ if
for each closed $F\subset \mathbb{X}$ and
for each open $G\subset \mathbb{X}$. The rate function is assumed to be lower semicontinuous; that is, all level sets $\{x:I(x)\le c\}$, $c\ge 0$, are closed. If all level sets are compact, then the rate function is called good.
(1)
\[d{X_{t}^{\varepsilon }}=a\big({X_{t}^{\varepsilon }}\big)\hspace{0.1667em}dt+\varepsilon \sigma \big({X_{t}^{\varepsilon }}\big)dW_{t},\hspace{1em}{X_{0}^{\varepsilon }}=x_{0}\in \mathbb{R},\](2)
\[\underset{\varepsilon \to \infty }{\limsup }r(\varepsilon )\log \mathbf{P}\big\{{X}^{\varepsilon }\in F\big\}\le -\underset{x\in F}{\inf }I(x)\](3)
\[\underset{\varepsilon \to \infty }{\liminf }r(\varepsilon )\log \mathbf{P}\big\{{X}^{\varepsilon }\in G\big\}\ge -\underset{x\in G}{\inf }I(x)\]We assume that, for some $C,c>0$,
It is well known that, in this case, the SDE (1) has a unique weak solution, which can be obtained by a proper combination of the time change transformation of a Wiener process and the Girsanov transformation of the measure; see [10], IV, §4. In what follows, we fix $T>0$, interpret the (weak) solution ${X}^{\varepsilon }=\{{X_{t}^{\varepsilon }},t\in [0,T]\}$ to (1) as a random element in $C(0,T)$, and prove the LDP for the family $\{{X}^{\varepsilon }\}$. Since the law of ${X}^{\varepsilon }$ does not depend on a possible change of the sign of σ, in what follows, we assume without loss of generality that $\sigma >0$.
(4)
\[\big|a(x)\big|\le C\big(1+|x|\big),\hspace{2em}c\le {\sigma }^{2}(x)\le C,\hspace{1em}x\in \mathbb{R}.\]Our principal regularity assumptions on the coefficients $a,\sigma $ is that they have no discontinuities of the second kind, that is, they have left- and right-hand limits at every point $x\in \mathbb{R}$. For a given pair of such functions $a,\sigma $, we define the modified functions $\bar{a},\bar{\sigma }$ as follows:
Denote by $\mathit{AC}(0,T)$ the class of absolutely continuous functions $\phi :[0,T]\to \mathbb{R}$, and for each $f\in \mathit{AC}(0,T)$, we denote by $\dot{f}$ its derivative, which is well defined for almost all $t\in [0,T]$.
Theorem 1.
Let $a,\sigma $ satisfy (4) and have no discontinuities of the second kind. Then the family of distributions in $C(0,T)$ of the solutions to SDEs (1) satisfies the LDP with the speed function $r(\varepsilon )={\varepsilon }^{2}$ and the good rate function equal to
\[I(f)=\frac{1}{2}{\int _{0}^{T}}\frac{{(\dot{f}_{t}-\bar{a}(f_{t}))}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt\]
for $f\in \mathit{AC}(0,T)$ with $f_{0}=x_{0}$ and $I(f)=\infty $ otherwise.
Theorem 1 has the form very similar to the classical Wentzel–Freidlin theorem ([9], Chapter 3, §2), which establishes LDP in a much more general setting, for multidimensional Markov processes that may contain both diffusive and jump parts. However, the Wentzel–Freidlin approach substantially exploits the continuity of infinitesimal characteristics of the process. The natural question arises: to which extent the continuity assumption can be relaxed in this theory. In [5, 6], the LDP was established for multidimensional diffusions with unit diffusion matrix and drift coefficients discontinuous along a given hyperplane; see also [1, 2, 4] for some other results in this direction. In [11], this result, in the one-dimensional setting, is extended to the case of piecewise smooth drift and diffusion coefficients with one common discontinuity point. The technique in the aforementioned papers is based on the analysis of the joint distribution of the process itself and its occupation time in the half-space above the discontinuity point (surface) and is hardly applicable when the structure of the discontinuity sets for the coefficients is more complicated. In [13], the LDP for a one-dimensional SDE with zero drift coefficient was established under a very mild regularity condition on σ: for the latter, it was only assumed that its discontinuity set has zero Lebesgue measure. Extension of this result to the case of nonzero drift coefficient is far from being trivial. In [14], such an extension was provided, but the assumption therein that $a/{\sigma }^{2}$ possesses a bounded derivative is definitely too restrictive. In this paper, we summarize the studies from [13] and [14]; note that the assumption on σ in the current paper is slightly stronger than in [13].
We note that our main result, Theorem 1, well illustrates the relation of the LDP with discontinuous coefficients to the classical Wentzel–Freidlin theory: the rate function in this theorem is given in a classical form, but with the properly modified coefficients. The heuristics of this modification is clearly seen. Namely, thanks to (ii), the rate functional I is lower semicontinuous; see Section 2.2. Assertion (i) corresponds to the fact that, in the case $a(x-)\ge 0$ and $a(x+)\le 0$, the family ${X}^{\varepsilon }$ with ${X_{0}^{\varepsilon }}=x$ weakly converges to the constant function equal to x. We interpret the limiting function as the solution to the ODE $\dot{x}_{t}=\bar{a}(x_{t})$, and note that a similar ODE for a may fail to have a solution at all.
2 Preliminaries to the proof
2.1 Exponential tightness, contraction principle
Recall that a family $\{{X}^{\varepsilon }\}$ is called exponentially tight with the speed function $r(\varepsilon )$ if for each $Q>0$, there exists a compact set $K\subset \mathbb{X}$ such that
where $B_{\delta }(x)$ denotes the open ball with center x and radius δ.
\[\underset{\varepsilon \to 0}{\limsup }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\notin K\big)\le -Q.\]
For an exponentially tight family, the LDP is equivalent to the weak LDP; the latter by definition means that the upper bound (2) holds for all compact sets F, whereas the lower bound (3) still holds for all open sets G. An equivalent formulation of the weak LDP is the following: for each $x\in \mathbb{X}$,
(5)
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \underset{\delta \to 0}{\lim }\underset{\varepsilon \to 0}{\limsup }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)\\{} & \displaystyle \hspace{1em}=\underset{\delta \to 0}{\lim }\underset{\varepsilon \to 0}{\liminf }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)=-I(x),\end{array}\]To prove (5), we will use a certain extension of the contraction principle, which in its classical form (e.g., [8], Section 3.1, and [7], Section 4.2.1) states the LDP for a family ${X}^{\varepsilon }=F({Y}^{\varepsilon })$, where ${Y}^{\varepsilon }$ is a family of random elements in a Polish space $\mathbb{Y}$ that satisfies an LDP with a good rate function J, and $F:\mathbb{Y}\to \mathbb{X}$ is a continuous mapping. The rate function for ${X}^{\varepsilon }$ in this case has the form
In the sequel, we use two different representations of our particular family ${X}^{\varepsilon }$ as an image of certain family whose LDP is well understood; however, the functions F in these representations fail to be continuous. Within such a framework, the following general lemma appears quite useful. We denote by $\rho _{\mathbb{X}},\rho _{\mathbb{Y}}$ the metrics in $\mathbb{X},\mathbb{Y}$ and by $\varLambda _{F}$ the set of continuity points of a mapping $F:\mathbb{Y}\to \mathbb{X}$. Note that $\varLambda _{F}$ is Borel measurable; see Appendix II in [3].
Lemma 1.
Let family ${Y}^{\varepsilon }$ satisfy the LDP with speed function $r(\varepsilon )$ and rate function J. Assume also that
Then, for any $x\in \mathbb{X}$,
with
(6)
\[\underset{\delta \to 0}{\lim }\underset{\varepsilon \to 0}{\limsup }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)\le -I_{\mathit{upper}}(x),\](7)
\[\underset{\delta \to 0}{\lim }\underset{\varepsilon \to 0}{\liminf }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)\ge -I_{\mathit{lower}}(x)\]
\[I_{\mathit{upper}}(x)=\underset{\delta \to 0}{\lim }\underset{\gamma \to 0}{\lim }\underset{y\in \varXi _{\gamma ,\delta }(x)}{\inf }J(y),\hspace{2em}I_{\mathit{lower}}(x)=\underset{\delta \to 0}{\lim }\underset{y\in \varTheta _{\delta }(x)}{\inf }J(y),\]
where
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \varTheta _{\delta }(x)& \displaystyle =\big\{y\in \varLambda _{F}:\rho _{X}\big(x,F(y)\big)<\delta \big\},\\{} \displaystyle \varXi _{\gamma ,\delta }(x)& \displaystyle =\big\{y\in \mathbb{Y}:\rho _{\mathbb{Y}}\big(y,\varTheta _{\delta }(x)\big)<\gamma \big\}.\end{array}\]
Proof.
We have
\[\mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)=\mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x),{Y}^{\varepsilon }\in \varLambda _{F}\big)=\mathbf{P}\big({Y}^{\varepsilon }\in \varTheta _{\delta }(x)\big).\]
Thus, the upper bound in the LDP for $\{{Y}^{\varepsilon }\}$ gives
\[\underset{\varepsilon \to 0}{\limsup }r(\varepsilon )\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(x)\big)\le -\inf \big\{J(y),y\in \bar{\varTheta }_{\delta }(x)\big\},\]
where $\bar{\varTheta }_{\delta }(x)$ denotes the closure of $\varTheta _{\delta }(x)$. Since $\bar{\varTheta }_{\delta }(x)\subset \varXi _{\gamma ,\delta }(x)$ for any $\gamma >0$, this provides (6). The proof of (7) is even simpler: for any $y\in \varTheta _{\delta }(x)$, there exists $r>0$ such that the image of the ball $B_{r}(y)$ under F is contained in $B_{\delta }(x)$, which yields
□Lemma 1 is a simplified and more precise version of Lemma 4 in [12]. The functions $I_{\mathit{upper}}$, $I_{\mathit{lower}}$ are lower semicontinuous: we can show this easily using that, for any sequence $x_{n}\to x$, the sets $\varTheta _{\delta /2}(x_{n})$ are embedded into $\varTheta _{\delta }(x)$ for n large enough (see, e.g., Proposition 3 in [12]). In fact, Lemma 1 says that for an arbitrary image of a family $\{{Y}^{\varepsilon }\}$, one part of an LDP (the upper bound) holds with one rate function, whereas the other part (the lower bound) holds with another rate function. This is our reason to call (6) and (7) the upper and the lower semicontraction principles. To prove (5), it suffices to verify the inequalities
We refer to [12] for a more discussion and an example where the pair of semicontraction principles do not provide an LDP.
(8)
\[I_{\mathit{lower}}(x)\le I(x),\hspace{2em}I(x)\le I_{\mathit{upper}}(x),\hspace{1em}x\in \mathbb{X}.\]2.2 Lower semicontinuity of I
In this section, we prove directly that the functional I specified in Theorem 1 is lower semicontinuous, that is, it is indeed a rate functional. This will explain the particular choice of the modified functions $\bar{a},\bar{\sigma }$. In addition, this will simplify the proofs, where we will use the representation for $I(x)$ presented further.
Define
The function S is continuous; hence, the functional $I_{2}$ is just continuous. The function ${\bar{a}}^{2}/{\bar{\sigma }}^{2}$ is lower semicontinuous by the choice of $\bar{a},\bar{\sigma }$; thus, the functional $I_{3}$ is lower semicontinuous. Finally, we can represent $I_{1}$ in the form $I_{1}(f)=I_{0}(\varSigma (f_{\cdot }))$, where the function
is continuous, and the functional
is known to be lower semicontinuous (this is just the rate functional for the family $\{\varepsilon W\}$). Hence, $I_{1}$ is lower semicontinuous, which completes the proof of the statement.
\[S(x)={\int _{0}^{x}}\frac{\bar{a}(z)}{{\bar{\sigma }}^{2}(z)}\hspace{0.1667em}dz,\hspace{1em}x\in \mathbb{R}.\]
Then $I(f)$, if it is finite, can be represented as
(9)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle I(f)& \displaystyle =\frac{1}{2}{\int _{0}^{T}}\frac{{(\dot{f}_{t})}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt+\big[S(f_{T})-S(f_{0})\big]+\frac{1}{2}{\int _{0}^{T}}\frac{{\bar{a}}^{2}(f_{t})}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt\\{} & \displaystyle =:I_{1}(f)+I_{2}(f)+I_{3}(f).\end{array}\]3 Proof of Theorem 1
3.1 Exponential tightness and the weak LDP
In this section, we prove that the family $\{{X}^{\varepsilon }\}$ is exponentially tight with the speed function $r(\varepsilon )={\varepsilon }^{2}$. Note that
\[{M_{t}^{\varepsilon }}:={\int _{0}^{t}}{\sigma }^{2}\big({X_{s}^{\varepsilon }}\big)\hspace{0.1667em}dW_{s}\]
is a continuous martingale with the quadratic characteristics
\[\big\langle {M}^{\varepsilon }\big\rangle _{t}={\int _{0}^{t}}{\sigma }^{2}\big({X_{s}^{\varepsilon }}\big)\hspace{0.1667em}ds\le Ct;\]
see (4). Recall that ${M}^{\varepsilon }$ can be represented as a Wiener process with the time change $t\mapsto \langle {M}^{\varepsilon }\rangle _{t}$; see, for example, [10], II. §7. Then, for each R,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\Big(\underset{t\in [0,T]}{\sup }\varepsilon \big|{M_{t}^{\varepsilon }}\big|>R\Big)\\{} & \displaystyle \hspace{1em}\le \underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\Big(\underset{t\in [0,CT]}{\sup }\varepsilon |W_{t}|>R\Big)=-\frac{{R}^{2}}{2CT}.\end{array}\]
On the other hand, for each $\omega \in \varOmega $ such that $\varepsilon |{M_{t}^{\varepsilon }}(\omega )|>R$, the corresponding trajectory of ${X}^{\varepsilon }$ satisfies
\[\big|{X_{t}^{\varepsilon }}\big|\le |x_{0}|+C{\int _{0}^{t}}\big(1+\big|{X_{s}^{\varepsilon }}\big|\big)\hspace{0.1667em}ds+R,\]
and therefore, by the Gronwall inequality,
Therefore, for any $Q>0$, there exists R such that
Next, recall the Arzelà–Ascoli theorem: for a closed set $K\subset C(0,T)$ to be compact, it is necessary and sufficient that it is bounded and equicontinuous. The family $\varepsilon {M}^{\varepsilon }$ is represented as a time changed family $\varepsilon {W}^{\varepsilon }$, where each ${W}^{\varepsilon }$ is a Wiener process, and the derivative of $\langle {M}^{\varepsilon }\rangle _{t}$ is bounded by C. Using these observations, it is easy to deduce the exponential tightness for $\{\varepsilon {M}^{\varepsilon }\}$ using the well-known fact that the family $\{\varepsilon W\}$ is exponentially tight. On the other hand, for any ω such that the trajectory of ${X_{t}^{\varepsilon }}$ is bounded by R, the corresponding trajectory of the process ${X_{t}^{\varepsilon }}-\varepsilon {M_{t}^{\varepsilon }}$ satisfies the Lipschitz condition w.r.t. t with the constant $C(1+R)$. Combined with the previous calculation, this easily yields the required exponential tightness.
In what follows, we proceed with the proof of (5). Since now the state space $\mathbb{X}=C(0,T)$ is specified, we change the notation and denotes the points in this space by $f,g,\dots \hspace{0.1667em}$. Since the set $B_{1}(f)$ is bounded, the law of ${X}^{\varepsilon }$ restricted to any $B_{\delta }(f)$ does not change if we change the coefficients $a,\sigma $ on the intervals $(-\infty ,-R],[R,\infty )$ with $R>0$ large enough. Hence, we furthermore assume the coefficients $a,\sigma $ to be constant on such intervals for some R.
3.2 Case I. Piecewise constant $a,\sigma $ with one discontinuity point
We proceed with the further proof in a step-by-step way, increasing gradually the classes of the coefficients $a,\sigma $ for which the corresponding LDP is proved. First, let $a,\sigma $ be constant on the intervals $(-\infty ,z)$ and $(z,\infty )$ with some $z\in \mathbb{R}$. Without loss of generality, we can assume that $z=0$. Then we can use Theorem 2.2 [11], where the LDP with the speed function $r(\varepsilon )={\varepsilon }^{2}$ is established for the pair $({X}^{\varepsilon },{Z}^{\varepsilon })$ with
\[{Z_{t}^{\varepsilon }}={\int _{0}^{t}}1_{(0,\infty )}\big({X_{s}^{\varepsilon }}\big)\hspace{0.1667em}ds,\hspace{1em}t\in [0,T].\]
The corresponding rate function in [11] is given in the following form. Denote $a_{\pm }=a(0\pm ),\sigma _{\pm }=\sigma (0\pm )$ and define the class $H(f)$ of functions $\psi \in \mathit{AC}(0,T)$ such that
\[\dot{\psi }_{t}\hspace{1em}\left\{\begin{array}{l@{\hskip10.0pt}l}=0,& f_{t}<0;\\{} =1,& f_{t}>0;\\{} \in [0,1],& f_{t}=0.\end{array}\right.\]
Then the rate functional for $({X}^{\varepsilon },{Z}^{\varepsilon })$ equals
with
\[L(x,y,z)=\left\{\begin{array}{l@{\hskip10.0pt}l}\frac{{(y-a(x))}^{2}}{{\sigma }^{2}(x)},& x\ne 0;\\{} \frac{{(a_{+}z+a_{-}(1-z))}^{2}}{{\sigma _{+}^{2}}z+{\sigma _{-}^{2}}(1-z)},& x=0,\hspace{0.1667em}\frac{a_{-}}{{\sigma _{-}^{2}}}>\frac{a_{+}}{{\sigma _{+}^{2}}};\\{} \frac{{a_{+}^{2}}}{{\sigma _{+}^{2}}}z+\frac{{a_{-}^{2}}}{{\sigma _{-}^{2}}}(1-z),& x=0,\hspace{0.1667em}\frac{a_{-}}{{\sigma _{-}^{2}}}\le \frac{a_{+}}{{\sigma _{+}^{2}}}\end{array}\right.\]
for all pairs $(f,\psi )$ such that $f\in \mathit{AC}(0,T),f_{0}=x_{0},\psi \in H(f)$ and, for all other pairs, $I(f,\psi )=\infty $.From this result, using the contraction principle (see Section 2), we easily derive the LDP for ${X}^{\varepsilon }$ with the rate function
\[I(f)=\underset{\psi \in H(f)}{\inf }I(f,\psi )={\int _{0}^{T}}L(f_{t},\dot{f}_{t})\hspace{0.1667em}dt,\hspace{2em}L(x,y)=\underset{z\in [0,1]}{\inf }L(x,y,z)\]
for $f\in \mathit{AC}(0,T)$, $f_{0}=x_{0}$ and $I(f)=\infty $ otherwise. Now only a minor analysis is required to show that this rate function actually coincides with that specified in Theorem 1. First, we observe that
This is obvious if either $a_{-}/{\sigma _{-}^{2}}\le a_{+}/{\sigma _{+}^{2}}$ or $a_{-}>0$, $a_{+}<0$. In the case where $a_{-}/{\sigma _{-}^{2}}>a_{+}/{\sigma _{+}^{2}}$ and $a_{-}$, $a_{+}$ have the same sign, we can verify directly that ${L^{\prime }_{z}}(0,y,z)$ have the same sign for $z\in [0,1]$, which completes the proof of the required identity.We will use repeatedly the following fact, which follows easily from the change-of-variables formula: for any $f\in \mathit{AC}(0,T)$ and any set $A\subset \mathbb{R}$ with zero Lebesgue measure, the Lebesgue measure of the set
see, for example, Lemma 1 in [13]. Applying (10) with $A=\{0\}$, we conclude that in the above expression for $I(f)$, the function L can be changed to
which completes the proof of Theorem 1 in this case.
3.3 Case II. Piecewise constant $a,\sigma $
Let, for some $z_{1}<\cdots <z_{m}$, the functions $a,\sigma $ be constant on the intervals $(-\infty ,z_{1})$, $(z_{1},z_{2}),$ $\dots ,(z_{m},\infty )$. Assume that $x_{0}\notin \{z_{k},k=1,\dots ,m\}$, which does not restrict the generality of the construction given further, and define the functions $a_{k},\sigma _{k},k=0,\dots ,m$ by
\[\begin{array}{r@{\hskip0pt}l}\displaystyle a_{k}(x)& \displaystyle =\left\{\begin{array}{l@{\hskip10.0pt}l}a(z_{k}-),& x<z_{k},\\{} a(z_{k}+),& x\ge z_{k},\end{array}\right.\\{} \displaystyle \sigma _{k}(x)& \displaystyle =\left\{\begin{array}{l@{\hskip10.0pt}l}\sigma (z_{k}-),& x<z_{k},\\{} \sigma (z_{k}+),& x\ge z_{k},\end{array}\right.\hspace{1em}k=1,\dots ,m.\end{array}\]
Consider a family of independent processes ${Y}^{0,\varepsilon },{Y}^{n,k,\varepsilon }$, $k=1,\dots ,m$, $n\ge 1$, such that ${Y}^{0,\varepsilon }$ solves SDE (1) with the coefficients $a_{0},\sigma _{0}$ and each ${Y}^{n,k,\varepsilon }$ solves a similar SDE with the coefficients $a_{k},\sigma _{k}$ and the initial value $z_{k}$. Define iteratively the process ${\tilde{X}}^{\varepsilon }$ in the following way: put ${\tilde{X}}^{\varepsilon }$ equal ${Y}^{0,\varepsilon }$ until the time moment
Define the random index $\kappa _{1}\in \{1,\dots ,m\}$ such that ${Y_{\tau _{1}}^{0,\varepsilon }}=z_{\kappa _{1}}$. Then put ${\tilde{X}_{t}^{\varepsilon }}={Y_{t-\tau _{1}}^{1,\kappa _{1},\varepsilon }}$ until the first time moment $\tau _{2}$ when this process hits $\{z_{k},k=1,\dots ,m\}\setminus \{z_{\kappa _{1}}\}$. Iterating this procedure, we get a process ${\tilde{X}_{t}^{\varepsilon }}$ with
\[{X_{t}^{\varepsilon }}={Y_{t}^{0,\varepsilon }},\hspace{1em}t\le \tau _{1},\hspace{2em}{X_{t}^{\varepsilon }}={Y_{t-\tau _{n}}^{n,\kappa _{k},\varepsilon }},\hspace{1em}t\in [\tau _{n},\tau _{n+1}],\hspace{2.5pt}n\ge 1.\]
It follows from the strong Markov property of ${X}^{\varepsilon }$ that ${\tilde{X}}^{\varepsilon }$ has the same law with ${X}^{\varepsilon }$. Hence, the given construction in fact represents the law of ${X}^{\varepsilon }$ as the image of the joint law family of independent processes ${Y}^{0,\varepsilon },{Y}^{n,k,\varepsilon }$, $k=1,\dots ,m$, $n\ge 1$. Each of these processes is a solution to (1) with corresponding coefficients having at most one discontinuity point; hence, the LDP for them is provided in the previous section. Our idea is to deduce the LDP ${X}^{\varepsilon }$ via a version of the contraction principle. With this idea in mind, we first perform a simplification of the above representation. For some N (the choice of N will be discussed below), we consider the space $\mathbb{Y}=C{(0,T)}^{1+mN}$ and construct a function $F:\mathbb{Y}\to \mathbb{X}=C(0,T)$ in the following way. For $y=({y}^{0},{y}^{n,k},k=1,\dots ,m,n=1,\dots ,N)$, we first define $\tau _{1}(y)=\inf \{t:{y_{t}^{0}}\in \{z_{k},k=1,\dots ,m\}\}$ with the usual convention that $\inf \varnothing =T.$ The function $[F(y)]_{t}$, $t\in [0,T]$, is defined to be equal to ${y_{t}^{0}}$ for $t\le \tau _{1}(y)$. If $\tau _{1}(y)<T$, then the construction is iterated: we define $\kappa _{1}(y)$ by ${y_{\tau _{1}(y)}^{0}}=x_{\kappa _{1}(y)}$ and put, for $t\ge \tau _{1}(y)$, $[F(y)]_{t}$ equal to ${y_{t-\tau _{1}(y)}^{1,\kappa _{1}(y)}}$ up to the first moment when this function hits $\{z_{k},k=1,\dots ,m\}\setminus \{x_{\kappa _{1}(y)}\}$. We iterate this procedure at most N times; that is, if $\tau _{N}(y)<T$, then we put
\[\big[F(y)\big]_{t}={y_{t-\tau _{N}(y)}^{N,\kappa _{N}(y)}},\hspace{1em}t\in \big[\tau _{N}(y),T\big].\]
Denote
For any fixed $f\in C(0,T)$, we can choose $\delta _{f}>0$ small enough and $N_{f}$ large enough so that each $g\in B_{\delta }(f)$ has less than N Δ-oscillations on $[0,T]$. Hence, if in the above construction, N is taken equal to $N_{f}$, then the restriction of the law of ${X}^{\varepsilon }$ to any ball $B_{\delta }(f),\delta \le \delta _{f}$, equals to the same restriction of the image of the joint law of the finite family ${Y}^{0,\varepsilon }$, ${Y}^{n,k,\varepsilon }$, $k=1,\dots ,m$, $n=1,\dots ,N$, under the mapping F specified before.We aim to verify (5), and we argue in the following way. We fix f and choose $N=N_{f}$ as before, so that the laws of ${X}^{\varepsilon }$, restricted to $B_{\delta }(f)$ for δ small, can be obtained as the image under F specified before. Then we prove (8) at this particular point $x=f$, with $I_{\mathit{lower}}$, $I_{\mathit{upper}}$ being constructed by this particular F. This yields the required weak LDP (5).
Within such an argument, we have to treat for any N the image under the corresponding F of the family of laws in $\mathbb{Y}=C{(0,T)}^{1+mN}$, which, according to the result proved in the previous section, satisfies the LDP with the rate function
\[J(y)=\frac{1}{2}{\int _{0}^{T}}\frac{{({\dot{y}_{t}^{0}}-a_{0}({y_{t}^{0}}))}^{2}}{{\sigma _{0}^{2}}({y_{t}^{0}})}\hspace{0.1667em}dt+\frac{1}{2}\sum \limits_{k=1}^{m}\sum \limits_{n=1}^{N}{\int _{0}^{T}}\frac{{({\dot{y}_{t}^{k,n}}-\bar{a}_{k}({y_{t}^{k,n}}))}^{2}}{{\bar{\sigma }_{k}^{2}}({y_{t}^{k,n}})}\hspace{0.1667em}dt\]
for
such that ${y}^{0},{y}^{n,k}\in \mathit{AC}(0,T)$, ${y_{0}^{0}}=x_{0}$, ${y_{0}^{n,k}}=z_{k}$ and $J(y)=\infty $ otherwise. To apply Lemma 1 in this setting, we first analyze the structure and the properties of the corresponding F.Each trajectory $f=F(y)\in C(0,T)$ is actually a patchwork, which consists of pieces of trajectories ${y}^{0},{y}^{n,k}$, $k=1,\dots ,m$, $n=1,\dots ,N$: the pasting points are $\tau _{1}(y),\dots ,\tau _{r}(y),$ $r=r(y)\le N$, and after $\tau _{n}(y)$, the (part of the) new trajectory is used with the number $\kappa _{n}(y)$. For a $y_{l}\to y$ in $\mathbb{Y}$, the corresponding sequence of trajectories $f_{l}=F(y_{l})$ may fail to converge to f because the functionals $\tau _{n}(\cdot ),\kappa _{n}(\cdot )$ are not continuous. However, the above “patchwork representation” easily yields the following two facts:
-
• any limit point $f_{\ast }$ of the sequence $\{f_{l}\}$ possesses a similar representation with the same $y=\lim _{l}y_{l}$ and with the corresponding pasting points ${\tau _{1}^{\ast }},{\tau _{2}^{\ast }},\dots \hspace{0.1667em}$ and numbers ${\kappa _{1}^{\ast }},{\kappa _{2}^{\ast }},\dots \hspace{0.1667em}$ being partial limits of the sequences $\{\tau _{1}(y_{l})\},\{\tau _{2}(y_{l})\},\dots \hspace{0.1667em}$ and $\{\kappa _{1}(y_{l})\},\{\kappa _{2}(y_{l})\},\dots \hspace{0.1667em}$;
-
• if the functions $\tau _{1}(\cdot ),\tau _{2}(\cdot ),\dots \hspace{0.1667em}$ are continuous at a given point $y\in \mathbb{Y}$, then $y\in \varLambda _{F}$.
Using the first fact, now it is easy to prove the second inequality in (8). If it fails, then for a given f, there exists a sequence $\{{y}^{l}\}$ such that $F(y_{l})\to f$ and $J(y_{l})\le c<I(f)$. Since the level set $\{y:J(y)\le c\}$ is compact, we can assume without loss of generality that $y_{l}$ converge to some y; recall that J is lower semicontinuous and thus $J(y)\le c$. The function f possesses the above patchwork representation with the trajectories taken from y, some pasting points ${\tau _{1}^{\ast }},\dots ,{\tau _{r}^{\ast }}$, and some numbers ${\kappa _{1}^{\ast }},\dots ,{\kappa _{r}^{\ast }}$. From this representation it is clear that $f\in \mathit{AC}(0,T)$ and $f_{0}=x_{0}$: if this fails, then the same properties fail at least for one trajectory from the family y and thus $J(y)=\infty $, which contradicts to $J(y)\le c$. Hence, we have
This gives a contradiction with inequalities $J(y)\le c$ and $I(f)>c$, which completes the proof of the second inequality in (8).
\[I(f)=\frac{1}{2}{\int _{0}^{T}}\frac{{(\dot{f}_{t}-\bar{a}(f_{t}))}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt=\frac{1}{2}\sum \limits_{n=1}^{r+1}{\int _{{\tau _{n-1}^{\ast }}}^{{\tau _{n}^{\ast }}}}\frac{{(\dot{f}_{t}-\bar{a}(f_{t}))}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt,\]
where we put ${\tau _{0}^{\ast }}=0,{\tau _{r+1}^{\ast }}=T$. Let $x_{0}$ be located on some interval $(z_{k-1},z_{k})$, $k=2,\dots ,m$, say, $x_{0}\in (z_{1},z_{2})$. Then, on the interval $(0,{\tau _{1}^{\ast }})$, the trajectory f is contained in the segment $[z_{1},z_{2}]$. The functions $a_{0},\sigma _{0}$ are constant and coincide with $\bar{a},\bar{\sigma }$ on $(z_{1},z_{2})$. In addition, $a_{0}=a(z_{1}+)=a(z_{2}-)$, $\sigma _{0}=\sigma (z_{1}+)=\sigma (z_{2}-)$; hence, by the choice of $\bar{a},\bar{\sigma }$ we have
\[\frac{{\bar{a}}^{2}(z)}{{\bar{\sigma }}^{2}(z)}\le \frac{{a_{0}^{2}}(z)}{{\sigma _{0}^{2}}(z)},\hspace{1em}z=z_{1},z_{2}.\]
Then by (10) with $A=\{z_{1},z_{2}\}$ we have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle {\int _{0}^{{\tau _{1}^{\ast }}}}\frac{{(\dot{f}_{t}-\bar{a}(f_{t}))}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt& \displaystyle ={\int _{0}^{{\tau _{1}^{\ast }}}}\bigg(\frac{{(\dot{f}_{t}-a_{0}(f_{t}))}^{2}}{{\sigma _{0}^{2}}(f_{t})}1_{f_{t}\notin A}+\frac{{\bar{a}}^{2}(f_{t})}{{\bar{\sigma }}^{2}(f_{t})}1_{f_{t}\in A}\bigg)\hspace{0.1667em}dt\\{} & \displaystyle \le {\int _{0}^{{\tau _{1}^{\ast }}}}\bigg(\frac{{(\dot{f}_{t}-a_{0}(f_{t}))}^{2}}{{\sigma _{0}^{2}}(f_{t})}1_{f_{t}\notin A}+\frac{{a_{0}^{2}}(f_{t})}{{\sigma _{0}^{2}}(f_{t})}1_{f_{t}\in A}\bigg)\hspace{0.1667em}dt\\{} & \displaystyle ={\int _{0}^{{\tau _{1}^{\ast }}}}\frac{{(\dot{f}_{t}-a_{0}(f_{t}))}^{2}}{{\sigma _{0}^{2}}(f_{t})}\hspace{0.1667em}dt={\int _{0}^{{\tau _{1}^{\ast }}}}\frac{{({\dot{y}_{t}^{0}}-a_{0}({y_{t}^{0}}))}^{2}}{{\sigma _{0}^{2}}({y_{t}^{0}})}.\end{array}\]
Analogous inequalities hold on each of the time intervals $({\tau _{n}^{\ast }},{\tau _{n+1}^{\ast }}),$ $n=1,\dots ,r$, with $a_{0},\sigma _{0}$ changed to $\bar{a}_{{\kappa _{n}^{\ast }}},\bar{\sigma }_{{\kappa _{n}^{\ast }}}$ (the proof is similar and omitted). Thus,
(11)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle I(f)& \displaystyle \le \frac{1}{2}{\int _{0}^{{\tau _{1}^{\ast }}}}\frac{{({\dot{y}_{t}^{0}}-a_{0}({y_{t}^{0}}))}^{2}}{{\sigma _{0}^{2}}({y_{t}^{0}})}\hspace{0.1667em}dt+\frac{1}{2}\sum \limits_{n=1}^{r}{\int _{{\tau _{n}^{\ast }}}^{{\tau _{n+1}^{\ast }}}}\frac{{({\dot{y}_{t}^{{\kappa _{n}^{\ast }},n}}-\bar{a}_{{\kappa _{n}^{\ast }}}({y_{t}^{{\kappa _{n}^{\ast }},n}}))}^{2}}{{\bar{\sigma }_{{\kappa _{n}^{\ast }}}^{2}}({y_{t}^{{\kappa _{n}^{\ast }},n}})}\hspace{0.1667em}dt\\{} & \displaystyle \le \frac{1}{2}{\int _{0}^{T}}\frac{{({\dot{y}_{t}^{0}}-a_{0}({y_{t}^{0}}))}^{2}}{{\sigma _{0}^{2}}({y_{t}^{0}})}\hspace{0.1667em}dt+\frac{1}{2}\sum \limits_{k=1}^{m}\sum \limits_{n=1}^{N}{\int _{0}^{T}}\frac{{({\dot{y}_{t}^{k,n}}-\bar{a}_{k}({y_{t}^{k,n}}))}^{2}}{{\bar{\sigma }_{k}^{2}}({y_{t}^{k,n}})}\hspace{0.1667em}dt=J(y).\end{array}\]The first inequality in (8) holds immediately for f such that $I(f)=\infty $. We fix f with $I(f)<\infty $ and $\gamma >0$ and construct $y_{\gamma }$ such that $F(y_{\gamma })=f$, the functions $\tau _{1}(\cdot ),\tau _{2}(\cdot ),\dots \hspace{0.1667em}$ are continuous at $y_{\gamma }$, and $J(y_{\gamma })\le I(f)+\gamma $. This completes the proof of (8).
The construction explained gives a cue for the choice of $y=y_{\gamma }$ (we omit the index γ to simplify the notation). We put ${y}^{0}$ equal to f until its first time moment ${\tau _{1}^{\ast }}$ of hitting the set $\{z_{1},z_{2}\}$ (we still assume that $x_{0}\in (z_{1},z_{2})$). Then we extend ${y}^{0}$ to the entire time interval $[0,T]$, and we aim to make the integral
small enough; that is, to make small the error in the second inequality in (11), which arises because of the integral of ${y}^{0}$. If we put ${y_{t}^{0}}=y_{{\tau _{1}^{\ast }}}+a_{0}(t-{\tau _{1}^{\ast }})$, then we obtain the trajectory at which the integral (12) equals zero; we call such a trajectory a zero-energy one. However, under such a choice, we may fail with the other our requirement that $\tau _{1}(\cdot )$ should be continuous at the point y. It is easy to verify that for such a continuity, it suffices that ${y}^{0}$, if hitting $\{z_{1},z_{2}\}$ at a point, say, $z_{1}$ at every interval $({\tau _{1}^{\ast }},{\tau _{1}^{\ast }}+\delta ),\hspace{2.5pt}\delta >0$, takes values both from $(-\infty ,z_{1})$ and $(z_{1},\infty )$. We can perturb the zero-energy trajectory introduced above on a small time interval near ${\tau _{1}^{\ast }}$ in such a way that this new trajectory possesses the continuity property explained before, and the integral (12) is $\le \gamma /N$.
(12)
\[{\int _{{\tau _{1}^{\ast }}}^{T}}\frac{{({\dot{y}_{t}^{0}}-a_{0}({y_{t}^{0}}))}^{2}}{{\sigma _{0}^{2}}({y_{t}^{0}})}\hspace{0.1667em}dt\]Then we iterate this procedure. Observe that, for any k, by the construction of the function $\bar{a}_{k}$ there exists at least one corresponding zero-energy trajectory with the initial value $z_{k}$, which now is defined as a solution to the ODE
We have ${\kappa _{1}^{\ast }}$ uniquely determined by the trajectory f (in fact, by the part of this trajectory up to time ${\tau _{1}^{\ast }}$). For $k\ne {\kappa _{1}^{\ast }}$, we define ${y}^{k,1}$ as the zero-energy trajectory on $[0,T]$ that starts from $x_{k}$ and corresponds to the coefficient $\bar{a}_{k}$. All these trajectories are “phantom” in the sense that they neither are involved into the representation of f through y nor give an impact into $J(y)$. For $k={\kappa _{1}^{\ast }}$, we define ${y}^{k,1}$ similarly as before: it equals $f_{t+{\tau _{1}^{\ast }}}$ for $t\le {\tau _{2}^{\ast }}-{\tau _{1}^{\ast }}$, and afterwards it is defined as a perturbation of a zero-energy trajectory that makes $\tau _{2}(\cdot )$ continuous in y and
\[{\int _{{\tau _{2}^{\ast }}-{\tau _{1}^{\ast }}}^{T}}\frac{{({\dot{y}_{t}^{k,1}}-\bar{a}_{k}({y_{t}^{k,1}}))}^{2}}{{\bar{\sigma }_{k}^{2}}({y_{t}^{k,1}})}\hspace{0.1667em}dt\le \frac{\gamma }{N}.\]
Repeating this construction $\le N$ times, we finally get the required function $y=y_{\gamma }$. This completes the proof of (8) and thus of (5). Together with the exponential tightness proved in Section 3.1, this completes the proof of the LDP in this case.3.4 Case III. Piecewise constant $a/{\sigma }^{2}$, general σ
In this section, we remove the assumption on $a,\sigma $ to be piecewise constant, still keeping this assumption for $a/{\sigma }^{2}$; we also assume that $a,\sigma $ are constant on $(-\infty ,R]$ and $[R,\infty )$ for some R. Our basic idea is to represent $\{{X}^{\varepsilon }\}$ as the image under a time changing transformation of a family $\{{Y}^{\varepsilon }\}$ and then to use the semicontraction principles. The same approach was used in [13], where the LDP was established for a solution of (1) with $a\equiv 0$; in this case, ${Y}^{\varepsilon }$ was taken in the form ${Y_{t}^{\varepsilon }}=x_{0}+\varepsilon W_{t}$. In our current setting, the choice of the coefficients for the SDE that defines ${Y}^{\varepsilon }$ should take into account the common discontinuity points for $a/\sigma $ and σ. This becomes visible both from an analysis of the proof of Theorem 1 in [13] and from the definition of the functions $\bar{a},\bar{\sigma }$, which combines the left- and right-hand values of both a and σ at the discontinuity points. The proper choice of the family is explained below. Some parts of the arguments are similar to those in [13]. We omit detailed proofs whenever it is possible to give a reference to [13] and focus on the particularly new points.
We assume $a/{\sigma }^{2}$ to be piecewise constant with discontinuity points $z_{1}<\cdots <z_{m}$ and put (with the convention $\prod _{\varnothing }=1$)
\[\tilde{\sigma }(x)=\prod \limits_{k:z_{k}\le x}\frac{\sigma (z_{k}+)}{\sigma (z_{k}-)},\hspace{2em}\upsilon (x)=\frac{\tilde{\sigma }(x)}{\sigma (x)},\hspace{2em}\tilde{a}(x)=a(x){\upsilon }^{2}(x).\]
Under such a choice, $\tilde{\sigma }=\sigma \upsilon $, and thus the function $\tilde{a}/{\tilde{\sigma }}^{2}$ equals $a/{\sigma }^{2}$ and is constant on each of the intervals $(-\infty ,z_{1}),\dots ,(z_{m},\infty )$. By construction, $\tilde{\sigma }$ is constant on these intervals as well; hence, $\tilde{a},\tilde{\sigma }$ fit the case studied in the previous section, and the required LDP holds for the family ${Y}^{\varepsilon }$ of the solutions to (1) with these coefficients and ${Y_{0}^{\varepsilon }}=x_{0}$. This construction yields also the following property, which will be important below: the function $a=(a/{\sigma }^{2}){\sigma }^{2}$ does not change its sign on each of the intervals $(-\infty ,z_{1}),\dots ,(z_{m},\infty )$. Hence, denoting $B={a}^{2}/{\sigma }^{2}$ and $\bar{B}={\bar{a}}^{2}/{\bar{\sigma }}^{2}$, we get
Fix $\varepsilon >0$ and define
\[\eta _{t}={\int _{0}^{t}}{\upsilon }^{2}\big({Y_{s}^{\varepsilon }}\big)\hspace{0.1667em}ds,\hspace{1em}t\ge 0,\]
$\tau _{t}={[\eta ]_{t}^{-1}}$ (the inverse function w.r.t. t), and ${X_{t}^{\varepsilon }}={Y_{\tau _{t}}^{\varepsilon }}$. Then ${X}^{\varepsilon }$ is a weak solution to (1) with ${X_{0}^{\varepsilon }}=x_{0}$; see [10], IV §7.In the above construction, $\eta _{t}\ge {c}^{2}t$ and thus $\tau _{t}\le {c}^{-2}t$; see (4). We put $\tilde{T}={c}^{-2}T$, $\mathbb{Y}=C(0,\tilde{T})$, and define ${Y}^{\varepsilon }$ as a family of solutions to (1) with the coefficients $\tilde{a},\tilde{\sigma }$ and the time horizon $\tilde{T}$. Then the family ${X}^{\varepsilon }$ possesses a representation ${X}^{\varepsilon }=F({Y}^{\varepsilon })$ with the mapping $F:\mathbb{Y}\to \mathbb{X}$ defined by
\[\big[F(y)\big]_{t}=y_{\tau _{t}(y)},\hspace{2em}\tau _{t}(y)={\big[\eta (y)\big]_{t}^{-1}},\hspace{1em}t\in [0,T],\]
\[\eta _{t}(y)={\int _{0}^{t}}{\upsilon }^{2}(y_{s})\hspace{0.1667em}ds,\hspace{1em}t\in [0,\tilde{T}].\]
Observe that for F to be continuous at a point $y\in \mathbb{Y}$, it suffices that y spends zero time in the set $\varDelta _{\upsilon }$ of the discontinuity points of the function υ; see [13], Lemma 1 and Corollary 1. Now $\varDelta _{\upsilon }\subset \varDelta _{a}\cup \varDelta _{\sigma }$ is at most countable, and it is easy to see that the continuity set $\varLambda _{F}$ has probability 1 w.r.t. the distribution of each ${Y}^{\varepsilon }$, that is, we can apply Lemma 1.Our further aim is to prove (8) in the above setting, which then would imply (5) and thus prove the LDP. The general idea of the proof is similar to that of Theorem 1 in [13], though particular technicalities differ substantially.
First, for a given $f\in \mathbb{X}$, we describe explicitly the set ${F}^{-1}(\{f\})$. We put
If $f=F(y)$, then
\[\zeta _{t}(f)={\int _{0}^{t}}{\upsilon }^{2}(y_{\tau _{s}(y)})\hspace{0.1667em}ds={\int _{0}^{\tau _{t}(y)}}{\upsilon }^{2}(y_{r}){\upsilon }^{-2}(y_{r})\hspace{0.1667em}dr=\tau _{t}(y),\hspace{1em}t\in [0,T];\]
here we changed the variables $r=\tau _{s}(y)$ and used that
\[dr={\tau ^{\prime }_{s}}(y)\hspace{0.1667em}ds=\frac{1}{{\upsilon }^{2}(y_{\tau _{s}(y)})}\hspace{0.1667em}ds=\frac{1}{{\upsilon }^{2}(y_{r})}\hspace{0.1667em}ds.\]
Therefore,
Observe that $\zeta _{T}(f)\le {c}^{-2}T=\tilde{T}$ and define
\[\pi _{t}(f)={\big[\zeta (f)\big]_{t}^{-1}}=\inf \big\{r:\zeta _{r}(f)\ge t\big\},\hspace{1em}t\in \big[0,\zeta _{T}(f)\big].\]
Then we conclude that
that is, for any $y\in {F}^{-1}(\{f\})$, the part of its trajectory with $t\le \zeta _{T}(f)$ is uniquely defined. On the other hand, it is easy to show that any $y\in \mathbb{Y}$ satisfying (14) belongs to ${F}^{-1}(\{f\})$.Next, we denote by $\hat{a},\hat{\sigma }$ the modified coefficients, which correspond to the coefficients $\tilde{a},\tilde{\sigma }$ in the sense explained in Section 1. Since $\tilde{a}=a{\upsilon }^{2}$ and $\tilde{\sigma }=\sigma \upsilon $, we easily see that
at every continuity point x for υ. Then, for any $y\in \mathit{AC}(0,\tilde{T})$ with $y_{0}=x_{0}$ that spends zero time in the set $\varDelta _{\upsilon }$, we have
(15)
\[\hat{a}(x)=\bar{a}(x){\upsilon }^{2}(x),\hspace{2em}\hat{\sigma }(x)=\bar{\sigma }(x)\upsilon (x)\]
\[J(y)=\frac{1}{2}{\int _{0}^{\tilde{T}}}\frac{{(\dot{y}_{t}-\hat{a}(y_{t}))}^{2}}{{\hat{\sigma }}^{2}(y_{t})}\hspace{0.1667em}dt=\frac{1}{2}{\int _{0}^{\tilde{T}}}\frac{{(\dot{y}_{t}-\bar{a}(y_{t}){\upsilon }^{2}(y_{t}))}^{2}}{{\bar{\sigma }}^{2}(y_{t}){\upsilon }^{2}(y_{t})}\hspace{0.1667em}dt.\]
On the other hand, using (14) and making the time change $s=\pi _{t}(f)$ with $f=F(y)$, we get
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \frac{1}{2}{\int _{0}^{\zeta _{T}(F(y))}}\frac{{(\dot{y}_{t}-\bar{a}(y_{t}){\upsilon }^{2}(y_{t}))}^{2}}{{\bar{\sigma }}^{2}(y_{t}){\upsilon }^{2}(y_{t})}\hspace{0.1667em}dt\\{} & \displaystyle \hspace{1em}=\frac{1}{2}{\int _{0}^{\zeta _{T}(F(y))}}\frac{{(\dot{y}_{t}{\upsilon }^{-2}(y_{t})-\bar{a}(y_{t}))}^{2}}{{\bar{\sigma }}^{2}(y_{t})}{\upsilon }^{2}(y_{t})\hspace{0.1667em}dt\\{} & \displaystyle \hspace{1em}=\frac{1}{2}{\int _{0}^{T}}\frac{{(\dot{f}_{s}-\bar{a}(f_{s}))}^{2}}{{\bar{\sigma }}^{2}(f_{s})}\hspace{0.1667em}ds=I\big(F(y)\big)\end{array}\]
because now $t=\zeta _{s}(f)=\tau _{s}(y)$ and thus
Thus,
with
Now we are ready to proceed with the proof of the first inequality in (8).
Proof.
We consider only f such that $I(f)<\infty $; otherwise, the required inequality is trivial. Let us fix a function y corresponding to f by the following convention: it is given by identity (14) up to the time moment $t=\zeta _{T}(f)$ and follows a zero-energy trajectory afterward, that is, satisfies
a.e. w.r.t. to the Lebesgue measure. We note that at least one such zero-energy trajectory exists (it may be nonunique, and in this case, we just fix one of such trajectories). Indeed, by construction, $\tilde{a}$ is piecewise constant, so that the corresponding $\hat{a}$ is piecewise constant as well. The proper choice of $\hat{a}(z_{k})$ at those points $z_{k}$ where $\tilde{a}(z_{k}-)>0,\tilde{a}(z_{k}+)<0$ yields that the above ODE, which determines a zero-energy trajectory, admits at least one solution.
If f spends zero time in the set $\varDelta _{\upsilon }$ of discontinuity points for υ, then the same property holds for the corresponding y constructed above. Indeed, the first part of the trajectory y is just the time-changed trajectory f, and the second part is a zero-energy trajectory. The latter trajectory is piecewise linear, and we can separate a finite set of time intervals where it either (a) moves with a constant speed $\ne 0$ (and thus spends a zero time in the set $\varDelta _{\upsilon }$, which has zero Lebesgue measure) or (b) stays constant (in this case, it equals $z_{k}$ for some k, and, by construction, υ is continuous at $\{z_{k}\}$). Hence, we conclude that (16) holds and, moreover, ${J}^{\mathit{tail}}(y)=0$, that is, $J(y)=I(f)$. In addition, $y\in \varLambda _{F}$, which gives for this f the required inequality
For a general f, we will show that, for each $\delta >0$, there exists ${f}^{\delta }$ such that ${f}^{\delta }\in B_{\delta }(f)$, $I({f}^{\delta })\le I(f)+\delta $, and ${f}^{\delta }$ spends zero time in $\varDelta _{\upsilon }$; since $I_{\mathit{lower}}$ is known to be lower semicontinuous, this will complete the proof of the first inequality in (8). Recall the decomposition $I=I_{1}+I_{2}+I_{3}$ from Section 2.2 and note that $I_{2}({f}^{n})\to I_{2}(f)$ if ${f}^{n}\to f$ in the uniform distance and $I_{1}({f}^{n})\to I_{1}(f)$ if ${f}^{n}\to f$ in the distance
\[d_{\varSigma }\big({f}^{1},{f}^{2}\big)={\Bigg({\int _{0}^{T}}{\big|{\big(\varSigma \big({f_{t}^{1}}\big)\big)^{\prime }}-{\big(\varSigma \big({f_{t}^{2}}\big)\big)^{\prime }}\big|}^{2}\hspace{0.1667em}dt\Bigg)}^{1/2}.\]
Hence, our aim is to construct a function ${f}^{\delta }$ that is close to f both in the uniform distance and in $d_{\varSigma }$, spends zero time in the set $\varDelta _{\upsilon }$, and
We decompose the time set
into a disjoint union of open intervals and modify the function x on each of these intervals. On the complement to this union, the function ${f}^{\delta }$ will remain the same; note that υ is continuous at every point $z_{k}$, and hence in order to get a function that spends zero time in $\varDelta _{\upsilon }$, it suffices to modify f on Q only. In what follows, we fix an interval $(\alpha ,\beta )$ from the decomposition of the set Q and describe the way to modify f on $(\alpha ,\beta )$. The construction below is mostly motivated by (13). We fix some $\gamma >0$ and choose a finite partition $\{u_{j}\}$ of the set $\{f_{t},t\in [\alpha ,\beta ]\}$ such that the oscillation of the function ${\bar{\sigma }}^{2}$ on each interval $(u_{j-1},u_{j})$ does not exceed γ. Then there exists a finite partition $\alpha =t_{0}<\cdots <t_{m}=\beta $ such that, on each time segment $[t_{i-1},t_{i}]$, the function x visits at most one point from the set $\{u_{j}\}$. Then, on each interval $[t_{i-1},t_{i}]$, we consider the family
where ${\phi }^{i}$ is a function such that
\[{\phi _{t_{i-1}}^{i}}={\phi _{t_{i}}^{i}}=0,\hspace{2em}{\int _{t_{i-1}}^{t_{i}}}{\big({\dot{\phi }_{t}^{i}}\big)}^{2}\hspace{0.1667em}dt<\infty ,\]
and $s_{i}$ is defined by the following convention: $s_{i}=+1$ if $\bar{B}$ is right-continuous at the (unique) point from the set $\{u_{j}\}$ that is visited by f on $[t_{i-1},t_{i}]$ or if f does not visit this set; otherwise, $s_{i}=-1$. If, in addition, ${\dot{\phi }_{t}^{i}}\ne 0$ a.e., then for all $\kappa >0$ except at most countable set of points, we have that ${f}^{i,\kappa }$ spends zero time in $\varDelta _{\upsilon }$ on the time interval; see [13], Lemma 2. The choice of the sign $s_{i}$ yields that, for $\kappa >0$ small enough,
\[B\big({f_{t}^{i,\kappa }}\big)\le \bar{B}(x_{t})+\gamma ,\hspace{1em}t\in [t_{i-1},t_{i}]\cap \varDelta _{B}.\]
Then $\kappa >0$ can be chosen small enough and the same for all intervals $[t_{i-1},t_{i}]$, so that the corresponding function ${\tilde{f}}^{\kappa }$, which coincides with ${f}^{i,\kappa }$ on $[t_{i-1},t_{i}]$, satisfies
\[{\int _{\alpha }^{\beta }}\bar{B}\big({\tilde{x}_{t}^{\kappa }}\big)\hspace{0.1667em}dt\le {\int _{\alpha }^{\beta }}\bar{B}(x_{t})\hspace{0.1667em}dt+2\gamma (\beta -\alpha ).\]
It is also easy to see that, in addition, the following inequalities can be guaranteed by the choice of (small) κ:
\[\underset{t\in (\alpha ,\beta )}{\sup }\big|{\tilde{f}_{t}^{\kappa }}-x_{t}\big|\le \gamma (\beta -\alpha ),\hspace{2em}{\int _{\alpha }^{\beta }}{\big|{\big(\varSigma \big({f_{t}^{\kappa }}\big)\big)^{\prime }}-{\big(\varSigma (f_{t})\big)}^{\prime }\big|}^{2}\hspace{0.1667em}dt\le \gamma (\beta -\alpha ).\]
Repeating the same construction on each interval from the partition for Q, we get a function $\tilde{f}$ such that
\[\| \tilde{f}-f\| \le \gamma T,\hspace{2em}{d_{\varSigma }^{2}}(\tilde{f},f)\le \gamma T,\hspace{2em}I_{3}(\tilde{f})\le I_{3}(f)+\gamma T,\]
and $\tilde{f}$ spends zero time in $\varDelta _{\upsilon }$. Taking in this construction $\gamma >0$ small enough, we obtain the required function ${f}^{\delta }=\tilde{f}$, which completes the proof of (17). □Recall that B and $\bar{B}$ satisfy (13). For the similar pair of functions $\tilde{B}={\tilde{a}}^{2}/{\tilde{\sigma }}^{2}$ and $\hat{B}={\hat{a}}^{2}/{\hat{\sigma }}^{2}$, we have even more: the functions $\hat{a},\hat{\sigma }$ are constant on each of the intervals $(-\infty ,z_{1}),\dots ,(z_{m},\infty )$; hence,
On the other hand, since $\tilde{a}=a{\upsilon }^{2}$ and $\tilde{\sigma }=\sigma \upsilon $, we have $B=\tilde{B}{\upsilon }^{-2}$, and thus
This yields, for $z\notin \{z_{k}\}$,
Recall that υ is continuous at each point $z_{k}$; hence, by (15) identity (18) holds for all $z\in \mathbb{R}$.
Now we are ready to proceed with the proof of the second inequality in (8).
Proof.
Assuming (19) to fail for some f, we will have sequences $\{{y}^{n}\},\{{\tilde{y}}^{n}\}$ such that $\{\tilde{y}_{n}\}\subset \varLambda _{F}$,
\[F\big({\tilde{y}}^{n}\big)\to f,\hspace{2em}\big\| {y}^{n}-{\tilde{y}}^{n}\big\| \to 0,\hspace{1em}\text{and}\hspace{1em}\underset{n}{\limsup }J\big({y}^{n}\big)<I(f).\]
Then $\{{y}^{n}\}$ belongs to some level set $\{J(y)\le c\}$ of a good rate function J. Hence, passing to a subsequence, we can assume that both $\{{y}^{n}\}$ and $\{{\tilde{y}}^{n}\}$ converge to some $y\in \mathbb{Y}$. In addition, $J(y)\le \liminf _{n}J({y}^{n})<I(f)$.Next, denote
where $\tau (\cdot )$ is the function introduced in the definition of F. Then each ${\tau }^{n}\in \mathit{AC}(0,T)$ with its derivative taking values from $[{C}^{-2},{c}^{-2}]$; see (4). This allows us, passing to a subsequence, assume that there exists a uniform limit $\tau =\lim _{n}{\tau }^{n}$ and that ${\dot{\tau }}^{n}\to \dot{\tau }$ weakly in $L_{2}(0,T)$.
Observe that
Now we will use (20) in order to compare $J_{i}$, $i=1,2,3$, with $I_{i}(f)$, $i=1,2,3$. We have directly that $J_{2}=I_{2}(f)$. Next, we change the variables $s=\tau _{t}$, and get
Then by (18) we get
\[\underset{t\in [0,T]}{\sup }\big|F\big({\tilde{y}}^{n}\big)_{t}-y_{{\tau _{t}^{n}}}\big|=\underset{t\in [0,T]}{\sup }\big|{\tilde{y}_{{\tau _{t}^{n}}}^{n}}-y_{{\tau _{t}^{n}}}\big|\to 0.\]
Thus,
Then we have a representation for the part of the trajectory y similar to (14):
We observe that
\[J(y)\ge \frac{1}{2}{\int _{0}^{\tau _{T}}}\frac{{(\dot{y}_{s}-\hat{a}(y_{s}))}^{2}}{{\hat{\sigma }}^{2}(y_{s})}\hspace{0.1667em}ds\]
and give a decomposition for the latter integral, similar to (9). Recall that the function $\hat{a}/{\hat{\sigma }}^{2}$ coincides with $\bar{a}/{\bar{\sigma }}^{2}$ at each point except the finite set $\{z_{k}\}$. Then
\[S(x)={\int _{0}^{x}}\frac{\bar{a}(z)}{{\bar{\sigma }}^{2}(z)}\hspace{0.1667em}dz={\int _{0}^{x}}\frac{\hat{a}(z)}{{\hat{\sigma }}^{2}(z)}\hspace{0.1667em}dz,\hspace{1em}x\in \mathbb{R},\]
and thus
(21)
\[J(y)\ge \frac{1}{2}{\int _{0}^{\tau _{T}}}\frac{{(\dot{y}_{s})}^{2}}{{\hat{\sigma }}^{2}(y_{s})}\hspace{0.1667em}ds+\big[S(y_{\tau _{T}})-S(y_{0})\big]+\frac{1}{2}{\int _{0}^{\tau _{T}}}\frac{{\hat{a}}^{2}(y_{s})}{{\hat{\sigma }}^{2}(y_{s})}\hspace{0.1667em}ds=:J_{1}+J_{2}+J_{3}.\]
\[J_{3}=\frac{1}{2}{\int _{0}^{T}}\frac{{\hat{a}}^{2}(f_{t})}{{\hat{\sigma }}^{2}(f_{t})}\dot{\tau }_{t}\hspace{0.1667em}dt.\]
Recall that we assumed $\dot{\tau }$ to be the $L_{2}$-weak limit of
On the other hand, ${\tilde{y}_{{\tau _{t}^{n}}}^{n}}\to f_{t}$, and then it is easy to show that, for a.a. $t\in [0,T]$,
(22)
\[\frac{1}{\max ({\upsilon }^{2}(f_{t}-),{\upsilon }^{2}(f_{t}+))}\le \dot{\tau }_{t}\le \frac{1}{\min ({\upsilon }^{2}(f_{t}-),{\upsilon }^{2}(f_{t}+))}.\]
\[J_{3}\ge \frac{1}{2}{\int _{0}^{T}}\frac{{\hat{a}}^{2}(f_{t})}{{\hat{\sigma }}^{2}(f_{t})}\frac{1}{\max ({\upsilon }^{2}(f_{t}-),{\upsilon }^{2}(f_{t}+))}\hspace{0.1667em}dt=\frac{1}{2}{\int _{0}^{T}}\bar{B}(f_{t})\hspace{0.1667em}dt=I_{3}(f).\]
Finally, changing the variables $s=\tau _{t}$, we get
\[J_{1}=\frac{1}{2}{\int _{0}^{T}}\frac{{(\dot{f}_{t})}^{2}}{{\hat{\sigma }}^{2}(f_{t})\dot{\tau }_{t}}\hspace{0.1667em}dt.\]
Denote $Q=\{t\in [0,T]:f_{t}\in \varDelta _{\upsilon }\}$ and recall that because $\varDelta _{\upsilon }$ has zero Lebesgue measure, $\dot{f}_{t}=0$ for a.a. $t\in Q$. On the other hand, if $f_{t}\notin \varDelta _{\upsilon },$ then by (15) and (22) we have
thus,
\[J_{1}=\frac{1}{2}\int _{[0,T]\setminus Q}\frac{{(\dot{f}_{t})}^{2}}{{\hat{\sigma }}^{2}(f_{t})\dot{\tau }_{t}}\hspace{0.1667em}dt=\frac{1}{2}\int _{[0,T]\setminus Q}\frac{{(\dot{f}_{t})}^{2}}{{\bar{\sigma }}^{2}(f_{t})}\hspace{0.1667em}dt=I_{1}(f).\]
Summarizing the above, we get $J(y)\ge I(f)$, which contradicts to the assumption made at the beginning of the proof. □3.5 Completion of the proof: general $a,\sigma $
In this last part of the proof, we remove the assumption $a/{\sigma }^{2}$ to be piecewise constant and prove the required statement in the full generality. According to Section 2.1, it suffices to prove that, for fixed $f\in C(0,T)$ and $\varkappa >0$,
for some $\delta >0$ and
for each $\delta >0$. While doing that, we can and will assume that $a,\sigma $ are constant on $(-\infty ,R]$ and $[R,\infty )$ for some R.
(23)
\[\underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(f)\big)\le -I(f)+\varkappa \](24)
\[\underset{\varepsilon \to 0}{\liminf }{\varepsilon }^{2}\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(f)\big)\ge -I(f)-\varkappa \]Consider, together with the original SDE (1), the SDE
where α is a bounded function to be specified later. Then by the Girsanov theorem
(25)
\[d{Y_{t}^{\varepsilon }}=\big[a\big({Y_{t}^{\varepsilon }}\big)+\alpha \big({Y_{t}^{\varepsilon }}\big)\sigma \big({Y_{t}^{\varepsilon }}\big)\big]\hspace{0.1667em}dt+\varepsilon \sigma \big({Y_{t}^{\varepsilon }}\big)dW_{t},\hspace{1em}{Y_{0}^{\varepsilon }}=x_{0}\in \mathbb{R},\]
\[\mathbf{P}\big({X_{\cdot }^{\varepsilon }}\in A\big)=\mathbf{E}1_{{Y}^{\varepsilon }\in A}{\mathcal{E}_{T}^{\varepsilon }},\]
\[{\mathcal{E}_{T}^{\varepsilon }}:=\exp \Bigg[-{\varepsilon }^{-1}{\int _{0}^{T}}\alpha \big({Y_{t}^{\varepsilon }}\big)\hspace{0.1667em}dW_{s}-\frac{{\varepsilon }^{-2}}{2}{\int _{0}^{T}}{\alpha }^{2}\big({Y_{t}^{\varepsilon }}\big)\hspace{0.1667em}ds\Bigg];\]
see [10], Chapter IV, Theorem 4.2. Then, for arbitrary $p,q>1:1/p+1/q=1$, we have
\[\mathbf{P}\big({X_{\cdot }^{\varepsilon }}\in A\big)\le \mathbf{P}{\big({Y_{\cdot }^{\varepsilon }}\in A\big)}^{1/p}{\big(\mathbf{E}{\big({\mathcal{E}_{T}^{\varepsilon }}\big)}^{q}\big)}^{1/q}.\]
Let $|\alpha (x)|\le \gamma $. Then we have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mathbf{E}{\big({\mathcal{E}_{T}^{\varepsilon }}\big)}^{q}& \displaystyle =\mathbf{E}\exp \Bigg[-q{\varepsilon }^{-1}{\int _{0}^{T}}\alpha \big({Y_{t}^{\varepsilon }}\big)\hspace{0.1667em}dW_{s}-q\frac{{\varepsilon }^{-2}}{2}{\int _{0}^{T}}{\alpha }^{2}\big({Y_{t}^{\varepsilon }}\big)\hspace{0.1667em}ds\Bigg]\\{} & \displaystyle =\mathbf{E}\exp \Bigg[\big({q}^{2}-q\big)\frac{{\varepsilon }^{-2}}{2}{\int _{0}^{T}}{\alpha }^{2}\big({Y_{t}^{\varepsilon }}\big)\hspace{0.1667em}ds\Bigg]{\mathcal{E}_{T}^{q\varepsilon }}\le {e}^{({q}^{2}-q){\gamma }^{2}{\varepsilon }^{-2}/2}\mathbf{E}\hspace{0.1667em}{\mathcal{E}_{T}^{q\varepsilon }}.\end{array}\]
Since α is bounded, ${\mathcal{E}}^{q\varepsilon }$ is a martingale. Thus, we can summarize the above calculation as follows:
In what follows, we will choose α such that the function $(a+\alpha \sigma )/{\sigma }^{2}$ is piecewise constant. Then the result proved in the previous section will provide that, for given $f\in C(0,T)$ and $\kappa >0$, there exists $\delta >0$ such that
where $\tilde{I}$ is the rate functional that corresponds to the new drift coefficient $a+\alpha \sigma $ and the same diffusion coefficient. It is easy to verify using the representation (9) and its analogue for $\tilde{I}$ that we can choose γ small enough so that the above construction with arbitrary α such that $\| \alpha \| \le \gamma $ yields
In that case, we get
(26)
\[\underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\big({Y}^{\varepsilon }\in B_{\delta }(f)\big)\le -\tilde{I}(f)+\frac{\kappa }{4},\]
\[\underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(f)\big)\le -\frac{1}{p}I(f)+\frac{\kappa }{4p}+\frac{(q-1){\gamma }^{2}}{2}.\]
Now we are ready to summarize the entire argument. For given $f\in C(0,T)$ and $\kappa >0$, we take $p>1$ close enough to 1 such that
Then we take $\gamma >0$ small enough such that (27) holds and
Observe that the function $a/{\sigma }^{2}$ has left and right limits at every point and is constant on $(-\infty ,R]$ and $[R,\infty )$. Then it can be approximated by piecewise constant functions in the uniform norm. This means that we can find α with $\| \alpha \| \le \gamma $ such that the function $(a+\alpha \sigma )/{\sigma }^{2}$ is piecewise constant. Then there exists $\delta >0$ such that (26) holds, and we finally deduce
\[\underset{\varepsilon \to 0}{\limsup }{\varepsilon }^{2}\log \mathbf{P}\big({X}^{\varepsilon }\in B_{\delta }(f)\big)\le -I(f)+\frac{\kappa }{4}+\frac{\kappa }{4p}+\frac{\kappa }{4}+\frac{\kappa }{4}<-I(f)+\kappa ,\]
which completes the proof of (23).Exactly the same argument provides the proof of (24) as well, with the minor change that now the law of ${Y}^{\varepsilon }$ should be expressed in the terms of ${X}^{\varepsilon }$; that is, we should use the Girsanov theorem in the following form:
The rest of the proof remains literally the same; we omit the detailed exposition.