1 Introduction
The goal of the paper is to study the stochastic differential equation (SDE), the diffusion coefficient of which includes an additional stochastic process:
where $\sigma (t,x,y)=\sigma _{1}(t,x)\sigma _{2}(t,y)$, and to estimate the drift parameter θ by the observations of stochastic processes X and Y. Such equations often arise as models of a financial market in mathematical finance. For example, one of the first models of such a type with $\sigma (t,x,y)=xy$ was proposed in [8], where Y was the square root of the geometric Brownian motion process. A similar model was considered by Heston [6], where the volatility was governed by the Ornstein–Uhlenbeck process. Fouque et al. used the model with stochastic volatility driven by the Cox–Ingersoll–Ross process; see [4, 5]. The case where $\sigma (t,x,y)=x\sigma _{2}(y)$ and Y is the Ornstein–Uhlenbeck process was studied in [12, 13].
In the present paper, we investigate the existence and uniqueness of weak and strong solutions to the equation (1). We adapt the approaches of Skorokhod [20], Stroock and Varadhan [21, 22], and Krylov [10, 11] to establish the weak existence and weak uniqueness. Concerning the strong existence and uniqueness, we use the well-known approaches of Yamada and Watanabe [23] (see also [2]) for inhomogeneous coefficients and Lipschitz conditions. In the present paper, we consider only the case of multiplicative stochastic volatility, where, as it was mentioned, the diffusion coefficient is factorized as $\sigma (t,x,y)=\sigma _{1}(t,x)\sigma _{2}(t,y)$. Then we construct the maximum likelihood estimator for the unknown drift parameter and prove its strong consistency. As an example, we consider a linear model with stochastic volatility driven by a solution to some Itô’s SDE. In particular, we study in details an SDE with constant coefficients, the Ornstein–Uhlenbeck process, and the geometric Brownian motion, as the model for volatility (note that process Y can be interpreted not only as a volatility, but also as an additional source of randomness). Note that the maximum likelihood estimation in the Ornstein–Uhlenbeck model with stochastic volatility was studied in [1]. Similar statistical methods for the case of deterministic volatility can be found in [7, 9, 16, 18].
The paper is organized as follows. In Section 2, we prove the existence of weak and strong solutions under different conditions. In Section 3, we establish the strong consistency of the maximum likelihood estimator of the unknown drift parameter θ. Section 4 contains the illustrations of our results with some simulations. Auxiliary statements are gathered in Section 5.
2 Existence and uniqueness results for weak and strong solutions
Let $(\varOmega ,\mathcal{F},\overline{\mathcal{F}},\mathbb{P})$ be a complete probability space with filtration $\overline{\mathcal{F}}=\left\{\mathcal{F}_{t},t\ge 0\right\}$ satisfying the standard assumptions. We assume that all processes under consideration are adapted to the filtration $\overline{\mathcal{F}}$.
2.1 Existence of weak solution in terms of Skorokhod conditions
Consider the following stochastic differential equation:
where $X|_{t=0}=X_{0}\in \mathbb{R}$, W is a Wiener process, and Y is some adapted stochastic process to be specified later.
(2)
\[ dX_{t}=a(t,X_{t})\hspace{0.1667em}dt+\sigma _{1}(t,X_{t})\sigma _{2}(t,Y_{t})\hspace{0.1667em}dW_{t},\]Theorem 1.
Let Y be a measurable and continuous process, a, $\sigma _{1}$, and $\sigma _{2}$ be continuous w.r.t. $x\in \mathbb{R}$, $y\in \mathbb{R}$, and $t\in [0,T]$, $\sigma _{2}$ be bounded, and
for some constant $K>0$. Then Eq. (2) has a weak solution.
Proof.
Consider a sequence of partitions of $[0,T]$: $0={t_{0}^{n}}<{t_{1}^{n}}<\cdots <{t_{n}^{n}}=T$ such that $\lim _{n\to \infty }\max _{k}({t_{k+1}^{n}}-{t_{k}^{n}})=0$. Define ${\xi _{k}^{n}}$ by ${\xi _{0}^{n}}=X(0)$ and
\[ {\xi _{k+1}^{n}}={\xi _{k}^{n}}+a\big({t_{k}^{n}},{\xi _{k}^{n}}\big)\Delta {t_{k}^{n}}+\sigma _{1}\big({t_{k}^{n}},{\xi _{k}^{n}}\big)\sigma _{2}\big({t_{k}^{n}},Y\big({t_{k}^{n}}\big)\big)\Delta {W_{k}^{n}}.\]
It follows from Lemma 1, Lemma 2, and Proposition 1 in Section 5 that it is possible to choose a subsequence ${n^{\prime }}$ and construct processes $\widetilde{\xi }_{{n^{\prime }}}$, $\widetilde{W}_{{n^{\prime }}}$, and $\widetilde{Y}_{{n^{\prime }}}$ such that the finite-dimensional distributions of $\widetilde{\xi }_{{n^{\prime }}}$, $\widetilde{W}_{{n^{\prime }}}$, and $\widetilde{Y}_{{n^{\prime }}}$ coincide with those of ${\xi }^{{n^{\prime }}}$, W, and Y and $\widetilde{\xi }_{{n^{\prime }}}\to \widetilde{\xi }$, $\widetilde{W}_{{n^{\prime }}}\to \widetilde{W}$ and $\widetilde{Y}_{{n^{\prime }}}\to \widetilde{Y}$ in probability, where $\widetilde{\xi }$, $\widetilde{W}$, and $\widetilde{Y}$ are some stochastic processes (evidently, $\widetilde{W}$ is a Wiener process). It suffices to prove that $\widetilde{\xi }$ is a solution of Eq. (2) when W and Y are replaced by $\widetilde{W}$ and $\widetilde{Y}$.We have that $\widetilde{\xi }_{{n^{\prime }}}$ satisfies the equation
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \widetilde{\xi }_{{n^{\prime }}}(t)& \displaystyle =\widetilde{\xi }_{{n^{\prime }}}(0)+\sum \limits_{{t_{k+1}^{{n^{\prime }}}}\le t}a\big({t_{k}^{{n^{\prime }}}},\widetilde{\xi }_{{n^{\prime }}}\big({t_{k}^{{n^{\prime }}}}\big)\big)\Delta {t_{k}^{{n^{\prime }}}}\\{} & \displaystyle \hspace{1em}+\sum \limits_{{t_{k+1}^{{n^{\prime }}}}\le t}\sigma _{1}\big({t_{k}^{{n^{\prime }}}},\widetilde{\xi }_{{n^{\prime }}}\big({t_{k}^{{n^{\prime }}}}\big)\big)\sigma _{2}\big({t_{k}^{{n^{\prime }}}},\widetilde{Y}_{{n^{\prime }}}\big({t_{k}^{{n^{\prime }}}}\big)\big)\Delta {\widetilde{W}_{k}^{{n^{\prime }}}}.\end{array}\]
Since $\sigma _{2}$ is bounded and $\sigma _{1}$ is of linear growth, their product is of linear growth:
where $K_{1}>0$ is a constant. Therefore,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \mathbb{P}\Big(\underset{0\le t\le T}{\sup }\left|\sigma _{1}\big(t,\widetilde{\xi }_{{n^{\prime }}}(t)\big)\sigma _{2}\big(t,\widetilde{Y}_{{n^{\prime }}}(t)\big)\right|>C\Big)\\{} & \displaystyle \hspace{1em}\le \mathbb{P}\Big(\underset{0\le t\le T}{\sup }K_{1}\big(1+\left|\widetilde{\xi }_{{n^{\prime }}}(t)\right|\big)>C\Big)=\mathbb{P}\bigg(\underset{0\le t\le T}{\sup }\left|\widetilde{\xi }_{{n^{\prime }}}(t)\right|>\frac{C}{K_{1}}-1\bigg)\\{} & \displaystyle \hspace{1em}=\mathbb{P}\bigg(\underset{0\le t\le T}{\sup }\big|{\xi }^{{n^{\prime }}}(t)\big|>\frac{C}{K_{1}}-1\bigg).\end{array}\]
Using Lemma 1, we get that
\[ \mathbb{P}\Big(\underset{0\le t\le T}{\sup }\left|\sigma _{1}\big(t,\widetilde{\xi }_{{n^{\prime }}}(t)\big)\sigma _{2}\big(t,\widetilde{Y}_{{n^{\prime }}}(t)\big)\right|>C\Big)\to 0\hspace{1em}\text{as}\hspace{2.5pt}C\to \infty .\]
Moreover, we have that $\sigma _{1}(t,x)\sigma _{2}(t,y)$ is continuous w.r.t. $t\in [0,T]$, $x,y\in \mathbb{R}$. Then, for any $\varepsilon >0$, there exists $\delta >0$ such that
\[ \big|\sigma _{1}(t_{1},x_{1})\sigma _{2}(t_{1},y_{1})-\sigma _{1}(t_{2},x_{2})\sigma _{2}(t_{2},y_{2})\big|<\varepsilon \]
whenever $\left|t_{1}-t_{2}\right|<\delta $, $\left|x_{1}-x_{2}\right|<\delta $, $\left|y_{1}-y_{2}\right|<\delta $. Therefore,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \mathbb{P}& \displaystyle \big(\left|\sigma _{1}\big(t_{1},\widetilde{\xi }_{{n^{\prime }}}(t_{1})\big)\sigma _{2}\big(t_{1},\widetilde{Y}_{{n^{\prime }}}(t_{1})\big)-\sigma _{1}\big(t_{2},\widetilde{\xi }_{{n^{\prime }}}(t_{2})\big)\sigma _{2}\big(t_{2},\widetilde{Y}_{{n^{\prime }}}(t_{2})\big)\right|>\varepsilon \big)\\{} & \displaystyle \le \mathbb{P}\big(\left|\widetilde{\xi }_{{n^{\prime }}}(t_{1})-\widetilde{\xi }_{{n^{\prime }}}(t_{2})\right|<\delta ,\left|\widetilde{Y}_{{n^{\prime }}}(t_{1})-\widetilde{Y}_{{n^{\prime }}}(t_{2})\right|<\delta ,\\{} & \displaystyle \hspace{1em}\left|\sigma _{1}\big(t_{1},\widetilde{\xi }_{{n^{\prime }}}(t_{1})\big)\sigma _{2}\big(t_{1},\widetilde{Y}_{{n^{\prime }}}(t_{1})\big)-\sigma _{1}\big(t_{2},\widetilde{\xi }_{{n^{\prime }}}(t_{2})\big)\sigma _{2}\big(t_{2},\widetilde{Y}_{{n^{\prime }}}(t_{2})\big)\right|>\varepsilon \big)\\{} & \displaystyle \hspace{1em}+\mathbb{P}\big(\left|\widetilde{\xi }_{{n^{\prime }}}(t_{1})-\widetilde{\xi }_{{n^{\prime }}}(t_{2})\right|\ge \delta \big)+\mathbb{P}\big(\left|\widetilde{Y}_{{n^{\prime }}}(t_{1})-\widetilde{Y}_{{n^{\prime }}}(t_{2})\right|\ge \delta \big)\\{} & \displaystyle =\mathbb{P}\big(\left|\xi _{{n^{\prime }}}(t_{1})-\xi _{{n^{\prime }}}(t_{2})\right|\ge \delta \big)+\mathbb{P}\big(\left|Y(t_{1})-Y(t_{2})\right|\ge \delta \big),\end{array}\]
and the last relation implies the following one:
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \underset{h\to 0}{\lim }\underset{{n^{\prime }}\to \infty }{\lim }\underset{\left|t_{1}-t_{2}\right|\le h}{\sup }\mathbb{P}& \displaystyle \big(\big|\sigma _{1}\big(t_{1},\widetilde{\xi }_{{n^{\prime }}}(t_{1})\big)\sigma _{2}\big(t_{1},\widetilde{Y}_{{n^{\prime }}}(t_{1})\big)\\{} & \displaystyle -\sigma _{1}\big(t_{2},\widetilde{\xi }_{{n^{\prime }}}(t_{2})\big)\sigma _{2}\big(t_{2},\widetilde{Y}_{{n^{\prime }}}(t_{2})\big)\big|>\varepsilon \big)=0.\end{array}\]
Applying Lemma 1, we get that
\[ \sum \limits_{{t_{k+1}^{{n^{\prime }}}}\le t}\sigma _{1}\big({t_{k}^{{n^{\prime }}}},\widetilde{\xi }_{{n^{\prime }}}\big({t_{k}^{{n^{\prime }}}}\big)\big)\sigma _{2}\big({t_{k}^{{n^{\prime }}}},\widetilde{Y}_{{n^{\prime }}}\big({t_{k}^{{n^{\prime }}}}\big)\big)\Delta {W_{k}^{{n^{\prime }}}}\to {\int _{0}^{T}}\sigma _{1}\big(s,\widetilde{\xi }(s)\big)\sigma _{2}\big(s,\widetilde{Y}(s)\big)\hspace{0.1667em}d\widetilde{W}(s)\]
in probability as ${n^{\prime }}\to \infty $, and we also have that
whence the proof follows. □2.2 Existence and uniqueness of weak solution in terms of Stroock–Varadhan conditions
In this approach, we assume additionally that the process Y also is a solution of some diffusion stochastic differential equation. Let ${W}^{1}$ and ${W}^{2}$ be two Wiener processes, possibly correlated, so that $d{W_{t}^{1}}d{W_{t}^{2}}=\rho dt$ for some $|\rho |\le 1$. In this case, we can represent ${W_{t}^{2}}=\rho {W_{t}^{1}}+\sqrt{1-{\rho }^{2}}{W_{t}^{3}}$, where ${W}^{3}$ is a Wiener process independent of ${W}^{1}$.
Theorem 2.
Consider the system of stochastic differential equations
where all coefficients a, α, $\sigma _{1}$, $\sigma _{2}$, and β are nonrandom measurable and bounded functions, $\sigma _{1}$, $\sigma _{2}$, and β are continuous in all arguments. Let $|\rho |<1$, and let $\sigma _{1}(t,x)>0$, $\sigma _{2}(t,y)>0$, $\beta (t,y)>0$ for all $t,x,y$. Then the weak existence and uniqueness in law hold for system (3)–(4), and in particular, the weak existence and uniqueness in law hold for Eq. (3) with Y being a weak solution of Eq. (4).
Proof.
Equations (3) and (4) are equivalent to the two-dimensional stochastic differential equation
where $Z(t)=(\begin{array}{c}X(t)\\{} Y(t)\end{array})$, $W(t)=(\begin{array}{c}{W}^{1}(t)\\{} {W}^{3}(t)\end{array})$ is a two dimensional Wiener process,
The quadratic form
\[ A(t,x,y)\hspace{0.1667em}=\left(\begin{array}{c}a(t,x)\\{} \alpha (t,y)\end{array}\right),\hspace{1em}\text{and}\hspace{1em}B(t,x,y)\hspace{0.1667em}=\left(\begin{array}{c@{\hskip10.0pt}c}\sigma _{1}(t,x)\sigma _{2}(t,y)& 0\\{} \rho \beta (t,y)& \sqrt{1-{\rho }^{2}}\beta (t,y)\end{array}\right).\]
It follows from the measurability and boundedness of a and α and from the continuity and boundedness of $\sigma _{1}$, $\sigma _{2}$, and β that the coefficients of matrices A and B are nonrandom, measurable, and bounded, and additionally the coefficients of B are continuous in all arguments. Then we can apply Theorems 4.2 and 5.6 from [21] (see also Prop. 1.14 in [3]) and deduce that we have to prove the following relation: for any $(t,x,y)\in {\mathbb{R}}^{+}\times {\mathbb{R}}^{2}$, there exists $\varepsilon (t,x,y)>0$ such that, for all $\lambda \in {\mathbb{R}}^{2}$,
Relation (5) is equivalent to the following one (we omit arguments):
\[ {\sigma _{1}^{2}}{\sigma _{2}^{2}}{\lambda _{1}^{2}}+{\beta }^{2}{\big(\rho \lambda _{1}+\sqrt{1-{\rho }^{2}}\lambda _{2}\big)}^{2}\ge {\varepsilon }^{2}\big({\lambda _{1}^{2}}+{\lambda _{2}^{2}}\big)\]
or
(6)
\[ \big({\sigma _{1}^{2}}{\sigma _{2}^{2}}+{\beta }^{2}{\rho }^{2}\big){\lambda _{1}^{2}}+{\beta }^{2}\big(1-{\rho }^{2}\big){\lambda _{2}^{2}}+2\rho \sqrt{1-{\rho }^{2}}{\beta }^{2}\lambda _{1}\lambda _{2}\ge {\varepsilon }^{2}\big({\lambda _{1}^{2}}+{\lambda _{2}^{2}}\big).\]
\[ Q(\lambda _{1},\lambda _{2})=\big({\sigma _{1}^{2}}{\sigma _{2}^{2}}+{\beta }^{2}{\rho }^{2}\big){\lambda _{1}^{2}}+{\beta }^{2}\big(1-{\rho }^{2}\big){\lambda _{2}^{2}}+2\rho \sqrt{1-{\rho }^{2}}{\beta }^{2}\lambda _{1}\lambda _{2}\]
in the left-hand side of (6) is positive definite since its discriminant
\[ D={\rho }^{2}\big(1-{\rho }^{2}\big){\beta }^{4}-{\beta }^{2}\big(1-{\rho }^{2}\big)\big({\sigma _{1}^{2}}{\sigma _{2}^{2}}+{\beta }^{2}{\rho }^{2}\big)=-{\beta }^{2}\big(1-{\rho }^{2}\big){\sigma _{1}^{2}}{\sigma _{2}^{2}}<0.\]
The continuity of $Q(\lambda _{1},\lambda _{2})$ implies the existence of $\min _{{\lambda _{1}^{2}}+{\lambda _{2}^{2}}=1}Q(\lambda _{1},\lambda _{2})>0$. Then, putting $\varepsilon =\min _{{\lambda _{1}^{2}}+{\lambda _{2}^{2}}=1}Q(\lambda _{1},\lambda _{2})$ and using homogeneity, we get (6). □2.3 Existence of strong solution in terms of Yamada–Watanabe conditions
Now we consider strong existence–uniqueness conditions for Eq. (2), adapting the Yamada–Watanabe conditions for inhomogeneous coefficients from [2].
Theorem 3.
Let a, $\sigma _{1}$, and $\sigma _{2}$ be nonrandom measurable bounded functions such that
-
(ii) There exists a positive increasing concave function $k(u)$, $u\in (0,\infty )$, satisfying $k(0)=0$ such that\[ \big|a(t,x)-a(t,y)\big|\le k(\left|x-y\right|),\hspace{1em}t\ge 0,\hspace{2.5pt}x,y\in \mathbb{R},\]and ${\int _{0}^{\infty }}{k}^{-1}(u)du=+\infty $. Also, let Y be an adapted continuous stochastic process. Then the pathwise uniqueness of solution holds for Eq. (2), and hence it has a unique strong solution.
Proof.
Let $1>a_{1}>a_{2}>\cdots >a_{n}>\cdots >0$ be defined by
\[ {\int _{a_{1}}^{1}}{\rho }^{-2}(u)\hspace{0.1667em}du=1,\hspace{0.2778em}{\int _{a_{2}}^{a_{1}}}{\rho }^{-2}(u)\hspace{0.1667em}du=2,\dots ,{\int _{a_{n}}^{a_{n-1}}}{\rho }^{-2}(u)\hspace{0.1667em}du=n,\dots .\]
We have that $a_{n}\to 0$ as $n\to \infty $. Let $\phi _{n}(u)$, $n=1,2,\dots $, be a continuous function with support contained in $(a_{n},a_{n-1})$ such that $0\le \phi _{n}(u)\le \frac{2{\rho }^{-2}(u)}{n}$ and ${\int _{a_{n}}^{a_{n-1}}}\phi _{n}(u)\hspace{0.1667em}du=1$. Such a function obviously exists. Set
\[ \varphi _{n}(x)={\int _{0}^{\left|x\right|}}{\int _{0}^{y}}\phi _{n}(u)\hspace{0.1667em}du\hspace{0.1667em}dy,\hspace{1em}x\in \mathbb{R}.\]
Clearly, $\varphi _{n}\in {C}^{2}(\mathbb{R})$, $\left|{\varphi ^{\prime }_{n}}(x)\right|\le 1$, and $\varphi _{n}(x)\nearrow \left|x\right|$ as $n\to \infty $.Let $X_{1}$ and $X_{2}$ be two solutions of Eq. (2) on the same probability space with the same Wiener process and such that $X_{1}(0)=X_{2}(0)$. Then we can present their difference as
\[\begin{array}{r@{\hskip0pt}l}\displaystyle X_{1}(t)-X_{2}(t)& \displaystyle ={\int _{0}^{t}}\sigma _{2}\big(s,Y(s)\big)\big(\sigma _{1}\big(s,X_{1}(s)\big)-\sigma _{1}\big(s,X_{2}(s)\big)\big)\hspace{0.1667em}dW(s)\\{} & \displaystyle \hspace{1em}+{\int _{0}^{t}}\big(a\big(s,X_{1}(s)\big)-a\big(s,X_{2}(s)\big)\big)\hspace{0.1667em}ds.\end{array}\]
By the Itô formula,
\[\begin{array}{r@{\hskip0pt}l}& \displaystyle \varphi _{n}\big(X_{1}(t)-X_{2}(t)\big)\\{} & \displaystyle \hspace{1em}={\int _{0}^{t}}{\varphi ^{\prime }_{n}}\big(X_{1}(s)-X_{2}(s)\big)\sigma _{2}\big(s,Y(s)\big)\big(\sigma _{1}\big(s,X_{1}(s)\big)-\sigma _{1}\big(s,X_{2}(s)\big)\big)\hspace{0.1667em}dW(s)\\{} & \displaystyle \hspace{2em}+{\int _{0}^{t}}{\varphi ^{\prime }_{n}}\big(X_{1}(s)-X_{2}(s)\big)\big(a\big(s,X_{1}(s)\big)-a\big(s,X_{2}(s)\big)\big)\hspace{0.1667em}ds\\{} & \displaystyle \hspace{2em}+\frac{1}{2}{\int _{0}^{t}}{\varphi ^{\prime\prime }_{n}}\big(X_{1}(s)-X_{2}(s)\big)\sigma _{2}{\big(s,Y(s)\big)}^{2}{\big(\sigma _{1}\big(s,X_{1}(s)\big)-\sigma _{1}\big(s,X_{2}(s)\big)\big)}^{2}\hspace{0.1667em}ds\\{} & \displaystyle \hspace{1em}=J_{1}+J_{2}+J_{3}.\end{array}\]
We have that $\mathbb{E}(J_{1})=0$,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \left|\mathbb{E}(J_{2})\right|& \displaystyle \le {\int _{0}^{t}}\mathbb{E}\left|a\big(s,X_{1}(s)\big)-a\big(s,X_{2}(s)\big)\right|\hspace{0.1667em}ds\\{} & \displaystyle \le {\int _{0}^{t}}\mathbb{E}(k\big(\left|X_{1}(s)-X_{2}(s)\right|\big)\hspace{0.1667em}ds\le {\int _{0}^{t}}k(\mathbb{E}\big(\left|X_{1}(s)-X_{2}(s)\right|\big)\hspace{0.1667em}ds\end{array}\]
by Jensen’s inequality, and
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \left|\mathbb{E}(J_{3})\right|& \displaystyle \le \frac{{C}^{2}}{2}{\int _{0}^{t}}\mathbb{E}\bigg(\frac{2}{n}{\rho }^{-2}\big(\big|X_{1}(s)-X_{2}(s)\big|\big){\rho }^{2}\big(\big|X_{1}(s)-X_{2}(s)\big|\big)\bigg)ds\\{} & \displaystyle \le \frac{t}{n}\to 0\hspace{1em}\text{as}\hspace{2.5pt}n\to \infty .\end{array}\]
So by letting $n\to \infty $ we get
\[ \mathbb{E}\big(\big|X_{1}(s)-X_{2}(s)\big|\big)\le {\int _{0}^{t}}k(\mathbb{E}\big(\big|X_{1}(s)-X_{2}(s)\big|\big)\hspace{0.1667em}ds.\]
We have that ${\int _{0}^{\infty }}{k}^{-1}(u)du=+\infty $. Then we get $\mathbb{E}(\left|X_{1}(s)-X_{2}(s)\right|)=0$, and hence $X_{1}(s)=X_{2}(s)$ a.s. □2.4 Existence and uniqueness for strong solution in terms of Lipschitz conditions
Theorem 4.
Let a, $\sigma _{1}$, and $\sigma _{2}$ be nonrandom measurable functions, and let Y be an adapted continuous stochastic process. Consider the following assumptions:
Then Eq. (2) has a unique strong solution.
This result can be proved by using the successive approximation method; see, for example, [19, Thm. 1.2].
3 Drift parameter estimation
3.1 General results
Let $(\varOmega ,\mathcal{F},\overline{\mathcal{F}},\mathbb{P})$ be a complete probability space with filtration $\overline{\mathcal{F}}=\left\{\mathcal{F}_{t},t\ge 0\right\}$ satisfying the standard assumptions. We assume that all processes under consideration are adapted to the filtration $\overline{\mathcal{F}}$. Consider a parameterized version of Eq. (2)
where W is a Wiener process. Assume that Eq. (7) has a unique strong solution $X=\{X_{t},t\in [0,T]\}$. Our main problem is to estimate the unknown parameter θ by continuous observations of X and Y.
(7)
\[ dX_{t}=\theta a(t,X_{t})\hspace{0.1667em}dt+\sigma _{1}(t,X_{t})\sigma _{2}(t,Y_{t})\hspace{0.1667em}dW_{t},\]Denote
Then a likelihood function for Eq. (7) has the form
\[ f(t,x,y)=\frac{a(t,x)}{{\sigma _{1}^{2}}(t,x){\sigma _{2}^{2}}(t,y)},\hspace{2em}g(t,x,y)=\frac{a(t,x)}{\sigma _{1}(t,x)\sigma _{2}(t,y)}.\]
Assume that, for all $t>0$,
\[ \frac{dP_{\theta }(T)}{dP_{0}(T)}=\exp \left\{\theta {\int _{0}^{T}}f(t,X_{t},Y_{t})\hspace{0.1667em}dX_{t}-\frac{{\theta }^{2}}{2}{\int _{0}^{T}}{g}^{2}(t,X_{t},Y_{t})\hspace{0.1667em}dt\right\};\]
see [15, Ch. 7]. Hence, the maximum likelihood estimator of parameter θ constructed by observations of X and Y on the interval $[0,T]$ has the form
(11)
\[ \hat{\theta }_{T}=\frac{{\textstyle\int _{0}^{T}}f(t,X_{t},Y_{t})\hspace{0.1667em}dX_{t}}{{\textstyle\int _{0}^{T}}{g}^{2}(t,X_{t},Y_{t})\hspace{0.1667em}dt}=\theta +\frac{{\textstyle\int _{0}^{T}}g(t,X_{t},Y_{t})\hspace{0.1667em}dW_{t}}{{\textstyle\int _{0}^{T}}{g}^{2}(t,X_{t},Y_{t})\hspace{0.1667em}dt}.\]Proof.
Note that, under condition (9) the process $M_{t}={\int _{0}^{t}}g(s,X_{s},Y_{s})\hspace{0.1667em}dW_{s}$ is a square-integrable local martingale with quadratic variation $\langle M\rangle _{t}={\int _{0}^{t}}{g}^{2}(s,X_{s},Y_{s})\hspace{0.1667em}ds$. According to the strong law of large numbers for martingales [14, Ch. 2, § 6, Thm. 10, Cor. 1], under the condition $\langle M\rangle _{T}\to \infty $ a.s. as $T\to \infty $, we have that $\frac{M_{T}}{\langle M\rangle _{T}}\to 0$ a.s. as $T\to \infty $. Therefore, it follows from representation (11) that $\hat{\theta }_{T}$ is strongly consistent. □
3.2 Linear equation with stochastic volatility
As an example, let us consider the model
where $W_{t}$ is a Wiener process, and $Y_{t}$ is a continuous stochastic process with values from an open interval $J=(l,r)$ (further, in examples, we will consider $J=\mathbb{R}$ or $J=(0,+\infty )$). By Theorem 4, under the assumption
there exists a unique strong solution of (12).
(12)
\[ dX_{t}=\theta X_{t}\hspace{0.1667em}dt+X_{t}\sigma _{2}(Y_{t})\hspace{0.1667em}dW_{t},\hspace{1em}X_{0}=x_{0}\in \mathbb{R},\]Let Y be a J-valued solution of the equation
where ${W}^{1}$ is a Wiener process, possibly correlated with W.
(13)
\[ dY_{t}=\alpha (Y_{t})\hspace{0.1667em}dt+\beta (Y_{t})\hspace{0.1667em}d{W_{t}^{1}},\hspace{1em}Y_{0}=y_{0}\in J,\]By ${L_{\mathrm{loc}}^{1}}(J)$ we denote the set of Borel functions $J\to [-\infty ,\infty ]$ that are locally integrable on J, that is, integrable on compact subsets of J. By ${L_{\mathrm{loc}}^{1}}(l+)$ we denote the set of Borel functions $f:J\to [-\infty ,\infty ]$ such that ${\int _{l}^{z}}\left|f(y)\right|\hspace{0.1667em}dy<\infty $ for some $z\in J$. The notation ${L_{\mathrm{loc}}^{1}}(r-)$ is introduced similarly.
Assume that coefficients α and β satisfy the Engelbert–Schmidt conditions
Let us introduce the following notation:
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \rho (y)& \displaystyle =\exp \left\{-2{\int _{c}^{y}}\frac{\alpha (u)}{{\beta }^{2}(u)}\hspace{0.1667em}du\right\},\hspace{1em}y\in J,\\{} \displaystyle s(y)& \displaystyle ={\int _{c}^{y}}\rho (u)\hspace{0.1667em}du,\hspace{1em}y\in \bar{J}=[l,r],\end{array}\]
for some $c\in J$. Assume additionally thatUnder (A2)–(A3), the SDE (13) has a weak solution, unique in law, which possibly exits J at some time ζ. Moreover, $\zeta =\infty $ a.s. if and only if conditions (A4)–(A5) are satisfied, see, for example, [17, Prop. 2.6].
Assume also that
-
(A6) ${\beta }^{-2}{\sigma _{2}^{-2}}\in {L_{\mathrm{loc}}^{1}}(J)$,
-
(A7) one of the following four conditions holds:
-
(i) $s(r)=\infty $, $s(l)=-\infty $,
-
(ii) $s(r)<\infty $, $s(l)=-\infty $, $\frac{s(r)-s}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(r-)$,
-
(iii) $s(r)=\infty $, $s(l)>-\infty $, $\frac{s-s(l)}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(l+)$,
-
(iv) $s(r)<\infty $, $s(l)>-\infty $, $\frac{s(r)-s}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(r-)$, $\frac{s-s(l)}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(l+)$,
-
-
(A8) $X_{t}\sigma (Y_{t})\ne 0$ a.s., $t\ge 0$.
Proof.
We need to verify conditions (8)–(10) of Theorem 5. For model (12), they read as follows:
Note that (15) is assumption (A8). By [17, Thm. 2.7] the local integrability condition (A6), together with (A2)–(A5), implies (16). Further, if assumption (A7)(i) holds, then (17) is satisfied by [17, Thm. 2.11]. In the remaining case $s(r)<\infty $ or $s(l)>-\infty $, we have that
(15)
\[\begin{array}{r@{\hskip0pt}l}\displaystyle X_{t}\sigma _{2}(Y_{t})& \displaystyle \ne 0,\hspace{1em}t\ge 0,\hspace{1em}\text{a.s.},\end{array}\]
\[ \varOmega =\left\{\underset{t\uparrow \infty }{\lim }Y_{t}=r\right\}\cup \left\{\underset{t\uparrow \infty }{\lim }Y_{t}=l\right\};\]
see [17]. Moreover, if $s(r)=\infty $, then $\mathbb{P}(\lim _{t\uparrow \infty }Y_{t}=r)=0$ by [17, Prop. 2.4]. If $s(r)<\infty $ and $\frac{s(r)-s}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(r-)$, then ${\int _{0}^{\infty }}{\sigma _{2}^{-2}}(Y_{s})\hspace{0.1667em}ds=\infty $ a.s. on $\{\lim _{t\uparrow \infty }Y_{t}=r\}$ by [17, Thm. 2.12]. The similar statements hold for $\{\lim _{t\uparrow \infty }Y_{t}=l\}$. This implies that (17) is satisfied under each of conditions (ii)–(iv) of assumption (A7). □Now we consider several examples of the process Y, namely the Bachelier model, the Ornstein–Uhlenbeck model, the geometric Brownian motion, and the Cox–Ingersoll–Ross model. We concentrate on verification of assumption (A7) for these models, assuming that other conditions of Theorem 6 are satisfied.
Example 1 (Bachelier model).
Let Y be a solution of the SDE
\[ dY_{t}=\alpha \hspace{0.1667em}dt+\beta \hspace{0.1667em}d{W_{t}^{1}},\hspace{1em}Y_{0}=y_{0}\in \mathbb{R},\]
where $\alpha \in \mathbb{R}$ and $\beta \ne 0$ are some constants. Assume that ${\sigma _{2}^{-2}}(y)\in {L_{\mathrm{loc}}^{1}}(\mathbb{R})$ and one of the following assumptions holds:
Then estimator (14) is strongly consistent.
Indeed, in this case, $J=\mathbb{R}$,
\[ \rho (y)=\exp \left\{-\frac{2\alpha }{{\beta }^{2}}\hspace{0.1667em}y\right\},\hspace{1em}\text{and}\hspace{1em}s(y)={\int _{0}^{y}}\exp \left\{-\frac{2\alpha }{{\beta }^{2}}\hspace{0.1667em}u\right\}\hspace{0.1667em}du.\]
If $\alpha =0$, then $s(y)=y$, $s(+\infty )=\infty $, $s(-\infty )=-\infty $, and assumption (A7)(i) is satisfied. Otherwise, we have
\[ s(y)=\frac{{\beta }^{2}}{2\alpha }\bigg(1-\exp \left\{-\frac{2\alpha }{{\beta }^{2}}\hspace{0.1667em}y\right\}\bigg).\]
If $\alpha >0$, then $s(+\infty )=\frac{{\beta }^{2}}{2\alpha }$, $s(-\infty )=-\infty $, and
\[ \frac{s(+\infty )-s(y)}{\rho (y){\beta }^{2}{\sigma _{2}^{2}}(y)}=\frac{1}{2\alpha {\sigma _{2}^{2}}(y)}\notin {L_{\mathrm{loc}}^{1}}(+\infty ),\]
and hence (A7)(ii) holds. The case $\alpha <0$ is considered similarly.Example 2 (Ornstein–Uhlenbeck or Vasicek model).
Let Y be a solution of the SDE
\[ dY_{t}=a(b-Y_{t})\hspace{0.1667em}dt+\gamma \hspace{0.1667em}d{W_{t}^{1}},\hspace{1em}Y_{0}=y_{0}\in \mathbb{R},\]
where $a,b\in \mathbb{R}$, and $\gamma >0$ are some constants. Assume that ${\sigma _{2}^{-2}}\in {L_{\mathrm{loc}}^{1}}(\mathbb{R})$ and one of the following assumptions holds:
Then estimator (14) is strongly consistent.
In this case, we also take $J=\mathbb{R}$. Then
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \rho (y)& \displaystyle =\exp \left\{-2{\int _{b}^{y}}\frac{a(b-u)}{{\gamma }^{2}}\hspace{0.1667em}du\right\}=\exp \left\{\frac{a}{{\gamma }^{2}}{(y-b)}^{2}\right\},\\{} \displaystyle s(y)& \displaystyle ={\int _{b}^{y}}\exp \left\{\frac{a}{{\gamma }^{2}}{(u-b)}^{2}\right\}\hspace{0.1667em}du.\end{array}\]
If $a\ge 0$, then $\exp \{\frac{a}{{\gamma }^{2}}{(u-b)}^{2}\}\ge 1$, and we get that $s(+\infty )=\infty $, $s(-\infty )=-\infty $.
If $a<0$, then
Therefore, $s(+\infty )=-s(-\infty )=\frac{\gamma \sqrt{\pi }}{2\sqrt{-a}}<\infty $, and we need to verify (A7)(iv). Since ${\int _{x}^{\infty }}{e}^{-{z}^{2}}dz\sim \frac{1}{2x}{e}^{-{x}^{2}}$ as $x\to \infty $, we see that
\[ \frac{s(+\infty )-s(y)}{\rho (y){\gamma }^{2}{\sigma _{2}^{2}}(y)}=\frac{\frac{\gamma }{\sqrt{-a}}{\underset{\frac{\sqrt{-a}}{\gamma }(y-b)}{\overset{\infty }{\int }}}{e}^{-{z}^{2}}dz}{\exp \left\{\frac{a}{{\gamma }^{2}}{(y-b)}^{2}\right\}{\gamma }^{2}{\sigma _{2}^{2}}(y)}\sim \frac{1}{-2a(y-b){\sigma _{2}^{2}}(y)}\]
as $y\to \infty $. Then $\frac{s(+\infty )-s(y)}{\rho (y){\gamma }^{2}{\sigma _{2}^{2}}(y)}\notin {L_{\mathrm{loc}}^{1}}(+\infty )$ if ${y}^{-1}{\sigma _{2}^{-2}}(y)\notin {L_{\mathrm{loc}}^{1}}(+\infty )$. The condition $\frac{s-s(-\infty )}{\rho {\beta }^{2}{\sigma _{2}^{2}}}\notin {L_{\mathrm{loc}}^{1}}(-\infty )$ is considered similarly.Example 3 (Geometric Brownian motion).
Let Y be a solution of the SDE
\[ dY_{t}=\alpha Y_{t}\hspace{0.1667em}dt+\beta Y_{t}\hspace{0.1667em}d{W_{t}^{1}},\hspace{1em}Y_{0}=y_{0}>0,\]
where $\alpha \hspace{0.1667em}\in \hspace{0.1667em}\mathbb{R}$ and $\beta \ne 0$ are some constants. Assume that ${y}^{-2}{\sigma _{2}^{-2}}(y)\hspace{0.1667em}\in \hspace{0.1667em}{L_{\mathrm{loc}}^{1}}((0,+\infty ))$ and one of the following assumptions holds:
Then estimator (14) is strongly consistent.
In this case, the process Y is positive, and hence $J=(0,\infty )$. We have
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \rho (y)& \displaystyle =\exp \left\{-2{\int _{1}^{y}}\frac{{\alpha }^{2}}{{\beta }^{2}u}\hspace{0.1667em}du\right\}={y}^{-\frac{2{\alpha }^{2}}{{\beta }^{2}}},\\{} \displaystyle s(y)& \displaystyle ={\int _{1}^{y}}{u}^{-\frac{2{\alpha }^{2}}{{\beta }^{2}}}\hspace{0.1667em}du=\left\{\begin{array}{l@{\hskip10.0pt}l}\frac{{y}^{1-\frac{2{\alpha }^{2}}{{\beta }^{2}}}-1}{1-\frac{2{\alpha }^{2}}{{\beta }^{2}}},\hspace{1em}& {\beta }^{2}\ne 2{\alpha }^{2},\\{} \ln y,\hspace{1em}& {\beta }^{2}=2{\alpha }^{2}.\end{array}\right.\end{array}\]
If ${\beta }^{2}=2{\alpha }^{2}$, then $s(0)=-\infty $ and $s(+\infty )=\infty $. If ${\beta }^{2}<2{\alpha }^{2}$, then $s(0)=-\infty $, $s(+\infty )<\infty $, and
\[ \frac{s(+\infty )-s(y)}{\rho (y){\beta }^{2}{y}^{2}{\sigma _{2}^{2}}(y)}=\frac{1}{(2{\alpha }^{2}-{\beta }^{2})y{\sigma _{2}^{2}}(y)}\notin {L_{\mathrm{loc}}^{1}}(+\infty ).\]
If ${\beta }^{2}>2{\alpha }^{2}$, then $s(0)>-\infty $, $s(+\infty )=\infty $, and
Example 4 (Cox–Ingersoll–Ross model).
Let Y be a solution of the SDE
\[ dY_{t}=a(b-Y_{t})\hspace{0.1667em}dt+\gamma \sqrt{Y_{t}}\hspace{0.1667em}d{W_{t}^{1}},\hspace{1em}Y_{0}=y_{0}\in \mathbb{R},\]
where a, b, γ are positive constants, and $2ab\ge {\gamma }^{2}$. Assume that
Then estimator (14) is strongly consistent.
Under the condition $2ab\ge {\gamma }^{2}$, the process Y is positive, and hence $J=(0,\infty )$. Further,
\[\begin{array}{r@{\hskip0pt}l}\displaystyle \rho (y)& \displaystyle =\exp \left\{-2{\int _{1}^{y}}\frac{a(b-u)}{{\gamma }^{2}u}\hspace{0.1667em}du\right\}={y}^{-\frac{2ab}{{\gamma }^{2}}}{e}^{\frac{2a}{{\gamma }^{2}}(y-1)},\\{} \displaystyle s(y)& \displaystyle ={e}^{-\frac{2a}{{\gamma }^{2}}}{\int _{1}^{y}}{u}^{-\frac{2ab}{{\gamma }^{2}}}{e}^{\frac{2a}{{\gamma }^{2}}u}\hspace{0.1667em}du.\end{array}\]
Since ${u}^{-\frac{2ab}{{\gamma }^{2}}}{e}^{\frac{2a}{{\gamma }^{2}}u}\to \infty $ as $u\to \infty $, we see that $s(+\infty )=\infty $. Moreover, using the inequality ${e}^{\frac{2a}{{\gamma }^{2}}u}\ge 1$, we get
\[ s(0)=-{e}^{-\frac{2a}{{\gamma }^{2}}}{\int _{0}^{1}}{u}^{-\frac{2ab}{{\gamma }^{2}}}{e}^{\frac{2a}{{\gamma }^{2}}u}\hspace{0.1667em}du\le -{e}^{-\frac{2a}{{\gamma }^{2}}}{\int _{0}^{1}}{u}^{-\frac{2ab}{{\gamma }^{2}}}\hspace{0.1667em}du=-\infty \]
since $\frac{2ab}{{\gamma }^{2}}>1$. Thus, assumption (A7)(i) is satisfied.4 Simulations
We illustrate the quality of the estimator $\hat{\theta }_{T}$ in model (12)–(13) by simulation experiments. We simulate the trajectories of the Wiener processes W and ${W}^{1}$ at the points $t=0,h,2h,3h,\dots \hspace{0.1667em}$ and compute the approximate values of the process Y and X as solutions to SDEs using Euler’s approximations. For each set of parameters, we simulate 100 sample paths with step $h=0.0001$. The initial values of the processes are $x_{0}=y_{0}=1$, and the true value of the parameter is $\theta =2$. The results are reported in Table 1.
Table 1.
The means and standard deviations of $\hat{\theta }_{T}$
T | |||||||
$\alpha (y)$ | $\beta (y)$ | $\sigma _{2}(y)$ | 10 | 50 | 100 | 200 | |
1 | 1 | ${\left|y\right|}^{1/4}$ | Mean | 1.9455 | 1.9431 | 1.9711 | 1.9762 |
Std.dev. | 0.4260 | 0.2576 | 0.2367 | 0.2022 | |||
y | $2y$ | $\sqrt{y}$ | Mean | 2.0104 | 2.0000 | 2.0000 | 2.0000 |
Std.dev. | 0.1225 | $5.7\cdot {10}^{-5}$ | $4.7\cdot {10}^{-8}$ | $1.6\cdot {10}^{-14}$ | |||
y | y | ${(1+y)}^{-1}$ | Mean | 2.0008 | 2.0001 | 2.0000 | 2.0000 |
Std.dev. | 0.0769 | 0.0010 | $2.2\cdot {10}^{-12}$ | $1.4\cdot {10}^{-14}$ | |||
y | 1 | $2+\sin y$ | Mean | 1.9358 | 1.9819 | 1.9927 | 1.9939 |
Std.dev. | 0.5436 | 0.2437 | 0.1679 | 0.1077 | |||
$-y$ | 1 | $2+\sin y$ | Mean | 1.9061 | 1.9684 | 1.9700 | 1.9786 |
Std.dev. | 0.5994 | 0.2472 | 0.1781 | 0.1254 | |||
$2-y$ | $\sqrt{y}$ | $\sqrt{y}$ | Mean | 1.9923 | 2.0039 | 1.9796 | 1.9872 |
Std.dev. | 0.3540 | 0.1604 | 0.1173 | 0.0782 | |||
$2-y$ | $\sqrt{y}$ | y | Mean | 2.0830 | 1.9835 | 1.9803 | 1.9886 |
Std.dev. | 0.4347 | 0.1974 | 0.1205 | 0.0840 |