1 Introduction
We consider a one-dimensional continuous stochastic process ${({X_{t}})}_{t\in [0,T]}$ that satisfies the stochastic differential equation
where ${({W_{t}})}_{t\in [0,T]}$ is a Brownian motion defined on a filtered probability space $(\varOmega ,\mathcal{F},{({\mathcal{F}_{t}})}_{t\in [0,T]},\mathbb{P})$. Various numerical approximation schemes for the SDE (1.1) have been proposed and studied in the literature in the past decades. The probably most prominent numerical approximation for the solution of (1.1) is the Euler scheme, which can be described as follows. Let ${\varphi _{n}}:{\mathbb{R}_{+}}\to {\mathbb{R}_{+}}$ be the function defined by ${\varphi _{n}}(t)=i/n$ when $t\in [i/n,(i+1)/n)$. The continuous Euler approximation scheme is described by
The probabilistic properties of the Euler approximation scheme have been investigated in numerous papers. We refer to the classical works [2, 3, 14, 16, 17, 23] for the studies on weak and strong approximation errors among many others. Asymptotic results in the framework of non-regular coefficients can be found in e.g. [1, 8, 9, 27].
(1.1)
\[ d{X_{t}}=b(t,{X_{t}})dt+\sigma (t,{X_{t}})d{W_{t}}\hspace{1em}\text{with}\hspace{2.5pt}{X_{0}}={x_{0}},\](1.2)
\[ d{X_{t}^{n}}=b\big({\varphi _{n}}(t),{X_{{\varphi _{n}}(t)}^{n}}\big)dt+\sigma \big({\varphi _{n}}(t),{X_{{\varphi _{n}}(t)}^{n}}\big)d{W_{t}}\hspace{1em}\text{with}\hspace{2.5pt}{X_{0}^{n}}={x_{0}}.\]In this paper we take a different route and propose to use the Wiener chaos expansion (also called polynomial chaos in the literature) to approximate the solution of the SDE (1.1). To explain ideas let us fix an orthonormal basis ${({e_{i}})}_{i\ge 1}$ of the separable Hilbert space ${L^{2}}([0,T])$. It is a well-known statement (cf. [6]) that if ${X_{t}}\in {L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})$ for all $t\in [0,T]$, where $\mathcal{F}$ is generated by the Brownian motion ${({W_{t}})}_{t\in [0,T]}$, it admits the chaotic expansion
where ${x_{\alpha }}$ are deterministic functions, ${\varPsi ^{\alpha }}$ are mutually orthogonal random projections that are associated to the basis ${({e_{i}})}_{i\ge 1}$, and the index set $\mathcal{I}$ is defined via
Such orthogonal expansions have been successfully applied in numerous fields of stochastic and numerical analysis. We refer e.g. to [11–13, 25] for applications of the Wiener chaos expansion in the context of SDEs and to [20–22, 26] for applications of polynomial expansion in modelling, simulation and filtering of stochastic partial differential equations. The aim of this work is to use the chaos expansion (1.3) to numerically approximate the solution of the SDE (1.1). For this purpose we need to truncate the infinite sum in (1.3). An obvious approach is to consider the approximation
where the subset ${\mathcal{I}_{p,k}}\subset \mathcal{I}$ refers to using orthogonal projections ${\varPsi ^{\alpha }}$ only with respect to the first k basis elements ${({e_{i}})}_{1\le i\le k}$ and only up to the pth order Wiener chaos. This method is mostly related to the articles [11, 12, 22, 21]. More specifically, the works [21, 22] study the ${L^{2}}$-error associated with the approximation (1.5), but only for a particular choice of the basis ${({e_{i}})}_{i\ge 1}$. In this paper we will study the decay rate of $\mathbb{E}[{({X_{t}}-{X_{t}^{p,k}})^{2}}]$ when $k,p\to \infty $ for a general basis ${({e_{i}})}_{i\ge 1}$ of ${L^{2}}([0,T])$ applying methods from Malliavin calculus.
(1.4)
\[ \mathcal{I}:=\big\{\alpha ={({\alpha _{i}})}_{i\ge 1}:\hspace{2.5pt}{\alpha _{i}}\in {\mathbb{N}_{0}}\hspace{2.5pt}\text{and almost all}\hspace{2.5pt}{\alpha _{i}}\text{'s are}\hspace{2.5pt}0\big\}.\](1.5)
\[ {X_{t}^{p,k}}=\sum \limits_{\alpha \in {\mathcal{I}_{p,k}}}{x_{\alpha }}(t){\varPsi ^{\alpha }},\]The paper is structured as follows. In Section 2 we present the elements of Malliavin calculus. The main results of the paper are demonstrated in Section 3. Section 4 is devoted to proofs. In Section 5 we illustrate our approach with exemplary numerical results for the Haar and a trigonometric basis, and propose a heuristic based on sparse indices for computational speedup.
2 Background on Malliavin calculus
In this section we introduce some basic concepts of Malliavin calculus. The interested readers are referred to [24] for more thorough exposition on this subject. Set $\mathbb{H}={L^{2}}([0,T])$ and let ${\langle \cdot ,\cdot \rangle _{\mathbb{H}}}$ denote the scalar product on $\mathbb{H}$. We note that $\mathbb{H}$ is a separable Hilbert space and denote by ${({e_{i}})}_{i\ge 1}$ an orthonormal basis of $\mathbb{H}$. We consider the isonormal Gaussian family $W=\{W(h):h\in \mathbb{H}\}$ indexed by $\mathbb{H}$ defined on a probability space $(\varOmega ,\mathcal{F},\mathbb{P})$, i.e. the random variables $W(h)$ are centered Gaussian with a covariance structure determined via
Here and throughout the paper we assume that $\mathcal{F}=\sigma (W)$. In our setting we consider $W(h)={\int _{0}^{T}}{h_{s}}d{W_{s}}$, where W is a standard Brownian motion. We define the normalised Hermite polynomials through the identities
The nth Wiener chaos ${\mathcal{H}_{n}}$ is the closed linear subspace of ${L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})$ generated by the family of random variables $\{{H_{n}}(W(h)):\hspace{2.5pt}\| h{\| _{\mathbb{H}}}=1\}$. The vector spaces ${\mathcal{H}_{n}}$, $n\ge 0$, are orthogonal and we have the Wiener chaos expansion
(See [24, Theorem 1.1.1] for more details.) Next, for $\alpha \in \mathcal{I}$, where $\mathcal{I}$ has been introduced in (1.4), we define the random variable
where ${({e_{i}})}_{i\ge 1}$ is a fixed orthonormal basis of $\mathbb{H}$. We define $|\alpha |={\sum _{i=1}^{\infty }}{\alpha _{i}}$ and $\alpha !={\prod _{i=1}^{\infty }}{\alpha _{i}}!$ for $\alpha \in \mathcal{I}$. We deduce that the set $\{{\varPsi ^{\alpha }}:\hspace{2.5pt}\alpha \in \mathcal{I}\hspace{2.5pt}\text{with}\hspace{2.5pt}|\alpha |=n\}$ forms a complete orthonormal basis of the nth Wiener chaos ${\mathcal{H}_{n}}$ and consequently $\{{\varPsi ^{\alpha }}:\hspace{2.5pt}\alpha \in \mathcal{I}\}$ is a complete orthonormal basis of ${L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})$ (cf. [24, Proposition 1.1.1]).
(2.1)
\[ {H_{0}}(x):=1,\hspace{2em}{H_{n}}(x):=\frac{{(-1)^{n}}}{\sqrt{n!}}\exp \bigg(\frac{{x^{2}}}{2}\bigg)\frac{{\mathrm{d}^{n}}}{\mathrm{d}{x^{n}}}\bigg(\exp \bigg(-\frac{{x^{2}}}{2}\bigg)\bigg),\hspace{1em}n\ge 1.\](2.2)
\[ {L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})={\underset{n=0}{\overset{\infty }{\bigoplus }}}{\mathcal{H}_{n}}.\](2.3)
\[ {\varPsi ^{\alpha }}:={\prod \limits_{i=1}^{\infty }}{H_{{\alpha _{i}}}}\big(W({e_{i}})\big),\]Now, we introduce multiple stochastic integrals of order $n\in \mathbb{N}$, which are denoted by ${I_{n}}$. For an element ${h^{\otimes n}}:=h\otimes \cdots \otimes h$ in ${\mathbb{H}^{\otimes n}}$ with $\| h{\| _{\mathbb{H}}}=1$, we define
Assuming that the mapping ${I_{n}}$ is linear, the definition (2.4) can be extended to all symmetric elements $h\in {\mathbb{H}^{\otimes n}}$ by polarisation identity. Finally, for an arbitrary function $h\in {\mathbb{H}^{\otimes n}}$ we set
where $\widetilde{h}$ denotes the symmetrised version of $h\in {\mathbb{H}^{\otimes n}}$. By definition ${I_{n}}$ maps $\mathbb{H}$ into ${\mathcal{H}_{n}}$, so the multiple integrals of different orders are orthogonal. In particular, they satisfy the isometry property
Furthermore, for any symmetric $h\in {\mathbb{H}^{\otimes n}},g\in {\mathbb{H}^{\otimes m}}$, the following multiplication formula holds:
where the rth contraction $g{\otimes _{r}}h$ is defined by
where ${I_{0}}=\mathbb{E}[F]$ and the decomposition is unique when ${g_{n}}$, $n\ge 1$, are assumed to be symmetric (cf. [24, Theorem 1.1.2]).
(2.4)
\[ {I_{n}}\big({h^{\otimes n}}\big):=\sqrt{n!}\hspace{2.5pt}{H_{n}}\big(W(h)\big),\hspace{1em}n\ge 1.\](2.5)
\[ \mathbb{E}\big[{I_{m}}(g){I_{n}}(h)\big]=n!{\langle \widetilde{g},\widetilde{h}\rangle _{{\mathbb{H}^{\otimes n}}}}{1_{\{n=m\}}},\hspace{1em}h\in {\mathbb{H}^{\otimes n}},g\in {\mathbb{H}^{\otimes m}}.\](2.6)
\[ {I_{m}}(g){I_{n}}(h)={\sum \limits_{r=0}^{\min (m,n)}}r!\left(\genfrac{}{}{0pt}{}{m}{r}\right)\left(\genfrac{}{}{0pt}{}{n}{r}\right){I_{m+n-2r}}(g{\otimes _{r}}h),\]
\[\begin{aligned}{}g{\otimes _{r}}h({t_{1}},\dots ,{t_{m+n-2r}})& :={\int _{{[0,T]^{r}}}}g({u_{1}},\dots ,{u_{r}},{t_{1}},\dots {t_{m-r}})\\ {} & \hspace{2em}\times h({u_{1}},\dots ,{u_{r}},{t_{m-r+1}},\dots {t_{m+n-2r}})d{u_{1}}\dots d{u_{r}}.\end{aligned}\]
(See [24, Proposition 1.1.3].) The Wiener chaos expansion (2.2) transfers to the context of multiple integrals as follows. For each random variable $F\in {L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})$ we obtain the orthogonal decomposition
(2.7)
\[ F={\sum \limits_{n=0}^{\infty }}{I_{n}}({g_{n}}),\hspace{1em}{g_{n}}\in {\mathbb{H}^{\otimes n}},\]Next, we introduce the notion of Malliavin derivative and its adjoint operator. We define the set of smooth random variables via
Notice that ${D^{k}}F$ is a ${\mathbb{H}^{\otimes k}}$-valued random variable, and we write ${D_{x}^{k}}F$ for the realisation of the function ${D^{k}}F$ at the point $x\in {[0,T]^{k}}$. The space ${\mathbb{D}^{k,q}}$ denotes the completion of the set $\mathcal{S}$ with respect to the norm
Higher order Malliavin derivatives of ${\varPsi ^{\alpha }}$ are computed similarly.
\[ \mathcal{S}=\big\{F=f\big(W({h_{1}}),\dots ,W({h_{n}})\big):\hspace{2.5pt}n\ge 1,{h_{i}}\in \mathbb{H}\big\},\]
where $f\in {C_{p}^{\infty }}({\mathbb{R}^{n}})$ (i.e. the space of infinitely differentiable functions such that all derivatives exhibit polynomial growth). The kth order Malliavin derivative of $F\in \mathcal{S}$, denoted by ${D^{k}}F$, is defined by
(2.8)
\[ {D^{k}}F={\sum \limits_{{i_{1}},\dots ,{i_{k}}=1}^{n}}\frac{{\partial ^{k}}}{\partial {x_{{i_{1}}}}\cdots \partial {x_{{i_{k}}}}}f\big(W({h_{1}}),\dots ,W({h_{n}})\big){h_{{i_{1}}}}\otimes \cdots \otimes {h_{{i_{k}}}}.\]
\[ \| F{\| _{k,q}}:={\Bigg(\mathbb{E}\big[|F{|^{q}}\big]+{\sum \limits_{m=1}^{k}}\mathbb{E}\big[{\big\| {D^{m}}F\big\| _{{\mathbb{H}^{\otimes m}}}^{q}}\big]\Bigg)^{1/q}}.\]
We define ${\mathbb{D}^{k,\infty }}={\cap _{q\mathrm{>}1}}{\mathbb{D}^{k,q}}$. The Malliavin derivative of the random variable ${\varPsi ^{\alpha }}\in \mathcal{S}$, $\alpha \in \mathcal{I}$, can be easily computed using the definition (2.8) and the formula ${H^{\prime }_{n}}(x)=\sqrt{n}{H_{n-1}}(x)$: The operator ${D^{k}}$ possesses an unbounded adjoint denoted by ${\delta ^{k}}$, which is often referred to as multiple Skorokhod integral. The following integration by parts formula holds (see [24, Exercise 1.3.7]): if $u\in \text{Dom}({\delta ^{k}})$ and $F\in {\mathbb{D}^{k,2}}$, where $\text{Dom}({\delta ^{k}})$ consists of all elements $u\in {L^{2}}(\varOmega ;{\mathbb{H}^{\otimes k}})$ such that the inequality $|\mathbb{E}[{\langle {D^{k}}F,u\rangle _{{\mathbb{H}^{\otimes k}}}}]|\le c{(E[{F^{2}}])^{1/2}}$ holds for some $c\mathrm{>}0$ and all $F\in {\mathbb{D}^{k,2}}$, then we have the identity
In case the random variable $F\hspace{0.1667em}\in \hspace{0.1667em}{L^{2}}(\varOmega ,\mathcal{F},\mathbb{P})$ has a chaos decomposition as displayed in (2.7), the statement $F\in {\mathbb{D}^{k,2}}$ is equivalent to the condition ${\sum _{n=1}^{\infty }}{n^{k}}n!\| {g_{n}}{\| _{{\mathbb{H}^{\otimes n}}}^{2}}\mathrm{<}\infty $. In particular, when $F\in {\mathbb{D}^{1,2}}$ we deduce an explicit chaos representation of the derivative $DF={({D_{t}}F)}_{t\in [0,T]}$
where ${g_{n}}(\cdot ,t):{[0,T]^{n-1}}\to \mathbb{R}$ is obtained from the function ${g_{n}}$ by setting the last argument equal to t (see [24, Proposition 1.2.7 ]). Finally, we present an explicit formula for the Malliavin derivative of a solution of a stochastic differential equation. Assume that ${({X_{t}})}_{t\in [0,T]}$ is a solution of a stochastic differential equation (1.1) and $b,\sigma \in {C^{1}}(\mathbb{R})$. In this setting $D{X_{t}}={({D_{s}}{X_{t}})}_{s\in [0,T]}$ is given as the solution of the SDE
for $s\le t$, and ${D_{s}}{X_{t}}=0$ if $s\mathrm{>}t$ (see [24, Theorem 2.2.1]). Throughout the paper $\partial /\partial x$ denotes the derivative with respect to the space variable and $\partial /\partial t$ denotes the derivative with respect to the time variable.
(2.11)
\[ \mathbb{E}\big[F{\delta ^{k}}(u)\big]=\mathbb{E}\big[{\big\langle {D^{k}}F,u\big\rangle _{{\mathbb{H}^{\otimes k}}}}\big].\](2.12)
\[ {D_{t}}F={\sum \limits_{n=1}^{\infty }}n{I_{n-1}}\big({g_{n}}(\cdot ,t)\big),\hspace{1em}t\in [0,T],\](2.13)
\[ {D_{s}}{X_{t}}=\sigma (s,{X_{s}})+{\int _{s}^{t}}\frac{\partial }{\partial x}b(u,{X_{u}}){D_{s}}({X_{u}})du+{\int _{s}^{t}}\frac{\partial }{\partial x}\sigma (u,{X_{u}}){D_{s}}({X_{u}})d{W_{u}},\]3 Main results
We start with the analysis of the Wiener chaos expansion introduced in (1.3). Under square integrability assumption on the solution X of the SDE (1.1) we obtain the Wiener chaos expansion
applying the duality formula (2.11). In fact, the coefficients ${x_{\alpha }}(t)$ satisfy a system of ordinary differential equations. The following propagator system has been derived in [12, 21]. We state the proof for completeness.
\[ {X_{t}}=\sum \limits_{\alpha \in \mathcal{I}}{x_{\alpha }}(t){\varPsi ^{\alpha }}\hspace{1em}\text{with}\hspace{2.5pt}{\varPsi ^{\alpha }}={\prod \limits_{i=1}^{\infty }}{H_{{\alpha _{i}}}}\big(W({e_{i}})\big).\]
In order to study the strong approximation error we require a good control of the coefficients ${x_{\alpha }}(t)$. When ${X_{t}}$ is sufficiently smooth in the Malliavin sense, we deduce the identity
(3.1)
\[ {x_{\alpha }}(t)=\mathbb{E}\big[{X_{t}}{\varPsi ^{\alpha }}\big]=\frac{1}{\sqrt{\alpha !}}\mathbb{E}\Bigg[{\Bigg\langle {D^{|\alpha |}}{X_{t}},{\underset{i=1}{\overset{\infty }{\bigotimes }}}{e_{i}^{\otimes {\alpha _{i}}}}\Bigg\rangle }_{{\mathbb{H}^{\otimes |\alpha |}}}\Bigg]\]Theorem 3.1.
Let ${({X_{t}})}_{t\in [0,T]}$ be the unique solution of the SDE (1.1) and assume that $X\in {L^{2}}(\varOmega \times [0,T])$. Then ${X_{t}}$ possesses the chaos expansion (1.3) and the coefficients ${x_{\alpha }}(t)$ satisfy the system of ordinary differential equations
Here ${b_{\alpha }}(t,{X_{t}})$ (resp. ${\sigma _{\alpha }}(t,{X_{t}})$) denotes the α-coefficient of the Wiener chaos expansion (1.3) associated with the random variable $b(t,{X_{t}})$ (resp. $\sigma (t,{X_{t}})$), and the multi-index ${\alpha ^{-}}(j)$ is defined by (2.10).
(3.2)
\[\begin{aligned}{}{x^{\prime }_{\alpha }}(t)& ={b_{\alpha }}(t,{X_{t}})+{\sum \limits_{j=1}^{\infty }}\sqrt{{\alpha _{j}}}{e_{j}}(t){\sigma _{{\alpha ^{-}}(j)}}(t,{X_{t}}),\\ {} {x_{\alpha }}(0)& ={1_{\{\alpha =0\}}}{x_{0}}.\end{aligned}\]Proof.
Using the SDE (1.1) and applying the formula (3.1) we obtain the identity
\[ {x_{\alpha }}(t)={x_{0}}{1_{\{\alpha =0\}}}+{\int _{0}^{t}}\mathbb{E}\big[{\varPsi ^{\alpha }}b(s,{X_{s}})\big]ds+\mathbb{E}\Bigg[{\varPsi ^{\alpha }}{\int _{0}^{t}}\sigma (s,{X_{s}})d{W_{s}}\Bigg].\]
Applying the formula (3.1) once again for the random variable $b(s,{X_{s}})$ we immediately deduce that
On the other hand, observing the identity $\delta ({1_{[0,t]}}\sigma (\cdot ,{X_{\cdot }}))={\int _{0}^{t}}\sigma (s,{X_{s}})d{W_{s}}$, we get by the duality formula (2.11) and (2.9) that
\[\begin{aligned}{}\mathbb{E}\Bigg[{\varPsi ^{\alpha }}{\int _{0}^{t}}\sigma (s,{X_{s}})d{W_{s}}\Bigg]& ={\sum \limits_{j=1}^{\infty }}{\int _{0}^{t}}\sqrt{{\alpha _{j}}}{e_{j}}(s)\mathbb{E}\big[{\varPsi ^{{\alpha ^{-}}(j)}}\sigma (s,{X_{s}})\big]ds\\ {} & ={\sum \limits_{j=1}^{\infty }}{\int _{0}^{t}}\sqrt{{\alpha _{j}}}{e_{j}}(s){\sigma _{{\alpha ^{-}}(j)}}(s,{X_{s}})ds.\end{aligned}\]
Putting things together we obtain the identity
Consequently, the assertion follows after taking the derivative with respect to t. □We remark that the propagator system (3.2) is recursive. Let us give some simple examples to illustrate how (3.2) can be solved explicitly.
Example 3.2.
(i) (Scaled Brownian motion with drift) We start with the toy example of a scaled Brownian motion with drift:
In this case we obviously have that ${x_{\alpha }}(t)=0$ for any $\alpha \in \mathcal{I}$ with $|\alpha |\ge 2$. Applying formula (3.2) we obtain the representation
which is a well known Karhunen–Loéve expansion of the scaled Brownian motion.
(ii) (Geometric Brownian motion) Let us consider the geometric Brownian motion defined via the SDE
In this setting the propagator system (3.2) translates to
\[\begin{aligned}{}{x^{\prime }_{\alpha }}(t)& =b{x_{\alpha }}(t)+\sigma {\sum \limits_{j=1}^{\infty }}\sqrt{{\alpha _{j}}}{e_{j}}(t){x_{{\alpha ^{-}}(j)}}(t),\\ {} {x_{\alpha }}(0)& ={1_{\{\alpha =0\}}}{x_{0}}.\end{aligned}\]
This system of ordinary differential equations can be solved recursively. For $\alpha =0$ we have ${x^{\prime }_{0}}(t)=b{x_{0}}(t)$ and hence ${x_{0}}(t)={x_{0}}\exp (bx)$. If α is the jth canonical unit vector in $\mathcal{I}$ (and hence $|\alpha |=1$) we obtain the differential equation
\[\begin{aligned}{}{x^{\prime }_{\alpha }}(t)& =b{x_{\alpha }}(t)+\sigma {e_{j}}(t){x_{0}}(t),\\ {} {x_{\alpha }}(0)& ={1_{\{\alpha =0\}}}{x_{0}}.\end{aligned}\]
Hence, ${x_{\alpha }}(t)={x_{0}}\sigma \exp (bx){\int _{0}^{t}}{e_{j}}(s)ds$. Following this recursion we obtain the general formula
\[ {x_{\alpha }}(t)=\frac{1}{\sqrt{\alpha !}}{x_{0}}{\sigma ^{p}}\exp (bx){\prod \limits_{j=1}^{\infty }}{\Bigg({\int _{0}^{t}}{e_{j}}(s)ds\Bigg)^{{\alpha _{j}}}}\]
for any $\alpha \in \mathcal{I}$ with $|\alpha |=p$.For a general specification of the drift coefficient b and diffusion coefficient σ in model (1.1) the propagator system (3.2) cannot be solved explicitly. Thus, precise infinite dimensional Wiener chaos expansion (1.3) is out of reach. For simulation purposes it is an obvious idea to consider a finite subset of $\mathcal{I}$ in the expansion (1.3). We introduce the index set
The approximation of ${X_{t}}$ is now defined via (1.5):
We remark that the quantity ${X_{t}^{p,k}}$ is more useful than the Euler approximation ${X_{t}^{n}}$ introduced in (1.2) if we are interested in the approximation of the first two moments of ${X_{t}}$. Indeed, the first two moments of ${X_{t}^{p,k}}$ are given explicitly by
(3.3)
\[ {\mathcal{I}_{p,k}}:=\big\{\alpha \in \mathcal{I}:\hspace{2.5pt}|\alpha |\le p\hspace{2.5pt}\hspace{2.5pt}\text{and}\hspace{2.5pt}\hspace{2.5pt}{\alpha _{i}}=0\hspace{2.5pt}\text{for all}\hspace{2.5pt}i\mathrm{>}k\big\}.\](3.4)
\[ {X_{t}^{p,k}}=\sum \limits_{\alpha \in {\mathcal{I}_{p,k}}}{x_{\alpha }}(t){\varPsi ^{\alpha }}.\]
\[ \mathbb{E}\big[{X_{t}^{p,k}}\big]={x_{0}}(t)\hspace{1em}\text{and}\hspace{1em}\mathbb{E}\big[{\big({X_{t}^{p,k}}\big)^{2}}\big]=\sum \limits_{\alpha \in {\mathcal{I}_{p,k}}}{x_{\alpha }^{2}}(t),\]
while higher order moments can be computed via an application of the multiplication formula (2.6).The strong approximation error associated with the truncation (3.4) has been previously studied in [21, 22] in a slightly different context. In particular, in both papers the authors consider one specific basis ${({e_{i}})}_{i\ge 1}$ of ${L^{2}}([0,T])$ whereas we are interested in the asymptotic analysis for a general basis ${({e_{i}})}_{i\ge 1}$. While [22] mostly uses methods from analysis, our approach is based upon Malliavin calculus and is close in spirit to [21]. The main result of our paper gives an upper bound on the ${L^{2}}$-error $\mathbb{E}[{({X_{t}^{p,k}}-{X_{t}})^{2}}]$.
Theorem 3.3.
Let ${({X_{t}})}_{t\in [0,T]}$ be the solution of the SDE (1.1). Suppose that the coefficient functions b and σ satisfy the Lipschitz and linear growth conditions
Moreover, assume that $b,\sigma \in {C^{1,\infty }}([0,T]\times \mathbb{R})$, where ${C^{1,\infty }}([0,T]\times \mathbb{R})$ denotes the space of functions $f:[0,T]\times \mathbb{R}\to \mathbb{R}$ that are once continuously differentiable in the first component and infinitely differentiable in the second component, such that
where $C=C(t,K)$ is a positive constant and the function ${E_{l}}(t)$ is defined by
(3.5)
\[\begin{aligned}{}\big|b(t,x)-b(t,y)\big|+\big|\sigma (t,x)-\sigma (t,y)\big|& \le K|x-y|,\hspace{1em}t\in [0,T],\\ {} {\big|b(t,x)\big|^{2}}+{\big|\sigma (t,x)\big|^{2}}& \le {K^{2}}\big(1+|x{|^{2}}\big),\hspace{1em}t\in [0,T].\end{aligned}\]
\[\begin{aligned}{}\bigg|\frac{{\partial ^{l+m}}}{\partial {t^{l}}\partial {x^{m}}}b(t,x)-\frac{{\partial ^{l+m}}}{\partial {t^{l}}\partial {x^{m}}}b(t,y)\bigg|+\bigg|\frac{{\partial ^{l+m}}}{\partial {t^{l}}\partial {x^{m}}}\sigma (t,x)-\frac{{\partial ^{l+m}}}{\partial {t^{l}}\partial {x^{m}}}\sigma (t,y)\bigg|& \hspace{0.1667em}\le \hspace{0.1667em}K|x\hspace{0.1667em}-\hspace{0.1667em}y|,\end{aligned}\]
for $t\in [0,T]$, any $l=0,1$, and $m\ge 0$. Then it holds that
(3.6)
\[ \mathbb{E}\big[{\big({X_{t}^{p,k}}-{X_{t}}\big)^{2}}\big]\le C\big(1+{{x_{0}}^{2}}\big)\Bigg(\frac{1}{(p+1)!}+{\sum \limits_{l=k+1}^{\infty }}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg)\Bigg),\]Let us give some remarks about the statement (3.6). First of all, recalling the Karhunen–Loéve expansion ${W_{t}}={\sum _{l=1}^{\infty }}{E_{l}}(t){\int _{0}^{T}}{e_{l}}(s)d{W_{s}}$, we readily deduce that
\[\begin{aligned}{}\mathbb{E}\big[{W_{t}^{2}}\big]& ={\sum \limits_{l=1}^{\infty }}{E_{l}^{2}}(t)\mathrm{<}\infty \hspace{1em}\text{and}\\ {} {\int _{0}^{t}}\mathbb{E}\big[{W_{s}^{2}}\big]ds& ={\sum \limits_{l=1}^{\infty }}{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \mathrm{<}\infty .\end{aligned}\]
Hence, the upper bound on the right-hand side of (3.6) indeed converges to 0 when $p,k\to \infty $. We also note that the error associated with the truncation of the chaos order does not depend on the particular choice of the basis ${({e_{i}})}_{i\ge 1}$, while the error associated with truncation of basis strongly depends on ${({e_{i}})}_{i\ge 1}$ (which is not really surprising). Furthermore, we remark that it is extremely computationally costly to compute all coefficients ${x_{\alpha }}(t)$, $\alpha \in {\mathcal{I}_{p,k}}$, for a large chaos order p. Thanks to the first bound ${((p+1)!)^{-1}}$ in (3.6) it is sufficient to use small values of p in practical situations (usually $p\le 4$). Example 3.4.
Let us explicitly compute the last terms of the upper bound (3.6) for two prominent bases of ${L^{2}}([0,1])$.
(i) (Trigonometric basis) Consider the orthonormal basis ${({e_{i}})}_{i\ge 1}$ given by
for some $C\mathrm{>}0$.
\[ {e_{1}}(t)=1,\hspace{2em}{e_{2j}}(t)=\sqrt{2}\sin (2\pi jt),\hspace{2em}{e_{2j+1}}(t)=\sqrt{2}\cos (2\pi jt),\hspace{1em}j\ge 1.\]
In this setting we obtain that
\[ {E_{2j}^{2}}(t)=\frac{1}{2{\pi ^{2}}{j^{2}}}{\big(1-\cos (2\pi jt)\big)^{2}},\hspace{2em}{E_{2j+1}^{2}}(t)=\frac{1}{2{\pi ^{2}}{j^{2}}}{\big(1-\sin (2\pi jt)\big)^{2}},\]
for any $j\ge 1$. Consequently, we deduce that
(3.8)
\[ {\sum \limits_{l=k+1}^{\infty }}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg)\le \frac{C}{k}\](ii) (Haar basis) The Haar basis is a collection of functions
defined as follows:
Note that the exponential asymptotic rate for a fixed pth order Wiener chaos is equivalent to $\mathcal{O}({k^{-1}})$, if we choose $k={2^{n}}$ basis elements $({e_{i}})$. We will come back to these two bases in Section 5 where we provide exemplary numerical results that highlight this identical asymptotic rate, but also the different error distributions over time.
\[ {e_{0}}(t)=1,\hspace{2em}{e_{j,n}}(t)=\left\{\begin{array}{l@{\hskip10.0pt}l}{2^{(n-1)/2}}:\hspace{1em}& t\in [{2^{-n+1}}(j-1),{2^{-n}}(2j-1)],\\ {} -{2^{(n-1)/2}}:\hspace{1em}& t\in [{2^{-n}}(2j-1),{2^{-n+1}}j],\\ {} 0:\hspace{1em}& \text{else}.\end{array}\right.\]
In this case we deduce the following representation for ${E_{j,n}}(t)$:
\[ {E_{j,n}}(t)=\left\{\begin{array}{l@{\hskip10.0pt}l}{2^{(n-1)/2}}(t-{2^{-n+1}}(j-1)):\hspace{1em}& t\in [{2^{-n+1}}(j-1),{2^{-n}}(2j-1)],\\ {} {2^{(n-1)/2}}({2^{-n+1}}j-t):\hspace{1em}& t\in [{2^{-n}}(2j-1),{2^{-n+1}}j],\\ {} 0:\hspace{1em}& \text{else}.\end{array}\right.\]
Since ${\max _{t\in [0,1]}}{E_{j,n}^{2}}(t)={2^{-(n+1)}}$ and ${E_{j,n}}(t)\ne 0$ only for $t\in [{2^{-n+1}}(j-1),{2^{-n+1}}j]$, we finally obtain that
(3.9)
\[ {\sum \limits_{l=n+1}^{\infty }}{\sum \limits_{j=1}^{{2^{n}}}}\Bigg({E_{j,l}^{2}}(t)+{\int _{0}^{t}}{E_{j,l}^{2}}(\tau )d\tau \Bigg)\le C{2^{-n}}.\]4 Proofs
Throughout the proofs $C=C(t,K)$ denotes a generic positive constant, which might change from line to line. We start with a proposition that gives an upper bound for the ${L^{2}}$-norm of the Malliavin derivatives of ${X_{t}}$.
Proof.
We show the assertion of Proposition 4.1 by induction over n. For $n=0$ the result is a well-known consequence of the Lipschitz and linear growth conditions (3.5) on the functions b and σ (see e.g. [15, Theorem 2.9, page 289] which relates the second moment of ${X_{t}}$ to the second moment of the initial condition ${X_{0}}={x_{0}}$). For $n=1$ we known from the formula (2.13) that the process ${({D_{s}}{X_{t}})}_{s\in [0,T]}$ satisfies the SDE
Now, we will perform the induction step. Notice that ${D_{s}}{X_{t}}\in {\mathbb{D}^{1,\infty }}$, so its Malliavin derivative satisfies again an SDE according to [24, Theorem 2.2.2]. We define the stochastic process
and
where the sums run over the set ${\mathfrak{P}^{m}}$ of all partitions ${J_{1}}\cup \cdots \cup {J_{\nu }}$ of $\{1,\dots ,m\}$, where ${J_{1}},\dots ,{J_{\nu }}$ are disjoint sets. We determine $\mathfrak{z}(t)=\sigma (t,{X_{t}})$ as well. With these notations at hand, we find by (2.13) and induction that the nth order Malliavin derivative ${D_{{s_{1}},\dots ,{s_{n}}}^{n}}{X_{t}}$ satisfies the linear SDE
for $\hat{s}:=\max \{{s_{1}},\dots ,{s_{n}}\}\le t$ and ${D_{{s_{1}},\dots ,{s_{n}}}^{n}}{X_{t}}=0$ else. Hence, its initial value is given by
\[ {D_{s}}{X_{t}}=\sigma (s,{X_{s}})+{\int _{s}^{t}}\frac{\partial }{\partial x}b(u,{X_{u}}){D_{s}}({X_{u}})du+{\int _{s}^{t}}\frac{\partial }{\partial x}\sigma (u,{X_{u}}){D_{s}}({X_{u}})d{W_{u}},\]
for $s\le t$, and ${D_{s}}{X_{t}}=0$ for $s\mathrm{>}t$. Next, we introduce the two-dimensional stochastic process ${({Y_{s;t}^{(1)}})_{t\ge s}}$ via
Note that the process ${({Y_{s;t}^{(1)}})_{t\ge s}}$ satisfies a two-dimensional SDE with initial condition given by ${Y_{s;s}^{(1)}}$. In particular, setting ${Y_{s;t}^{(1)}}=({Y_{s;t}^{(1.1)}},{Y_{s;t}^{(1.2)}})$, we have the representation
\[ {D_{s}}{X_{t}}=\sigma \big(s,{Y_{s;s}^{(1.1)}}\big)+{\int _{s}^{t}}\overline{b}\big(u,{Y_{s;u}^{(1)}}\big)du+{\int _{s}^{t}}\overline{\sigma }\big(u,{Y_{s;u}^{(1)}}\big)d{W_{u}},\]
where $\overline{b}(u,y)=\frac{\partial }{\partial x}b(u,{y_{1}}){y_{2}}$ and $\overline{\sigma }(u,y)=\frac{\partial }{\partial x}\sigma (u,{y_{1}}){y_{2}}$ for $y=({y_{1}},{y_{2}})$. Applying again the estimates of [15, Theorem 2.9, page 289] to the diffusion process ${({Y_{s;t}^{(1)}})_{t\ge s}}$, we conclude the estimate
(4.2)
\[\begin{aligned}{}\mathbb{E}\big[{({D_{s}}{X_{t}})^{2}}\big]& \le \mathbb{E}\big[{\big\| {Y_{s,t}^{(1)}}\big\| ^{2}}\big]\le C\big(1+\mathbb{E}\big[{\big\| {Y_{s,s}^{(1)}}\big\| ^{2}}\big]\big)\\ {} & \le C\big(1+\mathbb{E}\big[{X_{s}^{2}}\big]+\mathbb{E}\big[\sigma {(s,{X_{s}})^{2}}\big]\big)\\ {} & \le C\big(1+{x_{0}^{2}}\big).\end{aligned}\]
\[ {Y_{{s_{1}},\dots ,{s_{n}};t}^{(n)}}:=\big({X_{t}},{D_{{s_{1}}}}{X_{t}},{D_{{s_{1}},{s_{2}}}^{2}}{X_{t}},\dots ,{D_{{s_{1}},\dots ,{s_{n}}}^{n}}{X_{t}}\big).\]
Assume that the assertion of Theorem 3.3 holds for all $k=1,\dots ,n-1$. In order to compute the estimate, we have to consider the initial values of the SDE system satisfied by ${Y_{{s_{1}},\dots ,{s_{n}};\hspace{0.1667em}t}^{(n)}}$. The formula below can be found in [24, Theorem 2.2.2]. Before we state it, we need some additional notation. The stochastic process ${D^{m}}{X_{t}}=\{{D_{{s_{1}},\dots ,{s_{m}}}^{m}}{X_{t}}\big|({s_{1}},\dots ,{s_{m}})\in [0,T]\}$ depends on the m time points ${s_{1}},\dots ,{s_{m}}$. For any subset $J=\{{j_{1}}\mathrm{<}\cdots \mathrm{<}{j_{\eta }}\}$ of $\{1,\dots ,m\}$ with $|J|=\eta \le m$ elements, denote $s(J)=({s_{{j_{1}}}},\dots ,{s_{{j_{\eta }}}})$. Further on, we define
(4.3)
\[ \mathfrak{z}(t,{s_{1}},\dots ,{s_{m}})=\sum \limits_{{\mathfrak{P}^{m}}}\frac{{\partial ^{m}}}{\partial {x^{m}}}\sigma (t,{X_{t}}){D_{s({J_{1}})}^{|s({J_{1}})|}}{X_{t}}\cdots {D_{s({J_{\nu }})}^{|s({J_{\nu }})|}}{X_{t}},\](4.4)
\[ \mathfrak{y}(t,{s_{1}},\dots ,{s_{m}})=\sum \limits_{{\mathfrak{P}^{m}}}\frac{{\partial ^{m}}}{\partial {x^{m}}}b(t,{X_{t}}){D_{s({J_{1}})}^{|s({J_{1}})|}}{X_{t}}\cdots {D_{s({J_{\nu }})}^{|s({J_{\nu }})|}}{X_{t}},\](4.5)
\[\begin{aligned}{}{D_{{s_{1}},\dots ,{s_{n}}}^{n}}{X_{t}}& ={\sum \limits_{i=1}^{n}}\mathfrak{z}({s_{i}},{s_{1}},\dots ,{s_{i-1}},{s_{i+1}},\dots ,{s_{n}})\\ {} & \hspace{2em}+{\int _{\hat{s}}^{t}}\mathfrak{y}(u,{s_{1}},\dots ,{s_{n}})du+{\int _{\hat{s}}^{t}}\mathfrak{z}(u,{s_{1}},\dots ,{s_{n}})d{W_{u}}\end{aligned}\]
\[ {\sum \limits_{i=1}^{n}}\mathfrak{z}({s_{i}},{s_{1}},\dots ,{s_{i-1}},{s_{i+1}},\dots ,{s_{n}}),\]
where
\[\begin{aligned}{}& \mathfrak{z}({s_{1}},\dots ,{s_{n}})=\frac{{\partial ^{n}}}{\partial {x^{n}}}\sigma ({s_{1}},{X_{{s_{1}}}})\\ {} & \hspace{2em}\times \big({D_{{s_{2}}}}{X_{{s_{1}}}}\cdots {D_{{s_{n}}}}{X_{{s_{1}}}}+{D_{{s_{2}},{s_{3}}}^{2}}{X_{{s_{1}}}}\cdot {D_{{s_{4}}}}{X_{{s_{1}}}}\cdots {D_{{s_{n}}}}{X_{{s_{1}}}}+\cdots +{D_{{s_{2}},\dots ,{s_{n}}}^{n-1}}{X_{{s_{1}}}}\big).\end{aligned}\]
Finally, we apply [15, Theorem 2.9, page 289] as for the case $n=1$ and taking into account that the assertion holds for $k=1,\dots ,n-1$:
\[\begin{aligned}{}& \mathbb{E}\big[{\big|{D_{{s_{1}},\dots ,{s_{n}}}^{n}}{X_{t}}\big|^{2}}\big]\le \mathbb{E}\big[{\big\| {Y_{{s_{1}},\dots ,{s_{n}};\hspace{0.1667em}t}^{(n)}}\big\| ^{2}}\big]\\ {} & \hspace{2em}\le C\Bigg(1+\mathbb{E}\big[|{X_{\hat{s}}}{|^{2}}\big]+\cdots +\mathbb{E}\Bigg[{\Bigg|{\sum \limits_{i=1}^{k}}\mathfrak{z}({s_{i}},{s_{1}},\dots {s_{i-1}},{s_{i+1}},\dots ,{s_{n}})\Bigg|^{2}}\Bigg]\Bigg)\\ {} & \hspace{2em}\le {C^{n}}\big(1+{{x_{0}}^{2}}\big).\end{aligned}\]
This completes the proof of Proposition 4.1. □Now, we proceed with the proof of the main result of Theorem 3.3, which follows the ideas of [21, Proof of Theorem 2.2]. Observe the decomposition
where $d(\alpha ):=\max \{i\ge 1:\hspace{2.5pt}{\alpha _{i}}\mathrm{>}0\}$. To determine an upper bound for ${A_{1}}(p)$, we use the chaos expansions (1.3) and (2.7) of ${X_{t}}$, i.e.,
where we abbreviate
Finally, using (4.8) and applying Proposition 4.1 we obtain
The treatment of the term ${A_{2}}(p,k)$ is more involved. Recalling the definition of $d(\alpha )=\max \{i\ge 1:\hspace{2.5pt}{\alpha _{i}}\mathrm{>}0\}$, we introduce the characteristic set $({i_{1}},\dots ,{i_{n}})$ of $\alpha \in \mathcal{I}$ for ${i_{1}}\le {i_{2}}\le \cdots \le {i_{n}}$ and $|\alpha |=n$. It is defined as follows: ${i_{1}}$ is the index of the first non-zero component of α. If ${\alpha _{{i_{1}}}}=1$ then ${i_{2}}$ is the index of the second non-zero component of α; otherwise ${i_{1}}={i_{2}}=\cdots ={i_{{\alpha _{{i_{1}}}}}}$ and ${i_{{\alpha _{{i_{1}}}}+1}}$ is the second non-zero component of α. The same operation is repeated for the index ${i_{{\alpha _{{i_{1}}}}+1}}$ and further non-zero components of α. In this fashion, the characteristic set is constructed, resulting in the observation that $d(\alpha )={i_{n}}$. To give a simple example, consider the multiindex $\alpha =(2,0,1,4,0,0,\dots )\in \mathcal{I}$. Here the characteristic set of α is given by
For any $\alpha \in \mathcal{I}$ with $|\alpha |=n$ and a basis ${({e_{i}})}_{i\ge 1}$ of $\mathbb{H}={L^{2}}([0,T])$, we denote by ${\widetilde{e}_{\alpha }}({\mathbf{t}^{n}})$ the (scaled) symmetrised form of $\bigotimes {e_{i}^{\otimes {\alpha _{i}}}}$ defined via
where $({i_{1}},\dots ,{i_{n}})$ is the characteristic set of α and the sum runs over all permutations π within the permutation group ${\mathfrak{P}^{n}}$ of $\{1,\dots ,n\}$. From (3.1) we know that ${x_{\alpha }}(t)=\mathbb{E}[{X_{t}}{\varPsi ^{\alpha }}]$. On the other hand, we have the representation
by (2.11). Since
with ${t_{0}}:=0$ and ${t_{n+1}}:=t$, by changing the order of integration. Then for any ${i_{n}}=l\ge 1$ we integrate by parts to deduce
where
Now, we use substitution in (4.12) by renaming ${\mathbf{t}_{j}^{n}}$ in the following way for each j: With ${s_{i}}={t_{i}}$ for all $i\le j-1$ and ${s_{i}}={t_{i+1}}$ for all $i\mathrm{>}j-1$, we have ${\mathbf{s}^{n-1}}:={\mathbf{t}_{j}^{n}}$ by setting ${s_{0}}=0$ and ${s_{n}}=t$. Moreover, we denote with ${\mathbf{s}^{n-1,r}}$, $r=1,\dots ,n-1$, the set that is generated from ${\mathbf{s}^{n-1}}$ by taking ${s_{r}}$ twice. To finalize this notation, we set ${\mathbf{s}^{n-1,0}}=({s_{0}},{s_{1}},\dots ,{s_{n-1}})$ and ${\mathbf{s}^{n-1,n}}=({s_{1}},\dots ,{s_{n-1}},{s_{n}})$. Then
Consequently, we deduce
Combining (4.9) and (4.15) we obtain the result of Theorem 3.3.
(4.6)
\[ \mathbb{E}\big[{\big({X_{t}^{p,k}}-{X_{t}}\big)^{2}}\big]\le 2\underset{=:{A_{1}}(p)}{\underbrace{{\sum \limits_{n=p+1}^{\infty }}\hspace{2.5pt}\sum \limits_{|\alpha |=n}{x_{\alpha }^{2}}(t)}}+2\underset{=:{A_{2}}(p,k)}{\underbrace{{\sum \limits_{l=k+1}^{\infty }}\hspace{2.5pt}{\sum \limits_{n=0}^{p}}\hspace{2.5pt}\sum \limits_{\begin{array}{c}|\alpha |=n\\ {} d(\alpha )=l\end{array}}{{x_{\alpha }}^{2}}(t)}},\]
\[ {X_{t}}={\sum \limits_{n=0}^{\infty }}\hspace{2.5pt}\sum \limits_{|\alpha |=n}{x_{\alpha }}(t){\varPsi ^{\alpha }}={\sum \limits_{n=0}^{\infty }}{I_{n}}\big({\xi _{n}}\big({\mathbf{t}^{n}};t\big)\big),\]
with ${\mathbf{t}^{n}}=({t_{1}},\dots ,{t_{n}})$ and symmetric kernel functions ${\xi _{n}}({\mathbf{t}^{n}};t)$ being defined by
for all $n\in {\mathbb{N}_{0}}$. We conclude that
(4.7)
\[\begin{aligned}{}\sum \limits_{|\alpha |=n}{{x_{\alpha }}^{2}}(t)& =\mathbb{E}\bigg[{\bigg(\sum \limits_{|\alpha |=n}{x_{\alpha }}(t){\varPsi ^{\alpha }}\bigg)^{2}}\bigg]=\mathbb{E}\big[{\big({I_{n}}\big({\xi _{n}}\big({\mathbf{t}^{n}};t\big)\big)\big)^{2}}\big]\\ {} & =n!{\big\langle {\xi _{n}}\big({\mathbf{t}^{n}};\hspace{0.1667em}t\big),{\xi _{n}}\big({\mathbf{t}^{n}};\hspace{0.1667em}t\big)\big\rangle }_{\mathbb{H}}\\ {} & ={(n!)^{2}}{\int ^{(n);t}}{\big({\xi _{n}}\big({\mathbf{t}^{n}};\hspace{0.1667em}t\big)\big)^{2}}d{\mathbf{t}^{n}},\end{aligned}\]
\[ {\int ^{(n);t}}f(\cdot )d{\mathbf{t}^{n}}:={\int _{0}^{t}}\dots {\int _{0}^{{t_{2}}}}f(\cdot )d{t_{1}}\dots d{t_{n}}.\]
Therefore, we deduce via (4.7) that
(4.8)
\[ {A_{1}}(p)={\sum \limits_{n=p+1}^{\infty }}{(n!)^{2}}{\int ^{(n);t}}{\big({\xi _{n}}\big({\mathbf{t}^{n}};t\big)\big)^{2}}d{\mathbf{t}^{n}}={\sum \limits_{n=p+1}^{\infty }}{\int ^{(n);t}}\mathbb{E}\big[{\big({D_{{t_{1}},\dots ,{t_{n}}}^{n}}{X_{t}}\big)^{2}}\big]d{\mathbf{t}^{n}}.\](4.9)
\[\begin{aligned}{}{A_{1}}(p)& ={\sum \limits_{n=p+1}^{\infty }}{\int ^{(n);t}}\mathbb{E}\big[{\big({D_{{t_{1}},\dots ,{t_{n}}}^{n}}{X_{t}}\big)^{2}}\big]d{\mathbf{t}^{n}}\\ {} & \le \big(1+{{x_{0}}^{2}}\big){\sum \limits_{n=p+1}^{\infty }}{C^{n}}{\int ^{(n);t}}d{\mathbf{t}^{n}}\\ {} & =\big(1+{{x_{0}}^{2}}\big){\sum \limits_{n=p+1}^{\infty }}\frac{{(Ct)^{n}}}{n!}\\ {} & \le C\big(1+{{x_{0}}^{2}}\big)\frac{1}{(p+1)!}.\end{aligned}\](4.10)
\[ {\widetilde{e}_{\alpha }}\big({\mathbf{t}^{n}}\big):=\sum \limits_{\pi \in {\mathfrak{P}^{n}}}{e_{{i_{1}}}}({t_{\pi (1)}})\cdots {e_{{i_{n}}}}({t_{\pi (n)}}),\]
\[ {\varPsi ^{\alpha }}=\frac{1}{\sqrt{\alpha !}}{I_{|\alpha |}}\Bigg({\underset{i=1}{\overset{\infty }{\bigotimes }}}{e_{i}^{\otimes {\alpha _{i}}}}\Bigg)=\frac{1}{\sqrt{\alpha !}}{I_{|\alpha |}}\bigg(\frac{1}{|\alpha |!}{\widetilde{e}_{\alpha }}\bigg).\]
Now, for any $\alpha \in \mathcal{I}$ with $|\alpha |=n$, we obtain the identity
(4.11)
\[ {x_{\alpha }}(t)=\frac{n!}{\sqrt{\alpha !}}{\int ^{(n);t}}{\xi _{n}}\big({\mathbf{t}^{n}};t\big){\widetilde{e}_{\alpha }}\big({\mathbf{t}^{n}}\big)d{\mathbf{t}^{n}},\]
\[ {\widetilde{e}_{\alpha }}\big({\mathbf{t}^{n}}\big)={\sum \limits_{j=1}^{n}}{e_{{i_{n}}}}({t_{j}})\cdot {\widetilde{e}_{{\alpha ^{-}}({i_{n}})}}\big({\mathbf{t}_{j}^{n}}\big),\]
where ${\mathbf{t}_{j}^{n}}$ is obtained from ${\mathbf{t}^{n}}$ by omitting ${t_{j}}$ and ${\alpha ^{-}}(\cdot )$ denotes the diminished multi-index as defined in (2.10), we deduce that
(4.12)
\[ {x_{\alpha }}(t)=\frac{n!}{\sqrt{\alpha !}}{\sum \limits_{j=1}^{n}}\hspace{2.5pt}{\int ^{(n-1);t}}\Bigg({\int _{{t_{j-1}}}^{{t_{j+1}}}}{\xi _{n}}\big({\mathbf{t}^{n}};t\big)\hspace{0.1667em}{e_{{i_{n}}}}({t_{j}})d{t_{j}}\Bigg){\widetilde{e}_{{\alpha ^{-}}({i_{n}})}}\big({\mathbf{t}_{j}^{n}}\big)d{\mathbf{t}_{j}^{n}},\](4.13)
\[ {\int _{{t_{j-1}}}^{{t_{j+1}}}}{\xi _{n}}\big({\mathbf{t}^{n}};t\big)\hspace{0.1667em}{e_{l}}({t_{j}})d{t_{j}}={\big[{\xi _{n}}\big({\mathbf{t}^{n}};t\big)\hspace{0.1667em}{E_{l}}({t_{j}})\big]_{{t_{j}}={t_{j-1}}}^{{t_{j}}={t_{j+1}}}}-{\int _{{t_{j-1}}}^{{t_{j+1}}}}\frac{\partial }{\partial {t_{j}}}{\xi _{n}}\big({\mathbf{t}^{n}};t\big)\hspace{0.1667em}{E_{l}}({t_{j}})d{t_{j}},\]
\[ {\big[{\xi _{n}}\big({\mathbf{t}^{n}};t\big)\hspace{0.1667em}{E_{l}}({t_{j}})\big]_{{t_{j}}={t_{j-1}}}^{{t_{j}}={t_{j+1}}}}={\xi _{n}}\big({\mathbf{s}^{n-1,j}}\big)\hspace{0.1667em}{E_{l}}({s_{j}})-{\xi _{n}}\big({\mathbf{s}^{n-1,j-1}}\big)\hspace{0.1667em}{E_{l}}({s_{j-1}}),\hspace{2.5pt}\hspace{2.5pt}j=1,\dots ,n.\]
Because ${E_{l}}({s_{0}})={E_{l}}(0)=0$ and ${E_{l}}({s_{n}})={E_{l}}(t)$, from (4.13) we see that by summing over j all terms except one cancel out. Hence, setting
\[\begin{aligned}{}{\psi _{l}}\big({\mathbf{s}^{n-1}};t\big)& :={\xi _{n}}\big({\mathbf{s}^{n-1,n}}\big)\hspace{0.1667em}{E_{l}}(t)\\ {} & \hspace{2em}-{\int _{0}^{{s_{1}}}}\frac{\partial }{\partial {s_{1}}}{\xi _{n}}\big(\tau ,{\mathbf{s}^{n-1}};t\big)\hspace{0.1667em}{E_{l}}(\tau )d\tau \\ {} & \hspace{2em}-{\sum \limits_{j=2}^{n-1}}{\int _{{s_{j-1}}}^{{s_{j}}}}\frac{\partial }{\partial {s_{j}}}{\xi _{n}}(\dots ,{s_{j-1}},\tau ,{s_{j+1}},\dots ;t)\hspace{0.1667em}{E_{l}}(\tau )d\tau \\ {} & \hspace{2em}-{\int _{{s_{n-1}}}^{t}}\frac{\partial }{\partial {s_{n}}}{\xi _{n}}\big({\mathbf{s}^{n-1}},\tau ;t\big)\hspace{0.1667em}{E_{l}}(\tau )d\tau ,\end{aligned}\]
we obtain from (4.12)
\[\begin{aligned}{}\sum \limits_{\begin{array}{c}|\alpha |=n\\ {} {i_{n}}=d(\alpha )=l\end{array}}{{x_{\alpha }}^{2}}(t)& =\sum \limits_{\begin{array}{c}|\alpha |=n\\ {} {i_{n}}=d(\alpha )=l\end{array}}{\bigg(\frac{n!}{\sqrt{\alpha !}}{\int ^{(n-1);t}}{\psi _{l}}\big({\mathbf{s}^{n-1}};t\big)\hspace{0.1667em}{\widetilde{e}_{{\alpha ^{-}}(l)}}\big({\mathbf{s}^{n-1}}\big)d{\mathbf{s}^{n-1}}\bigg)^{2}}\\ {} & \le {n^{2}}\hspace{0.1667em}\sum \limits_{|\beta |=n-1}{\bigg(\frac{(n-1)!}{\sqrt{\beta !}}{\int ^{(n-1);t}}{\psi _{l}}\big({\mathbf{s}^{n-1}};t\big)\hspace{0.1667em}{\widetilde{e}_{\beta }}\big({\mathbf{s}^{n-1}}\big)d{\mathbf{s}^{n-1}}\bigg)^{2}},\end{aligned}\]
since $|{\alpha ^{-}}({i_{|\alpha |}})|=|\alpha |-1$ and $\alpha !\ge {\alpha ^{-}}({i_{|\alpha |}})!$. In order to interpret the last sum, consider the random variable ${I_{n-1}}({\psi _{l}}({\mathbf{s}^{n-1}};t))$. Then, following (4.7) and the identity (4.11), we conclude that
\[\begin{aligned}{}\mathbb{E}\big[{\big({I_{n-1}}\big({\psi _{l}}\big({\mathbf{s}^{n-1}};t\big)\big)\big)^{2}}\big]& ={\big((n-1)!\big)^{2}}{\int ^{(n-1);t}}{\psi _{l}^{2}}\big({\mathbf{s}^{n-1}};t\big)d{\mathbf{s}^{n-1}}\\ {} & =\sum \limits_{|\beta |=n-1}\hspace{-0.1667em}{\bigg(\frac{(n-1)!}{\sqrt{\beta !}}\hspace{-0.1667em}{\int ^{(n-1);t}}\hspace{-0.1667em}{\psi _{l}}\big({\mathbf{s}^{n-1}};t\big)\hspace{0.1667em}{\widetilde{e}_{\beta }}\big({\mathbf{s}^{n-1}}\big)d{\mathbf{s}^{n-1}}\bigg)^{2}}.\end{aligned}\]
Thus, we get the estimate
\[ \sum \limits_{\begin{array}{c}|\alpha |=n\\ {} {i_{n}}=l\end{array}}{{x_{\alpha }}^{2}}(t)\le {(n!)^{2}}{\int ^{(n-1);t}}{\psi _{l}^{2}}\big({\mathbf{s}^{n-1}};t\big)d{\mathbf{s}^{n-1}}.\]
Now, using the Cauchy–Schwarz inequality we deduce that
\[\begin{aligned}{}{\psi _{l}^{2}}\big({\mathbf{s}^{n-1}};t\big)& \le (n+1)\Bigg({\xi _{n}^{2}}\big({\mathbf{s}^{n-1,n}}\big)\hspace{0.1667em}{E_{l}^{2}}(t)\\ {} & \hspace{2em}+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \cdot {\sum \limits_{j=1}^{n}}{\int _{{s_{j-1}}}^{{s_{j}}}}{\bigg|\frac{\partial }{\partial {s_{j}}}{\xi _{n}}(\dots {s_{j-1}},\tau ,{s_{j+1}},\dots ;t)\bigg|^{2}}d\tau \Bigg).\end{aligned}\]
Recall the identity ${\xi _{n}}(\cdot ;t)={(n!)^{-1}}\mathbb{E}[{D^{n}}{X_{t}}]$. Using similar arguments as in the proof of Proposition 4.1, we obtain the inequality
\[ {\bigg|\frac{\partial }{\partial {t_{i}}}{\xi _{n}}({t_{1}},\dots ,{t_{n}};\hspace{0.1667em}t)\bigg|^{2}}\le \big(1+{x_{0}^{2}}\big)\frac{{C^{n}}}{{(n!)^{2}}},\hspace{1em}i=1,\dots ,n.\]
Since $|{s_{j}}-{s_{j-1}}|\le t$, we finally conclude that
(4.14)
\[ {\psi _{l}^{2}}\big({\mathbf{s}^{n-1}};t\big)\le \big(1+{x_{0}^{2}}\big)\frac{{C^{n}}(n+1)}{{(n!)^{2}}}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg).\]
\[ \sum \limits_{\begin{array}{c}|\alpha |=n\\ {} {i_{n}}=l\end{array}}{{x_{\alpha }}^{2}}(t)\le \big(1+{x_{0}^{2}}\big)\frac{{C^{n}}(n+1)}{(n-1)!}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg).\]
Putting things together we conclude
(4.15)
\[\begin{aligned}{}{A_{2}}(p,k)& ={\sum \limits_{l=k+1}^{\infty }}\hspace{2.5pt}{\sum \limits_{n=1}^{p}}\hspace{2.5pt}\sum \limits_{\begin{array}{c}|\alpha |=n\\ {} {i_{n}}=l\end{array}}{{x_{\alpha }}^{2}}(t)\\ {} & \le \big(1+{x_{0}^{2}}\big){\sum \limits_{n=1}^{\infty }}\frac{{C^{n}}(n+1)}{(n-1)!}\cdot {\sum \limits_{l=k+1}^{\infty }}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg)\\ {} & \le C\big(1+{x_{0}^{2}}\big){\sum \limits_{l=k+1}^{\infty }}\Bigg({E_{l}^{2}}(t)+{\int _{0}^{t}}{E_{l}^{2}}(\tau )d\tau \Bigg).\end{aligned}\]Fig. 1.
Absolute errors of the variance as calculated with different polynomial chaos expansions for the Geometric Brownian Motion, $|Var[{X_{t}}]-Var[{X_{t}^{p,k}}]|$. The trigonometric and Haar bases (with $k={2^{n}}$) from Example 3.4 are shown
5 Numerical results and sparse truncation
To get an idea of the performance of the polynomial chaos expansion for different bases and truncations, we apply the trigonometric basis from Example 3.4 (i) and the Haar basis from Example 3.4 (ii) to the geometric Brownian motion from Example 3.2,
on the horizon $t\in [0,1]$ with $\mu =\sigma ={x_{0}}=1$. In this setting we have the identity
which can be used to compare the absolute errors of approximated variances using different bases and values for k and p. We used the ode45 solver with standard settings within Matlab 2017b (64bit, single core) on an Intel Core i5-6300U (2.4GHz, 8GB RAM).
5.1 Absolute errors of variances
Figure 1 shows the absolute errors between the variances of the analytical solution and different polynomial chaos expansions. One observes the fast decay of the approximation error for fixed k and increasing p, and vice versa. The main difference between the observed errors is that the maximum of the trigonometric expansion is always at the end of the time horizon at $t=1$, while the Haar basis leads to almost zero error on k equidistant time points, including $t=0$ and $t=1$.
Table 1 in the Appendix lists approximation errors, numbers of coefficients, and computational times for several choices of p and k. The exponential increase in the number of coefficients with increasing p and k leads to an exponential increase in runtime, as expected. To overcome this issue, we propose a heuristic to reduce the computational burden.
5.2 Sparse truncation
The information contained in the coefficient functions ${x_{\alpha }}(\cdot )$ decays with increasing order p of the basis polynomials ${\varPsi ^{\alpha }}$ and the decaying rate of the Gaussian expansion, i.e., the index k of the basis polynomials ${e_{k}}$ used for constructing ${\varPsi ^{\alpha }}$ [22].
Hence, we can define sparse index sets for reducing the number of multi-indices α from the full truncated index set ${\mathcal{I}_{p,k}}$ without losing too much information, i.e., accuracy in a numerical simulation.
Definition 1 (Sparse truncation of first order).
Let p be the maximum order of the index $\alpha \in {\mathcal{I}_{p,k}}$. Then the first order sparse index $\mathbb{r}=({r_{1}},\dots ,{r_{k}})$ satisfies $p={r_{1}}\ge {r_{2}}\ge \cdots \ge {r_{k}}$ and we define the first order sparse index set
Example 5.1.
Let $k=5$ and $p=3$. Then a possible choice of the sparse index is $\mathbb{r}=(3,2,2,1,1)$. For constructing the first order polynomials all five first order basis polynomials can be used. The second order polynomials are comprised by all possible combinations of first order basis polynomials depending on ${e_{1}},\dots ,{e_{5}}$ and the second order basis polynomials of ${e_{1}},{e_{2}},{e_{3}}$. Analogously, the third order polynomials are constructed. By using this first order sparse index set ${\mathcal{I}_{p,k}^{\mathbb{r}}}$ the number of coefficient functions ${x_{\alpha }}(\cdot )$ appearing within the propagator system (3.2) can be reduced drastically without impairing the solution much. The full index set ${\mathcal{I}_{p,k}}$ consists of $\frac{(k+p)!}{k!p!}=56$ terms, whereas this first order sparse index set includes 42 terms.
An even more rigorous reduction of the number of coefficients included in the propagator system can be achieved by using a second order sparse index, i.e., a series of first order sparse indices ${({\mathbb{r}^{j}})_{j=0,\dots ,p}}$ that depend on the actual order of the polynomials ${\varPsi ^{\alpha }}$ with $j=|\alpha |$, [22]. This approach allows to exclude crossing products of random variables $W({e_{i}})$ from the construction of higher order basis polynomials ${\varPsi ^{\alpha }}$ that add only negligible information to the system.
Definition 2 (Sparse truncation of second order).
Let p be the maximum order of the index $\alpha \in {\mathcal{I}_{p,k}}$. Then the second order sparse index $(\mathbb{r})={({\mathbb{r}^{j}})_{j\le p}}$ is a series of first order sparse indices ${\mathbb{r}^{j}}=({r_{1}^{j}},\dots ,{r_{k}^{j}})$ satisfying $j={r_{1}^{j}}\ge {r_{2}^{j}}\ge \cdots \ge {r_{k}^{j}}$ for all $j=|\alpha |\le p$ and we define the second order sparse index set
Example 5.2.
Considering again the setting of the previous example with $k=5$ and $p=3$, one possible choice of a second order sparse index is given via ${\mathbb{r}^{1}}=(1,1,1,1,1)$, ${\mathbb{r}^{2}}=(2,2,2,1,0)$, and ${\mathbb{r}^{3}}=(3,2,0,0,0)$. In constructing basis polynomials of order $|\alpha |=3$ we can use all combinations of basis polynomials depending on the first two random variables ${e_{1}}$ and ${e_{2}}$ up to orders 3 and 2, respectively. Thus, these are $\sqrt{6}{H_{3}}({e_{1}})$, $\sqrt{2}{H_{2}}({e_{1}}){H_{1}}({e_{2}})$, and $\sqrt{2}{H_{1}}({e_{1}}){H_{2}}({e_{2}})$, compare [22].
Table 2 in the Appendix lists the first and second order sparse indices that were used for the numerical study. The results in Table 1 show that at the price of a slightly higher error, and of course the loss of a guaranteed upper bound as in (3.6), the computational times could be reduced by several orders of magnitude.