1 Introduction
We consider a one-dimensional continuous stochastic process (Xt)t∈[0,T] that satisfies the stochastic differential equation
where (Wt)t∈[0,T] is a Brownian motion defined on a filtered probability space (Ω,F,(Ft)t∈[0,T],P). Various numerical approximation schemes for the SDE (1.1) have been proposed and studied in the literature in the past decades. The probably most prominent numerical approximation for the solution of (1.1) is the Euler scheme, which can be described as follows. Let φn:R+→R+ be the function defined by φn(t)=i/n when t∈[i/n,(i+1)/n). The continuous Euler approximation scheme is described by
The probabilistic properties of the Euler approximation scheme have been investigated in numerous papers. We refer to the classical works [2, 3, 14, 16, 17, 23] for the studies on weak and strong approximation errors among many others. Asymptotic results in the framework of non-regular coefficients can be found in e.g. [1, 8, 9, 27].
In this paper we take a different route and propose to use the Wiener chaos expansion (also called polynomial chaos in the literature) to approximate the solution of the SDE (1.1). To explain ideas let us fix an orthonormal basis (ei)i≥1 of the separable Hilbert space L2([0,T]). It is a well-known statement (cf. [6]) that if Xt∈L2(Ω,F,P) for all t∈[0,T], where F is generated by the Brownian motion (Wt)t∈[0,T], it admits the chaotic expansion
where xα are deterministic functions, Ψα are mutually orthogonal random projections that are associated to the basis (ei)i≥1, and the index set I is defined via
Such orthogonal expansions have been successfully applied in numerous fields of stochastic and numerical analysis. We refer e.g. to [11–13, 25] for applications of the Wiener chaos expansion in the context of SDEs and to [20–22, 26] for applications of polynomial expansion in modelling, simulation and filtering of stochastic partial differential equations. The aim of this work is to use the chaos expansion (1.3) to numerically approximate the solution of the SDE (1.1). For this purpose we need to truncate the infinite sum in (1.3). An obvious approach is to consider the approximation
where the subset Ip,k⊂I refers to using orthogonal projections Ψα only with respect to the first k basis elements (ei)1≤i≤k and only up to the pth order Wiener chaos. This method is mostly related to the articles [11, 12, 22, 21]. More specifically, the works [21, 22] study the L2-error associated with the approximation (1.5), but only for a particular choice of the basis (ei)i≥1. In this paper we will study the decay rate of E[(Xt−Xp,kt)2] when k,p→∞ for a general basis (ei)i≥1 of L2([0,T]) applying methods from Malliavin calculus.
The paper is structured as follows. In Section 2 we present the elements of Malliavin calculus. The main results of the paper are demonstrated in Section 3. Section 4 is devoted to proofs. In Section 5 we illustrate our approach with exemplary numerical results for the Haar and a trigonometric basis, and propose a heuristic based on sparse indices for computational speedup.
2 Background on Malliavin calculus
In this section we introduce some basic concepts of Malliavin calculus. The interested readers are referred to [24] for more thorough exposition on this subject. Set H=L2([0,T]) and let ⟨⋅,⋅⟩H denote the scalar product on H. We note that H is a separable Hilbert space and denote by (ei)i≥1 an orthonormal basis of H. We consider the isonormal Gaussian family W={W(h):h∈H} indexed by H defined on a probability space (Ω,F,P), i.e. the random variables W(h) are centered Gaussian with a covariance structure determined via
Here and throughout the paper we assume that F=σ(W). In our setting we consider W(h)=∫T0hsdWs, where W is a standard Brownian motion. We define the normalised Hermite polynomials through the identities
The nth Wiener chaos Hn is the closed linear subspace of L2(Ω,F,P) generated by the family of random variables {Hn(W(h)):‖h‖H=1}. The vector spaces Hn, n≥0, are orthogonal and we have the Wiener chaos expansion
(See [24, Theorem 1.1.1] for more details.) Next, for α∈I, where I has been introduced in (1.4), we define the random variable
where (ei)i≥1 is a fixed orthonormal basis of H. We define |α|=∑∞i=1αi and α!=∏∞i=1αi! for α∈I. We deduce that the set {Ψα:α∈Iwith|α|=n} forms a complete orthonormal basis of the nth Wiener chaos Hn and consequently {Ψα:α∈I} is a complete orthonormal basis of L2(Ω,F,P) (cf. [24, Proposition 1.1.1]).
Now, we introduce multiple stochastic integrals of order n∈N, which are denoted by In. For an element h⊗n:=h⊗⋯⊗h in H⊗n with ‖h‖H=1, we define
Assuming that the mapping In is linear, the definition (2.4) can be extended to all symmetric elements h∈H⊗n by polarisation identity. Finally, for an arbitrary function h∈H⊗n we set
where ˜h denotes the symmetrised version of h∈H⊗n. By definition In maps H into Hn, so the multiple integrals of different orders are orthogonal. In particular, they satisfy the isometry property
Furthermore, for any symmetric h∈H⊗n,g∈H⊗m, the following multiplication formula holds:
where the rth contraction g⊗rh is defined by
(See [24, Proposition 1.1.3].) The Wiener chaos expansion (2.2) transfers to the context of multiple integrals as follows. For each random variable F∈L2(Ω,F,P) we obtain the orthogonal decomposition
where I0=E[F] and the decomposition is unique when gn, n≥1, are assumed to be symmetric (cf. [24, Theorem 1.1.2]).
Next, we introduce the notion of Malliavin derivative and its adjoint operator. We define the set of smooth random variables via
where f∈C∞p(Rn) (i.e. the space of infinitely differentiable functions such that all derivatives exhibit polynomial growth). The kth order Malliavin derivative of F∈S, denoted by DkF, is defined by
Notice that DkF is a H⊗k-valued random variable, and we write DkxF for the realisation of the function DkF at the point x∈[0,T]k. The space Dk,q denotes the completion of the set S with respect to the norm
We define Dk,∞=∩q>1Dk,q. The Malliavin derivative of the random variable Ψα∈S, α∈I, can be easily computed using the definition (2.8) and the formula H′n(x)=√nHn−1(x): Higher order Malliavin derivatives of Ψα are computed similarly.
The operator Dk possesses an unbounded adjoint denoted by δk, which is often referred to as multiple Skorokhod integral. The following integration by parts formula holds (see [24, Exercise 1.3.7]): if u∈Dom(δk) and F∈Dk,2, where Dom(δk) consists of all elements u∈L2(Ω;H⊗k) such that the inequality |E[⟨DkF,u⟩H⊗k]|≤c(E[F2])1/2 holds for some c>0 and all F∈Dk,2, then we have the identity
In case the random variable F∈L2(Ω,F,P) has a chaos decomposition as displayed in (2.7), the statement F∈Dk,2 is equivalent to the condition ∑∞n=1nkn!‖gn‖2H⊗n<∞. In particular, when F∈D1,2 we deduce an explicit chaos representation of the derivative DF=(DtF)t∈[0,T]
where gn(⋅,t):[0,T]n−1→R is obtained from the function gn by setting the last argument equal to t (see [24, Proposition 1.2.7 ]). Finally, we present an explicit formula for the Malliavin derivative of a solution of a stochastic differential equation. Assume that (Xt)t∈[0,T] is a solution of a stochastic differential equation (1.1) and b,σ∈C1(R). In this setting DXt=(DsXt)s∈[0,T] is given as the solution of the SDE
for s≤t, and DsXt=0 if s>t (see [24, Theorem 2.2.1]). Throughout the paper ∂/∂x denotes the derivative with respect to the space variable and ∂/∂t denotes the derivative with respect to the time variable.
3 Main results
We start with the analysis of the Wiener chaos expansion introduced in (1.3). Under square integrability assumption on the solution X of the SDE (1.1) we obtain the Wiener chaos expansion
In order to study the strong approximation error we require a good control of the coefficients xα(t). When Xt is sufficiently smooth in the Malliavin sense, we deduce the identity
applying the duality formula (2.11). In fact, the coefficients xα(t) satisfy a system of ordinary differential equations. The following propagator system has been derived in [12, 21]. We state the proof for completeness.
Theorem 3.1.
Let (Xt)t∈[0,T] be the unique solution of the SDE (1.1) and assume that X∈L2(Ω×[0,T]). Then Xt possesses the chaos expansion (1.3) and the coefficients xα(t) satisfy the system of ordinary differential equations
Here bα(t,Xt) (resp. σα(t,Xt)) denotes the α-coefficient of the Wiener chaos expansion (1.3) associated with the random variable b(t,Xt) (resp. σ(t,Xt)), and the multi-index α−(j) is defined by (2.10).
Proof.
Using the SDE (1.1) and applying the formula (3.1) we obtain the identity
Applying the formula (3.1) once again for the random variable b(s,Xs) we immediately deduce that
On the other hand, observing the identity δ(1[0,t]σ(⋅,X⋅))=∫t0σ(s,Xs)dWs, we get by the duality formula (2.11) and (2.9) that
Putting things together we obtain the identity
Consequently, the assertion follows after taking the derivative with respect to t. □
We remark that the propagator system (3.2) is recursive. Let us give some simple examples to illustrate how (3.2) can be solved explicitly.
Example 3.2.
(i) (Scaled Brownian motion with drift) We start with the toy example of a scaled Brownian motion with drift:
In this case we obviously have that xα(t)=0 for any α∈I with |α|≥2. Applying formula (3.2) we obtain the representation
which is a well known Karhunen–Loéve expansion of the scaled Brownian motion.
(ii) (Geometric Brownian motion) Let us consider the geometric Brownian motion defined via the SDE
In this setting the propagator system (3.2) translates to
This system of ordinary differential equations can be solved recursively. For α=0 we have x′0(t)=bx0(t) and hence x0(t)=x0exp(bx). If α is the jth canonical unit vector in I (and hence |α|=1) we obtain the differential equation
Hence, xα(t)=x0σexp(bx)∫t0ej(s)ds. Following this recursion we obtain the general formula
for any α∈I with |α|=p.
For a general specification of the drift coefficient b and diffusion coefficient σ in model (1.1) the propagator system (3.2) cannot be solved explicitly. Thus, precise infinite dimensional Wiener chaos expansion (1.3) is out of reach. For simulation purposes it is an obvious idea to consider a finite subset of I in the expansion (1.3). We introduce the index set
The approximation of Xt is now defined via (1.5):
We remark that the quantity Xp,kt is more useful than the Euler approximation Xnt introduced in (1.2) if we are interested in the approximation of the first two moments of Xt. Indeed, the first two moments of Xp,kt are given explicitly by
while higher order moments can be computed via an application of the multiplication formula (2.6).
The strong approximation error associated with the truncation (3.4) has been previously studied in [21, 22] in a slightly different context. In particular, in both papers the authors consider one specific basis (ei)i≥1 of L2([0,T]) whereas we are interested in the asymptotic analysis for a general basis (ei)i≥1. While [22] mostly uses methods from analysis, our approach is based upon Malliavin calculus and is close in spirit to [21]. The main result of our paper gives an upper bound on the L2-error E[(Xp,kt−Xt)2].
Theorem 3.3.
Let (Xt)t∈[0,T] be the solution of the SDE (1.1). Suppose that the coefficient functions b and σ satisfy the Lipschitz and linear growth conditions
Moreover, assume that b,σ∈C1,∞([0,T]×R), where C1,∞([0,T]×R) denotes the space of functions f:[0,T]×R→R that are once continuously differentiable in the first component and infinitely differentiable in the second component, such that
for t∈[0,T], any l=0,1, and m≥0. Then it holds that
where C=C(t,K) is a positive constant and the function El(t) is defined by
Let us give some remarks about the statement (3.6). First of all, recalling the Karhunen–Loéve expansion Wt=∑∞l=1El(t)∫T0el(s)dWs, we readily deduce that
Hence, the upper bound on the right-hand side of (3.6) indeed converges to 0 when p,k→∞. We also note that the error associated with the truncation of the chaos order does not depend on the particular choice of the basis (ei)i≥1, while the error associated with truncation of basis strongly depends on (ei)i≥1 (which is not really surprising). Furthermore, we remark that it is extremely computationally costly to compute all coefficients xα(t), α∈Ip,k, for a large chaos order p. Thanks to the first bound ((p+1)!)−1 in (3.6) it is sufficient to use small values of p in practical situations (usually p≤4).
Example 3.4.
Let us explicitly compute the last terms of the upper bound (3.6) for two prominent bases of L2([0,1]).
(i) (Trigonometric basis) Consider the orthonormal basis (ei)i≥1 given by
In this setting we obtain that
for any j≥1. Consequently, we deduce that
for some C>0.
(ii) (Haar basis) The Haar basis is a collection of functions
defined as follows:
In this case we deduce the following representation for Ej,n(t):
Ej,n(t)={2(n−1)/2(t−2−n+1(j−1)):t∈[2−n+1(j−1),2−n(2j−1)],2(n−1)/2(2−n+1j−t):t∈[2−n(2j−1),2−n+1j],0:else.
Since maxt∈[0,1]E2j,n(t)=2−(n+1) and Ej,n(t)≠0 only for t∈[2−n+1(j−1),2−n+1j], we finally obtain that
Note that the exponential asymptotic rate for a fixed pth order Wiener chaos is equivalent to O(k−1), if we choose k=2n basis elements (ei). We will come back to these two bases in Section 5 where we provide exemplary numerical results that highlight this identical asymptotic rate, but also the different error distributions over time.4 Proofs
Throughout the proofs C=C(t,K) denotes a generic positive constant, which might change from line to line. We start with a proposition that gives an upper bound for the L2-norm of the Malliavin derivatives of Xt.
Proof.
We show the assertion of Proposition 4.1 by induction over n. For n=0 the result is a well-known consequence of the Lipschitz and linear growth conditions (3.5) on the functions b and σ (see e.g. [15, Theorem 2.9, page 289] which relates the second moment of Xt to the second moment of the initial condition X0=x0). For n=1 we known from the formula (2.13) that the process (DsXt)s∈[0,T] satisfies the SDE
for s≤t, and DsXt=0 for s>t. Next, we introduce the two-dimensional stochastic process (Y(1)s;t)t≥s via
Note that the process (Y(1)s;t)t≥s satisfies a two-dimensional SDE with initial condition given by Y(1)s;s. In particular, setting Y(1)s;t=(Y(1.1)s;t,Y(1.2)s;t), we have the representation
where ¯b(u,y)=∂∂xb(u,y1)y2 and ¯σ(u,y)=∂∂xσ(u,y1)y2 for y=(y1,y2). Applying again the estimates of [15, Theorem 2.9, page 289] to the diffusion process (Y(1)s;t)t≥s, we conclude the estimate
Now, we will perform the induction step. Notice that DsXt∈D1,∞, so its Malliavin derivative satisfies again an SDE according to [24, Theorem 2.2.2]. We define the stochastic process
Assume that the assertion of Theorem 3.3 holds for all k=1,…,n−1. In order to compute the estimate, we have to consider the initial values of the SDE system satisfied by Y(n)s1,…,sn;t. The formula below can be found in [24, Theorem 2.2.2]. Before we state it, we need some additional notation. The stochastic process DmXt={Dms1,…,smXt|(s1,…,sm)∈[0,T]} depends on the m time points s1,…,sm. For any subset J={j1<⋯<jη} of {1,…,m} with |J|=η≤m elements, denote s(J)=(sj1,…,sjη). Further on, we define
and
where the sums run over the set Pm of all partitions J1∪⋯∪Jν of {1,…,m}, where J1,…,Jν are disjoint sets. We determine z(t)=σ(t,Xt) as well. With these notations at hand, we find by (2.13) and induction that the nth order Malliavin derivative Dns1,…,snXt satisfies the linear SDE
for ˆs:=max{s1,…,sn}≤t and Dns1,…,snXt=0 else. Hence, its initial value is given by
where
Finally, we apply [15, Theorem 2.9, page 289] as for the case n=1 and taking into account that the assertion holds for k=1,…,n−1:
E[|Dns1,…,snXt|2]≤E[‖Y(n)s1,…,sn;t‖2]≤C(1+E[|Xˆs|2]+⋯+E[|k∑i=1z(si,s1,…si−1,si+1,…,sn)|2])≤Cn(1+x02).
This completes the proof of Proposition 4.1. □Now, we proceed with the proof of the main result of Theorem 3.3, which follows the ideas of [21, Proof of Theorem 2.2]. Observe the decomposition
where d(α):=max{i≥1:αi>0}. To determine an upper bound for A1(p), we use the chaos expansions (1.3) and (2.7) of Xt, i.e.,
with tn=(t1,…,tn) and symmetric kernel functions ξn(tn;t) being defined by
for all n∈N0. We conclude that
where we abbreviate
Therefore, we deduce via (4.7) that
Finally, using (4.8) and applying Proposition 4.1 we obtain
The treatment of the term A2(p,k) is more involved. Recalling the definition of d(α)=max{i≥1:αi>0}, we introduce the characteristic set (i1,…,in) of α∈I for i1≤i2≤⋯≤in and |α|=n. It is defined as follows: i1 is the index of the first non-zero component of α. If αi1=1 then i2 is the index of the second non-zero component of α; otherwise i1=i2=⋯=iαi1 and iαi1+1 is the second non-zero component of α. The same operation is repeated for the index iαi1+1 and further non-zero components of α. In this fashion, the characteristic set is constructed, resulting in the observation that d(α)=in. To give a simple example, consider the multiindex α=(2,0,1,4,0,0,…)∈I. Here the characteristic set of α is given by
For any α∈I with |α|=n and a basis (ei)i≥1 of H=L2([0,T]), we denote by ˜eα(tn) the (scaled) symmetrised form of ⨂e⊗αii defined via
where (i1,…,in) is the characteristic set of α and the sum runs over all permutations π within the permutation group Pn of {1,…,n}. From (3.1) we know that xα(t)=E[XtΨα]. On the other hand, we have the representation
Now, for any α∈I with |α|=n, we obtain the identity
by (2.11). Since
where tnj is obtained from tn by omitting tj and α−(⋅) denotes the diminished multi-index as defined in (2.10), we deduce that
with t0:=0 and tn+1:=t, by changing the order of integration. Then for any in=l≥1 we integrate by parts to deduce
where
Now, we use substitution in (4.12) by renaming tnj in the following way for each j: With si=ti for all i≤j−1 and si=ti+1 for all i>j−1, we have sn−1:=tnj by setting s0=0 and sn=t. Moreover, we denote with sn−1,r, r=1,…,n−1, the set that is generated from sn−1 by taking sr twice. To finalize this notation, we set sn−1,0=(s0,s1,…,sn−1) and sn−1,n=(s1,…,sn−1,sn). Then
Because El(s0)=El(0)=0 and El(sn)=El(t), from (4.13) we see that by summing over j all terms except one cancel out. Hence, setting
Combining (4.9) and (4.15) we obtain the result of Theorem 3.3.
(4.7)
∑|α|=nxα2(t)=E[(∑|α|=nxα(t)Ψα)2]=E[(In(ξn(tn;t)))2]=n!⟨ξn(tn;t),ξn(tn;t)⟩H=(n!)2∫(n);t(ξn(tn;t))2dtn,(4.9)
A1(p)=∞∑n=p+1∫(n);tE[(Dnt1,…,tnXt)2]dtn≤(1+x02)∞∑n=p+1Cn∫(n);tdtn=(1+x02)∞∑n=p+1(Ct)nn!≤C(1+x02)1(p+1)!.
ψl(sn−1;t):=ξn(sn−1,n)El(t)−∫s10∂∂s1ξn(τ,sn−1;t)El(τ)dτ−n−1∑j=2∫sjsj−1∂∂sjξn(…,sj−1,τ,sj+1,…;t)El(τ)dτ−∫tsn−1∂∂snξn(sn−1,τ;t)El(τ)dτ,
we obtain from (4.12)
∑|α|=nin=d(α)=lxα2(t)=∑|α|=nin=d(α)=l(n!√α!∫(n−1);tψl(sn−1;t)˜eα−(l)(sn−1)dsn−1)2≤n2∑|β|=n−1((n−1)!√β!∫(n−1);tψl(sn−1;t)˜eβ(sn−1)dsn−1)2,
since |α−(i|α|)|=|α|−1 and α!≥α−(i|α|)!. In order to interpret the last sum, consider the random variable In−1(ψl(sn−1;t)). Then, following (4.7) and the identity (4.11), we conclude that
E[(In−1(ψl(sn−1;t)))2]=((n−1)!)2∫(n−1);tψ2l(sn−1;t)dsn−1=∑|β|=n−1((n−1)!√β!∫(n−1);tψl(sn−1;t)˜eβ(sn−1)dsn−1)2.
Thus, we get the estimate
Now, using the Cauchy–Schwarz inequality we deduce that
Recall the identity ξn(⋅;t)=(n!)−1E[DnXt]. Using similar arguments as in the proof of Proposition 4.1, we obtain the inequality
Since |sj−sj−1|≤t, we finally conclude that
Consequently, we deduce
Putting things together we conclude
(4.15)
A2(p,k)=∞∑l=k+1p∑n=1∑|α|=nin=lxα2(t)≤(1+x20)∞∑n=1Cn(n+1)(n−1)!⋅∞∑l=k+1(E2l(t)+∫t0E2l(τ)dτ)≤C(1+x20)∞∑l=k+1(E2l(t)+∫t0E2l(τ)dτ).Fig. 1.
Absolute errors of the variance as calculated with different polynomial chaos expansions for the Geometric Brownian Motion, |Var[Xt]−Var[Xp,kt]|. The trigonometric and Haar bases (with k=2n) from Example 3.4 are shown
5 Numerical results and sparse truncation
To get an idea of the performance of the polynomial chaos expansion for different bases and truncations, we apply the trigonometric basis from Example 3.4 (i) and the Haar basis from Example 3.4 (ii) to the geometric Brownian motion from Example 3.2,
on the horizon t∈[0,1] with μ=σ=x0=1. In this setting we have the identity
which can be used to compare the absolute errors of approximated variances using different bases and values for k and p. We used the ode45 solver with standard settings within Matlab 2017b (64bit, single core) on an Intel Core i5-6300U (2.4GHz, 8GB RAM).
5.1 Absolute errors of variances
Figure 1 shows the absolute errors between the variances of the analytical solution and different polynomial chaos expansions. One observes the fast decay of the approximation error for fixed k and increasing p, and vice versa. The main difference between the observed errors is that the maximum of the trigonometric expansion is always at the end of the time horizon at t=1, while the Haar basis leads to almost zero error on k equidistant time points, including t=0 and t=1.
Table 1 in the Appendix lists approximation errors, numbers of coefficients, and computational times for several choices of p and k. The exponential increase in the number of coefficients with increasing p and k leads to an exponential increase in runtime, as expected. To overcome this issue, we propose a heuristic to reduce the computational burden.
5.2 Sparse truncation
The information contained in the coefficient functions xα(⋅) decays with increasing order p of the basis polynomials Ψα and the decaying rate of the Gaussian expansion, i.e., the index k of the basis polynomials ek used for constructing Ψα [22].
Hence, we can define sparse index sets for reducing the number of multi-indices α from the full truncated index set Ip,k without losing too much information, i.e., accuracy in a numerical simulation.
Example 5.1.
Let k=5 and p=3. Then a possible choice of the sparse index is r=(3,2,2,1,1). For constructing the first order polynomials all five first order basis polynomials can be used. The second order polynomials are comprised by all possible combinations of first order basis polynomials depending on e1,…,e5 and the second order basis polynomials of e1,e2,e3. Analogously, the third order polynomials are constructed. By using this first order sparse index set Irp,k the number of coefficient functions xα(⋅) appearing within the propagator system (3.2) can be reduced drastically without impairing the solution much. The full index set Ip,k consists of (k+p)!k!p!=56 terms, whereas this first order sparse index set includes 42 terms.
An even more rigorous reduction of the number of coefficients included in the propagator system can be achieved by using a second order sparse index, i.e., a series of first order sparse indices (rj)j=0,…,p that depend on the actual order of the polynomials Ψα with j=|α|, [22]. This approach allows to exclude crossing products of random variables W(ei) from the construction of higher order basis polynomials Ψα that add only negligible information to the system.
Example 5.2.
Considering again the setting of the previous example with k=5 and p=3, one possible choice of a second order sparse index is given via r1=(1,1,1,1,1), r2=(2,2,2,1,0), and r3=(3,2,0,0,0). In constructing basis polynomials of order |α|=3 we can use all combinations of basis polynomials depending on the first two random variables e1 and e2 up to orders 3 and 2, respectively. Thus, these are √6H3(e1), √2H2(e1)H1(e2), and √2H1(e1)H2(e2), compare [22].
Table 2 in the Appendix lists the first and second order sparse indices that were used for the numerical study. The results in Table 1 show that at the price of a slightly higher error, and of course the loss of a guaranteed upper bound as in (3.6), the computational times could be reduced by several orders of magnitude.