1 Introduction
For usual optimal control problems, by the dynamic principle of optimality [40] one may check that an optimal control remains optimal when it is restricted to a later time interval, meaning that optimal controls are time-consistent. The time-consistency feature provides a powerful advance to deal with optimal control problems. The dynamic principle of optimality consists in establishing relationships among a family of time-consistent optimal control problems parameterized by initial pairs (of time and state) through the so-called Hamilton–Jacobi–Bellman equation (HJB), which is a nonlinear partial differential equation. If the HJB equation is solvable, then one can find an optimal feedback control by taking the optimizer of the general Hamiltonian involved in the HJB equation.
However, in reality, the time-consistency can be lost in various ways, meaning that, as time goes, an optimal control might not remain optimal. Among several possible reasons causing the time-inconsistency, there are three ones playing some important roles:
The portfolio optimization problem with a hyperbolic discount function [11] and the risk aversion attitude in mean-variance models [17, 43] and [44] are two well-known cases of time-inconsistency in mathematical finance. Motivated by the second example, the present paper studies a general linear-quadratic optimal control problem for jump diffusions, which is time-inconsistent in the sense that it does not satisfy the Bellman optimality principle due to the existence of some quadratic terms in the expected controlled state process as well as a state-dependent risk aversion term in the running and the terminal cost functionals. The fundamental challenge when dealing with a time-inconsistent optimal control models is that we can’t employ the dynamic programming approach and the standard HJB equation, in general. One way to get around the time-inconsistency issue is to consider only precommitted strategies, see, e.g., [45] and [26].
However, the main method of dealing with time-inconsistency is to consider the time-inconsistent problems as noncooperative games, in which decisions at every moment of time are taken by multiple players at each moment of time and are intended to maximize or minimize their own objective functions. As a result, Nash equilibriums are considered rather than optimal solutions, see, e.g., [3, 8, 11, 15, 16, 23, 24, 28, 37, 38] and [39]. Strotz [28] was the first who applied this game perspective for dealing with the dynamic time-inconsistent decision problem posed by the deterministic Ramsay problem. He then proposed a rudimentary notion of Nash equilibrium strategy by capturing the concept of noncommitment and allowing the commitment period to be infinitesimally small. Further references which extend [28] are [16, 24] and [13]. Ekeland and Pirvu [11] gave a formal definition of feedback Nash equilibrium controls in a continuous-time setting in order to investigate the optimal investment–consumption problem under general discount functions in both deterministic and stochastic frameworks. Björk & Murguci [3] and Ekeland et al. [10] are two further expansions of Ekeland and Pirvu’s work. Yong [39] proposed an alternative method for analyzing general discounting time-inconsistent optimal control problem in continuous-time setting by taking into account a discrete time counterpart. Zhao et al. [42] investigated the consumption–investment problem under a general discount function and a logarithmic utility function using Yong’s method. Wang and Wu investigated a partially observed time-inconsistent recursive optimization issue in [33]. Basak and Chabakauri [1] touched upon the continuous-time Markowitz’s mean-variance portfolio selection problem, while Björk et al. [4] addressed the mean-variance portfolio selection with state-dependent risk aversion. Hu et al. [15], followed by Czichowski [8], found a time-consistent strategy for mean-variance portfolio selection in a non-Markovian framework.
The linear-quadratic optimal control problems are well known as a fundamental category of optimal control problems, since they may cover a wide range of problems in applications, such as the mean-variance portfolio selection model in financial applications. Furthermore, the LQ model may be used to approximate many nonlinear control problems. In recent years, time-inconsistent LQ control problems have gotten a lot of attention. Yong worked on a general discounted time-inconsistent deterministic LQ model in [37] and he consider a forward ordinary differential equation coupled with a backward Riccati–Volterra integral equation to obtain closed-loop equilibrium strategies. Hu et al. [15] presented a specific definition of open-loop Nash equilibrium controls in a continuous-time setting, which is distinct from that for the feedback controls provided in [11], in order to analyze a time-inconsistent stochastic linear-quadratic optimal control problem with stochastic coefficients. Yong [39] studied a time-inconsistent stochastic LQ problem for mean-field type stochastic differential equation. Finally, Hu et al. [14] looked into the uniqueness of the equilibrium solution found in [15]. They are the first who give a positive result regarding the uniqueness of the solution to a time-inconsistent problem.
There is little work in the literature concerning equilibrium strategies for optimal investment and reinsurance problems under the mean-variance criterion. Zeng and Li [43] are the first who study Nash equilibrium strategies for mean-variance insurers with constant risk aversion, where the surplus process of insurers is described by the diffusion model and the price processes of the risky stocks are driven by geometric Brownian motions. They have obtained equilibrium reinsurance and investment strategies explicitly using the technique described in [3]. Li and Li [17] obtained equilibrium strategies in the case of state-dependent risk aversion through a set of well-posed integral equations. Zeng et al. [44] investigate time-consistent investment and reinsurance strategies for mean-variance insurers under constant risk aversion, in which the surplus process and the price process of the risky stock are both jump-diffusion processes.
Markov regime-switching models have recently gotten a lot of interest in financial applications; see, for example, [46, 5, 6, 34] and [18]. Markov regime-switching models permit the market to face financial crises at any moment. The market is supposed to be governed by some kind of regime at any given moment. A bull market, in which stock prices are generally increasing, is a standard illustration of such a regime. The market’s behavior radically alters after a financial crisis. A switch in the regime symbolizes the crisis. The problem of mean-variance optimization under a continuous-time Markov regime-switching financial market was first studied by Zhou and Yin [46]. By applying stochastic linear-quadratic control methods, they obtained mean-variance efficient portfolios and efficient frontiers via solving two systems of ordinary linear differential equations. In the context of continuous and multiperiod time models, Chen et al. [5] and Chen and Yang [6] studied the mean-variance asset-liability management problem, respectively. Mean-variance asset-liability management problems with a continuous-time Markov regime-switching setup have been studied by Wei et al. [34]. They explicitly deduced a time-consistent investment strategy using the method described in [3]. Liang and Song [18] investigated optimal investment and reinsurance problems for insurers with mean-variance utility under partial information, where the stock’s drift rate and the risk aversion of the insurer are both Markov-modulated.
In this work, we present a general time-inconsistent stochastic conditional LQ control problem. Differently from most current studies [15, 39, 2, 42], where the noise is driven by a Brownian motion, in our LQ system the state develops according to a SDE, in which the noise is driven by a multidimensional Brownian motion and an independent multidimensional Poisson point process under a Markov regime-switching setup. Cases of continuous-time mean-variance criteria with state-dependent risk aversion are included in the objective function. We establish a stochastic system that describes open-loop Nash equilibrium controls, using the variational technique proposed by Hu et al. [14]. We emphasize that our model generalizes the ones investigated by Zeng and Li [43], Li et al. [17], Sun and Guo [30] and Zeng et al. [44], in addition to some classes of time-inconsistent stochastic LQ optimal control problems introduced in [15].
The paper is organized as follows: in the second section, we formulate the problem and provide essential notations and preliminaries. Section 3 is dedicated to presenting the necessary and sufficient conditions for equilibrium, which is our main result, and we get the unique equilibrium control in state feedback representation through a specific category of ordinary differential equations. In the last section, we apply the results of Section 3 to find the unique equilibrium reinsurance, investment and consumption strategies for the mean-variance-utility portfolio problem, as well as discuss some special cases. The paper concludes with an Appendix that includes some proofs.
2 Problem setting
Let $(\Omega ,\mathcal{F},\mathbb{F},\mathbb{P})$ be a filtered probability space where $\mathbb{F}:=\left\{\left.{\mathcal{F}_{t}}\right|t\in [0,T]\right\}$ is a right-continuous, $\mathbb{P}$-completed filtration to which all of the processes outlined below are adapted, such as the Markov chain, the Brownian motions, and the Poisson random measures.
During the present paper, we assume that the Markov chain $\alpha \left(\cdot \right)$ takes values in finite state space $\chi =\left\{{e_{1}},{e_{2}},\dots ,{e_{d}}\right\}$ where $d\in \mathbb{N}$, ${e_{i}}\in {\mathbb{R}^{d}}$ and the j-th component of ${e_{i}}$ is the Kronecker delta ${\delta _{ij}}$ for each $\left(i,j\right)\in {\left\{1,\dots ,d\right\}^{2}}$. $\mathcal{H}:={\left({\lambda _{ij}}\right)_{1\le i,j\le d}}$ represents the rate matrix of the Markov chain under $\mathbb{P}$. Note that ${\lambda _{ij}}$ is the constant transition intensity of the chain from state ${e_{i}}$ to state ${e_{j}}$ at time t, for each $\left(i,j\right)\in {\left\{1,\dots ,d\right\}^{2}}$. As a result, for $i\ne j$, we have ${\lambda _{ij}}\ge 0$ and ${\textstyle\sum \limits_{j=1}^{d}}{\lambda _{ij}}=0$, thus ${\lambda _{ii}}\le 0$. In the sequel, for each $i,j=1,2,\dots ,d$ with $i\ne j$, we assume that ${\lambda _{ij}}>0$ consequently, ${\lambda _{ii}}<0$. We have the following semimartingale representation of the Markov chain $\alpha \left(\cdot \right)$ obtained from Elliott et al. [12]
\[ \alpha \left(t\right)=\alpha \left(0\right)+{\int _{0}^{t}}{\mathcal{H}^{\top }}\alpha (\tau )d\tau +\mathcal{M}(t),\]
where $\{\mathcal{M}(t)|t\in [0,T]\}$ is an ${\mathbb{R}^{d}}$-valued $(\mathbb{F},\mathbb{P})$-martingale.First, we provide a set of Markov jump martingales linked with the chain $\alpha \left(\cdot \right)$, which will be used to model the controlled state process. For each $\left(i,j\right)\in {\left\{1,\dots ,d\right\}^{2}}$, with $i\ne j$, and $t\in \left[0,T\right]$, denote by ${J^{ij}}\left(t\right):={\lambda _{ij}}{\textstyle\int _{0}^{t}}\left\langle \alpha \left(\tau -\right),{e_{i}}\right\rangle d\tau +{m_{ij}}(t)$ the number of jumps from state ${e_{i}}$ to state ${e_{j}}$ up to time t, where ${m_{ij}}(t):={\textstyle\int _{0}^{t}}\left\langle \alpha \left(\tau -\right),{e_{i}}\right\rangle \left\langle d\mathcal{M}\left(\tau \right),{e_{j}}\right\rangle d\tau $ is an $(\mathbb{F},\mathbb{P})$-martingale. ${\Phi _{j}}(t)$ denotes the number of jumps into state ${e_{j}}$ up to time t, for each fixed $j=1,2,\dots ,d$, then
\[\begin{aligned}{}{\Phi _{j}}(t)& ={\sum \limits_{i=1,i\ne j}^{d}}{J^{ij}}\left(t\right),\\ {} & ={\sum \limits_{i=1,i\ne j}^{d}}{\lambda _{ij}}{\int _{0}^{t}}\left\langle \alpha \left(\tau -\right),{e_{i}}\right\rangle d\tau +{\tilde{\Phi }_{j}}(t),\end{aligned}\]
where ${\tilde{\Phi }_{j}}(t):={\textstyle\sum \limits_{i=1,i\ne j}^{d}}{m_{ij}}(t)$ is an $\left(\mathbb{F},\mathbb{P}\right)$-martingale for each $j=1,2,\dots ,d$. For each $j=1,2,\dots ,d$ set
Note that the process ${\tilde{\Phi }_{j}}(t)={\Phi _{j}}(t)-{\lambda _{j}}(t)$ is an $\left(\mathbb{F},\mathbb{P}\right)$-martingale, for each $j=1,2,\dots ,d$.
Now, we present the Markov regime-switching Poisson random measures. Assume that ${N_{i}}(dt,dz)$, $i=1,2,\dots ,l$, are independent Poisson random measures on $\left(\left[0,T\right]\times {\mathbb{R}_{0}},\mathcal{B}\left(\left[0,T\right]\right)\otimes {\mathcal{B}_{0}}\right)$ under $\mathbb{P}$. Assume that the compensator for the Poisson random measure ${N_{i}}(dt,dz)$ is defined by
\[ {n_{\alpha }^{i}}(dt,dz):={\theta _{\alpha \left(t-\right)}^{i}}(dz)dt=\left\langle \alpha \left(t-\right),{\theta ^{i}}(dz)\right\rangle dt,\]
where ${\theta ^{i}}(dz):={\left({\theta _{{e_{1}}}^{i}}(dz),{\theta _{{e_{2}}}^{i}}(dz),\dots ,{\theta _{{e_{d}}}^{i}}(dz)\right)^{\top }}\in {\mathbb{R}^{d}}$. The subscript α in ${n_{\alpha }^{i}}$, for $i=1,2,\dots ,l$, represents the dependence of the probability law of the Poisson random measure on the Markov chain $\alpha \left(\cdot \right)$. In fact ${\theta _{{e_{j}}}^{i}}(dz)$ is the conditional Lévy density of jump sizes of the random measure ${N_{i}}(dt,dz)$ at time t when $\alpha \left(t-\right)={e_{j}}$, for each $j=1,2,\dots ,d$. Furthermore, the compensated Poisson random measure ${\tilde{N}_{\alpha }}(dt,dz)$ is given by
2.1 Notations
Throughout this paper, we use the following notations: ${S^{n}}$ is the set of $n\times n$ symmetric real matrices. ${C^{\top }}$ is the transpose of the vector (or matrix) C. $\left\langle \cdot ,\cdot \right\rangle $ is the inner product in some Euclidean space. For any Euclidean space $H={\mathbb{R}^{n}}$, or ${S^{n}}$ with Frobenius norm $\left|\cdot \right|$, and $p,l,d\in \mathbb{N}$ we denote for any $t\in \left[0,T\right]$:
-
• ${\mathbb{L}^{p}}\left(\Omega ,{\mathcal{F}_{t}},\mathbb{P};H\right)=\left\{\xi :\Omega \to H|\xi \hspace{2.5pt}\text{is}\hspace{2.5pt}{\mathcal{F}_{t}}\text{-measurable},\hspace{2.5pt}\text{s.t.}\hspace{2.5pt}\mathbb{E}\left[{\left|\xi \right|^{p}}\right]<\infty \right\}$, for any $p\ge 1$;
-
• ${\mathbb{L}^{2}}\left({\mathbb{R}^{\ast }},\mathcal{B}\left({\mathbb{R}^{\ast }}\right),\theta ;{H^{l}}\right)=\bigg\{r\left(\cdot \right):{\mathbb{R}^{\ast }}\to {H^{l}}|r\left(\cdot \right)={\left({r_{k}}\left(\cdot \right)\right)_{k=1,2,\dots ,l}}\hspace{2.5pt}\text{is}\hspace{2.5pt}\mathcal{B}\left({\mathbb{R}^{\ast }}\right)\text{-measurable with}\hspace{2.5pt}{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{\left|{r_{k}}\left(z\right)\right|^{2}}{\theta _{\alpha }^{k}}\left(dz\right)ds<\infty \bigg\}$;
-
• ${\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;H\right)=\bigg\{\mathcal{Y}\left(\cdot \right):\left[t,T\right]\times \Omega \to H|\mathcal{Y}\left(\cdot \right)\hspace{2.5pt}\text{is}\hspace{2.5pt}{\left({\mathcal{F}_{s}}\right)_{s\in \left[t,T\right]}}\text{-adapted},s\mapsto \mathcal{Y}(s)\hspace{2.5pt}\text{is càdlàg},\hspace{2.5pt}\text{with}\hspace{2.5pt}\mathbb{E}\left[\underset{s\in \left[t,T\right]}{\sup }{\left|\mathcal{Y}\left(s\right)\right|^{2}}ds\right]<\infty \bigg\}$;
-
• ${\mathcal{L}_{\mathcal{F}}^{2}}\left(t,T;{H^{p}}\right)=\Big\{\mathcal{Y}\left(\cdot \right):\left[t,T\right]\times \Omega \to {H^{p}}|\mathcal{Y}\left(\cdot \right)\hspace{2.5pt}\text{is}\hspace{2.5pt}{\left({\mathcal{F}_{s}}\right)_{s\in \left[t,T\right]}}\text{-adapted},\text{with}\hspace{2.5pt}\mathbb{E}\left[{\textstyle\int _{t}^{T}}{\left|\mathcal{Y}\left(s\right)\right|^{2}}ds\right]<\infty \Big\}$;
-
• ${\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;H\right)=\Big\{\mathcal{Y}\stackrel{}{\left(\cdot \right)}:\left[t,T\right]\times \Omega \to H|\mathcal{Y}\left(\cdot \right)\hspace{2.5pt}\text{is}\hspace{2.5pt}{\left({\mathcal{F}_{s}}\right)_{s\in \left[t,T\right]}}\text{-predictable},\text{with}\hspace{2.5pt}\mathbb{E}\left[{\textstyle\int _{t}^{T}}{\left|\mathcal{Y}\left(s\right)\right|^{2}}ds\right]<\infty \Big\}$;
-
• ${\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[t,T\right]\times {\mathbb{R}^{\ast }};{H^{l}}\right)=\bigg\{\mathcal{R}\left(\cdot ,\cdot \right):\left[t,T\right]\times \Omega \times {\mathbb{R}^{\ast }}\to {H^{l}}|\mathcal{R}\left(\cdot ,\cdot \right)\hspace{2.5pt}\text{is}\hspace{2.5pt}{\left({\mathcal{F}_{s}}\right)_{s\in \left[t,T\right]}}\text{-predictable, with}\hspace{2.5pt}{\textstyle\sum \limits_{k=1}^{l}}\mathbb{E}\left[{\textstyle\int _{t}^{T}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{\left|{R_{k}}\left(s,z\right)\right|^{2}}{\theta _{\alpha }^{k}}\left(dz\right)ds\right]<\infty \bigg\}$;
-
• ${\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(t,T;{H^{d}}\right)=\bigg\{\mathcal{Y}\left(\cdot \right):\left[t,T\right]\times \Omega \to {H^{d}}|\mathcal{Y}\left(\cdot \right)={\left({\mathcal{Y}_{j}}\left(\cdot \right)\right)_{j=1,\dots ,d}}\hspace{2.5pt}\text{is}\hspace{2.5pt}{\left({\mathcal{F}_{s}}\right)_{s\in \left[t,T\right]}}\text{-predictable, with}\hspace{2.5pt}\mathbb{E}\bigg[{\textstyle\int _{t}^{T}}{\textstyle\sum \limits_{j=1}^{d}}{\left|{\mathcal{Y}_{j}}\left(s\right)\right|^{2}}{\lambda _{j}}\left(s\right)ds\bigg]<\infty \bigg\}$;
-
• $\mathcal{C}\left(\left[0,T\right];H\right)=\left\{f:\left[0,T\right]\to H|f\left(\cdot \right)\hspace{2.5pt}\text{is continuous}\right\}$;
-
• ${\mathcal{C}^{1}}\left(\left[0,T\right];H\right)=\left\{f:\left[0,T\right]\to H|f\left(\cdot \right)\hspace{2.5pt}\text{and}\hspace{2.5pt}\frac{df}{ds}\left(\cdot \right)\hspace{2.5pt}\text{are continuous}\right\}$;
-
• $\mathcal{D}\left[0,T\right]=\left\{\left(t,s\right)\in \left[0,T\right]\times \left[0,T\right]\hspace{2.5pt}\text{such that}\hspace{2.5pt}s\ge t\right\}$.
2.2 Assumptions and problem formulation
Throughout this paper, we consider a multidimensional nonhomogeneous linear controlled jump-diffusion system starting from the situation $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\big(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};{\mathbb{R}^{n}}\big)\times \chi $, defined by
(2.1)
\[ \left\{\begin{array}{l}dX\left(s\right)=\left\{A\left(s,\alpha \left(s\right)\right)X\left(s\right)+B\left(s,\alpha \left(s\right)\right)u\left(s\right)+b\left(s,\alpha \left(s\right)\right)\right\}ds\hspace{1em}\\ {} \phantom{dX\left(s\right)=}+{\textstyle\sum \limits_{i=1}^{p}}\left\{{C_{i}}\left(s,\alpha \left(s\right)\right)X\left(s\right)+{D_{i}}\left(s,\alpha \left(s\right)\right)u\left(s\right)+{\sigma _{i}}\left(s,\alpha \left(s\right)\right)\right\}d{W^{i}}\left(s\right)\hspace{1em}\\ {} \phantom{dX\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left\{{E_{k}}\left(s,z,\alpha \left(s\right)\right)X\left(s-\right)\right.+{F_{k}}\left(s,z,\alpha \left(s\right)\right)u\left(s\right)\hspace{1em}\\ {} \phantom{dX\left(s\right)=+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}}+\left.{c_{k}}\left(s,z,\alpha \left(s\right)\right)\right\}{\tilde{N}_{\alpha }^{k}}\left(ds,dz\right),\hspace{2.5pt}s\in \left[t,T\right],\hspace{1em}\\ {} X\left(t\right)=\xi ,\alpha \left(t\right)={e_{i}}.\hspace{1em}\end{array}\right.\]The coefficients $A\left(\cdot ,\cdot \right)$, ${C_{i}}\left(\cdot ,\cdot \right):\left[0,T\right]\times \chi \to {\mathbb{R}^{n\times n}}$; $B\left(\cdot ,\cdot \right),{D_{i}}\left(\cdot ,\cdot \right):\left[0,T\right]\times \chi \to {\mathbb{R}^{n\times m}}$; $b\left(\cdot ,\cdot \right)$, ${\sigma _{i}}\left(\cdot ,\cdot \right):\left[0,T\right]\times \chi \to {\mathbb{R}^{n}}$; ${E_{k}}\left(\cdot ,\cdot ,\cdot \right):\left[0,T\right]\times {\mathbb{R}^{\ast }}\times \chi \to {\mathbb{R}^{n\times n}}$; ${F_{k}}\left(\cdot ,\cdot ,\cdot \right):\left[0,T\right]\times {\mathbb{R}^{\ast }}\times \chi \to {\mathbb{R}^{n\times m}}$; ${c_{k}}\left(\cdot ,\cdot ,\cdot \right):\left[0,T\right]\times {\mathbb{R}^{\ast }}\times \chi \to {\mathbb{R}^{n}}$ are deterministic matrix-valued functions. Here, for any $t\in \left[0,T\right)$, the class of admissible control processes over $\left[t,T\right)$ is restricted to ${\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$. For any $u\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$ we denote by $X\left(\cdot \right)={X^{t,\xi ,{e_{i}}}}\left(\cdot ;u\left(\cdot \right)\right)$ its solution. Different controls $u\left(\cdot \right)$ will lead to different solutions $X\left(\cdot \right)$.
Remark 1.
In practice, the observable switching process is followed to represent the interest rate processes over various market settings. For example, the market may be generally split into “bullish” and “bearish” states, with characteristics varying greatly between the two modes. The application of switching model in mathematical finance can be discovered, for example, in [5, 6] and references therein.
To measure the performance of $u\left(\cdot \right)\in {L_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$, we introduce the following cost functional
(2.2)
\[\begin{aligned}{}& J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)\\ {} & \hspace{1em}=\mathbb{E}\left[{\int _{t}^{T}}\frac{1}{2}\left\{\left\langle Q\left(s\right)X\left(s\right),X\left(s\right)\right\rangle +\left\langle \bar{Q}\left(s\right)\mathbb{E}\left[X\left(s\right)\left|{\mathcal{F}_{s}^{\alpha }}\right.\right],\mathbb{E}\left[X\left(s\right)\left|{\mathcal{F}_{s}^{\alpha }}\right.\right]\right\rangle \right.\right.\\ {} & \hspace{2em}+\left.\left\langle R\left(t,s\right)u\left(s\right),u\left(s\right)\right\rangle \right\}ds+\left\langle {\mu _{1}}\xi +{\mu _{2}},X\left(T\right)\right\rangle +\frac{1}{2}\left\langle GX\left(T\right),X\left(T\right)\right\rangle \\ {} & \hspace{2em}+\left.\frac{1}{2}\left\langle \bar{G}\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right],\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right\rangle \right].\end{aligned}\]Remark 2.
Due to the general influence of the modulating switching process $\alpha (\cdot )$, the conditional expectation is employed rather than the expectation in (2.2). The presence of $\alpha (\cdot )$ in all coefficients of the state equation (2.1) can be makes the objective functional depends on the process’s history. This type of cost functional is also motivated by practical problems such as conditional mean-variance portfolio selection problem which is considered in Section 4 of this paper. A reader interested in this type of problems is referred to [21] and [19]. The term $\left\langle {\mu _{1}}\xi +{\mu _{2}},X\left(T\right)\right\rangle $ stems from a state-dependent utility function in economics [4].
We need to impose the following assumptions on the coefficients.
-
(H1) The functions $A\left(\cdot ,\cdot \right)$, $B\left(\cdot ,\cdot \right)$, $b\left(\cdot ,\cdot \right)$, ${C_{i}}\left(\cdot ,\cdot \right)$, ${D_{i}}\left(\cdot ,\cdot \right)$, ${\sigma _{i}}\left(\cdot ,\cdot \right)$, ${E_{k}}\left(\cdot ,\cdot ,\cdot \right)$, ${F_{k}}\left(\cdot ,\cdot ,\cdot \right)$ and ${c_{k}}\left(\cdot ,\cdot ,\cdot \right)$ are deterministic, continuous and uniformly bounded. The coefficients of the cost functional satisfy\[ \left\{\begin{array}{l}Q\left(\cdot \right),\bar{Q}\left(\cdot \right)\in C\left(\left[0,T\right];{S^{n}}\right),\hspace{1em}\\ {} R\left(\cdot ,\cdot \right)\in C\left(\mathcal{D}\left[0,T\right];{S^{m}}\right),\hspace{1em}\\ {} G,\bar{G}\in {S^{n}},\hspace{1em}{\mu _{1}}\in {\mathbb{R}^{n\times n}},\hspace{2.5pt}{\mu _{2}}\in {\mathbb{R}^{n}}.\hspace{1em}\end{array}\right.\]
-
(H2) The functions $R\left(\cdot ,\cdot \right)$, $Q\left(\cdot \right)$ and G satisfy $R\left(t,t\right)\ge 0$, $Q\left(t\right)\ge 0$, $\forall t\in \left[0,T\right]$ and $G\ge 0$.
Based on [25] we can prove under (H1) that, for any $\left(t,\xi ,{e_{i}},u\left(\cdot \right)\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};{\mathbb{R}^{n}}\right)\times \mathcal{X}\times {\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$, the state equation (2.2) has a unique solution $X\left(\cdot \right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;{\mathbb{R}^{n}}\right)$. Moreover, we have the estimate
for some positive constant K. In particular for $t=0$ and $u\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{m}}\right)$, Equation (2.1) starting from initial data $\left(0,{x_{0}}\right)$ has a unique solution $X\left(\cdot \right)\in {\mathcal{S}_{\mathcal{F}}^{2}}(0,T;{\mathbb{R}^{n}})$ for which
(2.3)
\[ \mathbb{E}\left[\underset{t\le s\le T}{\sup }{\left|X\left(s\right)\right|^{2}}\right]\le K\left(1+\mathbb{E}\left[{\left|\xi \right|^{2}}\right]\right),\]Our optimal control problem can be formulated as follows.
Problem (N).
For any initial triple $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};{\mathbb{R}^{n}}\right)\times \chi $, find a control $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$ such that
Any $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$ satisfying the above is called a pre-commitment optimal control. Furthermore, the presence of some quadratic terms of the conditional expectation of the state process as well as a state-dependent term in the objective functional destroys the time-consistency of a pre-committed optimal solutions of Problem (N). Hence, Problem (N) is time-inconsistent and there are two different sources of time-inconsistency.
3 The main results: characterization and uniqueness of equilibrium
In view of the fact that Problem (N) is time-inconsistent, the aim of this paper is to characterize open-loop Nash equilibriums as an alternative of optimal strategies. We employ the game theoretic approach to handle the time-inconsistency in the identical viewpoint as Ekeland et al. [11] and Björk and Murgoci [3]. Let us briefly explain the game perspective that we will consider as follows.
-
• We consider a game with one player at every point t in the interval $[0,T)$. This player corresponds to the incarnation of the controller on instant t and is referred to as “player t”.
-
• The t-th player can control the scheme just at time t by taking his/her policy $u\left(t,\cdot \right):\Omega \to {\mathbb{R}^{m}}$.
-
• A control process $u(\cdot )$ is then viewed as a complete explanation of the selected strategies of all players in the game.
-
• The reward to the player t is specified by the functional $J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)$.
We explain the concept of a “Nash equilibrium strategy” for the game described as above: This is an admissible control process $\hat{u}\left(\cdot \right)$ fulfilling the following criteria. Assume that every player s, with $s>t$, will apply the strategy $\hat{u}\left(s\right)$. Then the optimal decision for player t is that he/she also uses the strategy $\hat{u}\left(t\right)$. However, the difficulty with this “definition” is that the individual player t does not have any effect on the game’s result. He/she just selects the control at one point t. Furthermore, because this is a time set of Lebesgue measure zero, the control dynamics will be unaffected.
As a result, to identify open-loop Nash equilibrium controls, we follow [15], where a formal definition (Definition 4 below), inspired by [11], is proposed.
Remark 3.
In the rest of the paper, for brevity, we suppress the arguments $\left(s,\alpha \left(s\right)\right)$ for the coefficients $A\left(s,\alpha \left(s\right)\right)$, $B\left(s,\alpha \left(s\right)\right)$, $b\left(s,\alpha \left(s\right)\right)$, ${C_{i}}\left(s,\alpha \left(s\right)\right)$, ${D_{i}}\left(s,\alpha \left(s\right)\right)$, ${\sigma _{i}}\left(s,\alpha \left(s\right)\right)$, in addition we suppress the arguments $\left(s\right)$ and $\left(s,t\right)$ for the coefficients $Q\left(s\right)$, $\bar{Q}\left(s\right)$, $R\left(s,t\right)$ and we use the notation $\varrho \left(z\right)$ instead of $\varrho \left(s,z,\alpha \left(s\right)\right)$ for $\varrho ={E_{k}},{F_{k}}$ and ${c_{k}}$. Furthermore, sometimes we simply call $\hat{u}\left(\cdot \right)$ an equilibrium control instead of calling it an open-loop Nash equilibrium control, when there is no confusion.
In this section, we provide the main results about the necessary and sufficient conditions for equilibrium of the control problem formulated in the preceding section. To make the presentation of the paper more clear, the proofs will be relegated to Appendix A. To proceed towards the definition of an equilibrium, we first introduce the local spike variation for a given admissible control $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$: for any $t\in \left[0,T\right)$, $v\in {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t-}^{\alpha }},\mathbb{P};{\mathbb{R}^{m}}\right)$ and $\varepsilon \in \left(0,T-t\right)$, define
We have the following definition.
(3.1)
\[ {u^{\varepsilon }}\left(s\right)=\left\{\begin{array}{l@{\hskip10.0pt}l}\hat{u}\left(s\right)+v,\hspace{1em}& \hspace{2.5pt}\text{for}\hspace{2.5pt}s\in \left[t,t+\varepsilon \right),\\ {} \hat{u}\left(s\right),\hspace{1em}& \hspace{2.5pt}\text{for}\hspace{2.5pt}s\in \left[t+\varepsilon ,T\right).\end{array}\right.\]Definition 4 (Open-loop Nash equilibrium).
An admissible control $\hat{u}\left(\cdot \right)\hspace{0.1667em}\in \hspace{0.1667em}{L_{\mathcal{F},p}^{2}}(t,T;{\mathbb{R}^{m}})$ is an open-loop Nash equilibrium control for Problem (N) if for every sequence ${\varepsilon _{n}}\downarrow 0$, we have
for any $t\in \left[0,T\right]$ and $v\in {L^{2}}\left(\Omega ,{\mathcal{F}_{t-}^{\alpha }},\mathbb{P};{\mathbb{R}^{m}}\right)$. The corresponding equilibrium dynamics solves the following SDE with jumps: for $s\in \left[0,T\right]$,
(3.2)
\[ \underset{{\varepsilon _{n}}\downarrow 0}{\lim }\frac{1}{{\varepsilon _{n}}}\left\{J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);{u^{{\varepsilon _{n}}}}\left(\cdot \right)\right)-J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);\hat{u}\left(\cdot \right)\right)\right\}\ge 0,\]
\[ \left\{\begin{array}{l}d\hat{X}\left(s\right)=\left\{A\hat{X}\left(s\right)+B\hat{u}\left(s\right)+b\right\}ds\hspace{1em}\\ {} \phantom{d\hat{X}\left(s\right)=}+{\textstyle\sum \limits_{i=1}^{p}}\left\{{C_{i}}\hat{X}\left(s\right)+{D_{i}}\hat{u}\left(s\right)+{\sigma _{i}}\right\}d{W^{i}}\left(s\right)\hspace{1em}\\ {} \phantom{d\hat{X}\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left\{{E_{k}}\left(z\right)\hat{X}\left(s-\right)+{F_{k}}\left(z\right)\hat{u}\left(s\right)+{c_{k}}\left(z\right)\right\}{\widetilde{N}_{\alpha }^{k}}\left(ds,dz\right),\hspace{1em}\\ {} {\hat{X}_{0}}={x_{0}},\alpha \left(0\right)={e_{{i_{0}}}}.\hspace{1em}\end{array}\right.\]
3.1 Flow of the adjoint equations and characterization of equilibrium controls
In this subsection, we provide a general necessary and sufficient conditions to characterize the equilibrium strategies of Problem (N). First, we consider the adjoint equations used within the characterization of equilibrium controls. Let $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}(t,T;{\mathbb{R}^{m}})$ be a fixed control and denote by $\hat{X}\left(\cdot \right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(0,T;{\mathbb{R}^{n}}\right)$ its corresponding state process. For each $t\in \left[0,T\right]$, the first order adjoint equation defined on the time interval $\left[t,T\right]$ and satisfied by the 4-tuple of processes $\left(p\left(\cdot ;t\right),q\left(\cdot ;t\right),r\left(\cdot ,\cdot ;t\right),l\left(\cdot ;t\right)\right)$ is given as follows:
(3.3)
\[ \left\{\begin{array}{l}dp\left(s;t\right)=-\left\{{A^{\top }}p\left(s;t\right)+{\textstyle\sum \limits_{i=1}^{p}}{C_{i}^{\top }}{q_{i}}\left(s;t\right)\right.\hspace{1em}\\ {} \phantom{dp\left(s;t\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{E_{k}}{\left(z\right)^{\top }}{r_{k}}\left(s,z;t\right){\theta _{\alpha }^{k}}\left(dz\right)\left.-Q\hat{X}\left(s\right)-\bar{Q}\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{s}^{\alpha }}\right.\right]\right\}ds\hspace{1em}\\ {} \phantom{dp\left(s;t\right)=}+{\textstyle\sum \limits_{i=1}^{p}}{q_{i}}\left(s;t\right)d{W^{i}}\left(s\right)+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{r_{k}}\left(s,z;t\right){\widetilde{N}_{\alpha }^{k}}\left(ds,dz\right)\hspace{1em}\\ {} \phantom{dp\left(s;t\right)=}+{\textstyle\sum \limits_{j=1}^{d}}{l_{j}}\left(s,t\right)d{\tilde{\Phi }_{j}}\left(s\right),\hspace{2.5pt}s\in \left[t,T\right],\hspace{1em}\\ {} p\left(T;t\right)=-G\hat{X}\left(T\right)-\bar{G}\mathbb{E}\left[\hat{X}\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]-{\mu _{1}}\hat{X}\left(t\right)-{\mu _{2}}.\hspace{1em}\end{array}\right.\]Through this section, we will prove that we can get the equilibrium strategy by solving a system of FBSDEs which is not standard since the flow of the unknown process $\left(p\left(\cdot ;t\right),q\left(\cdot ;t\right),r\left(\cdot ,\cdot ;t\right),l\left(\cdot ;t\right)\right)$ for $t\in [0,T]$ is involved. To the best of our knowledge, the ability to explicitely solve this type of equation remains an open problem, except for a certain form of the objective function. However, by the separating variables approach we are able to completely solve this problem.
Lemma 5.
Consider a deterministic matrix-valued function $\phi \left(\cdot ,\cdot \right)$ as a solution of the following ODE
For any $t\in \left[0,T\right]$ and $s\in \left[t,T\right]$, the solution of Equation (3.3) have the representation
\[\begin{aligned}{}p\left(s;t\right)& =-\phi {\left(s,\alpha \left(s\right)\right)^{-1}}\left(\bar{p}\left(s\right)+\bar{G}\mathbb{E}\left[\hat{X}\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mu _{1}}\hat{X}\left(t\right)+{\mu _{2}}\right)\\ {} & \hspace{1em}-\phi {\left(s,\alpha \left(s\right)\right)^{-1}}{\int _{s}^{T}}\phi \left(\tau ,\alpha \left(\tau \right)\right)\bar{Q}\mathbb{E}\left[\hat{X}\left(\tau \right)\left|{\mathcal{F}_{\tau }^{\alpha }}\right.\right]d\tau ,\end{aligned}\]
and $\left({q_{i}}\left(s;t\right),{r_{k}}\left(s,z;t\right),{l_{j}}\left(s;t\right)\right)=-\phi {\left(s,\alpha \left(s\right)\right)^{-1}}\left({\bar{q}_{i}}\left(s\right),{\bar{r}_{k}}\left(s,z\right),{\bar{l}_{j}}\left(s\right)\right)$ for $i=1,2,\dots ,p$; $k=1,2,\dots ,l$; $j=1,2,\dots ,d$, where
(3.4)
\[ \left\{\begin{array}{l}d\bar{p}\left(s\right)=-\left\{{\textstyle\sum \limits_{i=1}^{p}}\phi \left(s,\alpha \left(s\right)\right){C_{i}^{\top }}\phi {\left(s,\alpha \left(s\right)\right)^{-1}}{\bar{q}_{i}}\left(s\right)\right.\hspace{1em}\\ {} \phantom{d\bar{p}\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\phi \left(s,\alpha \left(s\right)\right){E_{k}}{\left(z\right)^{\top }}\phi {\left(s,\alpha \left(s\right)\right)^{-1}}{\bar{r}_{k}}\left(s,z\right){\theta _{\alpha }^{k}}\left(dz\right)\hspace{1em}\\ {} \phantom{d\bar{p}\left(s\right)=}+\phi \left.\left(s,\alpha \left(s\right)\right)Q\hat{X}\left(s\right)\right\}ds+{\textstyle\sum \limits_{i=1}^{p}}{\bar{q}_{i}}\left(s\right)d{W^{i}}\left(s\right)\hspace{1em}\\ {} \phantom{d\bar{p}\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{\bar{r}_{k}}\left(s-,z\right){\tilde{N}_{\alpha }^{k}}\left(ds,dz\right)+{\textstyle\sum \limits_{j=1}^{d}}{\bar{l}_{j}}\left(s\right)d{\tilde{\Phi }_{j}}\left(s\right),\hspace{2.5pt}s\in \left[t,T\right],\hspace{1em}\\ {} \bar{p}\left(T\right)=G\hat{X}\left(T\right).\hspace{1em}\end{array}\right.\]Remark 6.
-
(2) From the representation of $\left(p\left(\cdot ;t\right),q\left(\cdot ;t\right),r\left(\cdot ,\cdot ;t\right),l\left(\cdot ;t\right)\right)$, for $t\in \left[0,T\right]$ given by Lemma 5, we can check that under (H1) Equation (3.3) admits a unique solution\[\begin{aligned}{}& \left(p\left(\cdot ;t\right),q\left(\cdot ;t\right),r\left(\cdot ,\cdot ;t\right),l\left(\cdot ;t\right)\right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;{\mathbb{R}^{n}}\right)\\ {} & \hspace{1em}\times {\mathcal{L}^{2}}\left(t,T;{\left({\mathbb{R}^{n}}\right)^{p}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[t,T\right]\times {\mathbb{R}^{\ast }};{\left({\mathbb{R}^{n}}\right)^{l}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(t,T;{\left({\mathbb{R}^{n}}\right)^{d}}\right).\end{aligned}\]
The following second order adjoint equation is defined on the time interval $\left[t,T\right]$ and satisfied by the 4-tuple of processes $\left(P\left(\cdot \right),\Lambda \left(\cdot \right),\Gamma \left(\cdot ;\cdot \right),L\left(\cdot \right)\right)$:
(3.5)
\[ \left\{\begin{array}{l}dP\left(s\right)=-\left\{{A^{\top }}P\left(s\right)+P\left(s\right)A+{\textstyle\sum \limits_{i=1}^{p}}\left({C_{i}^{\top }}P\left(s\right){C_{i}}+{\Lambda _{i}}\left(s\right){C_{i}}+{C_{i}^{\top }}{\Lambda _{i}}\left(s\right)\right)\right.\hspace{1em}\\ {} \phantom{dP\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left\{{\Gamma _{k}}\left(s,z\right){E_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)+{E_{k}}{\left(z\right)^{\top }}{\Gamma _{k}}\left(s,z\right)\right\}{\theta _{\alpha }^{k}}\left(dz\right)\hspace{1em}\\ {} \phantom{dP\left(s\right)=}+\left.{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{E_{k}}{\left(z\right)^{\top }}\left({\Gamma _{k}}\left(s,z\right)+P\left(s\right)\right){E_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)-Q\right\}ds\hspace{1em}\\ {} \phantom{dP\left(s\right)=}+{\textstyle\sum \limits_{i=1}^{p}}{\Lambda _{i}}\left(s\right)d{W^{i}}\left(s\right)+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{\Gamma _{k}}\left(s,z\right){\tilde{N}_{\alpha }^{k}}\left(ds,dz\right)\hspace{1em}\\ {} \phantom{dP\left(s\right)=}+{\textstyle\sum \limits_{j=1}^{d}}{L_{j}}\left(s\right)d{\tilde{\Phi }_{j}}\left(s\right),s\in \left[t,T\right],\hspace{1em}\\ {} P\left(T\right)=-G.\hspace{1em}\end{array}\right.\]Noting that (3.5) is a standard BSDE over the entire time period $[0,T]$, by the same manner of [27], we can verify that Equation (3.5) admits a unique solution
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& & \displaystyle \left(P\left(\cdot \right),\Lambda \left(\cdot \right),\Gamma \left(\cdot ;\cdot \right),L\left(\cdot \right)\right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;{S^{n}}\right)\\ {} & & \displaystyle \hspace{1em}\times {\mathcal{L}^{2}}\left(t,T;{\left({S^{n}}\right)^{p}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[t,T\right]\times {\mathbb{R}^{\ast }};{\left({S^{n}}\right)^{l}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(t,T;{\left({S^{n}}\right)^{d}}\right).\end{array}\]
Now, associated with $\left(\hat{u}\left(\cdot \right),\hat{X}\left(\cdot \right),p\left(\cdot ;\cdot \right),q\left(\cdot ;\cdot \right),r\left(\cdot ,\cdot ;\cdot \right),P\left(\cdot \right),\Gamma \left(\cdot ;\cdot \right)\right)$ we define, for $\left(s,t\right)\in \mathcal{D}\left(\left[0,T\right]\right)$,
and
(3.6)
\[ \mathcal{U}\left(s;t\right)={B^{\top }}p\left(s;t\right)+{\sum \limits_{i=1}^{p}}{D_{i}}{^{\top }}{q_{i}}\left(s;t\right)+{\sum \limits_{k=1}^{l}}{\int _{{\mathbb{R}^{\ast }}}}{F_{k}}{\left(z\right)^{\top }}{r_{k}}\left(s,z;t\right){\theta _{\alpha }^{k}}\left(dz\right)-R\hat{u}\left(s\right),\](3.7)
\[ \mathcal{V}\left(s;t\right)={\sum \limits_{i=1}^{p}}{D_{i}^{\top }}P\left(s\right){D_{i}}+{\sum \limits_{k=1}^{l}}{\int _{{\mathbb{R}^{\ast }}}}{F_{k}}{\left(z\right)^{\top }}\left(P\left(s\right)+\Gamma \left(s,z\right)\right){F_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)-R.\]Remark 7.
Definition 4 is slightly different from the original definition provided by [15] and [14], where the open-loop equilibrium control is given by
Although the limit (3.8) already provides a characterizing condition, however, it is not very useful because it involves an a.s. limit with respect to uncountably many $\varepsilon >0$. Thus, in this case by using the property of RCLL of state process $X(\cdot )$ we can deduce an equivalent condition for the equilibrium, see Hu et al. [15]. In this paper, we defined an open-loop equilibrium control by sense (3.2), which is well defined in general.
(3.8)
\[ \underset{\varepsilon \downarrow 0}{\lim }\frac{1}{\varepsilon }\left\{J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);{u^{\varepsilon }}\left(\cdot \right)\right)-J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);\hat{u}\left(\cdot \right)\right)\right\}\ge 0.\]The following lemma will be used later in this study, it provides some important property about the flow of adapted processes.
Lemma 8.
Under assumptions (H1)–(H2), for any $\hat{u}\left(\cdot \right)\in {L_{\mathcal{F},p}^{2}}\left(t,T;{\mathbb{R}^{m}}\right)$, there exists a sequence ${\left({\varepsilon _{n}^{t}}\right)_{n\in \mathbb{N}}}\subset (0,T-t)$ satisfying ${\varepsilon _{n}^{t}}\to 0$ as $n\to \infty $, such that
Now we introduce the space
Clearly, for any $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{m}}\right)$, its associated flow of adjoint processes $p\left(\cdot ;\cdot \right)\in \mathcal{L}$.
(3.10)
\[ \mathcal{L}=\left\{\Lambda \left(\cdot ;t\right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;{\mathbb{R}^{n}}\right)\hspace{2.5pt}\text{such that}\hspace{2.5pt}\underset{t\in \left[0,T\right]}{\sup }\mathbb{E}\left[\underset{s\in \left[t,T\right]}{\sup }{\left|\Lambda \left(s;t\right)\right|^{2}}\right]<+\infty \right\}.\]The following theorem is the first main result of this work, it provides a necessary and sufficient conditions for equilibrium controls to the time-inconsistent Problem (N).
Theorem 9 (Characterization of equilibrium).
Let (H1) hold. Given an admissible control $\hat{u}\left(\cdot \right)\in {L_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{m}}\right)$, let
where $U\left(t;t\right)$ and $V\left(t;t\right)$ are given by (3.6) and (3.7), respectively.
\[\begin{aligned}{}& \left(p\left(\cdot ;\cdot \right),q\left(\cdot ;t\right),r\left(\cdot ,\cdot ;t\right),l\left(\cdot ;t\right)\right)\\ {} & \hspace{1em}\in \mathcal{L}\times {\mathcal{L}_{\mathcal{F}}^{2}}\left(0,T;{\left({\mathbb{R}^{n}}\right)^{p}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[0,T\right]\times {\mathbb{R}^{\ast }};{\left({\mathbb{R}^{n}}\right)^{l}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(0,T;{\left({\mathbb{R}^{n}}\right)^{d}}\right),\end{aligned}\]
be the unique solution to the BSDE (3.3) and let
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}& & \displaystyle \left(P\left(\cdot \right),\Lambda \left(\cdot \right),\Gamma \left(\cdot ,\cdot \right),L\left(\cdot \right)\right)\in {\mathcal{S}_{\mathcal{F}}^{2}}\left(t,T;{S^{n}}\right)\\ {} & & \displaystyle \hspace{1em}\times {\mathcal{L}^{2}}\left(t,T;{\left({S^{n}}\right)^{p}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[t,T\right]\times {\mathbb{R}^{\ast }};{\left({S^{n}}\right)^{l}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(t,T;{\left({S^{n}}\right)^{d}}\right),\end{array}\]
be the unique solution to the BSDE (3.5). Then $\hat{u}\left(\cdot \right)$ is an open-loop Nash equilibrium if and only if the following two conditions hold: The first order equilibrium condition
and the second order equilibrium condition
(3.12)
\[ \mathcal{V}\left(t;t\right)\le 0,\hspace{2.5pt}d\mathbb{P}\textit{-a.s.},\hspace{2.5pt}\forall t\in \left[0,T\right],\]In order to give a proof for the above theorem, the main idea is still based on the variational techniques in the spirit of proving the characterization of equilibria [14] and [15] in the absence of random jumps.
Let $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{m}}\right)$ be an admissible control and $\hat{X}\left(\cdot \right)$ be the corresponding controlled state process. Consider the perturbed control ${u^{\varepsilon }}\left(\cdot \right)$ defined by the spike variation (3.1) for some fixed arbitrary $t\in \left[0,T\right]$, $v\in {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t-}^{\alpha }},\mathbb{P};{\mathbb{R}^{m}}\right)$ and $\varepsilon \in \left(0,T-t\right)$. Denote by ${\hat{X}^{\varepsilon }}\left(\cdot \right)$ the solution of the state equation corresponding to ${u^{\varepsilon }}\left(\cdot \right)$. It follows from the standard perturbation approach, see, for example, [31] and [41], that ${\hat{X}^{\varepsilon }}\left(\cdot \right)-\hat{X}\left(\cdot \right)={y^{\varepsilon ,v}}\left(\cdot \right)+{Y^{\varepsilon ,v}}\left(\cdot \right)$, where ${y^{\varepsilon ,v}}\left(\cdot \right)$ and ${Y^{\varepsilon ,v}}\left(\cdot \right)$ solve the following SDEs, respectively, for $s\in \left[t,T\right]$:
(3.13)
\[\begin{aligned}{}& \left\{\begin{array}{l}d{y^{\varepsilon ,v}}\left(s\right)=A{y^{\varepsilon ,v}}\left(s\right)ds+{\textstyle\textstyle\sum _{i=1}^{p}}\left\{{C_{i}}{y^{\varepsilon ,v}}\left(s\right)+{D_{i}}v{1_{\left[t,t+\varepsilon \right)}}\left(s\right)\right\}d{W^{i}}\left(s\right)\hspace{1em}\\ {} \phantom{d{y^{\varepsilon ,v}}\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left\{{E_{k}}\left(z\right){y^{\varepsilon ,v}}\left(s-\right)+{F_{k}}\left(z\right)v{1_{\left[t,t+\varepsilon \right)}}\left(s\right)\right\}{\tilde{N}_{\alpha }^{k}}\left(ds,dz\right),\hspace{1em}\\ {} {y^{\varepsilon ,v}}\left(t\right)=0,\hspace{1em}\end{array}\right.\end{aligned}\](3.14)
\[\begin{aligned}{}& \left\{\begin{array}{l}d{Y^{\varepsilon ,v}}\left(s\right)=\left\{A{Y^{\varepsilon ,v}}\left(s\right)+Bv{1_{\left[t,t+\varepsilon \right)}}\left(s\right)\right\}ds+{\textstyle\sum \limits_{i=1}^{p}}{C_{i}}{Y^{\varepsilon ,v}}\left(s\right)d{W^{i}}\left(s\right)\hspace{1em}\\ {} \phantom{d{Y^{\varepsilon ,v}}\left(s\right)=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{E_{k}}\left(z\right){Y^{\varepsilon ,v}}\left(s-\right){\tilde{N}_{\alpha }^{k}}\left(ds,dz\right),\hspace{1em}\\ {} {Y^{\varepsilon ,v}}\left(t\right)=0.\hspace{1em}\end{array}\right.\end{aligned}\]We need the following lemma
Lemma 10.
Under assumption (H1), the following estimates hold:
We have also
Moreover, we have the equality
(3.17)
\[ \underset{s\in \left[t,T\right]}{\sup }{\left|\mathbb{E}\left[{y^{\varepsilon ,v}}\left(s\right)\left|{\mathcal{F}_{s}^{\alpha }}\right.\right]\right|^{2}}=O\left({\varepsilon ^{2}}\right).\](3.18)
\[\begin{aligned}{}& J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);{u^{\varepsilon }}\left(\cdot \right)\right)-J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);\hat{u}\left(\cdot \right)\right)\\ {} & \hspace{1em}=-{\int _{t}^{t+\varepsilon }}\mathbb{E}\left\{\left\langle \mathcal{U}\left(s;t\right),v\right\rangle +\frac{1}{2}\left\langle \mathcal{V}\left(s;t\right)v,v\right\rangle \right\}ds+o\left(\varepsilon \right).\end{aligned}\]Now, we are ready to give the proof of Theorem 9.
Proof of Theorem 9.
Given an admissible control $\hat{u}\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{m}}\right)$, for which (3.11) and (3.12) holds, according to Lemma 8, we have from (3.18) that for any $t\in \left[0,T\right]$ and for any ${\mathbb{R}^{m}}$-valued, ${\mathcal{F}_{t}^{\alpha }}$-measurable and bounded random variable v, there exists a sequence ${\left({\varepsilon _{n}^{t}}\right)_{n\in \mathbb{N}}}\subset (0,T-t)$ satisfying ${\varepsilon _{n}^{t}}\to 0$ as $n\to \infty $, such that
\[\begin{aligned}{}& \underset{n\to 0}{\lim }\frac{1}{{\varepsilon _{n}^{t}}}\left\{J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);{u^{\varepsilon }}\left(\cdot \right)\right)-J\left(t,\hat{X}\left(t\right),\alpha \left(t\right);\hat{u}\left(\cdot \right)\right)\right\}\\ {} & \hspace{1em}=-\left\{\left\langle \mathcal{U}\left(t;t\right),v\right\rangle +\frac{1}{2}\left\langle \mathcal{V}\left(t;t\right)v,v\right\rangle \right\},\\ {} & \hspace{1em}=-\frac{1}{2}\left\langle \mathcal{V}\left(t;t\right)v,v\right\rangle ,\\ {} & \hspace{1em}\ge 0,d\mathbb{P}\text{-a.s.}\end{aligned}\]
Hence $\hat{u}\left(\cdot \right)$ is an equilibrium strategy.
Conversely, assume that $\hat{u}\left(\cdot \right)$ is an equilibrium strategy. Then, by (3.2) together with (3.18) and Lemma 8, for any $\left(t,u\right)\in \left[0,T\right]\times {\mathbb{R}^{m}}$, the following inequality holds:
Now, we define $\forall \left(t,u\right)\in \left[0,T\right]\times {\mathbb{R}^{m}}$, $\Phi \left(t,u\right)=\left\langle \mathcal{U}\left(t;t\right),u\right\rangle +\frac{1}{2}\left\langle \mathcal{V}\left(t;t\right)u,u\right\rangle $. Easy manipulations show that the inequality (3.19) is equivalent to $\Phi \left(t,0\right)={\max _{u\in {\mathbb{R}^{m}}}}\Phi \left(t,u\right)$, $d\mathbb{P}\text{-a.s.},\forall t\in \left[0,T\right]$. So it is easy to prove that the maximum condition is equivalent to the following two conditions:
This completes the proof. □
Remark 11.
It is worth noting that for the positive semidefinite conditions on the coefficients $Q\left(\cdot \right)$, G and $R\left(\cdot ,\cdot \right)$, the corresponding process $P(\cdot )$ in [15] and [14] is indeed positive semidefinite due to the comparison principles of BSDEs. Thus, as a result of Theorem 9, a necessary and sufficient condition for a control being an equilibrium strategy is only the first order equilibrium condition (3.11). However, there is a significant difference between the estimate for the cost functional presented and that in [15] and [14]. Because stochastic coefficients and random jumps of the controlled system are taken into account, an additional term $\Gamma (\cdot ,\cdot )$ occurs in the formulation of $P(\cdot )$. So in this paper, $P(\cdot )$ is not necessarily positive semidefinite. This is why we modify the methodology of deriving the sufficient condition for equilibrium controls. Therefore, we have the following corollary, the proof of which follows the same arguments as the proof of Proposition 3.2 in [30].
Corollary 12.
Let (H1)–(H2) hold. Given an admissible control $\hat{u}\left(\cdot \right)\in {L_{\mathcal{F},p}^{2}}(0,T;{\mathbb{R}^{m}})$, let
\[\begin{aligned}{}& \left(p\left(\cdot ;\cdot \right),q\left(\cdot ;\cdot \right),r\left(\cdot ,\cdot ;\cdot \right),l\left(\cdot ;\cdot \right)\right)\\ {} & \hspace{1em}\in \mathcal{L}\times {\mathcal{L}_{\mathcal{F}}^{2}}\left(0,T;{\left({\mathbb{R}^{n}}\right)^{p}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\theta ,2}}\left(\left[0,T\right]\times {\mathbb{R}^{\ast }};{\left({\mathbb{R}^{n}}\right)^{l}}\right)\times {\mathcal{L}_{\mathcal{F},p}^{\lambda ,2}}\left(0,T;{\left({\mathbb{R}^{n}}\right)^{d}}\right)\end{aligned}\]
be the unique solution to the BSDE (3.3). Then $\hat{u}\left(\cdot \right)$ is an equilibrium, if the following condition holds $dP\textit{-a.s.}$, $dt\textit{-a.e.}$
3.2 Linear feedback stochastic equilibrium control
In this subsection, our goal is to obtain a state feedback representation of an equilibrium control for Problem (N) via some class of ordinary differential equations.
Now, suppressing the arguments $\left(s,{e_{i}}\right)$ for the coefficients A, B, b, ${C_{i}}$, ${D_{i}}$, ${\sigma _{i}}$, we use the notation $\varrho \left(z\right)$ instead of $\varrho \left(s,z,{e_{i}}\right)$ for $\varrho ={E_{k}},{F_{k}}$ and ${c_{k}}$. First, for any deterministic, differentiable function $\eta \in C\left(\left[0,T\right]\times \chi ;{\mathbb{R}^{n\times n}}\right)$ consider the differential-difference operator
Then we introduce the following system of differential-difference equations, for $s\in \left[0,T\right]$:
where $\Psi \left(\cdot ,\cdot \right)$ and $\psi \left(\cdot ,\cdot \right)$ are given by
with
(3.23)
\[ \left\{\begin{array}{l}0=\mathcal{L}\left(M\left(s,{e_{i}}\right)\right)+M\left(s,{e_{i}}\right)A+{A^{\top }}M\left(s,{e_{i}}\right)+{\textstyle\sum \limits_{i=1}^{p}}{C_{i}^{\top }}M\left(s,{e_{i}}\right){C_{i}}\hspace{1em}\\ {} \phantom{0=}-\left(M\left(s,{e_{i}}\right)B+{\textstyle\sum \limits_{i=1}^{p}}{C_{i}^{\top }}M\left(s,{e_{i}}\right){D_{i}}\right.\hspace{1em}\\ {} \phantom{0=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left.{E_{k}}{\left(z\right)^{\top }}M\left(s,{e_{i}}\right){F_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)\right)\Psi \left(s,{e_{i}}\right)\hspace{1em}\\ {} \phantom{0=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{E_{k}}{\left(z\right)^{\top }}M\left(s,{e_{i}}\right){E_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)+Q,\hspace{1em}\\ {} 0=\mathcal{L}\left(\bar{M}\left(s,{e_{i}}\right)\right)+\bar{M}\left(s,{e_{i}}\right)A+{A^{\top }}\bar{M}\left(s,{e_{i}}\right)-\bar{M}\left(s,{e_{i}}\right)B\Psi \left(s,{e_{i}}\right)+\bar{Q},\hspace{1em}\\ {} 0=\mathcal{L}\left(\Upsilon \left(s,{e_{i}}\right)\right)+{A^{\top }}\Upsilon \left(s,{e_{i}}\right),\hspace{1em}\\ {} 0=\mathcal{L}\left(\varphi \left(s,{e_{i}}\right)\right)+{A^{\top }}\varphi \left(s,{e_{i}}\right)+\left(M\left(s,{e_{i}}\right)+\bar{M}\left(s,{e_{i}}\right)\right)\left(b-B\psi \left(s,{e_{i}}\right)\right)\hspace{1em}\\ {} \phantom{0=}+{\textstyle\sum \limits_{i=1}^{p}}{C_{i}^{\top }}M\left(s,{e_{i}}\right)\left({\sigma _{i}}-{D_{i}}\psi \left(s,{e_{i}}\right)\right)\hspace{1em}\\ {} \phantom{0=}+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{E_{k}}{\left(z\right)^{\top }}M\left(s,{e_{i}}\right)\left({c_{k}}\left(z\right)-{F_{k}}\left(z\right)\psi \left(s,{e_{i}}\right)\right){\theta _{\alpha }^{k}}\left(dz\right),\hspace{1em}\\ {} M\left(T,{e_{i}}\right)=G;\bar{M}\left(T,{e_{i}}\right)=\bar{G};\Upsilon \left(T,{e_{i}}\right)={\mu _{1}};\varphi \left(T,{e_{i}}\right)={\mu _{2}},\hspace{1em}\end{array}\right.\](3.24)
\[ \left\{\begin{array}{l}\Psi \left(s,{e_{i}}\right)\triangleq \Theta \left(s,{e_{i}}\right)\left(\underset{}{{B^{\top }}}\left(M\left(s,{e_{i}}\right)+\bar{M}\left(s,{e_{i}}\right)+\Upsilon \left(s,{e_{i}}\right)\right)\right.\hspace{1em}\\ {} \phantom{\Psi \left(s,{e_{i}}\right)\triangleq }+{\textstyle\sum \limits_{i=1}^{p}}{D_{i}^{\top }}M\left(s,{e_{i}}\right){C_{i}}+\left.{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}{F_{k}}{\left(z\right)^{\top }}M\left(s,{e_{i}}\right){E_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)\right),\hspace{1em}\\ {} \psi \left(s,{e_{i}}\right)\triangleq \Theta \left(s,{e_{i}}\right)\left({B^{\top }}\varphi \left(s,{e_{i}}\right)+{\textstyle\sum \limits_{i=1}^{p}}{D_{i}^{\top }}M\left(s,{e_{i}}\right){\sigma _{i}}\right.\hspace{1em}\\ {} \phantom{\psi \left(s,{e_{i}}\right)\triangleq }+{\textstyle\sum \limits_{k=1}^{l}}{\textstyle\int _{{\mathbb{R}^{\ast }}}}\left.{F_{k}}{\left(z\right)^{\top }}M\left(s,{e_{i}}\right){c_{k}}\left(z\right){\theta _{\alpha }^{k}}\left(dz\right)\right),\hspace{1em}\end{array}\right.\]The following theorem presents the existence condition for a linear feedback equilibrium control.
Theorem 13.
Let (H1)–(H2) hold. Suppose that the system of equations (3.23) admit a solution $M\left(\cdot ,{e_{i}}\right)$, $\bar{M}\left(\cdot ,{e_{i}}\right)$, $\Upsilon \left(\cdot ,{e_{i}}\right)$ and $\varphi \left(\cdot ,{e_{i}}\right)$, for any ${e_{i}}\in X$, on $C\big(\left[0,T\right];{\mathbb{R}^{n\times n}}\big)$. Then the time-inconsistent LQ Problem (N) has an equilibrium control that can be represented by the state feedback form
where $\Psi \left(\cdot ,\cdot \right)$ and $\psi \left(\cdot ,\cdot \right)$ are given by (3.24).
(3.25)
\[ \hat{u}\left(t\right)=-\Psi \left(t,\alpha \left(t\right)\right)\hat{X}\left(t-\right)-\psi \left(t,\alpha \left(t\right)\right),\]3.3 Uniqueness of the equilibrium control
In this subsection, we prove that if the system of equations (3.23) is solvable, then the state feedback equilibrium control given by (3.25) is the unique open-loop Nash equilibrium control of Problem (N).
Theorem 14.
Let (H1)–(H2) hold. Suppose that $M\left(\cdot ,\cdot \right)$, $\bar{M}\left(\cdot ,\cdot \right)$, $\Upsilon \left(\cdot ,\cdot \right)$ and $\varphi \left(\cdot ,\cdot \right)$ are solutions to the system (3.23). Then $\hat{u}\left(\cdot \right)$ given by (3.25) is the unique open-loop Nash equilibrium control for Problem (N).
4 Applications
In this section, we discuss an extension of a new class of optimization problems [36], in which the investor manages her/his wealth by consuming and investing in a financial market subject to a mean variance criterion controlling the final risk of the portfolio. This problem can be eventually formulated as a time-inconsistent stochastic LQ problem and solved by the results presented in the preceding sections.
4.1 Conditional mean-variance-utility consumption–investment and reinsurance problem
We study equilibrium reinsurance (eventually new business), investment and consumption strategies for mean-variance-utility portfolio problem where the surplus of the insurers is assumed to follow a jump-diffusion model. The financial market consists of one riskless asset and one risky asset whose price processes are described by regime-switching SDEs. The problem is formulated as follows. Consider an insurer whose surplus process is described by the jump-diffusion model
where $c>0$ is the premium rate, ${\beta _{0}}$ is a positive constant, ${W^{1}}$ is a one-dimensional standard Brownian motion, ${N_{\alpha }}$ is a Poisson process with intensity $\lambda >0$ and ${\left\{{Y_{i}}\right\}_{i\in \mathbb{N}-\left\{0\right\}}}$ is a sequence of independent and identically distributed positive random variables with common distribution ${\mathbb{P}_{Y}}$ having finite first and second moments ${\mu _{Y}}={\textstyle\int _{0}^{\infty }}z{\mathbb{P}_{Y}}\left(dz\right)$ and ${\sigma _{Y}}={\textstyle\int _{0}^{\infty }}{z^{2}}{\mathbb{P}_{Y}}\left(dz\right)$. We assume that ${W^{1}}$, ${N_{\alpha }}$, and $\left\{{\textstyle\sum \limits_{i=1}^{{N_{\alpha }}\left(.\right)}}{Y_{i}}\right\}$ are independent. Let Y be a generic random variable which has the same distribution as ${Y_{i}}$. The premium rate c is assumed to be calculated via the expected value principle, i.e. $c=\left(1+\eta \right)\lambda {\mu _{Y}}$ with safety loading $\eta >0$.
(4.1)
\[ d\Lambda \left(s\right)=cds+{\beta _{0}}d{W^{1}}\left(s\right)-d{\sum \limits_{i=1}^{{N_{\alpha }}\left(s\right)}}{Y_{i}},\hspace{2.5pt}s\in \left[0,T\right],\]Note that the process ${\textstyle\sum \limits_{i=1}^{{N_{\alpha }}\left(s\right)}}{Y_{i}}$ can also be defined through a random measure ${N_{\alpha }^{1}}\left(ds,dz\right)$ as
\[ {\sum \limits_{i=1}^{{N_{\alpha }}\left(s\right)}}{Y_{i}}={\int _{0}^{s}}{\int _{0}^{\infty }}z{N_{\alpha }^{1}}\left(dr,dz\right),\]
where ${N_{\alpha }^{1}}$ is a finite Poisson random measure with a random compensator having the form ${\theta _{\alpha }^{1}}\left(dz\right)ds=\lambda {\mathbb{P}_{Y}}\left(dz\right)ds$. We recall that ${\tilde{N}_{\alpha }^{1}}\left(ds,dz\right)={N_{\alpha }^{1}}\left(ds,dz\right)-{\theta _{\alpha }^{1}}\left(dz\right)ds$ defines the compensated jump martingale random measure of ${N_{\alpha }^{1}}$. Obviously, we have
Hence (4.1) is equivalent to
Suppose that the insurer is allowed to invest its wealth in a financial market, in which two securities are traded continuously. One of them is a bond with price ${S^{0}}\left(s\right)$ at time $s\in \left[0,T\right]$ governed by
There is also a risky asset with unit price ${S^{1}}\left(s\right)$ at time $s\in \left[0,T\right]$ governed by
where ${r_{0}},\sigma ,\beta :\left[0,T\right]\times \mathcal{X}\to \left(0,\infty \right)$ are assumed to be deterministic and continuous functions such that $\sigma \left(s,\alpha \left(s\right)\right)>{r_{0}}\left(s,\alpha \left(s\right)\right)>0$, ${W^{2}}\left(\cdot \right)$ is a one-dimensional standard Brownian motion, ${N_{\alpha }^{2}}$ is a finite Poisson random measure with random compensator having the form ${n_{\alpha }^{2}}\left(ds,dz\right)={\theta _{\alpha }^{2}}\left(dz\right)ds$. We assume that ${W^{1}}\left(\cdot \right)$, ${W^{2}}\left(\cdot \right)$, ${N_{\alpha }^{1}}\left(\cdot ,\cdot \right)$ and ${N_{\alpha }^{2}}\left(\cdot ,\cdot \right)$ are independent and ${\theta _{\alpha }^{2}}\left(\cdot \right)$ is a Lévy measure on $\left(-1,+\infty \right)$ such that ${\textstyle\int _{-1}^{+\infty }}{\left|z\right|^{2}}{\theta _{\alpha }^{2}}\left(dz\right)<\infty $.
(4.3)
\[ d{S^{0}}\left(s\right)={r_{0}}\left(s,\alpha \left(s\right)\right){S^{0}}\left(s\right)ds,{S^{0}}\left(0\right)={s_{0}}>0.\](4.4)
\[\begin{aligned}{}d{S^{1}}\left(s\right)& ={S^{1}}\left(s-\right)\left(\sigma \left(s,\alpha \left(s\right)\right)ds+\beta \left(s,\alpha \left(s\right)\right)d{W^{2}}\left(s\right)\right.\\ {} & \hspace{1em}+\left.{\int _{-1}^{+\infty }}z\left({N_{\alpha }^{2}}\left(ds,dz\right)-{\theta _{\alpha }^{2}}\left(dz\right)ds\right)\right),\hspace{2.5pt}{S^{1}}\left(0\right)={s^{1}}>0,\end{aligned}\]The insurer, starting from an initial capital ${x_{0}}>0$ at time 0, is allowed to dynamically purchase proportional reinsurance (acquire new business), invest in the financial market and consuming. A trading strategy $u\left(\cdot \right)$ is described by a three-dimensional stochastic processes ${\left({u_{1}}\left(\cdot \right),{u_{2}}\left(\cdot \right),{u_{3}}\left(\cdot \right)\right)^{\top }}$. The strategy ${u_{1}}\left(s\right)\ge 0$ represents the retention level of reinsurance or new business acquired at time $s\in \left[0,T\right]$. We point that ${u_{1}}\left(s\right)$ $\in \left[0,1\right]$ corresponds to a proportional reinsurance cover and shows that the cedent should divert part of the premium to the reinsurer at the rate of $\left(1-{u_{1}}\left(t\right))\right)\left({\theta _{0}}+1\right)\lambda {\mu _{Y}}$, where ${\theta _{0}}$ is the relative safety loading of the reinsurer satisfying ${\theta _{0}}\ge \eta $. Meanwhile, for each claim Y occurring at time s, the reinsurer pays $\left(1-{u_{1}}\left(t\right))\right)Y$ of the claim, and the cedent pays the rest. ${u_{1}}\left(s\right)\in \left(1,+\infty \right)$ corresponds to acquiring new business. ${u_{2}}\left(s\right)\ge 0$ represents the amount invested in the risky stock at time s. The dollar amount invested in the bond at time s is ${X^{{x_{0}},{e_{{i_{0}}}},u\left(\cdot \right)}}\left(s\right)-{u_{2}}\left(s\right)$, where ${X^{{x_{0}},{e_{{i_{0}}}},u\left(\cdot \right)}}\left(\cdot \right)$ is the wealth process associated with strategy $u\left(\cdot \right)$ and the initial states $\left({x_{0}},{e_{{i_{0}}}}\right)$, ${u_{3}}\left(s\right)$ represents the consumption rate at time $s\in \left[0,T\right]$. Thus, incorporating reinsurance/new business, and investment strategies into the surplus process and the risky asset, respectively. As time evolves, we consider the evolution of the controlled stochastic differential equation parametrized by $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \chi $ and satisfied by $X\left(\cdot \right)$: for $s\in \left[0,T\right]$,
where $r\left(s,\alpha \left(s\right)\right)=\left(\sigma \left(s,\alpha \left(s\right)\right)-{r_{0}}\left(s,\alpha \left(s\right)\right)\right)$ and $\delta =\eta -{\theta _{0}}$. Then, for any $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \chi $ the mean-variance-utility consumption–investment and reinsurance optimization problem is reduced to maximization of the utility function J $(t,\xi ,{e_{i}};\cdot )$ given by
subject to (4.5), where $h\left(\cdot \right):[0,T]\to \mathbb{R}$ is a general deterministic nonexponential discount function satisfying $h(0)=1$, $h(s)>0$ $ds\text{-a.e.}$ and ${\textstyle\int _{0}^{T}}h(s)ds<\infty $. In this paper we consider general discount functions satisfying the above assumptions. Some possible examples of discount functions are considered in the literatures [42] and [10].
(4.5)
\[ \left\{\begin{array}{l}dX\left(s\right)=\left\{{r_{0}}\left(s,\alpha \left(s\right)\right)X\left(s\right)+\left(\delta +{\theta _{0}}{u_{1}}\left(s\right)\right)\lambda {\mu _{Y}}+r\left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)\right\}ds\hspace{1em}\\ {} \phantom{dX\left(s\right)=}-{u_{3}}\left(s\right)ds+{\beta _{0}}{u_{1}}\left(s\right)d{W^{1}}\left(s\right)+\beta \left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)d{W^{2}}\left(s\right)\hspace{1em}\\ {} \phantom{dX\left(s\right)=}-{u_{1}}\left(s-\right){\textstyle\textstyle\int _{0}^{+\infty }}z{\tilde{N}_{\alpha }^{1}}\left(ds,dz\right)+{u_{2}}\left(s-\right){\textstyle\textstyle\int _{-1}^{+\infty }}z{\tilde{N}_{\alpha }^{2}}\left(ds,dz\right),\hspace{1em}\\ {} X\left(t\right)=\xi ,\alpha \left(t\right)={e_{i}},\hspace{1em}\end{array}\right.\](4.6)
\[\begin{aligned}{}J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)& =\mathbb{E}\left[{\int _{t}^{T}}\frac{1}{2}h\left(s-t\right){u_{3}}{(s)^{2}}ds+\frac{1}{2}\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\\ {} & \hspace{1em}-\left.\left({\mu _{1}}\xi +{\mu _{2}}\right)\mathbb{E}\underset{}{\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]}\right],\end{aligned}\]Remark 15.
Similar to [19] and [21], due to the presence of the observable random factor $\alpha \left(\cdot \right)$, we consider the expectation of a conditional mean-variance criterion in the above cost functional. This is different from the mean-variance portfolio selection problem with regime switching considered in [41] and [5]. In [21], a conditional mean-variance portfolio selection problem with common noise is proposed and solved using the linear-quadratic optimal control of the conditional McKean–Vlasov equation with random coefficients and dynamic programming approach.
With $n=1$, $p=l=m=3$, the optimal control problem associated with (4.5) and (4.6) is equivalent to maximization of
subject to (2.1). Here $A={r_{0}}\left(s,\alpha \left(s\right)\right)$, $B=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}\lambda {\mu _{Y}}{\theta _{0}}& r\left(s,\alpha \left(s\right)\right)& -1\end{array}\right)$, $b=\delta \lambda {\mu _{Y}}$, ${D_{1}}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}{\beta _{0}}& 0& 0\end{array}\right)$, ${D_{2}}=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}0& \beta \left(s,\alpha \left(s\right)\right)& 0\end{array}\right)$, $Q=0$, $\bar{Q}=0$, ${F_{1}}\left(z\right)=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}-z{1_{\left(0,\infty \right)}}\left(z\right)& 0& 0\end{array}\right)$, ${F_{2}}\left(z\right)=\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}0& z{1_{\left(-1,\infty \right)}}\left(z\right)& 0\end{array}\right)$, $\Gamma =\left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}0& 0& 1\end{array}\right)$, $R\left(t,s\right)=h\left(s-t\right){\Gamma ^{\top }}\Gamma $, $G=1$, $\bar{G}=-1$, ${C_{i}}=0$, ${\sigma _{i}}=0$, ${E_{k}}\left(z\right)=0$ and ${c_{k}}\left(z\right)=0$. Thus, the above model is a special case of the general time-inconsistent LQ problem formulated earlier in this paper. Then we apply Corollary 12 and Theorem 13 to obtain the unique Nash equilibrium trading strategy. Define
(4.7)
\[\begin{aligned}{}J\left(t,\xi ,{e_{i}};u\left(.\right)\right)& =\mathbb{E}\left[{\int _{t}^{T}}\frac{1}{2}\left\langle h\left(s-t\right){\Gamma ^{\top }}\Gamma u(s),u\left(s\right)\right\rangle ds+\frac{1}{2}\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\\ {} & \hspace{1em}-\left.\left({\mu _{1}}\xi +{\mu _{2}}\right)\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right],\end{aligned}\](4.8)
\[ \rho \left(s,\alpha \left(s\right)\right)\triangleq \left(\frac{{\left(\lambda {\mu _{Y}}{\theta _{0}}\right)^{2}}}{\left({\beta _{0}^{2}}+{\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta _{\alpha }^{1}}\left(dz\right)\right)}+\frac{r{\left(s,\alpha \left(s\right)\right)^{2}}}{\left(\beta {\left(s,\alpha \left(s\right)\right)^{2}}+{\textstyle\textstyle\int _{-1}^{+\infty }}{z^{2}}{\theta _{\alpha }^{2}}\left(dz\right)\right)}\right).\]Then the system (3.23) reduced to the following: for $s\in \left[0,T\right]$,
(4.9)
\[ \left\{\begin{array}{l}{M^{\prime }}\left(s,{e_{i}}\right)+M\left(s,{e_{i}}\right)\left(2{r_{0}}\left(s,{e_{i}}\right)-\Upsilon \left(s,{e_{i}}\right)+{\lambda _{ii}}\right)-\rho \left(s,{e_{i}}\right)\Upsilon \left(s,{e_{i}}\right)\hspace{1em}\\ {} \hspace{1em}+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}M\left(s,{e_{j}}\right)=0,\hspace{1em}\\ {} {\bar{M}^{\prime }}\left(s,{e_{i}}\right)+\bar{M}\left(s,{e_{i}}\right)\left(2{r_{0}}\left(s,{e_{i}}\right)-\Upsilon \left(s,{e_{i}}\right)+{\lambda _{ii}}\right)-\rho \left(s,{e_{i}}\right)\Upsilon \left(s,{e_{i}}\right)\hspace{1em}\\ {} \hspace{1em}+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\bar{M}\left(s,{e_{j}}\right)=0,\hspace{1em}\\ {} {\Upsilon ^{\prime }}\left(s,{e_{i}}\right)+\Upsilon \left(s,{e_{i}}\right)\left({r_{0}}\left(s,{e_{i}}\right)+{\lambda _{ii}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\Upsilon \left(s,{e_{j}}\right)=0,\hspace{1em}\\ {} {\varphi ^{\prime }}\left(s,{e_{i}}\right)+\varphi \left(s,{e_{i}}\right)\left({r_{0}}\left(s,{e_{i}}\right)+{\lambda _{ii}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\varphi \left(s,{e_{j}}\right)=0,\hspace{1em}\\ {} M\left(T,{e_{i}}\right)=1,\bar{M}\left(T,{e_{i}}\right)=-1,\hspace{2.5pt}\Upsilon \left(T,{e_{i}}\right)=-{\mu _{1}},\varphi \left(T,{e_{i}}\right)=-{\mu _{2}}.\hspace{1em}\end{array}\right.\]By standard arguments, we obtain, for $s\in \left[0,T\right]$ and ${e_{i}}\in \mathcal{X}$,
\[\begin{aligned}{}M\left(s,{e_{i}}\right)& ={e^{{\textstyle\textstyle\int _{s}^{T}}\left(2{r_{0}}\left(\tau ,{e_{i}}\right)-\Upsilon \left(\tau ,{e_{i}}\right)+{\lambda _{ii}}\right)d\tau }}\\ {} & \hspace{1em}\left(1+{\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}\left(2{r_{0}}\left(u,{e_{i}}\right)-\Upsilon \left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}\left\{-\rho \left(\tau ,{e_{i}}\right)\Upsilon \left(\tau ,{e_{i}}\right)\right.\right.\\ {} & \hspace{1em}+\left.{\sum \limits_{j\ne i}^{d}}\left.{\lambda _{ij}}M\left(\tau ,{e_{j}}\right)\right\}d\tau \right),\\ {} & =\bar{M}\left(s,{e_{i}}\right),\end{aligned}\]
also we have, for ${e_{i}}\in \mathcal{X}$,
\[\begin{aligned}{}\Upsilon \left(s,{e_{i}}\right)& ={e^{{\textstyle\textstyle\int _{s}^{T}}\left({r_{0}}\left(\tau ,{e_{i}}\right)+{\lambda _{ii}}\right)d\tau }}\\ {} & \hspace{1em}\times \left(-{\mu _{1}}+{\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-\left({r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}{\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\Upsilon \left(\tau ,{e_{j}}\right)d\tau \right)\end{aligned}\]
and
\[\begin{aligned}{}\varphi \left(s,{e_{i}}\right)& ={e^{{\textstyle\textstyle\int _{s}^{T}}\left({r_{0}}\left(\tau ,{e_{i}}\right)+{\lambda _{ii}}\right)d\tau }}\\ {} & \hspace{1em}\times \left(-{\mu _{2}}+{\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-\left({r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}{\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\varphi \left(\tau ,{e_{j}}\right)d\tau \right).\end{aligned}\]
In view of Theorem 13, the Nash equilibrium control (3.25) gives, for $s\in \left[0,T\right]$,
where $\forall \left(s,{e_{i}}\right)\in \left[0,T\right]\times \mathcal{X}$
and
(4.10)
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right)}{\left({\beta _{0}^{2}}+{\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta _{\alpha }^{1}}\left(dz\right)\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.11)
\[\begin{aligned}{}{\hat{u}_{2}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\left(\beta {\left(s,{e_{i}}\right)^{2}}+{\textstyle\textstyle\int _{-1}^{+\infty }}{z^{2}}{\theta _{\alpha }^{2}}\left(dz\right)\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.13)
\[ {\Phi _{1}}\left(s,{e_{i}}\right)=\frac{{e^{{\textstyle\textstyle\int _{s}^{T}}\left(-{r_{0}}\left(\tau ,{e_{i}}\right)+\Upsilon \left(\tau ,{e_{i}}\right)\right)d\tau }}\left(-{\mu _{1}}+{\textstyle\textstyle\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-\left({r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\Upsilon \left(\tau ,{e_{j}}\right)d\tau \right)}{1+{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}\left(2{r_{0}}\left(u,{e_{i}}\right)-\Upsilon \left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}\left\{-\rho \left(\tau ,{e_{i}}\right)\Upsilon \left(\tau ,{e_{i}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}M\left(\tau ,{e_{j}}\right)\right\}d\tau },\](4.14)
\[ {\Phi _{2}}\left(s,{e_{i}}\right)=\frac{{e^{{\textstyle\textstyle\int _{s}^{T}}\left(-{r_{0}}\left(\tau ,{e_{i}}\right)+\Upsilon \left(\tau ,{e_{i}}\right)\right)d\tau }}\left(-{\mu _{2}}+{\textstyle\textstyle\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-\left({r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\varphi \left(\tau ,{e_{j}}\right)d\tau \right)}{1+{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}\left(2{r_{0}}\left(u,{e_{i}}\right)-\Upsilon \left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}\left\{-\rho \left(\tau ,{e_{i}}\right)\Upsilon \left(\tau ,{e_{i}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}M\left(\tau ,{e_{j}}\right)\right\}d\tau }.\]The conditional expectation of the corresponding equilibrium wealth process solves the equation
\[ \left\{\begin{array}{l}d\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=\left\{{\mathcal{P}_{1}}\left(s,\alpha \left(s\right)\right)\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{2}}\left(s,\alpha \left(s\right)\right)\right\}ds,\hspace{1em}\\ {} \mathbb{E}\left[\hat{X}\left(0\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]={x_{0}},\hspace{1em}\end{array}\right.\]
where
\[ \left\{\begin{array}{l}{\mathcal{P}_{1}}\left(s,\alpha \left(s\right)\right)={r_{0}}\left(s,\alpha \left(s\right)\right)-\rho \left(s,\alpha \left(s\right)\right){\Phi _{1}}\left(s,\alpha \left(s\right)\right)-\Upsilon \left(s,\alpha \left(s\right)\right),\hspace{1em}\\ {} {\mathcal{P}_{2}}\left(s,\alpha \left(s\right)\right)=-\rho \left(s,\alpha \left(s\right)\right){\Phi _{2}}\left(s,\alpha \left(s\right)\right)-\varphi \left(s,\alpha \left(s\right)\right)+b\left(s,\alpha \left(s\right)\right).\hspace{1em}\end{array}\right.\]
Technical computations show that
\[ \left\{\begin{array}{l}d\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=\left(\left\{2{\mathcal{P}_{1}}\left(s,\alpha \left(s\right)\right)+{\mathcal{P}_{3}}\left(s,\alpha \left(s\right)\right)\right\}\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\hspace{1em}\\ {} \phantom{d\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=}+2\left({\mathcal{P}_{2}}\left(s,\alpha \left(s\right)\right)+{\mathcal{P}_{4}}\left(s,\alpha \left(s\right)\right)\right)\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\hspace{1em}\\ {} \phantom{d\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=}+\left.{\mathcal{P}_{5}}\left(s,\alpha \left(s\right)\right)\right)ds,\hspace{1em}\\ {} \mathbb{E}\left[\hat{X}{\left(0\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]={x_{0}^{2}},\hspace{1em}\end{array}\right.\]
and
\[ \left\{\begin{array}{l}d\mathrm{Var}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\hspace{1em}\\ {} \hspace{1em}=\left\{2{\mathcal{P}_{1}}\left(s,\alpha \left(s\right)\right)\mathrm{Var}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{3}}\left(s,\alpha \left(s\right)\right)\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\hspace{1em}\\ {} \left.\hspace{2em}+2{\mathcal{P}_{4}}\left(s,\alpha \left(s\right)\right)\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{5}}\left(s,\alpha \left(s\right)\right)\right\}ds,\hspace{1em}\\ {} \mathrm{Var}\left[\hat{X}\left(0\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=0,\hspace{1em}\end{array}\right.\]
where
\[ \left\{\begin{array}{l}{\mathcal{P}_{3}}\left(s,\alpha \left(s\right)\right)=\rho \left(s,\alpha \left(s\right)\right){\Phi _{1}}{\left(s,\alpha \left(s\right)\right)^{2}},\hspace{1em}\\ {} {\mathcal{P}_{4}}\left(s,\alpha \left(s\right)\right)=\rho \left(s,\alpha \left(s\right)\right){\Phi _{1}}\left(s,\alpha \left(s\right)\right){\Phi _{2}}\left(s,\alpha \left(s\right)\right),\hspace{1em}\\ {} {\mathcal{P}_{5}}\left(s,\alpha \left(s\right)\right)=\rho \left(s,\alpha \left(s\right)\right){\Phi _{2}}{\left(s,\alpha \left(s\right)\right)^{2}}.\hspace{1em}\end{array}\right.\]
Then
\[\begin{aligned}{}& \mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]={\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle {e^{{\textstyle\textstyle\int _{0}^{s}}\hspace{2.5pt}{\mathcal{P}_{1}}\left(\tau ,{e_{i}}\right)d\tau }}\\ {} & \phantom{\mathbb{E}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=}\times \left({x_{0}}+{\int _{0}^{s}}{e^{{\textstyle\textstyle\int _{0}^{\tau }}\hspace{2.5pt}-{\mathcal{P}_{1}}\left(u,{e_{i}}\right)du}}{\mathcal{P}_{2}}\left(\tau ,{e_{i}}\right)d\tau \right),\\ {} & \mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]={\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle {e^{{\textstyle\textstyle\int _{0}^{s}}\left\{2{\mathcal{P}_{1}}\left(\tau ,{e_{i}}\right)+{\mathcal{P}_{3}}\left(\tau ,{e_{i}}\right)\right\}\hspace{2.5pt}d\tau }}\\ {} & \phantom{\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=}\times \left\{{x_{0}^{2}}+{\int _{0}^{s}}{e^{{\textstyle\textstyle\int _{0}^{\tau }}-\left\{2{\mathcal{P}_{1}}\left(u,{e_{i}}\right)+{\mathcal{P}_{3}}\left(u,{e_{i}}\right)\right\}du}}\right.\\ {} & \phantom{\mathbb{E}\left[\hat{X}{\left(s\right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]=}\times \left.\left(2\left({\mathcal{P}_{2}}\left(\tau ,{e_{i}}\right)+{\mathcal{P}_{4}}\left(\tau ,{e_{i}}\right)\right)\mathbb{E}\left[\hat{X}\left(\tau \right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{5}}\left(\tau ,{e_{i}}\right)\right)\underset{}{d\tau }\right\},\end{aligned}\]
and
\[\begin{aligned}{}& \mathrm{Var}\left[\hat{X}\left(s\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\\ {} & ={\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle {e^{{\textstyle\textstyle\int _{0}^{s}}\hspace{2.5pt}2{\mathcal{P}_{1}}\left(\tau ,{e_{i}}\right)d\tau }}{\int _{0}^{s}}{e^{{\textstyle\textstyle\int _{0}^{\tau }}\hspace{2.5pt}-2{\mathcal{P}_{1}}\left(u,{e_{i}}\right)du}}\left\{{\mathcal{P}_{3}}\left(\tau ,{e_{i}}\right)\mathbb{E}\left[\hat{X}{\left(\tau \right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\\ {} & \left.\hspace{1em}+2{\mathcal{P}_{4}}\left(\tau ,{e_{i}}\right)\mathbb{E}\left[\hat{X}\left(\tau \right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{5}}\left(\tau ,{e_{i}}\right)\right\}d\tau .\end{aligned}\]
Hence the objective function value for the equilibrium trading strategy $\hat{u}\left(\cdot \right)$ is
\[\begin{aligned}{}& J\left(0,{x_{0}},{e_{{i_{0}}}};\hat{u}\left(\cdot \right)\right)\\ {} & \hspace{1em}=\mathbb{E}\left[{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(T\right),{e_{i}}\right\rangle \left\{{\int _{0}^{T}}\frac{1}{2}h\left(s\right){\left(\Upsilon \left(s,{e_{i}}\right)\hat{X}\left(s\right)+\varphi \left(s,{e_{i}}\right)\right)^{2}}ds\right.\right.\\ {} & \hspace{2em}+\frac{1}{2}{e^{{\textstyle\textstyle\int _{0}^{T}}\hspace{2.5pt}2{\mathcal{P}_{1}}\left(\tau ,{e_{i}}\right)d\tau }}{\int _{0}^{T}}{e^{{\textstyle\textstyle\int _{0}^{\tau }}-2{\mathcal{P}_{1}}\left(u,{e_{i}}\right)du}}\left\{{\mathcal{P}_{3}}\left(\tau ,{e_{i}}\right)\mathbb{E}\left[\hat{X}{\left(\tau \right)^{2}}\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right.\\ {} & \hspace{2em}+\left.2{\mathcal{P}_{4}}\left(\tau ,{e_{i}}\right)\mathbb{E}\left[\hat{X}\left(\tau \right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]+{\mathcal{P}_{5}}\left(\tau ,{e_{i}}\right)\right\}d\tau \\ {} & \hspace{2em}-\left.\left.\left({\mu _{1}}{x_{0}}+{\mu _{2}}\right){e^{{\textstyle\textstyle\int _{0}^{T}}{\mathcal{P}_{1}}\left(\tau ,{e_{i}}\right)d\tau }}\left({x_{0}}+{\int _{0}^{T}}{e^{{\textstyle\textstyle\int _{0}^{\tau }}-{\mathcal{P}_{1}}\left(u,{e_{i}}\right)du}}{\mathcal{P}_{2}}\left(\tau ,{e_{i}}\right)d\tau \right)\right\}\right].\end{aligned}\]
4.2 Conditional mean-variance investment and reinsurance strategies
In this subsection, we will address a special case where the insurer does not take into account the consumption strategy. The objective is to maximize the conditional expectation of terminal wealth $\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]$ and at the same time to minimize the conditional variance of the terminal wealth $\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]$, over controls $u\left(\cdot \right)$ valued in ${\mathbb{R}^{2}}$. Then, the mean-variance investment and reinsurance optimization problem is defined as minimizing the cost $J\left(t,\xi ,{e_{i}};\cdot \right)$ given by
subject to, for $s\in \left[0,T\right]$,
where $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \chi $ and $u\left(\cdot \right)={\left({u_{1}}\left(\cdot \right),{u_{2}}\left(\cdot \right)\right)^{\top }}$ is an admissible trading strategy.
(4.15)
\[ J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)=\frac{1}{2}\mathbb{E}\left[\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]-\left({\mu _{1}}\xi +{\mu _{2}}\right)\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right],\](4.16)
\[ \left\{\begin{array}{l}dX\left(s\right)=\left\{{r_{0}}\left(s,\alpha \left(s\right)\right)X\left(s\right)+\left(\delta +{\theta _{0}}{u_{1}}\left(s\right)\right)\lambda {\mu _{Y}}+r\left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)\right\}ds\hspace{1em}\\ {} \phantom{dX\left(s\right)=}+{\beta _{0}}{u_{1}}\left(s\right)d{W^{1}}\left(s\right)+\beta \left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)d{W^{2}}\left(s\right)\hspace{1em}\\ {} \phantom{dX\left(s\right)=}-{u_{1}}\left(s-\right){\textstyle\textstyle\int _{0}^{+\infty }}z{\tilde{N}_{\alpha }^{1}}\left(ds,dz\right)+{u_{2}}\left(s-\right){\textstyle\textstyle\int _{-1}^{+\infty }}z{\tilde{N}_{\alpha }^{2}}\left(ds,dz\right),\hspace{1em}\\ {} X\left(t\right)=\xi ,\alpha \left(t\right)={e_{i}},\hspace{1em}\end{array}\right.\]In this case, the equilibrium strategy given by the expressions (4.10) and (4.11) changes to, for $s\in \left[0,T\right]$,
where $\forall \left(s,{e_{i}}\right)\in \left[0,T\right]\times \mathcal{X}$
(4.17)
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right)}{\left({\beta _{0}^{2}}+{\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta _{\alpha }^{1}}\left(dz\right)\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.18)
\[\begin{aligned}{}{\hat{u}_{2}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\left(\beta {\left(s,{e_{i}}\right)^{2}}+{\textstyle\textstyle\int _{-1}^{+\infty }}{z^{2}}{\theta _{\alpha }^{2}}\left(dz\right)\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.19)
\[\begin{aligned}{}{\Phi _{1}}\left(s,{e_{i}}\right)& =\frac{{e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau ,{e_{i}}\right)d\tau }}\left(-{\mu _{1}}+{\textstyle\textstyle\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-{r_{0}}\left(u,{e_{i}}\right)du}}{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\Upsilon \left(\tau ,{e_{j}}\right)d\tau \right)}{1+{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}\left(2{r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}\left\{-\rho \left(\tau ,{e_{i}}\right)\Upsilon \left(\tau ,{e_{i}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}M\left(\tau ,{e_{j}}\right)\right\}d\tau },\end{aligned}\](4.20)
\[\begin{aligned}{}{\Phi _{2}}\left(s,{e_{i}}\right)& =\frac{{e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau ,{e_{i}}\right)d\tau }}\left(-{\mu _{2}}+{\textstyle\textstyle\int _{s}^{T}}{e^{{\textstyle\textstyle\int _{\tau }^{T}}-{r_{0}}\left(u,{e_{i}}\right)du}}{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}\varphi \left(\tau ,{e_{j}}\right)d\tau \right)}{1+{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}\left(2{r_{0}}\left(u,{e_{i}}\right)+{\lambda _{ii}}\right)du}}\left\{-\rho \left(\tau ,{e_{i}}\right)\Upsilon \left(\tau ,{e_{i}}\right)+{\textstyle\sum \limits_{j\ne i}^{d}}{\lambda _{ij}}M\left(\tau ,{e_{j}}\right)\right\}d\tau }.\end{aligned}\]
Numerical example. In this section, by providing some numerical examples, we demonstrate the validity and good performance of our proposed study in solving the mean-variance problem with the Markov switching. For simplicity, let us consider Equation (4.16) in which the Markov chain takes two possible states ${e_{1}}=1$ and ${e_{2}}=2$, i.e. $\chi =\left\{1,2\right\}$, with the generator of the Markov chain being
\[ \mathcal{H}=\left(\begin{array}{c@{\hskip10.0pt}c}2\hspace{1em}& -2\\ {} -4\hspace{1em}& 4\end{array}\right)\]
and the initial condition $X\left(0\right)=1.1$. For illustration purpose, we assume the finite time horizon is given as $T=60$ and that the coefficients of the dynamic equation are given below${r_{0}}\left(\alpha \left(t\right)\right)$ | $r\left(\alpha \left(t\right)\right)$ | $\beta \left(\alpha \left(t\right)\right)$ | δ | ${\theta _{0}}$ | ${\beta _{0}}$ | λ | ${\mu _{Y}}$ | |
$\alpha \left(t\right)=$1 | 0.35 | 0.20 | 0.30 | 0.09 | 1.5 | 0.5 | 0.65 | 0.6 |
$\alpha \left(t\right)=$2 | 0.40 | 0.25 | 0.55 | 0.09 | 1.5 | 0.5 | 0.65 | 0.6 |
We consider the cost function defined by Equation (4.15) with ${\mu _{1}}={\mu _{2}}=1$. Without loss of generality we use the notation $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{i})\right]$ for $\mathbb{E}\left[\hat{X}\left(t\right)\left|{\mathcal{F}_{T}^{i}}\right.\right]$ where $i=1,2$ and α.
Figure 1 depicts the state change of the Markov chain $\alpha (\cdot )$ between 0 and 60 units of time, where the initial state is assume to be $\alpha (0)=1$.
Figure 2 presents the curves of the different state trajectories of the equilibrium expected wealth $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{i})\right]$, in the three mods: $i=1$, $i=2$ and $i=\alpha \left(t\right)$. By using Matlab’s advanced ODE solvers (particularly the function ode45) and Markov chain $\alpha \left(\cdot \right)$, we can achieve trajectories of $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{1})\right]$, $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{2})\right]$ and $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\alpha \left(t\right))\right]$ and their graphs: the dashed blue line is the graph of $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{1})\right]$, the continuous brown line is the graph of $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\mathbf{2})\right]$, and the solid black line is the graph of $\mathbf{E}\left[\mathbf{X}(\mathbf{t},\alpha \left(t\right))\right]$, whose values are switched between the dashed blue line and the continuous brown line.
Figure 3 shows the state trajectory of the equilibrium wealth $\mathrm{X}(\cdot )$. In fact, when $\alpha \left(0\right)=1$, $\mathrm{X}(0)=1.1$ is the initial state trajectory. Then the values are also switched between two paths which are the trajectories of the equilibrium wealth corresponding to the different states of the Markov chain: $\alpha \left(t\right)=1$ and $\alpha \left(t\right)=2$. As a result, by comparing with Figure 1, we can clearly see how the Markovian switching influences the overall behavior of the state trajectories of the equilibrium wealth.
4.3 Special cases and relationship to other works
4.3.1 Classical Cramér–Lundberg model
Now, assume that the insurer’s surplus is modelled by the classical Cramér–Lundberg (CL) model (i.e. the model (4.2) with ${\beta _{0}}=0$), and that the financial market consists of one risk-free asset whose price process is given by (4.3), and only one risky asset whose price process does not have jumps and is modelled by a diffusion process (i.e. the model (4.4) with $z=0,ds\text{-a.e.}$). Then the dynamics of the wealth process $X\left(\cdot \right)={X^{t,\xi ,{e_{i}}}}\left(\cdot ;u\left(\cdot \right)\right)$ which corresponds to an admissible strategy $u\left(\cdot \right)={\left({u_{1}}\left(\cdot \right),{u_{2}}\left(\cdot \right)\right)^{\top }}$ and initial triplet $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \mathcal{X}$ can be described, for $s\in \left[t,T\right]$, by
(4.21)
\[ \left\{\begin{array}{l}dX\left(s\right)=\left\{{r_{0}}\left(s,\alpha \left(s\right)\right)X\left(s\right)+\left(\delta +{\theta _{0}}{u_{1}}\left(s\right)\right)\lambda {\mu _{Y}}+r\left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)\right\}ds\hspace{1em}\\ {} \phantom{dX\left(s\right)=}+\beta \left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)d{W^{2}}\left(s\right)-{u_{1}}\left(s-\right){\textstyle\textstyle\int _{0}^{+\infty }}z{\tilde{N}_{\alpha }^{1}}\left(ds,dz\right),\hspace{1em}\\ {} X\left(t\right)=\xi ,\alpha \left(t\right)={e_{i}}.\hspace{1em}\end{array}\right.\]We derive the equilibrium strategy which is described for the following two cases.
Case 1: ${\mu _{1}}=0$. We suppose that ${\mu _{1}}=0$ and ${\mu _{2}}=\frac{1}{\gamma }$, such that $\gamma >0$. Then the minimization problem (4.15) reduces to
subject to $u\left(\cdot \right)\in {\mathcal{L}_{\mathcal{F},p}^{2}}\left(0,T;{\mathbb{R}^{2}}\right)$, where $X\left(\cdot \right)={X^{t,\xi ,{e_{i}}}}\left(\cdot ;u\left(\cdot \right)\right)$ satisfies (4.21), for every $\left(t,{x_{t}},{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \chi $. In this case the equilibrium reinsurance–investment strategy given by (4.17) and (4.18) for $s\in \left[0,T\right]$ becomes
where ${\Phi _{1}}\left(s,{e_{i}}\right)$ and ${\Phi _{2}}\left(s,{e_{i}}\right)$ are given by (4.19) and (4.20) for ${\mu _{1}}=0$ and ${\mu _{2}}=\frac{1}{\gamma }$.
(4.22)
\[ \min J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)=\mathbb{E}\left[\frac{1}{2}\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]-\frac{1}{\gamma }\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right],\](4.23)
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right)}{{\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta _{\alpha }^{1}}\left(dz\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.24)
\[\begin{aligned}{}{\hat{u}_{2}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\beta {\left(s,{e_{i}}\right)^{2}}}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\]In the absence of the Markov chain, i.e. when $d=1$, $\ell \left(s,\alpha \left(s\right)\right)\equiv \ell \left(s\right)$ for $\ell ={r_{0}},r$ and β, the equilibrium solution (4.23) and (4.24) for $s\in \left[0,T\right]$ reduces to
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =\frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right){e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau \right)d\tau }}}{\gamma \left({\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta ^{1}}\left(dz\right)\right)},\\ {} {\hat{u}_{2}}\left(s\right)& =\frac{r\left(s\right){e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau \right)d\tau }}}{\gamma \beta {\left(s\right)^{2}}}.\end{aligned}\]
It is worth pointing out that the above equilibrium solutions are identical to the ones found in Zeng and Li [43] by solving some extended HJB equations.
Case 2: ${\mu _{2}}=0$. Now, suppose that ${\mu _{1}}=\frac{1}{\gamma }$ and ${\mu _{2}}=0$, such that $\gamma >0$. Then the minimization problem (4.15) reduces to
where ${\Phi _{1}}\left(s,{e_{i}}\right)$ and ${\Phi _{2}}\left(s,{e_{i}}\right)$ are given by (4.19) and (4.20) for ${\mu _{1}}=\frac{1}{\gamma }$ and ${\mu _{2}}=0$.
\[ \min J\left(t,\xi ,{e_{i}};u\left(\cdot \right)\right)=\mathbb{E}\left[\frac{1}{2}\mathrm{Var}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]-\frac{\xi }{\gamma }\mathbb{E}\left[X\left(T\right)\left|{\mathcal{F}_{T}^{\alpha }}\right.\right]\right],\]
for any $\left(t,{x_{t}},{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \chi $. This is the case of the mean-variance problem with state dependent risk aversion. For this case the equilibrium reinsurance–investment strategy given by (4.17) and (4.18) for $s\in \left[0,T\right]$, reduces to (4.25)
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right)}{{\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta _{\alpha }^{1}}\left(dz\right)}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\](4.26)
\[\begin{aligned}{}{\hat{u}_{2}}\left(s\right)& =-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\beta {\left(s,{e_{i}}\right)^{2}}}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\end{aligned}\]In the absence of the Markov chain the equilibrium solution reduces for $s\in \left[0,T\right]$ to
(4.27)
\[\begin{aligned}{}{\hat{u}_{1}}\left(s\right)& =\frac{\left(\lambda {\mu _{Y}}{\theta _{0}}\right){e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau \right)d\tau }}\hat{X}\left(s\right)}{\left({\textstyle\textstyle\int _{0}^{+\infty }}{z^{2}}{\theta ^{1}}\left(dz\right)\right)\left(\gamma +{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}{r_{0}}\left(u\right)du}}\rho \left(\tau \right)d\tau \right)},\end{aligned}\](4.28)
\[\begin{aligned}{}{\hat{u}_{2}}\left(s\right)& =\frac{r\left(s\right){e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau \right)d\tau }}\hat{X}\left(s\right)}{\beta {\left(s\right)^{2}}\left(\gamma +{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}{r_{0}}\left(u\right)du}}\rho \left(\tau \right)d\tau \right)}.\end{aligned}\]The equilibrium reinsurance–investment solution presented above is comparable to that found in Li and Li [17] in which the equilibrium is however defined within the class of feedback controls. Note that in [17] the authors adopted the approach developed by Björk et al. [4] and they have obtained feedback equilibrium solutions via some well posed integral equations.
4.3.2 The investment only
In this subsection, we consider the investment-only optimization problem. In this case the insurer does not purchase reinsurance or acquire new business, which means that ${u_{1}}\left(s\right)\equiv 1$, and his consumption is not taken into account. We assume that the financial market consists of one risk-free asset whose price process is given by (4.3), and only one risky asset whose price process does not have jumps. A trading strategy $u\left(\cdot \right)$ reduces to a one-dimensional stochastic processes ${u_{2}}\left(\cdot \right)$ in this case, where ${u_{2}}\left(s\right)$ represents the amount invested in the risky stock at time s. The dynamics of the wealth process $X\left(\cdot \right)$ which corresponds to an admissible investment strategy ${u_{2}}\left(\cdot \right)$ and initial triplet $\left(t,\xi ,{e_{i}}\right)\in \left[0,T\right]\times {\mathbb{L}^{2}}\left(\Omega ,{\mathcal{F}_{t}^{\alpha }},\mathbb{P};\mathbb{R}\right)\times \mathcal{X}$ can be described by
\[ \left\{\begin{array}{l}dX\left(s\right)=\left\{{r_{0}}\left(s,\alpha \left(s\right)\right)X\left(s\right)+\delta \lambda {\mu _{Y}}+r\left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)\right\}ds+{\beta _{0}}d{W^{1}}\left(s\right)\hspace{1em}\\ {} \phantom{dX\left(s\right)=}+\beta \left(s,\alpha \left(s\right)\right){u_{2}}\left(s\right)d{W^{2}}\left(s\right)-{\textstyle\textstyle\int _{0}^{+\infty }}z{\tilde{N}_{\alpha }^{1}}\left(ds,dz\right),\hspace{2.5pt}\text{for}\hspace{2.5pt}s\in \left[t,T\right],\hspace{1em}\\ {} X\left(t\right)=\xi ,\alpha \left(t\right)={e_{i}}.\hspace{1em}\end{array}\right.\]
Similar to the previous subsection, for the investment-only case we derive the equilibrium strategy which is described in the following two cases.
Case 1: ${\mu _{1}}=0$. We suppose that ${\mu _{1}}=0$ and ${\mu _{2}}=\frac{1}{\gamma }$, such that $\gamma >0$. In this case the equilibrium investment strategy given by (4.17) becomes
\[ {\hat{u}_{2}}\left(s\right)=-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\beta {\left(s,{e_{i}}\right)^{2}}}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),s\in \left[0,T\right],\]
where ${\Phi _{1}}\left(s,{e_{i}}\right)$ and ${\Phi _{2}}\left(s,{e_{i}}\right)$ are given by (4.19) and (4.20) for ${\mu _{1}}=0$ and ${\mu _{2}}=\frac{1}{\gamma }$.In the absence of the Markov chain the equilibrium solution reduces to
This essentially covers the solution obtained by Björk and Murgoci [3] by solving some extended HJB equations.
Case 2: ${\mu _{2}}=0$. Now, suppose that ${\mu _{1}}=\frac{1}{\gamma }$ and ${\mu _{2}}=0$, such that $\gamma >0$. This is the case of the mean-variance problem with state-dependent risk aversion. For this case the equilibrium investment strategy given by (4.17) reduces to
\[ {\hat{u}_{2}}\left(s\right)=-{\sum \limits_{i=1}^{d}}\left\langle \alpha \left(s-\right),{e_{i}}\right\rangle \frac{r\left(s,{e_{i}}\right)}{\beta {\left(s,{e_{i}}\right)^{2}}}\left({\Phi _{1}}\left(s,{e_{i}}\right)\hat{X}\left(s\right)+{\Phi _{2}}\left(s,{e_{i}}\right)\right),\hspace{2.5pt}s\in \left[0,T\right],\]
where ${\Phi _{1}}\left(s,{e_{i}}\right)$ and ${\Phi _{2}}\left(s,{e_{i}}\right)$ are given by (4.19) and (4.20) for ${\mu _{1}}=\frac{1}{\gamma }$ and ${\mu _{2}}=0$.In the absence of the Markov chain the equilibrium solution reduces to
\[ {\hat{u}_{2}}\left(s\right)=\frac{r\left(s\right){e^{{\textstyle\textstyle\int _{s}^{T}}-{r_{0}}\left(\tau \right)d\tau }}\hat{X}\left(s\right)}{\beta {\left(s\right)^{2}}\left(\gamma +{\textstyle\textstyle\int _{s}^{T}}{e^{-{\textstyle\textstyle\int _{\tau }^{T}}{r_{0}}\left(u\right)du}}\rho \left(\tau \right)d\tau \right)},\hspace{2.5pt}s\in \left[0,T\right].\]
This essentially covers the solution obtained by Hu et al. [15].
5 Conclusion
In this paper, we have considered a class of dynamic decision models of conditional time-inconsistent LQ type, under the effect of a Markovian regime-switching. We have employed the game theoretic approach to handle the time inconsistency. Throughout this study open-loop Nash equilibrium strategies are established as an alternative to optimal strategies. This was achieved using a stochastic system that includes a flow of forward-backward stochastic differential equations under equilibrium conditions. The inclusion of concrete examples in mathematical finance confirms the validity of our proposed study. The work may be developed in different ways:
-
(1) The methodology may be expanded, for example, to a non-Markovian framework, implying that the coefficients of the controlled SDE as well as the coefficients of the objective functional are random. The research on this topic is in progress and will be covered in our forthcoming paper.
-
(2) As the reviewer suggests, the model discussed in this paper may be extended to “progressive measurable” as an alternative of “predictable” control problem, and a research problem on how to obtain the corresponding state feedback equilibrium strategy is a very interesting and challenging one (see [29] for more details). Some further investigations will be carried out in our future publications.