1 Introduction
How to efficiently allocate capital lies at the heart of financial decision making. Portfolio theory, as developed by [35], provides a framework for this problem, based on the means, variances and covariances of the assets in the considered portfolio. The theory revolves around the trade-off between expected return and variance (risk), denoted by mean-variance optimization. In this setting, investors allocate wealth in order to maximize expected return given a certain level of risk or conversely allocate wealth to minimize the risk given a certain level of expected return. Although it has received a lot of criticism (see, e.g., [42] and [27]), the framework remains one of the most crucial components in portfolio management.
In this paper, we consider the tangency portfolio (TP) which is one of the most important portfolios in the financial literature. The TP weights determine what proportions of the capital to invest in each asset and are obtained by maximizing the expected quadratic utility function. For a portfolio of p risky assets, the TP weights are given by
where $\boldsymbol{\mu }$ is a p-dimensional mean vector of the asset returns, Σ is a $p\times p$ symmetric positive definite covariance matrix of the asset returns, the coefficient $\alpha >0$ describes the investors’ risk aversion,1 ${r_{f}}$ denotes the rate of a risk-free asset and ${\mathbf{1}_{p}}$ is a p-dimensional vector of ones. We allow for short sales and, therefore, some weights can be negative. Let us also note that ${\mathbf{w}_{TP}}$ determines the structure of the portfolio which corresponds to risky assets and does in general not sum to 1. Consequently, the rest of the wealth $1-{\mathbf{w}^{\prime }_{TP}}{\mathbf{1}_{p}}$ needs to be invested into the risk-free asset.
(1)
\[ {\mathbf{w}_{TP}}={\alpha ^{-1}}{\boldsymbol{\Sigma }^{-1}}(\boldsymbol{\mu }-{r_{f}}{\mathbf{1}_{p}}),\]Naturally, the TP weights ${\mathbf{w}_{TP}}$ depend on knowledge of the mean vector $\boldsymbol{\mu }$ and the covariance matrix Σ. In general, these quantities are not known and need to be estimated from data on N historical return vectors ${\mathbf{x}_{1}},\dots ,{\mathbf{x}_{N}}$. Plugging sample estimates of the mean vector and covariance matrix into (1) leads us to the sample estimate of the TP weights expressed as
where S is the sample covariance matrix and $\bar{\mathbf{x}}$ is the sample mean vector, respectively, of ${\mathbf{x}_{1}},\dots ,{\mathbf{x}_{N}}$.2 The statistical properties of ${\hat{\mathbf{w}}_{TP}}$ have been extensively studied throughout the literature. [18] derived an exact test of the weights in the multivariate normal case. [39] obtained the univariate density for the TP weights as well as its asymptotic distribution, under the assumption that returns are independent and identically multivariate normally distributed. Further, [4] provided a procedure of monitoring the TP weights with a sequential approach. [6] obtained the density for, and several exact tests on, linear transformations of estimated TP weights, while [32] provided approximate and asymptotic distributions for the weights. [3] studied the distribution of ${\hat{\mathbf{w}}_{TP}}$ from a Bayesian perspective.3 [15] studied the TP weights in small and large dimensions when both the population and sample covariance matrix are singular. Analytical expressions of higher order moments of the estimated TP weights are derived in [29], while the article [31] presented the asymptotic distribution of the estimated TP weights as well as the asymptotic distribution of the statistical test on the elements of the TP under a high-dimensional asymptotic regime. [38] derived a test for the location of the TP, and [37] extended this result to the high-dimensional setting. Furthermore, [9] derived central limit theorems for the TP weights estimator under the assumption that the matrix of observations has a matrix-variate location mixture of normal distributions. More recently, [30] investigated the distributional properties of the TP weights under a skew-normal model in small and large dimensions.
(2)
\[ {\hat{\mathbf{w}}_{TP}}={\alpha ^{-1}}{\mathbf{S}^{-1}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}}),\]The common scenario considered is that the number of observations available for the estimation, denoted by N, is greater than the portfolio size, denoted by p. In this case the sample covariance matrix S is positive definite, and ${\hat{\mathbf{w}}_{TP}}$ can be obtained as presented in (2). However, when the considered portfolio is large, it is possible that the number of available observations is less than the portfolio dimension. This can be due to a lack of data for all the assets in the portfolio, but it may also occur due to the fact that covariance of asset returns tends to change over time. As such, the assumption of a constant covariance might only hold for limited periods of time, hence limiting the amount of data available for estimation. Many applications consider portfolios of large dimensions, containing up to 50, 100 or even 1000 assets (see, e.g., [41, 26, 34, 2, 20, 16, 22, 5, 12, 1]). If returns are measured on weekly or monthly intervals, data reaching back several decades might be required to ensure $p\le N$. Unless the considered assets can be assumed to have a constant covariance matrix over very long time periods, data spanning such long time intervals is not suitable to use in the estimation, or might simply not be available. Any such situations, where $p>N$, would result in a singular sample covariance matrix S, which in turn is noninvertible, in the standard sense.
This issue can be remedied by estimating ${\boldsymbol{\Sigma }^{-1}}$ in (1) with the Moore–Penrose inverse of S, which we will denote by ${\mathbf{S}^{+}}$. This generalized inverse has previously been successfully employed in portfolio theory for the $p>N$ case by [10, 11, 44, 15].4 Applying the Moore–Penrose inverse, the TP weights are estimated as
An attractive feature of applying the Moore–Penrose inverse ${\mathbf{S}^{+}}$ in (1) is that it is the least square solution to the system of equations described by
which in the singular case generally lacks exact solution. That is, as shown in [40], for any vector $\mathbf{v}\in {\mathbb{R}^{p}}$, we have that $\| \mathbf{S}\mathbf{v}-{\alpha ^{-1}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}}){\| _{2}}$ ≥ $\| \mathbf{S}{\mathbf{S}^{+}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}})-{\alpha ^{-1}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}}){\| _{2}}$, where $\| \cdot {\| _{2}}$ denotes the Euclidean norm of a vector. Phrased differently, (3) provides the best solution to equation (4), in the least square sense. In addition, when $p\le N$, we have that ${\mathbf{S}^{+}}={\mathbf{S}^{-1}}$ and ${\tilde{\mathbf{w}}_{TP}}={\hat{\mathbf{w}}_{TP}}$, such that ${\tilde{\mathbf{w}}_{TP}}$ can be viewed as a general estimator for the TP weights, covering both the singular and nonsingular case. For further properties of the Moore–Penrose inverse, see, e.g., [17].
(3)
\[ {\tilde{\mathbf{w}}_{TP}}={\alpha ^{-1}}{\mathbf{S}^{+}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}}).\]The expectation and variance of an estimator are key quantities to describe its statistical properties. With the standard assumption of normally distributed asset returns, the stochastic components of ${\tilde{\mathbf{w}}_{TP}}$ consists of ${\mathbf{S}^{+}}$ and $\bar{\mathbf{x}}$, which are independent under the assumption of normally distributed data (see, e.g., [10]). Unfortunately, there exist no derivation of the expected value or variance of ${\mathbf{S}^{+}}$, when $p>N$. In [21] however, these quantities are presented in the special case of $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$. The authors also provided approximate results, using moments of standard normal random variables, and exact results for moments of the generalized reflexive inverse, another quantity that can be applied as an inverse of S. Further, in a recent paper [28], several bounds on the mean and variance of ${\mathbf{S}^{+}}$ are provided, based on the Poincaré separation theorem. Our paper builds on the results presented in [21] and [28] to provide bounds and approximations for the moments of the TP weights, $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]$ and $\mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]$, where $\mathbb{E}[\cdot ]$ and $\mathbb{V}[\cdot ]$ denote the expected value and variance, respectively. We also present a simulation study, where various measures compare the derived bounds with the equivalent sample quantities obtained from simulated data. Finally, we compare the moments obtained applying the reflexive generalized inverse and the sample moments based on the Moore–Penrose inverse.
The rest of this paper is organized as follows. Section 2.1 provides exact moment results for the case $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$. Section 2.2 presents bounds for the moments of ${\tilde{\mathbf{w}}_{TP}}$ in the general case, while approximate moments are derived in Section 2.3. Exact moments applying the reflexive generalized inverse are derived in Section 3. The simulation study is presented in Section 4 while Section 5 summarizes.
2 Moments with the Moore–Penrose inverse
Let X be a $p\times N$ matrix with N asset return vectors of dimension $p\times 1$ stacked as columns, where $p>N$. Further, we assume that these return vectors are independent and normally distributed with mean vector $\boldsymbol{\mu }$ and positive definite covariance matrix Σ. Thus $\mathbf{X}\sim {\mathcal{MN}_{p,N}}(\boldsymbol{\mu }{\mathbf{1}_{N}},\boldsymbol{\Sigma },{\mathbf{I}_{N}})$, where ${\mathcal{MN}_{p,n}}(\mathbf{M},\boldsymbol{\Sigma },\mathbf{U})$ denotes the matrix-variate normal distribution with $p\times N$ mean matrix M, $p\times p$ row-wise covariance matrix Σ and $N\times N$ column-wise covariance matrix U. Further, let the $p\times 1$ vector $\bar{\mathbf{x}}$ be the row mean of X. Now, define $\mathbf{Y}=\mathbf{X}-\bar{\mathbf{x}}{\mathbf{1}^{\prime }_{N}}$, such that $\mathbf{Y}\sim {\mathcal{MN}_{p,N}}(\mathbf{0},\boldsymbol{\Sigma },{\mathbf{I}_{N}})$. Further, let $\mathbf{S}=\mathbf{Y}{\mathbf{Y}^{\prime }}/n$, such that $\text{rank}(\mathbf{S})=n<p$ with $n=N-1$, and $n\mathbf{S}\sim {\mathcal{W}_{p}}(n,\boldsymbol{\Sigma })$, i.e. $n\mathbf{S}$ follows a p-dimensional singular Wishart distribution with n degrees of freedom and the parameter matrix Σ. Let $\mathbf{S}=\mathbf{Q}\mathbf{R}{\mathbf{Q}^{\prime }}$ denote the eigenvalue decomposition of S, where R is the $n\times n$ diagonal matrix of positive eigenvalues and Q is the $p\times n$ matrix with corresponding eigenvectors as columns. Further, define
Then, ${\mathbf{S}^{+}}$ constitutes the Moore–Penrose inverse of $\mathbf{Y}{\mathbf{Y}^{\prime }}/n$, and ${\mathbf{S}^{+}}$ is independent of $\bar{\mathbf{x}}$ (see [10]).
In the following, let $\boldsymbol{\eta }={\alpha ^{-1}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}})$ and $\boldsymbol{\theta }=\mathbb{E}[\boldsymbol{\eta }]={\alpha ^{-1}}(\boldsymbol{\mu }-{r_{f}}{\mathbf{1}_{p}})$. Consequently, from Corollary 3.2b.1 in [36], together with the fact that $\mathbb{E}[\bar{\mathbf{x}}]=\boldsymbol{\mu }$ and $\mathbb{V}[\bar{\mathbf{x}}]=\boldsymbol{\Sigma }/(n+1)$, we obtain that
Further, let ${s^{ij}}$ denote the element on row i and column j of ${\mathbf{S}^{+}}$, and let ${\sigma ^{ij}}$ denote the element on row i and column j of ${\boldsymbol{\Sigma }^{-1}}$. Also let ${\mathbf{e}_{i}}$ denotes a $p\times 1$ vector where all values are equal to zero, except the i-th element, which is equal to one. Moreover, we assume that ${\lambda _{1}}(\mathbf{M})\ge {\lambda _{2}}(\mathbf{M})\ge \cdots \ge {\lambda _{p}}(\mathbf{M})$ are the ordered eigenvalues of a symmetric $p\times p$ matrix M, and that $\mathbf{A}{\le _{L}}\mathbf{B}$ denotes the Löwner ordering of two positive semi-definite matrices A and B.
(5)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}]& \displaystyle =& \displaystyle \boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}+\frac{\boldsymbol{\Sigma }}{{\alpha ^{2}}(n+1)},\end{array}\](6)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\boldsymbol{\eta }^{\prime }}\boldsymbol{\eta }]& \displaystyle =& \displaystyle {\boldsymbol{\theta }^{\prime }}\boldsymbol{\theta }+\frac{\text{tr}(\boldsymbol{\Sigma })}{{\alpha ^{2}}(n+1)},\end{array}\](7)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\boldsymbol{\eta }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\eta }]& \displaystyle =& \displaystyle {\boldsymbol{\theta }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\theta }+\frac{\text{tr}(\boldsymbol{\Sigma }\boldsymbol{\Sigma })}{{\alpha ^{2}}(n+1)}.\end{array}\]2.1 Exact moments when $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$
When Σ is the identity matrix, it is possible to derive exact moments of the TP weights obtained from the Moore–Penrose inverse in the singular case. First, note the following results presented in Theorem 2.1 of [21], which state that in the case $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$ and $p>n+3$, we have that
where ${\mathbf{C}_{{p^{2}}}}$ is the commutation matrix, $\text{vec}(\cdot )$ is the vectorization operator and
Note that constants in (10)–(12) differ slightly from the constants presented in [21], since our paper considers results for $n\mathbf{S}\sim {\mathcal{W}_{p}}(n,\boldsymbol{\Sigma })$, while [21] derived the results for $\mathbf{W}\sim {\mathcal{W}_{p}}(n,\boldsymbol{\Sigma })$. The moments in (8) and (9) allow us to derive the following results.
(10)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {a_{1}}& \displaystyle =& \displaystyle \frac{{n^{2}}}{p(p-n-1)},\end{array}\]Theorem 1.
If $p>n+3$ and $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$, then
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle =& \displaystyle {a_{1}}{\mathbf{w}_{TP}},\\ {} \displaystyle \mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle =& \displaystyle ({a_{2}}+2{a_{3}}){\mathbf{w}_{TP}}{\mathbf{w}^{\prime }_{TP}}+\left[{a_{2}}{\mathbf{w}^{\prime }_{TP}}{\mathbf{w}_{TP}}+\frac{{a_{1}^{2}}+(p+1){a_{2}}+2{a_{3}}}{{\alpha ^{2}}(n+1)}\right]{\mathbf{I}_{p}}\end{array}\]
with constants ${a_{1}}$, ${a_{2}}$ and ${a_{3}}$ that are defined in (10)–(12).
Proof.
Since ${\tilde{\mathbf{w}}_{TP}}={\alpha ^{-1}}{\mathbf{S}^{+}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}})$, the first result follows directly from (8) and the independence of ${\mathbf{S}^{+}}$ and $\bar{\mathbf{x}}$. For the second result, first note that as discussed in [21], equation (9) can be written as
Also note the following element representations of matrix operations, where A and B are symmetric $p\times p$ matrices and $\text{tr}(\cdot )$ denotes the trace operator of a matrix:
Moreover, with $\boldsymbol{\eta }={\alpha ^{-1}}(\bar{\mathbf{x}}-{r_{f}}{\mathbf{1}_{p}})$ and $\mathbb{E}[\boldsymbol{\eta }]=\boldsymbol{\theta }$,
By letting $\mathbf{H}=\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}$ and applying equations (13)–(15) we obtain
\[ \operatorname{Cov}({s^{ij}},{s^{kl}})={a_{2}}({\delta _{ik}}{\delta _{jl}}+{\delta _{il}}{\delta _{jk}})+2{a_{3}}{\delta _{ij}}{\delta _{kl}},\]
where ${\delta _{ij}}=1$ if $i=j$ and 0 otherwise, so that ${\delta _{ij}}$, $i,j=1\dots ,p$, denote the elements of ${\mathbf{I}_{p}}$. Hence, we have that
(13)
\[ \mathbb{E}[{s^{ij}}{s^{kl}}]={a_{2}}({\delta _{ik}}{\delta _{jl}}+{\delta _{il}}{\delta _{jk}})+({a_{1}^{2}}+2{a_{3}}){\delta _{ij}}{\delta _{kl}}.\](14)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\left[\mathbf{A}\text{tr}(\mathbf{B}\mathbf{A})\right]_{ij}}& \displaystyle =& \displaystyle {a_{ij}}{\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{a_{kl}}{b_{kl}},\end{array}\](15)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\left[\mathbf{A}\mathbf{B}\mathbf{A}\right]_{ij}}& \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{b_{kl}}{a_{ik}}{a_{jl}}\\ {} & \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{b_{kl}}{a_{il}}{a_{jk}}.\end{array}\](16)
\[ \mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]=\mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]=\mathbb{E}\left[\mathbb{E}[{\mathbf{S}^{+}}\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}{\mathbf{S}^{+}}\mid \boldsymbol{\eta }]\right]-\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}].\]
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}{[{\mathbf{S}^{+}}\mathbf{H}{\mathbf{S}^{+}}\mid \boldsymbol{\eta }]_{ij}}& \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{h_{kl}}\mathbb{E}[{s^{ik}}{s^{jl}}]\\ {} & \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{h_{kl}}[{a_{2}}({\delta _{ij}}{\delta _{kl}}+{\delta _{il}}{\delta _{kj}})+({a_{1}^{2}}+2{a_{3}}){\delta _{ik}}{\delta _{jl}}]\\ {} & \displaystyle =& \displaystyle {a_{2}}{\left[{\mathbf{I}_{p}}\text{tr}(\mathbf{H}{\mathbf{I}_{p}})\right]_{ij}}+{a_{2}}{[{\mathbf{I}_{p}}\mathbf{H}{\mathbf{I}_{p}}]_{ij}}+({a_{1}^{2}}+2{a_{3}}){[{\mathbf{I}_{p}}\mathbf{H}{\mathbf{I}_{p}}]_{ij}}.\end{array}\]
Consequently,
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\mathbf{S}^{+}}\mathbf{H}{\mathbf{S}^{+}}\mid \boldsymbol{\eta }]& \displaystyle =& \displaystyle ({a_{1}^{2}}+{a_{2}}+2{a_{3}})\mathbf{H}+{a_{2}}\text{tr}(\mathbf{H}){\mathbf{I}_{p}},\end{array}\]
and inserting the above result into (16) together with (5) and (8) gives
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]& \displaystyle =& \displaystyle ({a_{1}^{2}}+{a_{2}}+2{a_{3}})\left(\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}+{\alpha ^{-2}}{N^{-1}}{\mathbf{I}_{p}}\right)+\\ {} & & \displaystyle +{a_{2}}\left(\text{tr}(\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }})+{\alpha ^{-2}}{N^{-1}}p\right){\mathbf{I}_{p}}-{a_{1}^{2}}\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\end{array}\]
and the theorem follows noting that $\boldsymbol{\theta }={\mathbf{w}_{TP}}$ when $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$. □A direct consequence of Theorem 1 is that the estimator ${\tilde{\mathbf{w}}_{TP}}$ is biased, with bias factor ${a_{1}}$. Hence, in the case of $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$, we have that ${a_{1}^{-1}}{\tilde{\mathbf{w}}_{TP}}$ constitutes an unbiased estimator. Further, in accordance with Corollary 2.1 in [21], as $n,p\to \infty $, assuming $n/p\to r$, with $0<r<1$, the constants of $\mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]$ emits the following asymptotic magnitudes: ${a_{1}}=\mathcal{O}(1)$, ${a_{2}}=\mathcal{O}({n^{-1}})=\mathcal{O}({p^{-1}})$ and ${a_{3}}=\mathcal{O}({n^{-2}})=\mathcal{O}({p^{-2}})$. Consequently, since $\text{tr}({\mathbf{w}_{TP}}{\mathbf{w}^{\prime }_{TP}})=\mathcal{O}(p)$ in the general case, we have that ${a_{2}}\text{tr}({\mathbf{w}_{TP}}{\mathbf{w}^{\prime }_{TP}})=\mathcal{O}(1)$. Hence, unless ${\mathbf{w}_{TP}}$ has some specific structure, $\mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]$ does not vanish to zero under this asymptotic regime. This is not unique for the singular case, since the corresponding is also true for ${\hat{\mathbf{w}}_{TP}}$ in the nonsingular case, when $n,p\to \infty $. Finally, we note that in practice the population covariance matrix of a portfolio of assets will likely never be equal to ${\mathbf{I}_{p}}$, and hence the results in this section are mainly of theoretical nature.
2.2 Bounds on the moments
This section aims to provide upper and lower bounds for the expected value of ${\tilde{\mathbf{w}}_{TP}}$ and upper bounds for the variance of ${\tilde{\mathbf{w}}_{TP}}$. First, define the following $p\times p$ matrices,
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbf{D}& \displaystyle =& \displaystyle {a_{1}}{({\lambda _{p}}({\boldsymbol{\Sigma }^{-1}}))^{2}}\boldsymbol{\Sigma },\\ {} \displaystyle {\mathbf{U}_{a}}& \displaystyle =& \displaystyle {a_{1}}{({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{2}}\boldsymbol{\Sigma },\\ {} \displaystyle {\mathbf{U}_{b}}& \displaystyle =& \displaystyle \frac{n}{p-n-1}{\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}){\mathbf{I}_{p}},\end{array}\]
with elements ${d_{ij}}$, ${u_{ij}^{(a)}}$ and ${u_{ij}^{(b)}}$, respectively. Further denote by ${e_{ij}}$ the elements of $\mathbb{E}[{\mathbf{S}^{+}}]$ and let ${u_{ii}^{(\ast )}}=\min \{{u_{ii}^{(a)}},{u_{ii}^{(b)}}\}$, $i=1,\dots ,p$. Then we can derive the following result.Theorem 2.
Suppose $p>n+3$ and $\boldsymbol{\Sigma }>0$. Let ${w_{i}}$ and ${\theta _{i}}$, be the i-th elements of the $p\times 1$ vectors $\mathbf{w}=\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]$ and $\boldsymbol{\theta }={\alpha ^{-1}}(\boldsymbol{\mu }-{r_{f}}{\mathbf{1}_{p}})$, respectively. Then for $i=1,\dots ,p$, it holds that
\[ {v_{ii}}{\theta _{i}}+{\sum \limits_{j\ne i}^{p}}{v_{ij}}{\theta _{j}}\le {w_{i}}\le {z_{ii}}{\theta _{i}}+{\sum \limits_{j\ne i}^{p}}{z_{ij}}{\theta _{j}}\]
where, for $i,j=1,\dots ,p$,
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {v_{ij}}& \displaystyle =& \displaystyle \left\{\begin{array}{l@{\hskip10.0pt}l}{g_{ij}}& \text{if}\hspace{2.5pt}{\theta _{j}}\ge 0,\\ {} {h_{ij}}& \text{if}\hspace{2.5pt}{\theta _{j}}<0,\end{array}\right.\\ {} \displaystyle {z_{ij}}& \displaystyle =& \displaystyle \left\{\begin{array}{l@{\hskip10.0pt}l}{g_{ij}}& \text{if}\hspace{2.5pt}{\theta _{j}}<0,\\ {} {h_{ij}}& \text{if}\hspace{2.5pt}{\theta _{j}}\ge 0,\end{array}\right.\end{array}\]
with ${g_{ii}}={d_{ii}}$, ${h_{ii}}={u_{ii}^{(\ast )}}$, while for $i\ne j$,
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {g_{ij}}& \displaystyle =& \displaystyle \max \left\{\begin{array}{l}{d_{ij}}-\sqrt{({u_{ii}^{(\ast )}}-{d_{ii}})({u_{jj}^{(\ast )}}-{d_{jj}})},\\ {} {u_{ij}^{(a)}}-\sqrt{({u_{ii}^{(a)}}-{d_{ii}})({u_{jj}^{(a)}}-{d_{jj}})},\\ {} -\sqrt{({u_{ii}^{(b)}}-{d_{ii}})({u_{jj}^{(b)}}-{d_{jj}})},\\ {} -\sqrt{{u_{ii}^{(\ast )}}{u_{jj}^{(\ast )}}}\end{array}\right\},\\ {} \displaystyle {h_{ij}}& \displaystyle =& \displaystyle \min \left\{\begin{array}{l}{d_{ij}}+\sqrt{({u_{ii}^{(\ast )}}-{d_{ii}})({u_{jj}^{(\ast )}}-{d_{jj}})},\\ {} {u_{ij}^{(a)}}+\sqrt{({u_{ii}^{(a)}}-{d_{ii}})({u_{jj}^{(a)}}-{d_{jj}})},\\ {} \sqrt{({u_{ii}^{(b)}}-{d_{ii}})({u_{jj}^{(b)}}-{d_{jj}})},\\ {} \sqrt{{u_{ii}^{(\ast )}}{u_{jj}^{(\ast )}}}\end{array}\right\}.\end{array}\]
Proof.
The result follows directly from the element-wise bounds in Lemma A2 and that due to the independence of ${\mathbf{S}^{+}}$ and $\bar{\mathbf{x}}$ we have $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]=\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }$. □
Note that when $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$, we have that ${\lambda _{1}}{({\boldsymbol{\Sigma }^{-1}})^{2}}={\lambda _{p}}{({\boldsymbol{\Sigma }^{-1}})^{2}}=1$, and hence $\mathbb{E}[{\mathbf{S}^{+}}]=\mathbf{D}={\mathbf{U}_{a}}={a_{1}}{\mathbf{I}_{p}}$. Consequently ${g_{ij}}={h_{ij}}=0$, $i\ne j$, and ${g_{ii}}={h_{ii}}={a_{1}}$, $i=1,\dots ,p$, since ${u_{ii}^{(a)}}={a_{1}}<{a_{1}}\frac{p}{n}={u_{ii}^{(b)}}$, and $p>n$. Hence, Theorem 2 yields that $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]={a_{1}}\boldsymbol{\theta }$, consistent with the result of Theorem 1.
The following result provides two upper bounds for the variance of the TP weights estimate ${\tilde{\mathbf{w}}_{TP}}$.
Theorem 3.
Suppose $p>n+3$ and $\boldsymbol{\Sigma }>0$. Then
with the expected values given in (5)–(7) and
(17)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle {\le _{L}}& \displaystyle (2{c_{1}}+{c_{2}}){({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}\left({k_{1}}\boldsymbol{\Sigma }\mathbb{E}[\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}]\boldsymbol{\Sigma }+{k_{2}}\boldsymbol{\Sigma }\mathbb{E}[{\boldsymbol{\eta }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\eta }]\right),\end{array}\](18)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle {\le _{L}}& \displaystyle (2{c_{1}}+{c_{2}}){({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}\mathbb{E}[({\boldsymbol{\eta }^{\prime }}\boldsymbol{\eta })]{\mathbf{I}_{p}},\end{array}\]
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {c_{1}}& \displaystyle =& \displaystyle {n^{2}}{[(p-n)(p-n-1)(p-n-3)]^{-1}},\\ {} \displaystyle {c_{2}}& \displaystyle =& \displaystyle (p-n-2){c_{1}},\\ {} \displaystyle {k_{1}}& \displaystyle =& \displaystyle \left[1+n-\frac{(p+1)(p(n+1)-2)}{p(p+1)-2}\right]\frac{n}{p},\\ {} \displaystyle {k_{2}}& \displaystyle =& \displaystyle \left[1-\frac{(p+1)(p-n)}{p(p+1)-2}\right]\frac{n}{p}.\end{array}\]
Proof.
We are interested in bounds for the quantity ${\boldsymbol{\alpha }^{\prime }}\mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]\boldsymbol{\alpha }={\boldsymbol{\alpha }^{\prime }}\mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]\boldsymbol{\alpha }$, for all $\boldsymbol{\alpha }\in {\mathbb{R}^{p}}$. First, by the tower property we have
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]& \displaystyle =& \displaystyle \mathbb{E}\left[\mathbb{E}[{\mathbf{S}^{+}}\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}{\mathbf{S}^{+}}\mid \boldsymbol{\eta }]\right]-\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}].\end{array}\]
Hence, we can obtain
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\boldsymbol{\alpha }^{\prime }}\mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]\boldsymbol{\alpha }& \displaystyle =& \displaystyle \mathbb{E}\left[\mathbb{E}[{\boldsymbol{\alpha }^{\prime }}{\mathbf{S}^{+}}\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}{\mathbf{S}^{+}}\boldsymbol{\alpha }\mid \boldsymbol{\eta }]\right]-{\boldsymbol{\alpha }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}]{\boldsymbol{\alpha }^{\prime }}\\ {} & \displaystyle =& \displaystyle \mathbb{E}\left[\mathbb{E}[{({\boldsymbol{\alpha }^{\prime }}{\mathbf{S}^{+}}\boldsymbol{\eta })^{2}}\mid \boldsymbol{\eta }]\right]-{({\boldsymbol{\alpha }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta })^{2}}.\end{array}\]
Then, by noting that ${({\boldsymbol{\alpha }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta })^{2}}>0$ and applying the bounds from Lemma A4 on $\mathbb{E}[{({\boldsymbol{\alpha }^{\prime }}{\mathbf{S}^{+}}\boldsymbol{\eta })^{2}}]$ we can derive
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\boldsymbol{\alpha }^{\prime }}\mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]\boldsymbol{\alpha }& \displaystyle \le & \displaystyle (2{c_{1}}+{c_{2}}){({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}\\ {} & & \displaystyle \times \mathbb{E}\left[{k_{1}}{({\boldsymbol{\alpha }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\eta })^{2}}+{k_{2}}({\boldsymbol{\alpha }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\alpha })({\boldsymbol{\eta }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\eta })\right],\\ {} \displaystyle {\boldsymbol{\alpha }^{\prime }}\mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]\boldsymbol{\alpha }& \displaystyle \le & \displaystyle {({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}(2{c_{1}}+{c_{2}})\mathbb{E}[({\boldsymbol{\alpha }^{\prime }}\boldsymbol{\alpha })({\boldsymbol{\eta }^{\prime }}\boldsymbol{\eta })],\end{array}\]
and with the aid of (5)–(7) the result follows. □2.3 Approximate moments
Regarding general Σ, it is possible to provide approximate moments for ${\tilde{\mathbf{w}}_{TP}}$ using simulations of standard normal matrices. Following Section 3.1 in [21], we denote the eigendecomposition of Σ as $\boldsymbol{\Sigma }={\boldsymbol{\Gamma }\boldsymbol{\Lambda }\boldsymbol{\Gamma }^{\prime }}$, with ${\lambda _{i}}$ denoting the i-th diagonal element of Λ, and let $\mathbf{Z}\sim {\mathcal{MN}_{p,n}}(\mathbf{0},{\mathbf{I}_{p}},{\mathbf{I}_{n}})$, with ${\mathbf{z}^{\prime }_{i}}$ denoting row i of Z. Further, denote ${m_{ij}}(\boldsymbol{\Lambda })=\mathbb{E}[{\mathbf{z}^{\prime }_{i}}{({\mathbf{Z}^{\prime }}\boldsymbol{\Lambda }\mathbf{Z})^{-2}}{\mathbf{z}_{j}}]$ and ${v_{ij,kl}}(\boldsymbol{\Lambda })=\operatorname{Cov}[{\mathbf{z}^{\prime }_{i}}{({\mathbf{Z}^{\prime }}\boldsymbol{\Lambda }\mathbf{Z})^{-2}}{\mathbf{z}_{j}},{\mathbf{z}^{\prime }_{k}}{({\mathbf{Z}^{\prime }}\boldsymbol{\Lambda }\mathbf{Z})^{-2}}{\mathbf{z}_{l}}]$, where $\operatorname{Cov}[X,Y]$ denotes the covariance between X and Y.
Also define
and make the decomposition
where ${\boldsymbol{\Psi }_{ij}}$ are $p\times p$ matrices, $i,j=1,\dots ,p$. The following result can then be derived.
(19)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbf{M}(\boldsymbol{\Lambda })& \displaystyle =& \displaystyle n{\sum \limits_{i=1}^{p}}{\lambda _{i}}{m_{ii}}(\boldsymbol{\Lambda }){\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{i}},\\ {} \displaystyle \mathbf{V}(\boldsymbol{\Lambda })& \displaystyle =& \displaystyle {n^{2}}\left[{\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}{\lambda _{i}}{\lambda _{j}}{v_{ii,jj}}(\boldsymbol{\Lambda })({\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{j}}\otimes {\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{j}})\right.\\ {} & & \displaystyle +{\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}{\lambda _{i}}{\lambda _{j}}{v_{ij,ij}}(\boldsymbol{\Lambda })({\mathbf{e}_{j}}{\mathbf{e}^{\prime }_{j}}\otimes {\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{i}})({\mathbf{I}_{{p^{2}}}}+{\mathbf{C}_{{p^{2}}}})\\ {} & & \displaystyle \left.-2{\sum \limits_{i}^{p}}{\lambda _{i}^{2}}{v_{ii,ii}}(\boldsymbol{\Lambda })({\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{i}}\otimes {\mathbf{e}_{i}}{\mathbf{e}^{\prime }_{i}})\right]\end{array}\](20)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle (\boldsymbol{\Gamma }\otimes \boldsymbol{\Gamma })\mathbf{V}(\boldsymbol{\Lambda })({\boldsymbol{\Gamma }^{\prime }}\otimes {\boldsymbol{\Gamma }^{\prime }})& \displaystyle =& \displaystyle \left(\begin{array}{c@{\hskip10.0pt}c@{\hskip10.0pt}c}{\boldsymbol{\Psi }_{11}}& \cdots & {\boldsymbol{\Psi }_{1p}}\\ {} \vdots & \ddots & \vdots \\ {} {\boldsymbol{\Psi }_{p1}}& \cdots & {\boldsymbol{\Psi }_{pp}}\end{array}\right),\end{array}\]Theorem 4.
If $p>n+3$ and $\boldsymbol{\Sigma }>0$, then
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle =& \displaystyle \boldsymbol{\Gamma }\mathbf{M}(\boldsymbol{\Lambda }){\boldsymbol{\Gamma }^{\prime }}\boldsymbol{\theta },\\ {} \displaystyle \mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]& \displaystyle =& \displaystyle {\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}\left({\theta _{i}}{\theta _{j}}+\frac{{\sigma _{ij}}}{{\alpha ^{2}}(n+1)}\right){\boldsymbol{\Psi }_{ij}}+\frac{1}{{\alpha ^{2}}(n+1)}\boldsymbol{\Gamma }\mathbf{M}(\boldsymbol{\Lambda })\boldsymbol{\Lambda }\mathbf{M}(\boldsymbol{\Lambda }){\boldsymbol{\Gamma }^{\prime }}\end{array}\]
with ${\theta _{i}}={\alpha ^{-1}}({\mu _{i}}-{r_{f}})$.
Proof.
From Theorem 3.1 in [21], we have that $\mathbb{E}[{\mathbf{S}^{+}}]=\boldsymbol{\Gamma }\mathbf{M}(\boldsymbol{\Lambda }){\boldsymbol{\Gamma }^{\prime }}$. Then the first result follows due to the independence of ${\mathbf{S}^{+}}$ and $\bar{\mathbf{x}}$. For the second result, we have that
Again we let $\mathbf{H}=\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}$. Applying Theorem 3.1 in [21] we have that
(21)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]& \displaystyle =& \displaystyle \mathbb{E}\left[\mathbb{E}[{\mathbf{S}^{+}}\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}{\mathbf{S}^{+}}\mid \boldsymbol{\eta }]\right]-\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}].\end{array}\]
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[\text{vec}({\mathbf{S}^{+}})]& \displaystyle =& \displaystyle (\boldsymbol{\Gamma }\otimes \boldsymbol{\Gamma })\mathbf{V}(\boldsymbol{\Lambda })({\boldsymbol{\Gamma }^{\prime }}\otimes {\boldsymbol{\Gamma }^{\prime }}),\end{array}\]
and in accordance with equation (6.8) in [23], we get
\[ \mathbb{E}[{\mathbf{S}^{+}}\mathbf{H}{\mathbf{S}^{+}}]={\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}{h_{ij}}{\boldsymbol{\Psi }_{ij}}+\mathbb{E}[{\mathbf{S}^{+}}]\mathbf{H}\mathbb{E}[{\mathbf{S}^{+}}],\]
where ${\boldsymbol{\Psi }_{ij}}$ is obtained from the decomposition (20). Inserting the above into (21) gives
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{+}}\boldsymbol{\eta }]& \displaystyle =& \displaystyle {\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}\mathbb{E}[{h_{ij}}]{\boldsymbol{\Psi }_{ij}}+\mathbb{E}[{\mathbf{S}^{+}}]\mathbb{E}[\mathbf{H}]\mathbb{E}[{\mathbf{S}^{+}}]-\mathbb{E}[{\mathbf{S}^{+}}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{+}}]\\ {} & \displaystyle =& \displaystyle {\sum \limits_{i=1}^{p}}{\sum \limits_{j=1}^{p}}\left({\theta _{i}}{\theta _{j}}+\frac{{\sigma _{ij}}}{{\alpha ^{2}}N}\right){\boldsymbol{\Psi }_{ij}}+\frac{1}{{\alpha ^{2}}N}\boldsymbol{\Gamma }\mathbf{M}(\boldsymbol{\Lambda })\boldsymbol{\Lambda }\mathbf{M}(\boldsymbol{\Lambda }){\boldsymbol{\Gamma }^{\prime }}\end{array}\]
due to (5) and since ${\boldsymbol{\Gamma }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\Gamma }=\boldsymbol{\Lambda }$. The theorem is proved. □In [21] the authors note that the moments ${m_{ij}}(\boldsymbol{\Lambda })$ and ${v_{ij,kl}}(\boldsymbol{\Lambda })$ do not seem to have tractable closed-form representations. However, these quantities can be approximated by simulation of Z, given the eigenvalues of Σ.
3 Exact moments with reflexive generalized inverse
An alternative to using the Moore–Penrose inverse ${\mathbf{S}^{+}}$ to estimate ${\boldsymbol{\Sigma }^{-1}}$ is an application of the reflexive generalized inverse, defined as
\[ {\mathbf{S}^{\dagger }}={\boldsymbol{\Sigma }^{-1/2}}{\left({\boldsymbol{\Sigma }^{-1/2}}\mathbf{S}{\boldsymbol{\Sigma }^{-1/2}}\right)^{+}}{\boldsymbol{\Sigma }^{-1/2}},\]
where the elements of ${\mathbf{S}^{\dagger }}$ are denoted ${s_{ij}^{\dagger }}$. Then, the TP weights vector can be estimated by
and we derive the following result.Theorem 5.
If $p>n+3$ and $\boldsymbol{\Sigma }>0$, then
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}[{\mathbf{w}_{TP}^{\dagger }}]& \displaystyle =& \displaystyle {a_{1}}{\mathbf{w}_{TP}},\\ {} \displaystyle \mathbb{V}[{\mathbf{w}_{TP}^{\dagger }}]& \displaystyle =& \displaystyle ({a_{2}}+2{a_{3}}){\mathbf{w}_{TP}}{\mathbf{w}^{\prime }_{TP}}\\ {} & & \displaystyle +\left[{a_{2}}{\mathbf{w}^{\prime }_{TP}}\boldsymbol{\Sigma }{\mathbf{w}_{TP}}+\frac{{a_{1}^{2}}+(p+1){a_{2}}+2{a_{3}}}{{\alpha ^{2}}(n+1)}\right]{\boldsymbol{\Sigma }^{-1}}.\end{array}\]
Proof.
The first result follows directly from Corollary 2.3 in [21], and the independence of S and $\bar{\mathbf{x}}$. For the second result, we have that
Again we let $\mathbf{H}=\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}$, and note that by Corollary 2.3 in [21] we have
(22)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{\dagger }}\boldsymbol{\eta }]& \displaystyle =& \displaystyle \mathbb{E}\left[\mathbb{E}[{\mathbf{S}^{\dagger }}\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}{\mathbf{S}^{\dagger }}\mid \boldsymbol{\eta }]\right]-\mathbb{E}[{\mathbf{S}^{\dagger }}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{\dagger }}].\end{array}\]
\[ \mathbb{E}[{s_{ik}^{\dagger }}{s_{lj}^{\dagger }}]={a_{2}}({\sigma ^{ij}}{\sigma ^{kl}}+{\sigma ^{il}}{\sigma ^{kj}})+({a_{1}^{2}}+2{a_{3}}){\sigma ^{ik}}{\sigma ^{jl}}\]
which combined with (14)–(15) allows us to obtain
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{E}{\left[{\mathbf{S}^{\dagger }}\mathbf{H}{\mathbf{S}^{\dagger }}\mid \boldsymbol{\eta }\right]_{ij}}& \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{h_{kl}}\mathbb{E}\left[{s_{ik}^{\dagger }}{s_{lj}^{\dagger }}\right]\\ {} & \displaystyle =& \displaystyle {\sum \limits_{k=1}^{p}}{\sum \limits_{l=1}^{p}}{h_{kl}}\left({a_{2}}({\sigma ^{ij}}{\sigma ^{kl}}+{\sigma ^{il}}{\sigma ^{kj}})\right.\\ {} & & \displaystyle \left.+({a_{1}^{2}}+2{a_{3}}){\sigma ^{ik}}{\sigma ^{jl}}\right)\\ {} & \displaystyle =& \displaystyle ({a_{1}^{2}}+{a_{2}}+2{a_{3}}){\left[{\boldsymbol{\Sigma }^{-1}}\mathbf{H}{\boldsymbol{\Sigma }^{-1}}\right]_{ij}}+{a_{2}}\text{tr}(\mathbf{H}{\boldsymbol{\Sigma }^{-1}}){\left[{\boldsymbol{\Sigma }^{-1}}\right]_{ij}}\end{array}\]
so sthat
\[ \mathbb{E}[{\mathbf{S}^{\dagger }}\mathbf{H}{\mathbf{S}^{\dagger }}\mid \boldsymbol{\eta }]=({a_{1}^{2}}+{a_{2}}+2{a_{3}}){\boldsymbol{\Sigma }^{-1}}\mathbf{H}{\boldsymbol{\Sigma }^{-1}}+{a_{2}}\text{tr}(\mathbf{H}{\boldsymbol{\Sigma }^{-1}}){\boldsymbol{\Sigma }^{-1}}.\]
Inserting this into equation (22) gives
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle \mathbb{V}[{\mathbf{S}^{\dagger }}\boldsymbol{\eta }]& \displaystyle =& \displaystyle ({a_{1}^{2}}+{a_{2}}+2{a_{3}}){\boldsymbol{\Sigma }^{-1}}\mathbb{E}[\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}]{\boldsymbol{\Sigma }^{-1}}+{a_{2}}\text{tr}(\mathbb{E}[\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}]{\boldsymbol{\Sigma }^{-1}}){\boldsymbol{\Sigma }^{-1}}\\ {} & & \displaystyle -\mathbb{E}[{\mathbf{S}^{\dagger }}]\boldsymbol{\theta }{\boldsymbol{\theta }^{\prime }}\mathbb{E}[{\mathbf{S}^{\dagger }}],\end{array}\]
and applying the first result on $\mathbb{E}[{\mathbf{S}^{\dagger }}]$ together with (5) concludes the proof. □An obvious drawback of ${\mathbf{w}_{TP}^{\dagger }}$ is that Σ must be known in order to construct ${\mathbf{S}^{\dagger }}$. Moreover, in the case of $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$ the results in Theorem 5 coincide with the results in Theorem 1, since in this case ${\mathbf{S}^{\dagger }}={\mathbf{S}^{+}}$.
4 Simulation study
The aim of this section is to compare the bounds on the moments of ${\tilde{\mathbf{w}}_{TP}}$ derived in Section 2.2 with the sample mean and sample variance of this estimator. We will also investigate the difference between the moments of ${\mathbf{w}_{TP}^{\dagger }}$ derived in Theorem 5 and the sample moments of ${\tilde{\mathbf{w}}_{TP}}$. Ideally, the bounds should not deviate from the obtained sample moments very much. To this end, define ${\mathbf{b}^{l}}$ and ${\mathbf{b}^{u}}$ as the $p\times 1$ vectors with elements
so that ${t_{l}}$, ${t_{u}}$ and ${t^{\dagger }}$ measure the element-wise difference between the sample mean vector and the lower and upper bounds on the mean, and mean of ${\mathbf{w}_{TP}^{\dagger }}$, respectively. Dividing by p allows comparing the measures between various portfolio sizes. Further, let
where ${\mathbf{1}_{p}}$ is a $p\times 1$ vector of ones. Then, ${T_{1}}$ and ${T_{2}}$ provide a measure of discrepancy between the sample covariance matrix and bounds presented in Theorem 3, while ${T^{\dagger }}$ measures the discrepancy between the variance of ${\mathbf{w}_{TP}^{\dagger }}$ presented in Theorem 5 and the sample covariance matrix of ${\tilde{\mathbf{w}}_{TP}}$. Since it is divided by ${p^{2}}$, the number of elements in ${\mathbf{B}_{1}}$, ${\mathbf{B}_{2}}$, $\mathbb{V}[{\mathbf{w}_{TP}^{\dagger }}]$ and V, the measures ${T_{1}}$, ${T^{\dagger }}$ and ${T_{2}}$ again allow for comparison between different portfolio sizes. Moreover, define
where $\| \mathbf{M}{\| _{F}^{2}}$ denotes the Frobenius norm of the matrix M. Hence ${f_{1}}$, ${f_{2}}$, ${F_{1}}$ and ${F_{2}}$ represent the normalized Frobenius norm of differences between the bounds and the sample variance, while ${f^{\dagger }}$ and ${F^{\dagger }}$ denote the differences between the moments of ${\mathbf{w}_{TP}^{\dagger }}$ and the sample variance of ${\tilde{\mathbf{w}}_{TP}}$.
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {b_{i}^{l}}& \displaystyle =& \displaystyle {v_{ii}}{\mu _{i}}+{\sum \limits_{j\ne i}^{p}}{v_{ij}}{\mu _{j}},\\ {} \displaystyle {b_{i}^{u}}& \displaystyle =& \displaystyle {z_{ii}}{\mu _{i}}+{\sum \limits_{j\ne i}^{p}}{z_{ij}}{\mu _{j}},\end{array}\]
such that ${\mathbf{b}^{l}}$ and ${\mathbf{b}^{u}}$ represent the element-wise lower and upper bounds for the expected TP weights vector presented in Theorem 2, where we set $\alpha =1$ and ${r_{f}}=0$. Let
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {\mathbf{B}_{1}}& \displaystyle =& \displaystyle (2{c_{1}}+{c_{2}}){({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}\left({k_{1}}\boldsymbol{\Sigma }\mathbb{E}[\boldsymbol{\eta }{\boldsymbol{\eta }^{\prime }}]\boldsymbol{\Sigma }+{k_{2}}\boldsymbol{\Sigma }\mathbb{E}[{\boldsymbol{\eta }^{\prime }}\boldsymbol{\Sigma }\boldsymbol{\eta }]\right),\\ {} \displaystyle {\mathbf{B}_{2}}& \displaystyle =& \displaystyle (2{c_{1}}+{c_{2}}){({\lambda _{1}}({\boldsymbol{\Sigma }^{-1}}))^{4}}\mathbb{E}[({\boldsymbol{\eta }^{\prime }}\boldsymbol{\eta })]{\mathbf{I}_{p}}\end{array}\]
represent the bounds in equations (17) and (18) in Theorem 3, respectively. Further, let m and V respectively denote the sample mean vector and sample covariance matrix of ${\tilde{\mathbf{w}}_{TP}}$ based on an observed matrix X, as described in Section 2. Moreover, we define (23)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {t_{l}}& \displaystyle =& \displaystyle \frac{{\mathbf{1}^{\prime }_{p}}|{\mathbf{b}^{l}}-\mathbf{m}|}{p},\end{array}\](26)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {T_{1}}& \displaystyle =& \displaystyle \frac{\left|{\mathbf{1}^{\prime }_{p}}\left({\mathbf{B}_{1}}-\mathbf{V}\right){\mathbf{1}_{p}}\right|}{{p^{2}}},\end{array}\](29)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {f_{l}}& \displaystyle =& \displaystyle \| {\mathbf{b}^{l}}-\mathbf{m}{\| _{F}^{2}}/\| \mathbf{m}{\| _{F}^{2}},\end{array}\](30)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {f_{u}}& \displaystyle =& \displaystyle \| {\mathbf{b}^{u}}-\mathbf{m}{\| _{F}^{2}}/\| \mathbf{m}{\| _{F}^{2}},\end{array}\](31)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {f^{\dagger }}& \displaystyle =& \displaystyle \| \mathbb{E}[{\mathbf{w}_{TP}^{\dagger }}]-\mathbf{m}{\| _{F}^{2}}/\| \mathbf{m}{\| _{F}^{2}},\end{array}\](32)
\[\begin{array}{r@{\hskip10.0pt}c@{\hskip10.0pt}l}\displaystyle {F_{1}}& \displaystyle =& \displaystyle \| {\mathbf{B}_{1}}-\mathbf{V}{\| _{F}^{2}}/\| \mathbf{V}{\| _{F}^{2}},\end{array}\]In the following, we will study simulations of (23)–(34) for various parameter values. In order to account for a wide range of values of $\boldsymbol{\mu }$ and Σ, these values will be randomly generated in the simulation study. Each of the p elements in the mean vector $\boldsymbol{\mu }$ will be independently generated as $\mathcal{U}(-0.1,0.1)$, where $\mathcal{U}(l,u)$ denotes the uniform distribution between l and u. The positive definite covariance matrix Σ will be determined as $\boldsymbol{\Sigma }=\boldsymbol{\Gamma }\boldsymbol{\Lambda }{\boldsymbol{\Gamma }^{\prime }}$, where the $p\times p$ matrix Γ represent the eigenvectors of Σ and is generated according to the Haar distribution. The $p\times p$ matrix Λ is diagonal, and its elements represents the ordered eigenvalues of Σ. Here we let the p eigenvalues be equally spaced from d to 1, for various values of d. Then, the parameter d represents a measure of dependency between the p assets in the portfolio, where $d=1$ represents no dependency and larger d represents a stronger dependency structure. Consequently, the simulation procedure can be described as follows:
The above procedure is repeated $r=10$ times to get r values of (23)–(34) for a given combination of p, N and d. Figures 1–12 display the mean value, for the r simulations, of each respective measure, for $p=\{25,50,75,100\}$, $d=\{1,\dots ,10\}$ and $N=\{2,0.4p,0.7p,p-3\}$.5 For easier reading, the values are displayed on a logarithmic scale and are connected with a solid line. First, we notice that most measures seem to increase with increasing dependency measure d. Further, ${t_{l}}$, ${t_{u}}$, ${t^{\dagger }}$, ${T_{1}}$, ${T_{2}}$, ${T^{\dagger }}$ increase with increasing sample size N. However, ${F_{2}}$, the measure of the discrepancy between the sample variance of ${\tilde{\mathbf{w}}_{TP}}$ and the variance bound ${\mathbf{B}_{2}}$, on the contrary, decreases with increasing N. Regarding the bounds on the expected value of ${\tilde{\mathbf{w}}_{TP}}$, ${t_{l}}$ and ${t_{u}}$ become very similar, and so do ${f_{1}}$ and ${f_{2}}$. The measures of the difference between $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]$ and $\mathbb{E}[{\mathbf{w}_{TP}^{\dagger }}]$, ${t^{\dagger }}$ and ${f^{\dagger }}$, are fairly small for most of the considered simulation parameters. This suggests that $\mathbb{E}[{\mathbf{w}_{TP}^{\dagger }}]$ can serve as a rough approximation of $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]$, especially for $N\in (0.4p,0.7p)$. Furthermore, when $d=1$ we have $\boldsymbol{\Sigma }={\mathbf{I}_{p}}$, and hence the both bounds ${\mathbf{b}_{1}}$ and ${\mathbf{b}_{2}}$, as well as $\mathbb{E}[{\mathbf{w}_{TP}^{\dagger }}]$, provide equality with $\mathbb{E}[{\tilde{\mathbf{w}}_{TP}}]$. In particular, for $d=1$, these measures simply capture sample variance for the mean of m. Similarly, when $d=1$, ${T^{\dagger }}$ and ${F^{\dagger }}$ capture the sample variance of V. Further, for $N<p-3$ and low values of d, ${T^{\dagger }}$ and ${F^{\dagger }}$ are fairly small, suggesting that $\mathbb{V}[{\mathbf{w}_{TP}^{\dagger }}]$ could be applied as a rough approximation of $\mathbb{V}[{\tilde{\mathbf{w}}_{TP}}]$ in these cases. Finally, we notice that the measures ${F_{1}}$ and ${F_{2}}$ become very large for most of the combinations of p, N and d. It is however important to note that the Frobenius norm of differences, that these measures are based on, captures element-wise squared discrepancies, while ${\mathbf{B}_{1}}$ and ${\mathbf{B}_{2}}$ are not element-wise bounds, but rather bounds in the Löwner order sense.
-
1) Generate $\boldsymbol{\mu }$, with ${\mu _{i}}\sim \mathcal{U}(-0.1,0.1)$, $i=1\dots ,p$.
-
2) Generate Γ according to the Haar distribution, and compute $\boldsymbol{\Sigma }=\boldsymbol{\Gamma }\boldsymbol{\Lambda }{\boldsymbol{\Gamma }^{\prime }}$, where $\text{diag}(\boldsymbol{\Lambda })=d\dots ,1$.
-
3) Independently generate $\bar{\mathbf{x}}\sim {\mathcal{N}_{p,1}}(\boldsymbol{\mu },\boldsymbol{\Sigma }/N)$ and $n\mathbf{S}\sim {\mathcal{W}_{p}}(n,\boldsymbol{\Sigma })$.
-
4) Compute ${\tilde{\mathbf{w}}_{TP}}$.
-
5) Repeat steps 3) and 4) above $s=10000$ times.
-
6) Based on the s samples of ${\tilde{\mathbf{w}}_{TP}}$, compute m and V.
5 Summary
The TP is an important portfolio in mean-variance asset optimization framework of [35], and the statistical properties of the typical TP weight estimator have been thoroughly studied. However, when the portfolio dimension is greater than the sample size, this estimator is not applicable since standard inversion of the now singular sample covariance matrix is not possible. This issue can be solved by applying the Moore–Penrose inverse, to which a general TP weights estimator can be provided, covering both the singular and nonsingular case. Unfortunately, there exists no derivation of the moments for the Moore–Penrose inverse of a singular Wishart matrix, and consequently the moments of the general TP estimator cannot be obtained.
In this paper, we provide bounds on the mean and variance of the TP weights estimator in the singular case. Further, we present approximate results, as well as exact moment results in the case when the population covariance is equal to the identity matrix. We also provide exact moment results when the reflexive generalized inverse is applied in the TP weights equation.
Moreover, we investigate the properties of the derived bounds, and the estimator based on the reflexive generalized inverse, in a simulation study. The difference between the various bounds and the sample counterparts are measured by several quantities, and studied for numerous dimensions, sample sizes and levels of dependencies of the population covariance matrix. The results suggest that many of the derived bounds are closest to the sample moments when the population covariance matrix implies low dependency between the considered assets. Finally, the study implies that in some cases the moments of TP weights based on the reflexive generalized inverse can be used as a rough approximation for the moments of TP weights based on the Moore–Penrose inverse. For future studies, it would be relevant, for example, to perform a sensitivity analysis on how fluctuations in the population covariance matrix affect the estimated TP weights.