Principal components analysis for mixtures with varying concentrations
Volume 8, Issue 4 (2021), pp. 509–523
Pub. online: 12 November 2021
Type: Research Article
Open Access
Received
14 August 2021
14 August 2021
Revised
30 October 2021
30 October 2021
Accepted
30 October 2021
30 October 2021
Published
12 November 2021
12 November 2021
Abstract
Principal Component Analysis (PCA) is a classical technique of dimension reduction for multivariate data. When the data are a mixture of subjects from different subpopulations one can be interested in PCA of some (or each) subpopulation separately. In this paper estimators are considered for PC directions and corresponding eigenvectors of subpopulations in the nonparametric model of mixture with varying concentrations. Consistency and asymptotic normality of obtained estimators are proved. These results allow one to construct confidence sets for the PC model parameters. Performance of such confidence intervals for the leading eigenvalues is investigated via simulations.
References
Autin, F., Pouet, C.: Adaptive test on components of densities mixture. Math. Methods Stat. 21(2), 93–108 (2012). MR2974011. https://doi.org/10.3103/S1066530712020020
Doronin, O.: Adaptive estimation for a semiparametric model of mixture. Theory Probab. Math. Stat. 91, 29–41 (2015). MR3364121. https://doi.org/10.1090/tpms/964
Gayraud, G., Ingster, Y.I.: Detection of sparse additive functions. Electron. J. Stat. 6, 1409–1448 (2012). MR2988453. https://doi.org/10.1214/12-EJS715
Härdle, W., Simar, L.: Applied Multivariate Statistical Analysis. Springer, Berlin Heidelberg (2007). MR2367300
Ho, N., Nguyen, X.: Convergence rates of parameter estimation for some weakly identifiable finite mixtures. Ann. Stat. 44, 2726–2755 (2016). MR3576559. https://doi.org/10.1214/16-AOS1444
Jolliffe, I.T.: Principal Component Analysis. Springer, New York (2010). MR0841268. https://doi.org/10.1007/978-1-4757-1904-8
Magnus, J.R., Neudecker, H.: Matrix Differential Calculus with Applications in Statistics and Econometrics. Wiley, New York (2019). MR1698873
Maiboroda, R., Sugakova, O.: Statistics of mixtures with varying concentrations with application to DNA microarray data analysis. J. Nonparametr. Stat. 24(1), 201–215 (2012). MR2885834. https://doi.org/10.1080/10485252.2011.630076
Maiboroda, R., Sugakova, O., Doronin, A.: Generalized estimating equations for mixtures with varying concentrations. Can. J. Stat. 41(2), 217–236 (2013). MR3061876. https://doi.org/10.1002/cjs.11170
Maiboroda, R., Sugakova, O.: Jackknife covariance matrix estimation for observations from mixture. Mod. Stoch. Theory Appl. 6(4), 495–513 (2019). MR4047396. https://doi.org/10.15559/19-vmsta145
Miroshnichenko, V., Maiboroda, R.: Asymptotic normality of modified LS estimator for mixture of nonlinear regressions. Mod. Stoch. Theory Appl. 7(4), 435–448 (2020). MR4195645
Pidnebesna, A., Fajnerová, I., Horáček, J., Hlinka, J.: Estimating Sparse Neuronal Signal from Hemodynamic Response: The Mixture Components Inference Approach. https://www.biorxiv.org/content/10.1101/2019.12.19.876508v1. Accessed 8 August 2021.
Scrucca, L., Fop, M., Murphy, T.B., Raftery, A.E.: mclust 5: Clustering, classification and density estimation using Gaussian finite mixture models. R J. 8(1), 289–317 (2016). https://doi.org/10.32614/RJ-2016-021
Van Huffel, S., Vandewalle, J.: The Total Least Squares Problem – Computational Aspects and Analysis. Society for Industrial and Applied Mathematics, Philadelphia (1991). MR1118607. https://doi.org/10.1137/1.9781611971002