Abstract
Sample size determination is a fundamental step in the design of experiments. Methods for sample size determination are abundant for univariate analysis methods, but scarce in the multivariate case. Omics data are multivariate in nature and are commonly investigated using multivariate statistical methods, such as principal component analysis (PCA) and partial least-squares discriminant analysis (PLS-DA). No simple approaches to sample size determination exist for PCA and PLS-DA. In this paper we will introduce important concepts and offer strategies for (minimally) required sample size estimation when planning experiments to be analyzed using PCA and/or PLS-DA.
Original language | English |
---|---|
Pages (from-to) | 2379-2393 |
Number of pages | 15 |
Journal | Journal of Proteome Research |
Volume | 15 |
Issue number | 8 |
DOIs | |
Publication status | Published - Aug-2016 |
Keywords
- loading estimation
- covariance estimation
- eigenvalue distribution
- random matrix theory
- hypothesis testing
- dimensionality
- multivariate analysis
- power analysis
- PRINCIPAL-COMPONENTS-ANALYSIS
- COVARIANCE MATRICES
- LARGEST EIGENVALUE
- STATISTICAL POWER
- CONFIDENCE-INTERVALS
- CARDIOVASCULAR RISK
- METABOLOMICS DATA
- CROSS-VALIDATION
- STANDARD ERRORS
- STOPPING RULES