Clustering and dimension reduction for mixed variables

Maurizio Vichi, Donatella Vicari*, Henk A. L. Kiers

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)

Abstract

Clustering (partitioning) and simultaneous dimension reduction of objects and variables of a two-way two-mode data matrix is proposed here. The methodology is based on a general model that includes K-means clustering, factorial K-means, projection pursuit clustering (also known as reduced K-means), principal component analysis and intermediate cases of object clustering and variable reduction. Since we often have sets consisting of both qualitative and quantitative variables, the general model is now extended to deal with the general relevant case of mixed variables, analogous to variants of PCA handling qualitative (nominal and ordinal) variables in addition to quantitative variables. The model, called clustering and dimension reduction (CDR), is fully discussed in all the special cases cited above. For least-squares estimation of the model, an efficient coordinate descent algorithm is presented. Finally, a simulation study and two analyses on real data illustrate the features of CDR and study the performance of the proposed algorithm.
Original languageEnglish
Pages (from-to)243-269
Number of pages27
JournalBehaviormetrika
Volume46
DOIs
Publication statusPublished - Oct-2019

Cite this