Analysis of flow cytometry data by matrix relevance learning vector quantization

Michael Biehl*, Kerstin Bunte, Petra Schneider

*Bijbehorende auteur voor dit werk

OnderzoeksoutputAcademicpeer review

32 Citaten (Scopus)
479 Downloads (Pure)

Samenvatting

Flow cytometry is a widely used technique for the analysis of cell populations in the study and diagnosis of human diseases. It yields large amounts of high-dimensional data, the analysis of which would clearly benefit from efficient computational approaches aiming at automated diagnosis and decision support. This article presents our analysis of flow cytometry data in the framework of the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukemia (AML) Challenge, 2011. In the challenge, example data was provided for a set of 179 subjects, comprising healthy donors and 23 cases of AML. The participants were asked to provide predictions with respect to the condition of 180 patients in a test set. We extracted feature vectors from the data in terms of single marker statistics, including characteristic moments, median and interquartile range of the observed values. Subsequently, we applied Generalized Matrix Relevance Learning Vector Quantization (GMLVQ), a machine learning technique which extends standard LVQ by an adaptive distance measure. Our method achieved the best possible performance with respect to the diagnoses of test set patients. The extraction of features from the flow cytometry data is outlined in detail, the machine learning approach is discussed and classification results are presented. In addition, we illustrate how GMLVQ can provide deeper insight into the problem by allowing to infer the relevance of specific markers and features for the diagnosis.

Originele taal-2English
Artikelnummere59401
Aantal pagina's11
TijdschriftPLoS ONE
Volume8
Nummer van het tijdschrift3
DOI's
StatusPublished - 18-mrt-2013

Citeer dit