Invariant coordinate selection as preprocessing for clustering
Alfons, A., Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. Invited
Date
December 17 – 19, 2022
Time
12:00 AM
Location
King’s College London, UK (hybrid)
Event
Abstract
Dimension reduction is an important preprocessing step in the multivariate analysis field, likely improving the identification of clusters. The well-known Principal Component Analysis (PCA), is one of the most famous dimension reduction techniques, but it may not be the best choice for clustering purposes. An alternative approach, Invariant Component Selection (ICS), relies on the simultaneous diagonalization of two scatter matrices. It goes beyond PCA by finding directions of interest through the optimization of general kurtosis measures and returns affine invariant components. Two challenging steps are the choice of the pair of scatter matrices and the selection of the components to retain. Some theoretical results have already been derived that guarantee that under some elliptical mixture models, the structure of the data can be highlighted on a subset of the first and/or last components. ICS has received little attention concerning clustering tasks. We evaluate the performance of several well-known clustering algorithms with ICS as a preprocessing step. We consider different combinations of scatter matrices, components selection approaches and the impact of outliers, on some simulations and some benchmark data sets.
Details
- Posted on:
- December 19, 2022
- Length:
- 2 minute read, 214 words
- See Also: