Invariant coordinate selection as preprocessing for clustering

Alfons, A., Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. Invited

Date

December 17 – 19, 2022

Time

12:00 AM

Location

King’s College London, UK (hybrid)

Event

Abstract

Dimension reduction is an important preprocessing step in the multivariate analysis field, likely improving the identification of clusters. The well-known Principal Component Analysis (PCA), is one of the most famous dimension reduction techniques, but it may not be the best choice for clustering purposes. An alternative approach, Invariant Component Selection (ICS), relies on the simultaneous diagonalization of two scatter matrices. It goes beyond PCA by finding directions of interest through the optimization of general kurtosis measures and returns affine invariant components. Two challenging steps are the choice of the pair of scatter matrices and the selection of the components to retain. Some theoretical results have already been derived that guarantee that under some elliptical mixture models, the structure of the data can be highlighted on a subset of the first and/or last components. ICS has received little attention concerning clustering tasks. We evaluate the performance of several well-known clustering algorithms with ICS as a preprocessing step. We consider different combinations of scatter matrices, components selection approaches and the impact of outliers, on some simulations and some benchmark data sets.

Details
Posted on:
December 19, 2022
Length:
2 minute read, 214 words
See Also: