Invariant Coordinate Selection for Identifying the Structure of Multivariate Datasets

Alfons, A., Archimbaud, A., Nordhausen, K. and Ruiz-Gazen, A. Invited

Date

February 29, 2024

Time

12:00 AM

Location

Erasmus University Rotterdam, The Netherlands

Event

Abstract

Invariant Coordinate Selection (ICS) is a powerful unsupervised multivariate method designed for identifying the structure of multivariate datasets on a subspace. It goes beyond the well-known Principal Components Analysis method by not relying on maximizing the inertia but on optimizing a generalized kurtosis and is not only invariant by orthogonal transformation of the data but by any affine transformation. More precisely, ICS compares two scatter matrices through their joint diagonalization. Some theoretical results proved that under some elliptical mixture models, the subspace spanned by the first and/or last components carries the information regarding the multivariate structure and recovers the Fisher discriminant subspace, whatever the choice of scatter matrices. Among others, we studied the relevance of ICS for outlier detection and clustering purposes from a theoretical and empirical point of view. However, the choice of the pair of scatter matrices and the selection of the components to retain are still two challenging steps and we built several R packages for ease of use. Building on an updated version of the ICS package, which implements and unifies different algorithms for accurately computing the joint diagonalization and uses S3 classes and methods instead of S4, we proposed three main packages: (i) ICSShiny, (ii) ICSOutlier and (iii) ICSClust. The first one is a graphical user interface that allows researchers without prior programming experience to easily perform ICS with different scatter matrices and identify outliers as proposed in the ICSOutlier package. Finally, we also recently developed the ICSClust package which implements tandem clustering with ICS and different methods for selecting the components supported by nice visualizations.

Details
Posted on:
February 29, 2024
Length:
2 minute read, 296 words
See Also: