Sparse Clustering (and Alignment) of Functional Data
Abstract
Finding sparse solutions to clustering problems has emerged as a hot topic in statistics in recent years, and sparse clustering approaches have been proposed also for the case of functional data, when it is often of interest to select the portion of the curves' domain showing the clustering at most. I will show that functional sparse clustering can be analytically defined as a variational problem with a hard thresholding constraint ensuring the sparsity of the solution: this problem is well-posed, and it has a unique optimal solution. Moreover, such definition of functional sparse clustering provides good insights in real applications. When dealing with curve clustering we cannot forget the presence of misalingment: this is a frequent situation in functional data analysis problems, which can heavily affect the sparse clustering results potentially leading to meaningless conclusions. Many methods to jointly cluster and align curves, which efficiently decouple amplitude and phase variability, have already been proposed in the literature on functional data. I propose a possible approach to jointly deal with sparse functional clustering while also aligning the curves, therefore performing all these tasks: functional clustering, curves' alignment, and domain selection. The method is studied in its well-posedness, and its performance and parameter tuning are explored in a variety of simulated scenarios and real case studies.