ShaRP: Shape Regularized Multidimensional Projections
Problem
Multidimensional projections are well-established methods for visualizing large datasets in which every observation has tens up to hundreds of dimensions. Many such algorithms exist, such as PCA, t-SNE, UMAP, or our own Neural Network Projection.
However, most projection algorithms do not control the shapes they generate. For example, take the well-known MNIST dataset. The image below shows the projection an autoencoder (AE, a) and SSNP (b) would create. Clearly, the shapes of the same-color points (data points having the same label, thus similar), are kind of arbitrary. Surely, this can confuse the user to infer a specific distribution of these points -- which is not the case.
![](http://webspace.science.uu.nl/~telea001/uploads/InfoVis/sharp_intro.png)
Solution
The ShaRP method we propose solves this simply and efficiently. It essentially extends SSNP (which itself is an autoencoder with an additional label-based cost) to force same-label points to obey a distribution given by the user. In the image above, we force this distribution to be Gaussian. Thus, we get same-label points in rather ''circular' clusters.
Much more shape control is possible. The image below shows the same MNIST dataset, now forcing the same-label points to group into squares. This helps e.g. annotating point clusters by (rectangular) image thumbnails.
![](http://webspace.science.uu.nl/~telea001/uploads/InfoVis/sharp_squares.png)
The image below shows three datasets (MNIST, HAR, Reuters, from left to right) projected by ShaRP to create triangular clusters.
![](http://webspace.science.uu.nl/~telea001/uploads/InfoVis/sharp_triangles.png)
Results
Why is this useful? Consider the image below, comparing ShaRP (favoring round clusters) with t-SNE, UMAP, and SSNP for three datasets. We see that ShaRP produces easier-to-understand, less shape-biased, images.
![](http://webspace.science.uu.nl/~telea001/uploads/InfoVis/sharp_evaluation.png)
Implementation
ShaRP is implemented in Python. The full source code is available here.
Publications
ShaRP: Shape-Regularized Multidimensional Projections. A. Machado, M. Behrisch, A. Telea. Proc. EuroVA 2023 (Best paper award)