Visualization and GraphicsInteractionDept ICSFaculty of ScienceUU

//webspace.science.uu.nl/~telea001/Site/TopBar

Self-Supervised Neural Projection (SSNP)

Problem

Multidimensional projections are the methods of choice for depicting large and high-dimensional datasets. Tens of such methods exist. So, which is the best? We evaluated over 40 of such methods quantitatively and concluded that there's no winner. Speed, quality, stability, out-of-sample ability, ease of use, and implementation simplicity seem to compete.

Recently, we proposed NNP, a projection method that uses deep learning to achieve all above features. Nearly. NNP is supervised, so it requires a training projection, which costs effort and attention to generate.

Solution

We propose here a different road. We use deep learning to enhance a classical autoencoder architecture with a cost based on point labels, either supplied with the data or computed by clustering. This removes the need for supervision but keeps all other desirable aspects of NNP.

Results

The image below compares SSNP's results (using agglomerative clustering (Agg), K-means clustering (Km), and ground truth labels (GT)) to those of three state-of-the-art methods: t-SNE, UMAP, and autoencoders. We see that SSNP's results are better than autoencoders, whereas the method is much faster, simpler to implement and use, and is deterministic, as compared to t-SNE and UMAP.

Performance

The graph below shows SSNP's performance compared to NNP, t-SNE, autoencoders, and UMAP. Our method is as fast as NNP and autoencoders (but higher quality, see previous image) and orders of magnitude faster than t-SNE and UMAP.

Implementation

SSNP is implemented in Python. Full source code available here.

Publications

Self-Supervised Dimensionality Reduction with Neural Networks and Pseudo-labeling M. Espadoto, N. Hirata, A. Telea (2023) Proc. IVAPP

Improving Self-Supervised Dimensionality Reduction: Exploring Hyperparameters and Pseudo-labeling Strategies A. Oliveira, M. Espadoto, R. Hirata, N. Hirata, A. Telea (2023) Springer CCIS 1691, 135-161