Performance Analysis of Semi-Supervised Learning Methods under Different Missing Label Patterns

F. Ilhan and E. Mumcuoglu, “Performance Analysis of Semi-Supervised Learning Methods under Different Missing Label Patterns”, 28th IEEE Signal Processing and Communications Applications, 2020.

IEEEXplore [code]

Abstract

In this study, we analyze the performance of semi-supervised learning methods under different missing label patterns and missing label proportions. Some semi-supervised learning methods make several assumptions about the missingness mechanism or data characteristics to promote performance improvement compared to supervised techniques. On the other hand, some works do not even consider the underlying patterns of missing labels. To investigate the behavior of these methods or verify their assumptions, we constructed partially labeled datasets virtually through simulating different missingness patterns over fully labeled datasets.We analyze the performance of support vector machines with self-training (SVM-ST) and Gaussian mixture models with semi-supervised expectation maximization (GMMSSEM). We also compare these methods with their supervised counterparts in terms of performance. Results show that missing label patterns and proportions have significant effects on performance.