Journal of Engineering and Applied Sciences

Year: 2018
Volume: 13
Issue: 21
Page No. 9065 - 9077

Improved Performance of Support Vector Machine for Imbalanced Data Sets Using Oversampling and Optimization

Authors : Sana Saeed and Hong Choon Ong

Abstract: Classification of imbalanced data sets particularly in the presence of noise is a significant problem in machine learning and data mining. Support Vector Machine (SVM) is one of the most renowned supervised classification algorithm. However, its performance becomes limited for imbalanced data sets. To improve the performance of SVM for imbalanced data sets including noisy borderline and real data sets, a methodology based on oversampling and optimization algorithm is proposed for two-class classification problems. By generating the synthetic samples in the minority class and searching the best choices of the parameters of SVM after minimizing the objective function, the performance of SVM is improved. To confirm the validity of the proposed methodology, an experimental study including noisy borderline and real imbalanced data sets was conducted. SVM was applied by using the proposed methodology, two optimization algorithms and one oversampling algorithm on all the data sets. The performance of SVM with all methods was evaluated using sensitivity, G mean and F-measure. A significantly improved performance of SVM was observed by using the proposed methodology.

How to cite this article:

Sana Saeed and Hong Choon Ong, 2018. Improved Performance of Support Vector Machine for Imbalanced Data Sets Using Oversampling and Optimization. Journal of Engineering and Applied Sciences, 13: 9065-9077.

Design and power by Medwell Web Development Team. © Medwell Publishing 2024 All Rights Reserved