Improving Panic Disorder Classification Using SMOTE and Random Forest
Abstract
Panic disorder is a serious anxiety disorder that can significantly impact an individual's mental health. If left undetected, this disorder can disrupt daily life, social relationships, and overall quality of life. Early detection and intervention are crucial for managing panic disorder and improving the well-being of those affected. Technology plays a pivotal role in facilitating early detection through data-driven approaches that employ algorithms to identify patterns of behavior or symptoms associated with panic disorder. Accurate classification of panic disorder is crucial for effective diagnosis and treatment. However, machine learning models trained on imbalanced datasets, such as those containing panic disorder patients, are prone to overfitting, leading to poor generalization performance. This study investigates the effectiveness of the Synthetic Minority Oversampling Technique (SMOTE) in addressing overfitting in panic disorder dataset classification using the Random Forest algorithm. The results demonstrate that SMOTE significantly improves the classification performance of Random Forest. By mitigating overfitting and improving generalization to unseen data, SMOTE increases accuracy by 15 percentage points. Before using SMOTE, the accuracy was 82%, and after using SMOTE it is 97%. The findings underscore the promise of SMOTE as a tool for boosting the performance of machine learning algorithms in classifying panic disorder from imbalanced data.
References
[2] P. Cao, D. Zhao, and O. Zaiane, “An Optimized Cost-Sensitive SVM for Imbalanced Data Learning,” in Advances in Knowledge Discovery and Data Mining, vol. 7819, J. Pei, V. S. Tseng, L. Cao, H. Motoda, and G. Xu, Eds., in Lecture Notes in Computer Science, vol. 7819. , Berlin, Heidelberg: Springer Berlin Heidelberg, 2013, pp. 280–292. doi: 10.1007/978-3-642-37456-2_24.
[3] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic Minority Over-sampling Technique,” jair, vol. 16, pp. 321–357, Jun. 2002, doi: 10.1613/jair.953.
[4] L. Liu, S. Tang, F. -X. Wu, Y. -P. Wang and J. Wang, "An Ensemble Hybrid Feature Selection Method for Neuropsychiatric Disorder Classification," in IEEE/ACM Transactions on Computational Biology and Bioinformatics, vol. 19, no. 3, pp. 1459-1471, 1 May-June 2022, doi: 10.1109/TCBB.2021.3053181.
[5] Q. Chen, Z.-L. Zhang, W.-P. Huang, J. Wu, and X.-G. Luo, “PF-SMOTE: A novel parameter-free SMOTE for imbalanced datasets,” Neurocomputing, vol. 498, pp. 75–88, Aug. 2022, doi: 10.1016/j.neucom.2022.05.017.
[6] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 20–29, Jun. 2004, doi: 10.1145/1007730.1007735.
[7] D. Wang, P., Yu, Z., & Zhang, “Facial expression recognition for panic disorder detection using convolutional neural networks,” IEEE, vol. 6, 2018.
[8] T. Li, H., Sun, F., & Zhang, “Speech emotion recognition for panic disorder detection using recurrent neural networks,” IEEE, vol. 6, 2018.
[9] V. Srividhya and R. Anitha, “Evaluating Preprocessing Techniques in Text Categorization,” pp. 49–51, 2010.
[10] S. Saifullah, Y. Fauziyah, and A. S. Aribowo, “Comparison of machine learning for sentiment analysis in detecting anxiety based on social media data,” J. Inform., vol. 15, no. 1, p. 45, 2021, doi: 10.26555/jifo.v15i1.a20111.
[11] S. Easterbrook and J. Callahan, “Formal Methods for Verification and Validation of Partial Specifications : A Case Study 1 Introduction 2 Context : The IV & V Process,” pp. 1–13.
[12] L. Tommy, D. Novianto, and Y. S. Japriadi, “Sistem Rekomendasi Hybrid untuk Pemesanan Hidangan Berdasarkan Karakteristik dan Rating Hidangan,” J. Appl. Informatics Comput., vol. 4, no. 2, pp. 137–145, 2020, doi: 10.30871/jaic.v4i2.2687.
[13] R. C. Bhagat and S. S. Patil, “Enhanced SMOTE algorithm for classification of imbalanced big-data using Random Forest.” 2015 IEEE International Advance Computing Conference (IACC), 2015, doi: 10.1109/iadcc.2015.7154739.
[14] Andri, R. Yunis, and Tanti, “Optimizing Random Forest Classification Using Chi-Square and SMOTE-ENN on Student Drop-Out Data.” 2023 Eighth International Conference on Informatics and Computing (ICIC), 2023, doi: 10.1109/icic60109.2023.10382055.
[15] J. Prasetya and A. Abdurakhman, “Comparison Of Smote Random Forest And Smote K-Nearest Neighbors Classification Analysis On Imbalanced Data.” Media Statistika, vol. 15, no. 2, pp. 198-208, 2023, doi: 10.14710/medstat.15.2.198-208.
[16] I. Permatasari, B. Dermawan, I. Maulana, and D. Kurniawan, “Classification of COVID-19 Aid Recipients in Kasomalang District Using the K-Nearest Neighbor Method”, JAIC, vol. 8, no. 1, pp. 133-139, Jul. 2024.
[17] S. Himawan, R. Sohiburoyyan, and I. Iryanto, “Hyperparameter Tuning on Graph Neural Network for the Classification of SARS-CoV-2 Inhibitors”, JAIC, vol. 7, no. 2, pp. 186-191, Nov. 2023.
[18] M. Fajri and A. Primajaya, “Komparasi Teknik Hyperparameter Optimization pada SVM untuk Permasalahan Klasifikasi dengan Menggunakan Grid Search dan Random Search”, JAIC, vol. 7, no. 1, pp. 10-15, Jul. 2023.
[19] W. Husain, L. K. Xin, N. A. Rashid and N. Jothi, "Predicting Generalized Anxiety Disorder among women using random forest approach," 2016 3rd International Conference on Computer and Information Sciences (ICCOINS), Kuala Lumpur, Malaysia, 2016, pp. 37-42, doi: 10.1109/ICCOINS.2016.7783185.
[20] S. F. Abdoh, M. Abo Rizka and F. A. Maghraby, "Cervical Cancer Diagnosis Using Random Forest Classifier With SMOTE and Feature Reduction Techniques," in IEEE Access, vol. 6, pp. 59475-59485, 2018, doi: 10.1109/ACCESS.2018.2874063.
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Penulis yang telah mempublikasikan artikel pada JAIC menyatakan setuju bahwa:
1. Artikel belum dan tidak pernah dipublikasikan sebelumnya pada jurnal ilmiah lain, prosiding ataupun jurnal elektronik lainnya.
2. Artikel yang telah diserahkan menjadi hak penuh kepada pengelola JAIC Politeknik Negeri Batam
3. Artikel diperbolehkan untuk dishare ke khalayak untuk meningkatkan produktivitas rujukan dan sitasi dari naskah yang telah terbit.