Main Article Content
Abstract
Random Forest is one of the classification methods employed in data mining. One of the problems in data mining classification is the problem of unbalanced class data This phenomenon arises when the data classes utilized do not have identical instances. Imbalance class data causes the classification results to be biased towards the majority class. Adaptive Synthetic (ADASYN) can be used to deal with this problem. ADASYN generates synthetic data by assigning different importance of minority class samples and then producing synthetic data with similar characteristics. The implementation of ADASYN is suitable for fishery production data, which will experience the problem of unbalanced class data. Fish production is part of the measured fishery. This study aims to classify the value of measured fishery production at PPN Pemangkat through Random Forest Classification using ADASYN to handle the imbalance class data problem and compare the results with those without ADASYN implementation. This study uses four predictor variables which include fishing gear types (), number of trip days (), number of crew (), and the total weight of fish () with production value as response variable (). Accuracy, precision, recall, specificity, and G-mean are the model performance indicators used. The results showed that ADASYN successfully handles the problem of unbalanced class data in Random Forest classification. Accuracy is increased from to , Specificity is increased from to , Precision from to , and G-Mean from to . The decrease in recall is negligible due to the small amount, so the Random Forest classification with ADASYN is better than without ADASYN
Keywords
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
- Ali, J., Khan, R., Ahmad, N., & Maqsood, I. (2012). Random Forests and Decision Trees. IJCSI International Journal of Computer Science Issues, 9(5). 272-278
- Aqsha, M., Thamrin, S., & Lawi, A. (2021). Combination of ADASYN-N and Random Forest in Predicting of Obesity Status in Indonesia: A Case Study of Indonesian Basic Health Research 2013. Journal of Physics: Conference Series, 2123(1), 012039. https://doi.org/10.1088/1742-6596/2123/1/012039
- Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A:1010933404324.
- Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321–357.
- Chen, Z., Zhou, L., & Yu, W. (2021). ADASYN−Random Forest Based Intrusion Detection Model. 2021 4th International Conference on Signal Processing and Machine Learning, 152–159. https://doi.org/10.1145/3483207.3483232.
- Genuer, R., Poggi, JM. (2020). Random Forests. In: Random Forests with R. Use R!. Springer, Cham. https://doi.org/10.1007/978-3-030-56485-8_3
- Irnawati, R., Simbolon, D., Wiryawan, B., Murdianto, B., & Nurani, T. W. (2011). Leading commodity analysis of capture fisheries in Karimunjawa National Park. Jurnal Perikanan dan Kelautan, 1(1), 11-17 (2011).
- Jatmiko, Y. A., Padmadisastra, S., & Chadidjah, A. (2020). Analisis Perbandingan Kinerja Cart Konvensional, Bagging Dan Random Forest Pada Klasifikasi Objek: Hasil Dari Dua Simulasi. Media Statistika, 12(1), 1-12.
- Lee, T.-H., Ullah, A., & Wang, R. (2020). Bootstrap Aggregating and Random Forest. In P. Fuleky (Ed.). Macroeconomic Forecasting in the Era of Big Data, 52, 389–429.
- Safitri, I., & Magdalena, W. (2018). Perikanan Tangkap Purse Seine di Pelabuhan Perikanan Nusantara (PPN) Pemangkat Kalimantan Barat. Jurnal Laut Khatulistiwa, 1(3), 89–96.
- Syukron, M., Santoso, R., & Widiharih, T. (2020). Perbandingan Metode Smote Random Forest dan Smote Xgboost untuk Klasifikasi Tingkat Penyakit Hepatitis C pada Imbalance Class Data. Jurnal Gaussian, 9(3), 227–236.