DDOS attack detection with data imperfections using machine learning algorithms

Artem Dremov; Artem Volokyta

doi:10.20535/2786-8729.7.2025.334076

Authors

Artem Dremov National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0009-0005-7214-9458
Artem Volokyta National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv, Ukraine https://orcid.org/0000-0001-9069-5544

DOI:

https://doi.org/10.20535/2786-8729.7.2025.334076

Keywords:

Machine Learning, DDoS, network security, network traffic analysis, data resampling

Abstract

The issue of DDoS (Distributed Denial of Service) attacks remains a prevalent one even in recent years. Modern environment is highly dynamic and is characterized by a large amount of traffic flow. Existing research covers several models, techniques and approaches to detecting DDoS traffic, which aim to optimize the detection in controlled datasets. However, unintentional noise or data corruption may lower the efficacy of such methods. As such, determining most effective ways to detect DDoS traffic in conditions of data imperfections is necessary for reliable network performance.

Therefore, the object of this research Is the usage of machine learning algorithms for detection of incoming DDoS attacks. The purpose of this research is to determine the performance of ways to detect incoming DDoS attacks with machine learning algorithms based on detection accuracy, while simulating imperfect data conditions. The study also examines the impact of class rebalancing on modified data. To achieve the aim of this research a variety of machine learning algorithms were implemented and tested on a CIC-DDoS2019 dataset. The data is modified by removing values and introducing noise, tested, the classes are resampled and the dataset is tested again. The goal is to achieve over 90% accuracy in a classification task of the type of DDoS attack and to determine how much the changes affect the performance of the algorithms.

The results of the testing indicated that several solutions reach the target mark and changes to the dataset in realistic conditions do not significantly affect the final result. However, all models tested show a decrease in accuracy compared to unmodified data with more complex models showing higher resilience (smaller decrease in accuracy). In addition, resampling of the data shows comparable decrease in accuracy of the models with more complex models being affected less.

The results of this study may be used in development of an algorithm of repairing the corrupted data or development of models more resistant to such data changes. Additionally, the results of this study may be used when considering models for practical implementations of a DDoS traffic classification system.

Author Biographies

Artem Dremov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

PhD student of the Computer Engineering Department of the Faculty of informatics and Computer Technique

Artem Volokyta, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

Associated Professor of the Computer Engineering Department of the Faculty of informatics and Computer Technique, Candidate of Technical Sciences, Associated Professor

References

C. Douligeris and A. Mitrokotsa, “DDoS attacks and defense mechanisms: Classification and state-of-the-art,” Computer Networks, vol. 44, no. 5, pp. 643–666, 2004. https://doi.org/10.1016/j.comnet.2003.10.003

G. Somani, M. S. Gaur, D. Sanghi, M. Conti, and R. Buyya, “DDoS attacks in cloud computing: Issues, taxonomy, and Future Directions,” Computer Communications, vol. 107, pp. 30–48, 2017. https://doi.org/10.1016/j.comcom.2017.03.010

M. Tahir, A. Abdullah, N. I. Udzir, and K. A. Kasmiran, “A novel approach for handling missing data to enhance network intrusion detection system,” Cyber Security and Applications, vol. 3, p. 100063, 2025. https://doi.org/10.1016/j.csa.2024.100063

“The impact of data quality on machine learning,” Wipro, https://www.wipro.com/engineering/the-impact-of-data-quality-on-machine-learning/ (accessed Aug. 13, 2025).

‌[5] T. E. Ali, Y.-W. Chong, and S. Manickam, “Machine learning techniques to detect a DDOS attack in SDN: A systematic review,” Applied Sciences, vol. 13, no. 5, p. 3183, 2023. https://doi.org/10.3390/app13053183

F. S. Lima Filho, F. A. Silveira, A. de Medeiros Brito Junior, G. Vargas-Solar, and L. F. Silveira, “Smart detection: An online approach for DOS/ddos attack detection using machine learning,” Security and Communication Networks, vol. 2019, pp. 1–15, 2019. https://doi.org/10.1155/2019/1574749

‌[7] A. A. Alahmadi et al., “DDoS attack detection in IOT-based networks using Machine Learning Models: A survey and research directions,” Electronics, vol. 12, no. 14, p. 3103, 2023. https://doi.org/10.3390/electronics12143103

‌[8] A. A. Bahashwan, M. Anbar, S. Manickam, T. A. Al-Amiedy, M. A. Aladaileh, and I. H. Hasbullah, “A systematic literature review on machine learning and deep learning approaches for detecting ddos attacks in software-defined networking,” Sensors, vol. 23, no. 9, p. 4441, 2023. https://doi.org/10.3390/s23094441

‌[9] F. L. Becerra-Suarez, I. Fernández-Roman, and M. G. Forero, “Improvement of distributed denial of service attack detection through machine learning and Data Processing,” Mathematics, vol. 12, no. 9, p. 1294, 2024. https://doi.org/10.3390/math12091294

A. Zainudin, L. A. Ahakonye, R. Akter, D.-S. Kim, and J.-M. Lee, “An efficient hybrid-DNN for ddos detection and classification in software-defined IIOT Networks,” IEEE Internet of Things Journal, vol. 10, no. 10, pp. 8491–8504, 2023. https://doi.org/10.1109/JIOT.2022.3196942

J. Zhao, M. Xu, Y. Chen, and G. Xu, “A DNN architecture generation method for ddos detection via genetic alogrithm,” Future Internet, vol. 15, no. 4, p. 122, 2023. https://doi.org/10.3390/fi15040122

Y. N. Soe, P. I. Santosa, and R. Hartanto, “DDoS attack detection based on simple ANN with smote for IOT environment,” 2019 Fourth International Conference on Informatics and Computing (ICIC), pp. 1–5, 2019. https://doi.org/10.1109/ICIC47613.2019.8985853

I. Sharafaldin, A. H. Lashkari, S. Hakak, and A. A. Ghorbani, “Developing realistic distributed denial of service (ddos) attack dataset and taxonomy,” 2019 International Carnahan Conference on Security Technology (ICCST), pp. 1–8, 2019. https://doi.org/10.1109/CCST.2019.8888419

A. H. Lashkari, “Ahlashkari/CICFlowMeter: CICFlowmeter-v4.0” GitHub, https://doi.org/10.13140/RG.2.2.13827.20003

F. Pedregosa et al., “Scikit-Learn: Machine learning in Python,” Journal of Machine Learning Research, https://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html (accessed Aug. 13, 2025).

‌[16] L. Bottou, “Stochastic gradient descent tricks,” Lecture Notes in Computer Science, pp. 421–436, 2012. https://doi.org/10.1007/978-3-642-35289-8_25

‌[17] C. M. Bishop, Pattern Recognition and Machine Learning. New York: Springer, 2016.

‌[18] C. Cortes and V. Vapnik, “Support-Vector Networks,” Machine Learning, vol. 20, no. 3, pp. 273–297, 1995. https://doi.org/10.1007/BF00994018

‌[19] D. R. Cox, “The regression analysis of binary sequences,” Journal of the Royal Statistical Society Series B: Statistical Methodology, vol. 20, no. 2, pp. 215–232, 1958. https://doi.org/10.1111/j.2517-6161.1958.tb00292.x

‌[20] F. Rosenblatt, “The Perceptron: A probabilistic model for information storage and organization in the brain.,” Psychological Review, vol. 65, no. 6, pp. 386–408, 1958. https://doi.org/10.1037/h0042519

‌[21] L. Breiman, “Random forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001. https://doi.org/10.1023/A:1010933404324

J. H. Friedman, “Stochastic gradient boosting,” Computational Statistics & Data Analysis, vol. 38, no. 4, pp. 367–378, 2002. https://doi.org/10.1016/S0167-9473(01)00065-2

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. https://doi.org/10.1038/nature14539

‌[24] T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama, “Optuna: A next-generation hyperparameter optimization framework,” arXiv.org, https://arxiv.org/abs/1907.10902 (accessed Aug. 13, 2025).

‌[25] H. He and E. A. Garcia, “Learning from Imbalanced Data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. https://doi.org/10.1109/TKDE.2008.239

‌[26] H. He, Y. Bai, E. A. Garcia, and S. Li, “Adasyn: Adaptive Synthetic Sampling Approach for imbalanced learning,” 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), pp. 1322–1328, 2008. https://doi.org/10.1109/IJCNN.2008.4633969

‌[27] I. Tomek, “Two modifications of CNN,” IEEE Transactions on Systems, Man, and Cybernetics, vol. SMC-6, no. 11, pp. 769–772, 1976. https://doi.org/10.1109/TSMC.1976.4309452

DDOS attack detection with data imperfections using machine learning algorithms

Authors

DOI:

Keywords:

Abstract

Author Biographies

Artem Dremov, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

Artem Volokyta, National Technical University of Ukraine “Igor Sikorsky Kyiv Polytechnic Institute”, Kyiv

References

Downloads

Published

How to Cite

Issue

Section

License

Current Issue

Information

Developed By