TriBoost: Cyberthreat Profiling in Networks using Machine Learning
2025 (English)Independent thesis Advanced level (degree of Master (Two Years)), 10 credits / 15 HE credits
Student thesis
Abstract [en]
The cyberthreat landscape has grown significantly, prompting societies to strengthen their cybersecurity postures and protect their assets. In today’s world, societies are undergoing continuous digital transformations that have expanded considerably the interconnectedness of IT networks, resulting in a substantial increase in the scale and volume of IT network communications. Due to the sophistication of evolving cyberthreats, traditional Intrusion Detection System (IDS) applications may not provide effective cybersecurity. Nonetheless, machine-learning-based IDS applications have proven to perform more effectively than traditional IDS applications. However, such applications can produce false detections, causing false alerts, halting legitimate traffic, or failing to detect malicious activity. Consequently, evaluating the performance of machine learning models is beneficial for identifying potential practical models and developing more reliable, scalable, and effective IDS applications. Random Forest is a classic machine learning model that has demonstrated robust performance in IDS applications. This study designed an experimental research method based on the general research design applied in a wide range of machine learning studies to utilize, evaluate, and compare the performance and time cost of state-of-the-art models, TabNet, a deep learning model, and eXtreme Gradient Boosting (XGBoost), a gradient boosting model with Random Forest, using Accuracy, F1, Recall, and Precision for performance evaluation. It involved performing multiclass classification tasks to classify ten types of network traffic activities using a publicly available network traffic dataset. Moreover, the study employed a controlled environment to ensure consistent conditions across the models, thereby enhancing the reliability of the findings and supporting replication and reproducibility. The findings show that the models struggled with the minority classes within the dataset. There is a significant class imbalance within the dataset, and some classes maintained insufficient representation, which could have impacted the models’ performance in handling the minority classes. Nevertheless, the findings suggest that XGBoost yielded promising performance and efficient time-cost results, outperforming a robust model like Random Forest, thereby making XGBoost a potential candidate for IDS applications.
Place, publisher, year, edition, pages
2025. , p. iii, 44, iii
Keywords [en]
Cybersecurity, Intrusion Detection System, Machine Learning
National Category
Computer Sciences
Identifiers
URN: urn:nbn:se:his:diva-25264OAI: oai:DiVA.org:his-25264DiVA, id: diva2:1972395
Subject / course
Informationsteknologi
Educational program
Privacy, Information and Cyber Security - Master's Programme 120 ECTS
Supervisors
Examiners
2025-06-182025-06-182025-09-29Bibliographically approved