Dynamic data-driven vulnerability assessments face massive heterogeneous data contained in, and produced by SOCs (Security Operations Centres). Manual vulnerability assessment practices result in inaccurate data and induce complex analytical reasoning. Contemporary security repositories’ diversity, incompleteness and redundancy contribute to such security concerns. These issues are typical characteristics of public and manufacturer vulnerability reports, which exacerbate direct analysis to root out security deficiencies. Recent advances in machine learning techniques promise novel approaches to overcome these notorious diversity and incompleteness issues across massively increasing vulnerability reports corpora. Yet, these techniques themselves exhibit varying degrees of performance as a result of their diverse methods. We propose a cognitive cybersecurity approach that empowers human cognitive capital along two dimensions. We first resolve conflicting vulnerability reports and preprocess embedded security indicators into reliable data sets. Then, we use these data sets as a base for our proposed ensemble meta-classifier methods that fuse machine learning techniques to improve the predictive accuracy over individual machine learning algorithms. The application and implication of this methodology in the context of vulnerability analysis of computer systems are yet to unfold the full extent of its potential. The proposed cognitive security methodology in this paper is shown to improve performances when addressing the above-mentioned incompleteness and diversity issues across cybersecurity alert repositories. The experimental analysis conducted on actual cybersecurity data sources reveals interesting tradeoffs of our proposed selective ensemble methodology, to infer patterns of computer system vulnerabilities.
CC BY 4.0
Available online 4 September 2021, 103210
This research has been supported in part by EU ISF (Internal Security Fund) in the context of Project Grant # A431.678/2016.