Predicting Cyber Vulnerability Exploits with Machine Learning

Michel Edkrantz, Alan Said. 2015, "Predicting Cyber Vulnerability Exploits with Machine Learning". Thirteenth Scandinavian Conference on Artificial Intelligence.


For an information security manager it can be a daunting task to keep up and assess which new cyber vulnerabilities to prioritize patching first. Every day numerous new vulnerabilities and exploits are reported for a wide variety of different software configurations. We use machine learning to make automatic predictions for unseen vulnerabilities based on previous exploit patterns. As sources for historic vulnerability data, we use the National Vulnerability Database (NVD) and the Exploit Database (EDB). Our work shows that common words from the vulnerability descriptions, external references, and vendor products, are the most important features to consider. Common Vulnerability Scoring System (CVSS) scores and categorical parameters, and Common Weakness Enumeration (CWE) numbers are redundant when a large number of common words are used, since this information is often contained within the vulnerability description. Using machine learning algorithms, it is possible to get a prediction accuracy of 83% for binary classification. In comparison, the performance differences between some of the algorithms are marginal with respect to metrics such as accuracy, precision, and recall. The best classifier with respect to both performance metrics and execution time is a linear time Support Vector Machine (SVM) algorithm. We conclude that in order to get better predictions the data quality must be enhanced.

Thirteenth Scandinavian Conference on Artificial Intelligence