Improving Malware Detection in Cybersecurity
Opening up the black-box to diagnose problems and foster collaboration

An AI revolution in cybersecurity?
Machine learning has enabled a new paradigm for detection of malware in real-time that uses predictive models to assess running programs for abnormal behavior. This approach has the potential to block never-before-seen malware.
Many cybersecurity vendors have been eager to deploy these AI-based protection algorithms without full understanding of how they behave, and in particular, how they can be exploited.
Perils of black-box modeling
However, the performance of the detection was also highly variable: on average it was strong, but on some days the ability to protect against the newest threats was significantly lower.
Neither the data scientists nor the security researchers were able to understand the black-box models, and more importantly, identify where they were failing.

Diagnosis using interpretable models
-
Auditable for security flaws
The model can be manually audited by security researchers to ensure there are no exploitable weak-points
-
Identification of problem areas
Looking at the leaves of the tree where the performance is weakest tells us where the model is most uncertain
Example decision tree predicting attack probability. The tree highlights problem areas such as the paths in red where the model is most uncertain
Unlocking a culture of collaboration
Security researchers were able to analyze the samples in the weak areas of the tree, deriving new features for the model to help better detect malware.
The security researchers also became confident in the logic of the model and placed more trust in its predictions.
These improvements helped stabilize the model performance and ensure consistently strong protection.

Unique Advantage
Why is the Interpretable AI solution unique?
-
Enables collaboration with domain experts
The interpretability of the model allows security researchers to collaborate with data scientists to improve the prediction quality
-
Auditable, robust and trusted
Security researchers can easily validate the logic of the model, minimizing the risk of exploitable flaws in the protection engine