Machine Learning & Cybersecurity: Securing ML in an Adversarial Environment

This is the second in a multipart series that explores machine learning (ML) within cybersecurity. The first post in this series provided a brief background to set the stage. In this second article, I look at how ML models and data might be attacked and how we can make them more secure. Stay tuned for the third part, in which I will discuss ethical considerations.

Machine learning (ML) for cybersecurity offers tremendous benefits and has become a vital component in many security solutions. However, there are also many risks that security professionals must understand when deploying ML as a component of a cybersecurity solution. This article examines possible vulnerabilities and risks, including attacks on the ML process, data, and models. Next, the author proposes implementing adversarial ML to test and protect the vital ML models used for cybersecurity.

Classical ML does not consider purposeful misleading by an adversary. Traditionally, ML focuses on uncovering knowledge and discovering relationships within the data and assumes a non-adversarial environment. For most traditional ML problems, such an approach is acceptable and efficient. (Refer to the previous paper, Machine Learning Overview for Cybersecurity Professionals, for background on ML and its uses within cybersecurity).

Figure 1 – Traditional ML process in a non-adversarial environment.

Unlike many other applications of ML, cybersecurity is an adversarial environment. Adversaries seek to exploit ML vulnerabilities to disrupt ML systems. When ML is used in an adversarial environment, it must be designed and built with the assumption that it will be attacked in all phases. Figure 2 depicts the same ML process within the context of an adversarial environment. The following section will discuss the various types of attacks shown in Figure 2.

Figure 2 – The ML process within an adversarial context, such as cybersecurity.

Attacks on Machine Learning

Though ML has had many successful real-world applications, including in cybersecurity, its methods are vulnerable. Attacks on ML can be categorized based on the phase (training, testing, or inference). These ML attacks can also be classified as causative or exploratory. Causative attacks, which typically occur in the training stage, alter the training data or model parameters. An example of a causative attack is injecting adversarial samples into the training data. In contrast, an exploratory attack does not tamper with the training data or the model. Instead, during the inference stage, an exploratory attack collects information about the data and the model, possibly to evade classification later.

Training and Testing Phase Attacks

Poisoning attacks are causative attacks that alter the training data or model parameters. Since ML methods rely on the quality of the training data, they are vulnerable to training data manipulation. In poisoning attacks, attackers inject adversarial samples into the training data set or manipulate labels to impact the ML algorithm’s performance. The poisoning can also be either direct or indirect. Direct poisoning targets the training data set. Indirect poisoning injects data into the raw data before preprocessing and extracting the training data set.

Since the training data set is often well guarded, poisoning attacks against the original training data set may be difficult. However, in a changing environment, ML models may require retraining so they can adapt. Attackers seek to exploit the need for retraining by targeting the retraining stage of an ML model. For example, an ML model that seeks to determine anomalous network activity must periodically be retrained. An adversary could launch a poisoning attack during this retraining phase by injecting adversarial data.

Inference Phase Attacks

Exploratory attacks do not tamper with the training data or the model. Instead, an exploratory attack collects information about the training data and the model during the inference stage. The attacker could use reversing techniques to discover how the ML algorithms work, possibly duplicate the model, or recover training data. The attacker can use the knowledge gained during an exploratory attack to launch an integrity attack.

Integrity attacks seek to evade detection by producing a false negative from a classifier. The adversary aims to produce a negative or benign result on an adversarial sample, thereby evading classification by the cybersecurity system. Such attacks often rely on exploratory attacks to understand how the classifier works. Another type of integrity attack is analogous to a denial of service. Such an attack causes the classifier to misclassify many benign samples, increasing false positives for the security team to evaluate.

Output integrity attacks — analogous to man-in-the-middle (MiTM) attacks — do not attack the ML model directly. Instead, these attacks intercept the result from an ML classifier and change it. For example, with a malware classifier, the attacker could intercept the result and change that result from malicious to benign.

Defending ML

Fortunately, there are several steps we can take to help protect ML data and models. The foundational elements for securing ML include a mixture of fundamental security practices and methods specific to ML. This section looks at various aspects of securing ML when used in an adversarial environment, such as cybersecurity.

Sanity Check Training Data

Much of the effort in developing effective training models goes into data collection and preparation. The learning is based on historical data, which defines the ground truth. The data’s quality, quantity, and relevance will affect the learning. Often, the data must be cleansed, and many of the decisions, such as how to deal with missing data, can greatly impact learning. Should the missing data be ignored, or should it be imputed? If imputed, by what means? Similar considerations must be given to outliers. Also, the data distribution can affect which ML methods to use

Training Data and Model Protection

Security fundamentals such as version control and access control are extremely important, especially related to the training data and models. Protecting the data and models from unauthorized access is paramount to guard against poisoning attacks. Effective version control of the data and models will also allow for reverting to a known good state in the event of an attack or error. Like with traditional software development, solid change management processes are a must.

Robust Learning

Improving the robustness of the learning algorithms to guard against poisoning attacks is a burgeoning area of research. Robust learning seeks to make the model inherently less susceptible to adversarial data or outliers. During the testing phase of a new classifier model, the designer can simulate attacks to see how the model responds. The classifier model can then be updated based on the results of the test attack. The result is a more robust model that is less susceptible to disruption due to poisoning. Another method used to create a more robust classifier is bagging (using multiple classifiers).

Detecting Attacks

The community needs to do more research on how to detect attacks during the training and testing phases. Current research includes analyzing model drift. However, a natural drift is associated with changing attack parameters and methods in many cybersecurity use cases. Distinguishing drift caused by a low and slow attack may prove difficult. This drift is another reason for version control of the training data. The original training data could be run back through the model to see if it produces the same results.

However, with online training, the live production environment is often used for initial training or periodic retraining of the models. This online retraining is typically done to adjust models to subtle changes. Therefore, with online training, detecting malicious subtle drift may prove extremely difficult. An alternative would be to use offline training with periodic snapshots of data. Before retraining, the sanity checking of training data and versioning discussed earlier can be used to help detect adversarial data.

Deploy Clustering with Extreme Caution

Use caution when deploying clustering ML models for cybersecurity use cases, especially for classification systems. Classifiers that leverage clustering algorithms are particularly susceptible to evasion attacks. If an adversary knows the state of clusters, the adversary can easily craft a new data point near one of the clusters. Until further research can improve the robustness of clustering, such methods should be used with great caution in cybersecurity.

Using Adversarial ML to Improve Models

Though ML has had many successful real-world applications, its methods have not been as successful as possible in the cybersecurity field. Classical ML methods and models do not consider purposeful misleading of the ML system by an adversary. Traditionally, ML has focused on uncovering knowledge and discovering relationships within all data supplied. As was shown previously, adversaries seek to exploit ML vulnerabilities to disrupt ML systems. Cybersecurity systems, including ML models used in cybersecurity, must be designed to assume that an adversary will attempt to exploit the system and disrupt the model. In the ML field, the data distribution within the test data is typically assumed to be statistically similar to the training data distribution. However, this assumption may not hold in cybersecurity if an adversary is actively manipulating the testing data.

Unfortunately, many methods used to assess the performance of an ML model evaluate the model under normal operation instead of in an adversarial context. The adversarial ML field seeks to improve ML algorithms’ robustness, security, and resiliency in adversarial settings, such as cybersecurity. The three main pillars of adversarial ML research are (a) recognizing training stage and inference stage vulnerabilities, (b) developing corresponding attacks to exploit these vulnerabilities, and (c) devising countermeasures. With adversarial ML, the security team proactively attempts to exploit vulnerabilities, much like red teaming in traditional cybersecurity.

Figure 3 – Reactive versus adversarial ML model development.

Conclusion and Recommendations

As we have seen, ML offers tremendous benefits when applied to cybersecurity. However, we must understand the limits and vulnerabilities of ML. Cybersecurity is an adversarial environment. We must realize that traditional ML methods do not consider willful misleading or disruption by an adversary. Furthermore, deploying poor ML models is easy, but deploying robust, secure ML models requires much effort. The following recommendations can help ensure the safe and efficient use of ML within an adversarial environment.

  • The training data used for ML must be sanitized. Proper data collection and preparation can reduce errors introduced through poor data.
  • Strict version and access controls must be employed to protect the training data and the models.
  • ML processes and data must be evaluated to understand where they may be vulnerable to attack.
  • When applying ML to cybersecurity, we must assume the data and models will be attacked. Therefore, we should employ adversarial development methods when developing models for an adversarial environment, such as cybersecurity.
  • Caution must be used when employing clustering methods within an adversarial environment, especially for classification. Clustering methods are particularly susceptible to evasion attacks.

The use of ML within cybersecurity will expand rapidly. We are only beginning to unleash the potential of ML. This is an exciting time, full of promise. However, we must ensure that this promise is realized safely and securely