Microsegmentation has become more widely adopted in the cloud and data center because of its enormous promise. Breaking the network into smaller and smaller fragments (often as small as the individual workloads themselves) comes with significant security and performance benefits. According to some estimates, as much as 76% of the traffic in a network is east-west, so allowing only application communication necessary for business operations can interrupt attack progression.

The problem, though, is that getting to that fully deployed and realized microsegmented environment is challenging. It requires fine-grained knowledge of the application communication patterns and application topologies of the network, with the second step being an inventory of what communication should be allowed and what shouldn’t. This leads to a large number of rules/policies, and large rule sets are difficult for humans to understand, manage and modify. Finally, the rules remain address-centric, which means that the translations from application policies to address-based rules is contained in people’s heads.

To gain the advantages of microsegmentation requires a lot of work–and the work doesn’t stop, since networks continue to change.

Is there some way to complete microsegmentation projects faster and more easily? To make them more accurate, and more resilient in the face of changing business and security challenges?

Turns out, there is.

A new approach

We want to make application-centric, rather than address-centric, rules. Starting with collected data about which applications are communicating, we use machine learning to analyze the data and create a nearly optimal set of automatically generated rules. This moves much of the complexity from humans to computers, which, after all, are much better at sorting through lots of information.

The machine learning isn’t used here to find malware; rather, it’s used to establish the state of the network—what is talking to what. We’re developing techniques to identify possibly suspicious activity from applications on a network—for now, the learned rules and our UI allow unexpected applications to be easily identified and dealt with.

Once the rules are created, then people can begin to use human insight to protect their network. Because the rules are readable, and parsed explicitly in terms of protecting applications, the rules can be deployed application by application; a human network security expert can use their knowledge and insight to deploy (or edit) the rules in an optimal way. A human is usually better at deciding what applications should be protected first, so we make it easy to find the relevant rules for protecting the most important applications and to use the rules to lock down the application.

Using machine learning

Starting with all the network traffic, we want to create a set of rules with the following goals:

As few rules as possible.
Simplest rules, without superfluous information.
More specific rules, rather than general rules.
Human-readable rules.
Rules with the broadest coverage possible.

These goals are often in conflict with one another. It’s necessary to balance these priorities to get the optimal rules, though. These constraints, and the constraints created by the data, rule out most of the techniques used for machine learning.

We ended up doing stochastic search through a space of candidate rules, maximizing a value based on the above constraints. If this sounds a little hard to understand, it’s not just you. Let’s try an analogy.

Lost in New York

Imagine you’re standing on a street corner in Manhattan with a latitude-longitude GPS. You know where you want to go, because you have the lat-long coordinates of your destination, but you don’t know which direction to start walking to get there. So you measure where you are and you walk a few blocks in one particular direction and look at your GPS again. If you’re closer to your destination than you were, then you stay at your new location and do it again. If you’re further away, you backtrack to where you started and then pick a different distance and direction. Sometimes, even if you’re a bit farther away, you still keep the new location, since maybe it’s a shortcut (or a way to get around Central Park!). As time goes on, you get closer to your destination, until nothing you do makes you any closer. You’ve arrived, more or less.

Of course, Manhattan is two-dimensional, except for the tall buildings. When we’re looking through the space of possible rule sets, there are a lot more “directions” to investigate. That’s why we leave it to the algorithms.

Applying machine learning to microsegmentation: faster to achieve, more secure

There are three ways in which Zscaler machine learning makes policy suggestions, which I’ll describe in more detail below:

Observe and describe the network’s intended state.
Define optimized policies to enforce that observed/intended state.
Continue to learn, adapt, and optimize policies while enforcing that intended state.

Step 1: Observe and describe

In this stage, we want to understand what the network is doing and what it's supposed to look like. Zscaler does this from the point of view of communicating applications, and uses knowledge of application communication patterns to identify anomalous traffic in the future that doesn't fit in with previously observed network patterns.

Most people we talk with about our machine learning capabilities are really interested in this stage since application-centric policies are very different from what they've encountered before. We often have to describe in more detail what we do to accomplish this. Zscaler collects fixed, immutable data about applications–hundreds of attributes that can securely identify applications. Communication patterns between applications, and the hosts and users involved, are also stored.

Within 48 hours (often less, depending on the nature of your network and how "locked down" it already is), there’s enough information for Zscaler to begin using machine learning to create policies automatically.

Step 2: Define and enforce

The wealth of information Zscaler collects about the applications and their communication patterns allows it to discover a nearly optimal set of policies that describe what’s been observed, using a relatively small number of features for each policy.

Zscaler produces a set of policies that are dramatically smaller than sets constructed using address-based solutions. Plus they're easier to understand, so even managers who aren't application experts can understand how to secure them.

For example, one customer previously required more than 13,000 address-based security policies to protect their applications. Zscaler was able to accomplish the same protection with several dozen application-based policies. That's the real benefit of combining application-based policy creation with machine learning. It becomes much easier to understand security, because the policies are few enough that you can browse them all, and clear enough that you can understand and act upon them.

It’s important to note that these policies don't decide what's “good” or “bad” on the network; they only describe what's actually happening on the network, as efficiently and simply as possible. The goal is to make it as easy as possible for humans to understand what's happening on their network and decide for themselves whether a given suggested policy should be deployed, modified, or eliminated.

Step 3: Learn and optimize

It’s important to note that Zscaler machine learning doesn't stop after the first few days of use. Because application traffic on the network continues, and, more importantly, the network changes and the applications change, there's always more information to gather, and it may be new and different information than what was gathered initially.

Hence, Zscaler continues to create new policy sets based on all the collected information. Since it would be likely that new policies could contradict (or confuse) existing policies, already-enforced policies are taken into account while creating new rules. For parts of the network where no policies are in place, however, Zscaler regularly updates its recommended policies to keep up with the evolving network and provides a current confidence score so users have a sense of how accurately the policies reflect current network behavior.

These three stages—observation, creation, and optimization—explain how Zscaler creates policies that provide effective security and can be understood by people. This frees security professionals to do their most important job—protecting the most important applications on their network from attack—without excessive drudgery. That’s how the use of machine learning accelerates the time-to-deploy for a microsegmentation project from weeks or months to days, and allows users to create security policy without needing to write security policy.

Blog: How Microsegmentation Differs from Network Segmentation

On-Demand Webinar: Microsegmentation Made Easy with Identity and Automation

Blog: Microsegmentation 101