Predictive analytics and machine learning in cybersecurity: an untapped opportunity for ‘negative’ response time

Brad Moldenhauer

Contributor

Zscaler

May 2, 2022

Predictive analytics can be one of the most powerful applications of AI and machine learning in cybersecurity. But it takes the right data to pull off.

The chief information security officer (CISO) is measured by his or her ability to reduce risk, control cost, and minimize friction among employees, data, and the business at large. The increasingly volatile threat landscape makes these objectives more difficult to achieve successfully, particularly risk reduction. And while there are many types of risk, the immediate threat is malicious actors compromising mission-critical data, applications, or services.

In the past, mitigating such risk required a suite of related, integrated technologies working in concert to recognize and block threats via logical policies. However, the increasing sophistication of threats – ranging from polymorphic malware to state-sponsored cybercriminals – requires a new, more proactive approach to address current and known threats and, to a significant extent, future threats as well.

To this end, artificial intelligence (AI) and machine learning (ML) capabilities offer promise and are already experiencing multiple and growing use cases in cybersecurity. When deployed in modern computational architecture such as a cloud, processing power, memory, storage, and network bandwidth can be dynamically allocated in proportion to changing workloads.

AI/ML in cybersecurity delivers powerful pattern recognition capabilities applicable to many business scenarios and threat classes. Once informed by analysis of a large body of training data, these tools often recognize known and unknown threats with accuracy comparable to trained security professionals. Transcending the limitations of that human professional, their performance scales in parallel with allocated technical resources and works continually.

One of the most powerful capabilities is predictive analytics, which detects patterns and then makes predictions by extrapolating forward in time to determine a likely outcome. This capability is precious in cybersecurity because the best response time to threats is negative. The host organization is informed of future threats, what should be done to stop them, and defensive actions are taken without human intervention.

Fulfilling the potential of AI/ML in security is complex, but not impossible

AI/ML-powered predictive analytics should, at least in theory, enable negative response times to threats. But the challenge of accurately predicting future threats is far from trivial, and some of today’s solutions fall short of delivering on their value proposition.

This is partly because of the sheer complexity of analyzing cyber threats. Consider that the most sophisticated ones are driven not by code, however cleverly programmed, but by experienced human specialists working in concert. Anticipating how such individuals will act, whether alone or part of a team, is largely beyond the scope of any AI/ML predictive analytics solution available today. This is why CISOs interested in managing risk will continue to need skilled teams of experts with deep domain knowledge.

Some other challenges in delivering predictive analytics solutions, however, are within the security solution provider’s power to address (again, at least in theory). These challenges predominantly revolve around their data to train their initial AI/ML models. Providers can use this data to refine and improve models, thus improving the accuracy of their predictions.

The best predictive analytics results require the right training data – and plenty of it

Where can organizations get pertinent data? That simple question is not so simple because the training data must meet several requirements. For instance:

It must accurately reflect the production environments (IT infrastructures and all related resources) to be protected.
It must include numerous instances of false positives that might fool less sophisticated security solutions. AI/ML models should be capable of recognizing and passing over false positives with a high degree of confidence.
It must include genuine security threats and attacks, including information about the status and composition of critical resources like dynamic libraries, scripts, executables, and other elements both before and after the attack. It’s best to use successful attacks carried out against modern, real-world infrastructure.
Above all, the total volume of data used to train the AI/ML – whether data about false positives, the infrastructure generally, or successful attacks – must be truly colossal. All AI/ML technologies require massive volumes of training data to deliver accurate results, and those pertaining to predictive analytics are no exception.

Zscaler is a world leader in leveraging AI/ML to manage cybersecurity risk

After considering these requirements and caveats, it’s clear that Zscaler is one of the few cybersecurity vendors capable of meeting the challenge.

Why? Zscaler’s cybersecurity solutions are not merely cloud-hosted and scalable, but also applicable to widely varying contexts. For instance, they can analyze a stream of encrypted data in real time and sandbox a potential threat, such as a file, to assess its behavior and instantly determine whether it’s malware.

By applying trained AI/ML models derived from our technology to current and historical system logs, customers can discover past patterns and trends, then predict where those trends are headed. This approach is well suited to spotting phishing campaigns, for instance, and URL patterns used by certain classes of malware.

It’s then possible to translate these insights into policies – either manually or by closing the technology loop using AI/ML itself. The added security from such a design can, as a result, be far more comprehensive and adaptive than even the most advanced next-generation firewalls.

As a world leader in cybersecurity services, Zscaler is also constantly processing a staggering volume of security-relevant data passing through global infrastructures leveraged by every class of business.

Our integrated, AI-powered services stop countless advanced attacks before they gain momentum by analyzing more than three hundred trillion signals per day. And our real-time security updates are shared more than two hundred thousand times per day across the Zscaler cloud.

The result is that Zscaler has both the type of data needed and the extreme volume necessary for training AI/ML models to the best effect – even for the challenging task of predictive analytics in the cybersecurity domain. Together with the world-class data scientists and machine learning engineers we have, and with the deep cybersecurity domain knowledge we accumulated over the year, Zscaler is poised to leverage AI/ML to transform the cybersecurity industry.

What to read next

The state of artificial intelligence, trust, and cybersecurity

C-suite lessons in AI fairness and explainability with C Minds, a women-led action tank

Explore more insights

Recommended