How and When to Embed Machine Learning in Your Product

HOWIE XU - Vice President of Machine Learning and AI

April 27, 2021 - 8 min read

Contents

Article
More blogs

To help companies think about how to build ML into their product infrastructures at scale, Capital G held a Q&A session with Howie Xu. It was originally published March 15, 2021, on Medium.

Machine learning can enable a company to make sense of mass quantities of data, separate noise from signal on an immense level and unleash a product’s ability to truly scale. Many CapitalG companies have embedded ML into their products with great success (and in fact this is an area where CapitalG often provides hands-on support and training), but getting there is no easy feat. Embedding ML is among the most complex tasks that product teams can attempt.

To help companies think about how to build ML into their product infrastructures at scale, we decided to sit down with Howie Xu, VP of Machine Learning & AI at leading cybersecurity company Zscaler (NASDAQ: ZS). A CapitalG portfolio company since 2015, Zscaler is now a large, publicly-traded global firm with more than 2,000 employees worldwide.

In this Q&A, Howie discusses the value that machine learning can provide, when to use ML (and, maybe more importantly, when not to) and how to organize teams for ML success.

Johan: Howie, before we dive in, can you describe what Zscaler does and your role as VP of Machine Learning and AI?

Howie: Sure! Zscaler enables secure digital transformation by rethinking traditional network security and replacing on-premises security appliances to empower enterprises to work securely from anywhere. You can think of us as the “Salesforce of security.”

Zscaler is the world’s largest inline security cloud company, securing more than 150 billion transactions per day, protecting thousands of customers from cyberattacks and data loss.

My role is to drive the technology, as well as awareness and use cases for AI-based product delivery at Zscaler. A lot of what’s involved in my role actually involves close collaboration with other internal teams, like the rollout of the world’s first AI-powered cloud sandboxing engine.

Johan: What advice would you give a founder, CTO or product leader who has achieved product-market-fit and is looking to embed ML in their product to help it scale?

Howie: My best piece of advice is to realize that AI is merely a means to an end. I think of it as a giant hammer — it can accomplish huge tasks, but it’s overkill in many scenarios. For successful implementations, seek out the 10X opportunities. Anything below that threshold, like a 20–30% increase in speed or efficiency, and such delta simply isn’t worth the effort required. In those cases, it makes more sense to use conventional technology.

I’ve found that the greatest challenges to ML success often amount to challenges in alignment rather than technology. So much of what leads to a successful ML implementation comes down to successful stakeholder alignment and education. This may sound trivial, but it’s absolutely not. Getting AI empathy and literacy to technical and non-technical teams and jumping from what computer scientists have done for the last 20 to 30 years to what the AI team truly needs is not linear and definitely not easy. You have to put the time into education and alignment, or the teams aren’t going to collaborate with you closely enough, and you won’t be able to get the domain knowledge or the data you need to make the algorithms work well.

Johan: You have both founder & scale-up perspective, having founded an AI/ML startup and now overseeing the embedding of AI & ML at Zscaler. What’s different about doing this at scale at a large, established company like Zscaler with 150B transactions and 100M threats detected per day?

Howie: At Zscaler, my greatest opportunity is having an infinite amount of data; likewise, my greatest challenge is having an infinite amount of data.

But, it is a fantastic problem to have. Most AI startups struggle to find sufficient data. In fact, a lot of security companies claim to use ML before they’re actually able to because they just don’t have enough data yet. We, on the other hand, have a trove of it. But it is challenging for us to process all of it. For starters, that’s just too much data for even the most powerful computers and largest teams to analyze. On top of that, issues like user privacy and security are of paramount importance and must be carefully factored into the equation. And of course, on top of all of that, we have to balance speed and cost in every implementation. Being able to process 99.99% of data would have no value if it took too long to achieve it.

The challenge for companies at scale is to architect ML to be both “massive scale data proof” and “existing product compatible.” We have to figure out which data to analyze, and we constantly face a spectrum of tradeoffs that necessarily come with having access to data on such a massive scale.

Johan: AI & ML can seem like buzz words that are promoted as almost panaceas to all business problems, which is obviously unrealistic. In what kinds of scenarios and applications is AI & ML most promising?

Howie: It may sound simple, but within cybersecurity, AL & ML are most promising in use cases requiring both speed and accuracy. These are the scenarios in which conventional technology simply can’t scale to meet the needs.

On the flip side, it’s important to understand that ML may generate non-trivial amount of false positives. However, if you leverage the technology right, the false positive is not a stopper. For example, let’s say that I can use AI to detect 95% or even 99% of logs, alerts, or files are not threats. We’re still going to investigate what’s left, and typically what’s left will be the most serious threats requiring closer investigation. Creating filters with AI allows teams to spend the time necessary to assess the more critical risks.

Johan: In your experience, what is the best way to organize the product and engineering teams to embed ML into an existing product?

Howie: There is no correct or incorrect way. What’s right will vary by company, but at Zscaler we organize ML as an independent functional team reporting directly to the president of the company. This structure helps us in a number of ways:

With hiring and retention because it provides the opportunity for top data science PhDs to work with each other. The opportunity to collaborate with high-caliber peers is often extremely important in the recruiting and retention of these highly sought after specialists.
Providing broad coverage across the company, as well as insight into the organization’s potential use cases for AI implementations.
Enforcing best practices, given our broad company-wide mandate.

The challenge with this structure is that it leads to more time needing to be spent communicating and collaborating with other functional organizations. I believe that’s a worthwhile tradeoff and have found this to be a highly successful organizational structure.

Johan: Zscaler participated in ML@CapitalG, our program in which Googlers train engineers & product teams at our portfolio companies on machine learning. You had many of your team members participate in the program — how important is it to train not just engineers but the teams around them, such as product managers or UX/UI designers?

Howie: My machine learning engineers loved the training of Google’s AI cooking recipes. But it is more than that. I’ll give you a specific example to show the importance. Amir Levy, one of our product managers who attended the training, sought me out on the session’s last day. He came back to the office and grabbed me to brainstorm a use case he thought about during the training. Within a few short months, the output of that decision was already significantly improving our company’s bottom line.

Johan: What tools or resources do you recommend to startups and more established companies implementing AI & ML implementations?

Howie: We don’t use all of them but some tools and resources for consideration: autoML, H2O AI, DataRobot, and Google Cloud Platform for exploring use cases

More than anything, though, I highly recommend seeking out the advice of peers who can discuss what worked and what failed terribly and why. Many of the biggest AI failures I know have not been on the technology side; they’ve been due to a lack of alignment between business and technical priorities. I’ve been incredibly fortunate to benefit from the knowledge and support of my company’s CEO Jay Chaudhry and President Amit Sinha, as well as my peers in the company and in the industry.

Johan: What accomplishments of your team’s are you most proud of this past year and why?

Howie: Protecting our customers by processing over 150 billion transactions and blocking 100 million threats per day is not a joke. Helping Zscaler to do this is our biggest accomplishment. There are a lot things happening along the way of course, including the the world’s first AI-powered cloud sandboxing engine with a successful integration with the TrustPath technology after the acquisition, all the unknown threats we helped to discover via AI/ML, the anomaly detection functionality we showcased at Zenith Live in December, and some very interesting AIOps progress we made recently.

Thank you for reading

Was this post useful?

Yes, very!

Not really

Disclaimer: This blog post has been created by Zscaler for informational purposes only and is provided "as is" without any guarantees of accuracy, completeness or reliability. Zscaler assumes no responsibility for any errors or omissions or for any actions taken based on the information provided. Any third-party websites or resources linked in this blog post are provided for convenience only, and Zscaler is not responsible for their content or practices. All content is subject to change without notice. By accessing this blog, you agree to these terms and acknowledge your sole responsibility to verify and use the information as appropriate for your needs.