My Zscaler 2019 summer intern, Anupma Sharan, returned to her Carnegie Mellon University campus this week and blogged about her amazing internship experiences. Anupma’s blog provides a glimpse into the exciting “powered by Machine Learning and AI” journey Zscaler is on. Thanks to Anupma and her ML teammates who supported and mentored her!
Machine learning and AI for cybersecurity: An intern's perspective
By Anupma Sharan, Carnegie Mellon University (CMU)
Machine learning (ML) and artificial intelligence (AI) have been getting a lot of attention, and deservedly so. The applications for these technologies are seemingly endless and could lead to major advances in a range of industries, including cybersecurity.
That was one of the reasons I was so excited about my internship at Zscaler. For 11 weeks this summer, I worked as a data science intern within the Zscaler machine learning and AI team. This team, which developed out of Zscaler’s acquisition of ML and AI technology from TrustPath in 2018, has embarked on an ambitious plan to use machine learning and AI to make a significant impact in cybersecurity. That goal caused me a bit of trepidation—an intern with little security background.
I had the opportunity to work on a fascinating project involving a multiclass malware machine learning model. The security industry has made good progress developing the binary-class malware model during the past few years but not as much with the multiclass model. Obviously, making a multiclass prediction is more complex and, as the project progressed, it was exciting to see how effective these models could be.
And what did I learn? In essence, this project is about improving the performance of our machine learning models to better detect, predict, and block unknown threats. This experience proved to me just how interesting and exciting the world of machine learning and AI is, as well as the area of data science. And, as would be expected, this experience reinforced one thing to me: data is key. No matter how much of an expert you are at tuning your machine learning models, you cannot overlook the importance
of your data.
As Howie Xu, my leader and the VP of machine learning and AI at Zscaler, said: “The road to solving a real-world problem using data science is not straight. In fact, it is a three-dimensional space, with the three dimensions being the data, the model, and the business value.” This is true, but data scientists working in the security space have to be mindful of which direction they tread in this 3D space due to the fact that the extent of the exploration is constrained by time.
As students, we often have a set of labeled data ready for model training, and we also might have enough time to solve a problem because our projects deal with small-scale data anyway. The same can’t be said for those in the real world where super large-scale data is being dealt with and often we have to find a way to label the data in an automated fashion. When it comes to cybersecurity, time and accuracy are of the essence. After all, cybercriminals aren’t known for waiting around while security professionals strengthen their defenses.
My teammates and I visited Mission Peak in June 2019
One of the most powerful things I learned during my internship was the importance of being passionate about the team’s mission. Every person within the Zscaler family and the Zscaler machine learning team possesses this quality. I learned so much from my mentors and teammates, including Dianhuan and Changsha. And I believe, following my internship this summer, I have come out as a better data scientist and a better team player. It was truly an unforgettable experience.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Howie Xu is the VP of Machine Learning and AI at Zscaler