This post also appeared on LinkedIn.
In November of 2019 (back before the pandemic lockdown), I was the CISO for an international professional services organization. My most important strategic project at that time was creating the “always-on VPN.” What was this top-tier objective? An attempt to modernize our global secure remote access. Why? With our old systems, it was too much effort to access company assets from outside the security perimeter, and we expected to grow as more people worked outside the office. (Little did we know...)
My CIO and I identified four important pain points the always-on VPN had to relieve:
- Inconvenience: Reduce login complexity to just a single sign-on (SSO) service for any device, anywhere to improve the user experience.
- Threat risk: Reduce attack surface as the number of people outside the security perimeter increased.
- Poor performance: Reduce traffic backhauled traffic to our data centers via MPLS connections to improve the user experience.
- High overhead: Reduce MPLS backhaul expenses by using direct internet connections.
We had different ideas about what was most important and what to do first. But we both agreed that our current architecture wasn’t going to support our move into a cloud-first, remote environment.
The recurring dilemma: user experience or enterprise security
My CIO questioned our approach to remote access: “Why do I have to authenticate to a company laptop, launch a VPN client, provide my network password again, enable my soft multi-factor authentication (MFA) token, and manually input an 8-digit rotating authenticator just to connect to the business network?”
Our tedious network-security login workflow contributed to a terrible user experience and was so onerous it made employees want to avoid security. The problem was further compounded by other services employees regularly use—banking, social media, and e-commerce, for example—requiring additional multi-factor authentication (MFA) push notifications to registered or authorized devices.
I agreed with his user experience concerns. I also wanted to reduce the number of inbound VPN listener services (three) and VDI gateways (two) running in three different continental DMZs. There had been several well-publicized VPN concentrator exploits that required emergency patches or updates around this same time. Updates like this needed to be tested and reviewed for operational impact, which meant a longer time for both implementation and network risk exposure.
Turns out these two issues were connected.
O365 broke us
We backhauled all of our United States remote traffic to a VPN concentrator at a colocation data center in Phoenix, Arizona. (The London and Beijing data centers had similar setups in Europe and Asia, respectively.). Our 2019 strategic plan was to adopt Office 365 via an Azure tenant. A business-impact assessment suggested that our road-warrior traffic (already trending up) would grow exponentially with increased use of Office 365 apps like SharePoint, OneDrive, Teams, and Skype.
If every eligible remote worker in the US went on the road simultaneously, all internet and Office 365 traffic would be backhauled to Phoenix and then sent to the Los Angeles service edge before heading for its real destination—the cloud. Our O365 data traveled a circuitous route across multiple networks, slowing app performance.
We had to figure out how to improve Office 365 performance. One option was to implement a split-tunnel solution, but the lack of device, traffic, and network visibility heightened risk far beyond our appetite. We needed a way to separate internet-bound traffic from internal-application bound traffic.
Zero trust shows us the way
Zero trust wasn’t new to me: we had used a cloud secure web gateway (SWG) for years to address threats and security objectives. The SWG excelled at inspecting outbound traffic for threats and data loss, and seamlessly secured all of our business SaaS tenants. It also offered the best visibility into outbound traffic behavior and usage metrics I’d ever experienced. So I thought, “Can we use this model for inbound private cloud traffic using zero trust?” (Spoiler alert: Yes. Yes we can.)
Zero trust is a new paradigm, and it requires a fundamental change in connectivity mindset. Legacy castle-and-moat security practitioners ask, “How do I securely connect this device to this network?” Zero trust is more direct (in more ways than one), and asks “How do I securely connect this user to this application?”
ZTNA, also known as the software-defined perimeter (SDP), is a set of technologies that operates on an adaptive trust model, where trust is never implicit, and access is granted on a “need-to-know,” least-privileged basis defined by granular policies.*
We deployed a network connector and set up a straightforward entitlement configuration for user provisioning. In other words: No new endpoint software deployment and installation. Now we could replace our VPN solution with a Zero Trust solution, and reduce our attack surface across all users. Added benefit: We immediately discovered half a dozen hidden “shadow IT” servers. Fortunately, they were properly maintained and patched via automation, but until ZTNA, nobody knew who used them or what they were for.
The big change: COVID-19
By February 2020, we were ready to deploy zero trust across the whole enterprise. Then COVID-19 shut down the world.
We faced several obstacles to a successful pivot to remote work. Around 20% of our workforce had never worked remotely: no business laptop, no MFA token, no knowledge of BYOD application management practices. All IT resources went towards adding licensing and infrastructure to our VDI environment. As we stabilized our remote workforce, our 2019 business impact assessment became a reality. The explosion of Zoom, Teams, and Bluejeans web conferencing saturated our U.S. VPN gateway bandwidth.
Rolling out zero trust alleviated that pain, and addressed user experience and security in a single solution.
The zero trust footnote: Why is my meeting quality bad?
After we’d dealt with keeping employees safe during the COVID-19 crisis, I was gung-ho for moving fully forward with a Zero Trust infrastructure for the entire company. But our CIO and I had a difference of opinion on where to take Zero Trust. I wanted to push it further and use it to enable direct internet connections to our cloud applications and assets, while he felt that might introduce too much change for employees during the pandemic.
But oddly enough, in April 2020 our CEO called a meeting. The CEO and his wife had concurrent Zoom calls, each with a similar number of participants. The CEO’s online meeting suffered continual audio and video drops. But his wife’s call ran with great sound and video. Why was this happening? What’s the problem?
The issue was exactly what the CIO and I had discussed a week earlier: backhauling. Our CEO’s web-conferencing traffic was routing through his home network in Washington, D.C., out to our Phoenix data center, and then to the Zscaler service edge in Los Angeles—all before connecting to the Zoom infrastructure. As for his wife’s call, well, we can make the simple assumption that her device traffic went directly through their local gateway. As I told the CIO: “You can fix this with direct internet connections—using zero trust.”
This meeting with the CEO triggered our global zero trust user deployment—a happy ending!