In this second blog of our 6 part series on The Impact of Public Cloud Across Your Organization, we are going to look at Cloud Native Application Protection Platforms (CNAPP) through the lens of a cloud operations team. Hopefully, we can answer the question, “how can CNAPP help CloudOps teams do their jobs more effectively?”

DISCLAIMER: I know, I know, I know…The term “cloud operations” has countless variations. What is under that moniker at one organization may not be 100% aligned with yours. I get it, I do. I am going to use some generalities here and focus on critical capabilities, with the hope that many of those capabilities with your cloud operations team’s organizational responsibility. As they say, your mileage may vary.

In this post, we are going to focus on the job functions listed below. Other responsibilities (e.g. DR planning and testing, training, etc.) are out of the scope of this discussion.

Establishing and maintaining a catalog of approved assets (e.g. approved images for compute deployments)
Asset management, including both initial deployment and configuration of cloud assets and ongoing management.
Monitoring of cloud estate for usage, configuration drift, and other operational issues
Patching and updating of cloud resources as required
Coordinating with internal Lines of Business (LOB) owners, security, and network teams for incident resolution
Troubleshooting
General access and security of cloud resources
Identity and access management responsibilities

As public cloud operating models become more mainstream for enterprises, identifying and delivering an approved service catalog to internal teams is of paramount importance. The pace of new services that are rolled out every month in the public cloud space is much faster than traditional enterprise software. Ungoverned consumption of these services can lead to cost overruns, entitlement complexities, insecure configurations, and potentially new threat vectors. It is not enough that the cloud operations team have principal responsibility in defining this catalog, they also have to manage the existing cloud estate to identify drift from those approved services.

Understanding What You Have

“You cannot protect what you don’t know” is one of the first principles of any enterprise cyber security strategy. It follows that clarity of consumption is the first and primary step of any operations team’s ability to protect the enterprise in the public cloud space.

At the core of this need, is the ability to quickly and intuitively survey the entire multi-cloud estate in real-time. The ability to see what resources are deployed, and into what regions, and identify appropriate and required tagging frameworks is simply table stakes for any cloud operations team.

Understanding and visualizing the changes that have occurred on a given asset and the identities that are responsible for that change allow the Operations team to either remedy or assign the remediation task(s) to the appropriate department.

All of the major Cloud Service Providers (CSPs) have consoles to see deployed resources and services, but the truth is for many organizations, consuming multiple cloud service providers is a reality. Being able to pull all the asset data from each of these disparate environments cleanly is critical to a speedy evaluation of those deployed assets and services.

Cloud operation teams also operate through a policy lens. The ability to programmatically investigate and report on unsanctioned deployments should also be a critical capability, not only for reducing risks but for saving time and costs as well.

Prioritizing the Focus

Cloud operations teams also focus on patching and updating existing assets and services. For many, the difficulty lies in understanding which services should be the focus of these activities. If an organization has 10,000 compute instances across multiple clouds, and 80% of them have a brand new vulnerability just reported in the news, that data point does not really help identify those instances that pose an imminent risk in and of itself. A cloud security operating platform should be able to understand and correlate other risk signals to help focus the team’s response.

Upon learning of the new vulnerability, operations teams should be able to identify what assets are vulnerable that have other attributes such as:

Public exposure
Access to storage accounts or buckets containing sensitive data
Strong identities and entitlements
Environmental tags

These additional signals are what allow cloud operations teams to prioritize those assets that represent the highest risk.

Who has Access?

Cloud operations teams also need to look beyond asset configurations, understanding the identities and entitlements related to those assets. Many organizations start their “cloud journey” (come on, you know I had to use that term at least once) by a LOB asking whether an application that is currently running on-prem can run effectively in a public cloud environment. The benefits of scale, elasticity, global reach, and costs can drive these evaluations. Security often does not. In fact, application owners might not have deep background knowledge of cloud infrastructure concepts.

The differences between identity and access management (IAM) in a traditional data center and a public cloud environment can also be extremely daunting. Roles are assigned not only to humans but to services (e.g. compute, secrets, storage) and the interplay between them can be complex. An application owner may simply find it expedient in these early evaluations to give themselves (or cloud services) “root” like privileges just to remove some of these complexities. Over time, as more applications move to the public cloud, these robust entitlements become a threat vector in and of themselves.

As the enterprise matures to develop a robust cloud operations structure, it becomes incumbent to go back and work towards the least privilege model. On the surface, IAM and asset management seem like different disciplines and one can make the argument that they are. However, it is critical for the operations team to integrate their view and understanding of identity and asset management into a single platform. It is incredibly difficult if not impossible to do this with two different solutions.

For example, consider a scenario where a compute administrator has full access to the console of various instances, but they have no direct entitlement to a storage bucket. If one or more compute instances have access to that bucket, the compute administrator can potentially assume the role of an instance, giving that admin inadvertent access to the bucket. Operations teams need to be able to see and identify that weakness, where a full compute admin, may be able to elevate or assume a role on the compute instance to leverage the compute role in the access and exfiltration of data from a cloud storage object. No easy task.

Conclusion

There is myriad of requirements for the modern cloud operations team today. Understanding their daily responsibilities to the enterprise and mapping those requirements to a platform capability is the approach that guides the development of Zscaler Posture Control.

Please check out the other parts of this series as we examine the requirements of other teams within a public cloud enterprise. We will continue to examine how Zscaler is designing platforms from the ground up to address those requirements while reducing the manual stitching together of individual point solutions, lowering costs for customers while delivering critical insights in an ever-complex multi-cloud world.

See the power of Zscaler Posture Control with our free cloud security risk assessment.