Digital Business

Locking down the git in GitHub

Jan 23, 2023
Locking down the git in GitHub Locking down the git in GitHub

For security professionals, third-party security risk concerns predate cloud computing. But as SaaS-based applications have exploded and integrations have tightened, so has the risk. This is especially true for organizations that rely on source code repositories like GitHub, which recently made news for two high-profile breaches involving customer data.

The first involved Okta, a prominent identity management and authentication solution provider. GitHub notified Okta in late December of suspicious access to its code repositories. Subsequent analysis revealed that hackers had obtained code used by Okta’s enterprise-class Workforce Identity Cloud. While the hackers did not gain access to Okta’s customer data, the IAM provider nevertheless began restricting access to its GitHub files. It shut down GitHub integrations with third-party Okta apps until they could thoroughly investigate the problem. The specific means by which hackers compromised GitHub have not been made public.

The second breach, which occurred in the first week of January, targeted Slack's chat service. As with Okta, hackers managed to access and download certain Slack source code stored in private repositories at GitHub. Also similarly, the Slack breach did not affect up-and-running services, and hackers did not gain access to Slack’s customer data. However, Slack’s primary codebase remained uncompromised throughout the breach, differing from the Okta breach. The access method is also clearer: according to Slack, the hackers stole a limited number of employees’ access tokens.

Researchers remain unsure if the two events are related, though GitHub did sustain similar breaches in April of last year using stolen OAuth tokens.

Dialing down the potential vulnerability of third-party code repositories

What steps can organizations using GitHub, the world’s most popular source code repository, take to protect themselves against a similar breach since they don’t own or manage GitHub’s architecture and can’t directly oversee its security? Should they continue to use GitHub at all?

The answer depends since each organization has unique context, requirements, and perceived vulnerabilities to potentially stolen code. Many organizations, including Okta, do not rely on private code for security, so the potential exposure is of secondary concern. Furthermore, since GitHub is the leading version control solution worldwide, it’s likely to remain the go-to resource for enterprise-class software development teams.

That being the case, it’s worth considering how to apply security best practices and solutions to address a similar breach (or any other involving a third-party vendor), and how these operational, management, and technical controls are complementary to zero trust outcomes. 

I suggest the following:

Thoroughly and frequently, audit the security of third-party services. Verify, among other factors, access control, server patching, and other practices to ensure they meet your expectations and minimize the odds of a breach. This applies to the software supply chain generally, as major breaches like SolarWinds and Codecov attacks emphasized. Minimize the risk that any one vendor can cause your operations if breached. Many SaaS vendors have established customer-facing trust portals or a chief trust officer function that, among other responsibilities, maintains customer relationships through a trust-oriented lens by being transparent on data handling security and ethics, supply chain vulnerabilities, and other compliance mandates. 

Conduct continuous red team exercises to try to attack, breach, and exploit your GitHub repository or similar cloud services, ideally with an automated attack simulation solution. The goal should be to expose any existing security flaws continuously rather than on an annual basis. I believe there is value in leveraging an independent vendor to perform targeted or specialized assessments, but the threat landscape is far too volatile and subject to SaaS provider feature and enhancement releases, for annual exercises to be effective. We need automated and continuous assessments to properly assess risk.

Make sure you’ve deployed the latest security patches for anything pertaining to your code repository. For instance, if Slack’s repository was compromised via stolen employee tokens, make it your goal to prevent token theft by inventorying every system or tool that uses tokens, analyzing how they are stored and exchanged, and determining what level of monitoring and validation can be implemented, again ideally through automation.

Leverage short-lived rotating tokens and avoid highly privileged tokens by using dedicated tokens for granular access and blast radius reduction. For GitHub users, replace classic personal access tokens (PATs) with new fine-grained PATs and restrict them to a specific repository where possible. Finally, have an automated control to identify and revoke unused tokens, and test your force rotation or revocation capability to build confidence in your response to a stolen token.

Limit repository access to employees with role-based access (RBA) and multi-factor authentication (MFA). Any time access is granted, an access trail should document that process – key intel needed for analyzing a breach. If a breach would be of especially high consequence, more robust MFA practices like hardware-based tokens that are unusually difficult for attackers to obtain or compromise may be called for. 

How zero trust provides additional security for third-party code repositories

Zero trust principles can also help mitigate the consequences of the breach of a third-party vendor or sanctioned application by minimizing reachability and risk. Reachability is the degree to which a vulnerability can be attacked and exploited to gain privileged access, directly or indirectly, to an asset. As one of my colleagues is wont to say, “if it isn’t reachable, it isn’t breachable.” This perfectly sums up the relationship between reachability and risk. In general, less means less risk. 

With zero trust architecture, applications are simply destinations, regardless of where they reside. This is important with regards to GitHub, because it can be deployed as a SaaS application on the internet or as a private application in a private data center or public cloud environment. Many organizations use both with a service enabled to connect them.

Zero trust architecture provides a collection of integrated, cloud-centric security capabilities that facilitate safe access to websites, software-as-a-service (SaaS) applications, and private applications. 

Let’s address recommended protections for our SaaS GitHub tenant while the engineering team is deploying the enterprise server to an on-premise virtualization hypervisor:

Enable API integration between GitHub and our zero trust architecture. This API integration will establish out-of-band CASB and provide us visibility over data at rest in our sanctioned GitHub SaaS tenant. With this visibility, we can now apply data loss and malware protection controls, and we have the flexibility to make those controls preventive or detective. We will also have SaaS security posture management (SSPM), as our API will continuously scan all resource types, repositories and tenant configuration items against our defined policies. For example, SSPM will notify or take action if MFA isn’t enabled. Assuming best practices with our configuration, I would want to know if secret scanning, code scanning, or Dependabot alerts were disabled on repositories.

Enable an inline contextual CASB policy to enforce access control for authorized users. Assuming we have integrated our GitHub tenant with our identity provider (IdP) and established our identity and authentication requirements (MFA), leveraging a defined security group, we will pass that group membership as a SAML attribute from our IdP to our zero trust architecture through SCIM to support near real-time changes and access authorizations. Let’s use that as our first policy criteria, and supplement it with high device trust, where the device requires our managed endpoint protection solution, full disk encryption (FDE), and a client certificate issued by our internal root certificate authority (CA). 

I have the foundational access policy in place, and I have options to allow or restrict uploading or downloading, as well as the ability to isolate authorized sessions for further control, perhaps only doing so if the access is initiated from certain locations. NOTE: I would have to create several custom URL Categories to account for GitHub allowing users to create unique vanity URLs associated with differing repositories for restricted uploads, non-code related services, and others utilized by the GitHub CLI to properly track, monitor and control for risk, along with a few URL filtering rules that will be limited to only the required HTTP Request Methods. Let’s make yet another, and hopefully final, supposition that I’ve done that and I’ll spare you the laborious administrivia.

Create a cloud firewall rule for non-web GitHub services. We need to create a custom network service with two TCP destination ports used by the GitHub CLI that is required to move files between a managed device and our sanctioned SaaS GitHub repository. That rule will also have granular criteria that includes our device trust and security group contextI’ve received word that our GitHub enterprise server is operational in our data center, and we have our zero trust architecture application controllers already provisioned on that enclave. 

Create a logical application segment for our source code repository. Our enterprise server is discoverable, and we will add its freshly minted FQDN into our new source code application segment. Before we move data onto this server, we must ensure all required services are running, and using dynamic application discovery we test all functional scenarios to see which ports and protocols are needed. After accessing the management console, administrative shell, maintenance mode, private mode, and configuring backups, we have a handful of standard and ephemeral ports that require TCP. Those have been added to the application segment, and the lights have been turned off for the other 65,500.

User-to-application segmentation through access policy creation. Leveraging the same SAML attribute from my IdP I used for our sanctioned GitHub SaaS tenant and our defined device posture, only the users and their devices can send a packet to our enterprise GitHub server. Its internal URL isn’t resolvable to any internal or external users not meeting access policy criteria, even if they are using our zero trust architecture. 

There are several more controls we can put in place such as controlling the outbound communication of our GitHub enterprise server to our GitHub SaaS tenant, which is trivial for our zero trust architecture to handle. We can also establish unmanaged device access for third parties with an approved request. With browser-based access to our private application and our identity proxy to enforce the requisite IAM controls, we can further reduce data loss risk by allowing that access only through remote browser isolation (RBI). 

There are several other requests or usage scenarios we could accommodate at acceptable levels of risk given this highly flexible and dynamic architecture. We will continue to evolve beyond perimeter-based, castle-and-moat control environments toward more cloud adoption, APIs, IoT, SaaS, and distributed systems. With this, we will encounter more potential points for attack ingress and exfiltration egress, but with far greater control sophistication and over inherent digital trust and reachability risks. We are building a new perimeter around identity with continuous authorization and zero trust. This is our protect surface. 

What to read next

Zero trust: Getting pragmatic with policy enforcement - Part 1

Introducing fine-grained personal access tokens for GitHub