Data loss prevention (DLP) is not a simple undertaking. It requires a comprehensive data protection strategy that is supported by critical stakeholders across the business, HR, legal departments, and IT. Gaining alignment on use cases, finding tools that address them, and building a team that will shepherd through the implementation and administration of those tools is crucial.

But even if you have a strategy and you’ve bought a DLP solution that supports advanced detection techniques and inspects encrypted traffic, you still need to address the most crucial aspect of a successful DLP implementation: creating policies that support your goal.

Let’s take the example of protecting payment card information.

Most DLP solutions have preconfigured dictionaries to detect credit card numbers. These dictionaries are based on regular expression patterns and may perform checksum validation. So, you simply set up a basic rule that scans all outbound traffic across all your users and triggers on any 16-digit pattern that fits the regular expression, right? Unfortunately, all the “false positives” you have to weed out will make your head spin.

To clarify, most false positives are not just the algorithm detecting a series of random 16-digit numbers; they are more likely to be actual credit card numbers whose use pose no risk to the business in the context they’re being used. This could be your employees doing online shopping with a personal credit card or your finance department paying invoices. For DLP administrators, such activities aren't considered data loss incidents.

To avoid false positives, preconfigured pattern-based dictionaries typically come with parameters for context that can be adjusted based on your use case.

If you want to detect the accidental loss of large quantities of credit card numbers, you can adjust the threshold of pattern occurrences that determine when a dictionary will trigger. For example, if you set the threshold to 50, the transaction will be blocked and you will receive an incident notification only if 50 or more patterns that match the regular expression for credit cards are being triggered.

If you want to detect the accidental loss of smaller quantities of credit card numbers, you can adjust the amount of context in proximity to the detected pattern, which the DLP system takes into consideration before triggering an incident. There are usually tiered confidence ratings that range from doing a Luhn checksum validation only, to additionally checking if the patterns are represented in common formats, all the way to looking for payment card-related keywords in proximity.

Using these parameters to tweak your policy will significantly increase your detection accuracy without causing false positives that clutter your remediation system.

The limitations of pattern-based dictionaries become obvious when it comes to detecting a single record from accidental or malicious data loss. For this use case, you should use an advanced detection technique called fingerprinting or exact data match (EDM). This technique can identify any single record of sensitive payment card information (for example, from your customers) that you store in your databases. Because EDM is compute-intensive, you will need a DLP solution that can perform this type of inspection at scale. Learn more about EDM in this blog.

To set up your DLP program for success, take your use case into account and take advantage of different detection techniques and adjustable parameters as you create your DLP policies to protect payment card information.

To learn how you can protect payment card information with Zscaler, check out this demo video.

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Selina Koenig is a product marketing manager at Zscaler.