Exact Data Match: Get rid of the “fake news” in your DLP solution
Any time I turn on the news, check my Facebook feed, watch videos on YouTube, or amuse myself over the latest tweets, I have to wonder about the accuracy of the information I am ingesting. These days, with social media giants shutting down misinformation campaigns by foreign adversaries, and political decision-making processes around the world being silently influenced by those same actors, we are forced to spend too much time processing and weeding out the bad information instead of analyzing and acting upon the good.
I can’t help but sympathize with all you security analysts out there who deal with misinformation—in the form of false positives—in your jobs on a daily basis. How many alerts do you find in your inbox from your data loss prevention solution during a typical day? And how many of those are just clutter, clouding your view? Probably too many, but rest assured that you are not alone! In a study conducted by analyst firm IDC, more than a third of respondents stated that they see up to 10,000 alerts each month. But out of those, a staggering 52 percent are false positives.
To clarify, most false positives are real positives, meaning the detection engine did its job and identified content that matches a policy. However, the content does not pose a risk to the business in the context it is being used. This is most likely to happen when a comprehensive data protection strategy is absent or policy isn’t configured accurately.
False positives have actual consequences. They don’t cause any direct harm, but they jam up the system. While you are busy weeding out the bad (false positives), you don’t have time to investigate the good (legitimate alerts). But in comparison to identifying fake news, protecting your data and optimizing the detection rate of your DLP solution isn’t guesswork and it doesn’t require fact checking. It is about making the right decisions to reduce risk and, most importantly, it’s about using the right techniques for the right content.
Exact Data Match is a technique that can detect your unique data such as credit card numbers, personal IDs, account numbers, etc. It drastically increases your detection accuracy and diminishes false positives to close to zero. While standard content matching techniques monitor the data that is leaving your network, looking for generic patterns and phrases, Exact Data Match identifies not just any “type” of data, but specific data that needs to be protected.
Let’s take the example of a credit card number: Monitoring traffic for credit card numbers with a regular expression will trigger alerts anytime anyone in the organization is using a credit card for any reason. That is too many “any”s. With Exact Data Match, only the credit card numbers of your customers or partners that you store in your databases would trigger alerts. The finance department paying bills or employees making online purchases would not trigger alerts, as such activities don’t pose a threat to the organization.
This sounds great! But how does it work? Exact Data Match fingerprints sensitive data from structured sources, then watches for attempts to move the fingerprinted data. It starts with plain text from a database or Excel spreadsheet, which gets obfuscated—usually by hashing—for privacy reasons. With hashing, algorithms are applied to the data, turning it into shorter data strings (hashes), and those hashes are stored within the DLP solution. The same algorithms are then applied to all outbound traffic. So, when traffic that is hashed matches the hashes stored in the DLP solution, the transfer will be blocked or trigger an alert.
Exact Data Match is a storage- and compute-intensive operation, requiring an underlying platform that accommodates its need for scale.
To conclude, if you want to increase the security posture of your organization, reduce the frustration of end users due to unnecessarily blocked transactions, and spend more of your time investigating and remediating actual incidents of data loss instead of digging through the haystack of false positives, leverage Exact Data Match.
For my part, I will continue weeding out misinformation until someone invents an algorithm that instantly identifies and blocks fake news.
To see EDM in action, watch.
Selina Koenig is a Product Marketing Manager at Zscaler