Data classification: What’s the point of brushing off 'digital dust'?

Wed, 8th Aug 2018

FYI, this story is more than a year old

By Jan van Vliet, VP and GM EMEA, Digital Guardian

To a lot of us data is boring but a necessary evil. Our organisations create documents that end up being stored somewhere (often goodness knows where) gathering the equivalent of digital dust. We're accumulating vast stores of data we have no idea what to do with, and no hope of learning anything useful from. And then if you layer on top the increasingly stringent data protection and compliance requirements, then it's easy to see that we've got our data management work cut out for us.

For IT security professionals, this data deluge only adds to the challenge of identifying, mitigating and remediating the ever-increasing number of cyber threats out there. Much of the advice around security strategy has shifted from attempting to block threats at the perimeter, to protecting what matters most – sensitive data. But therein lies another challenge. With such high volumes of data, how is it possible to quickly identify which data is the highest priority for protection?

Although not a new phenomenon, there has been a recent resurgence in data discovery and classification techniques. When implemented properly a solid data classification process will allow the IT security team to identify where sensitive data resides, set policies for handling it, implement appropriate technical controls, and educate users about current threats to the data they work with and best practices for keeping it safe.

Best practice for implementing a data classification strategy

Data classification is not a one size fits all approach. Every company has different needs to address, so a strategy must be tailored accordingly. However, the following five-point action plan can be used to create the foundation of an effective strategy:

Define what's needed

Establish the goals, objectives and strategic intent of the policy are. Make sure all relevant employees are aware and understand why it is being put in place. An effective data policy must also balance the confidentiality and privacy of employees/users against the integrity and availability of the data being protected. A policy that's too stringent can alienate staff and impede their ability to carry out their jobs, but if it's too lax, the very data the firm is trying to protect could be at risk.

Establish the scope

It's important to establish where the boundaries will be early on; otherwise it can quickly grow out of control. This is particularly important when considering partners and third parties. How far into their network, will/can you reach? Equally important is legacy and archived data. Where is it and how will it be protected? Finally, make sure to note anything that's out of scope and ensure this is evaluated and adjusted regularly.

Discover all sensitive data that's in scope

Once data policy and scope have been established, the next task is to identify all the sensitive data that requires classification and protection. First, understand what data is being looking for. This could take many forms, ranging from confidential case files and personally identifiable information, through to client IP, source code, proprietary formulas etc.

Next, focus on where this data is likely to be found, from endpoints and servers, to on-site databases and in the cloud. Remember that discovery is not a one time event, it should be continuously re-evaluated, taking into account data at rest, data in motion and data in use across all platforms.

Evaluate all appropriate solutions

When the time comes to identify an appropriate data classification solution, there are plenty to choose from. Many of the best solutions today are automated and classification can be context (file type, location etc) and/or content based (fingerprint, RegEx etc). This option can be expensive and require a high degree of fine tuning, but once up and running it is extremely fast and classification can be repeated as often as desired.

An alternative to automated solutions is a manual approach, which allows users themselves to choose the classification of a file. This approach relies on a data expert to lead the classification process and can be time intensive, but in law firms where the correct classification is intricate and/or subjective, a manual approach can often be preferable.

A final option is to outsource the classification process to a service provider or consulting firm. This approach is rarely the most efficient or cost effective, but can provide a one-time classification of data and give any firm a good idea of where it stands in terms of compliance and risk.

Ensure feedback mechanisms are in place

The final stage is to ensure there are effective feedback mechanisms in place that allow swift reporting both up and down the firm's hierarchy. As part of this, data flow should be analysed regularly to ensure classified data isn't moving in unauthorised ways or resting in places it shouldn't be. Any issues or discrepancies should be immediately flagged for follow up.

With data now playing a pivotal role in nearly every firm around the world, the ability to track, classify and protect it is critical. An effective data classification strategy should form the cornerstone of any modern security initiative, allowing firms to quickly identify the data most valuable to them and their clients, and ensure it is safe at all times.

Share on: