Problem solve Get help with specific problems with your technologies, process and projects.

Building an unstructured data protection program

Learn how to develop a holistic approach to managing the risks associated with unstructured data.

Electronic data is woven into the fabric of nearly every business process in today's financial services organizations. This data is critical to daily operations, yet it is rare for the full spectrum of associated risks to be considered, measured and addressed holistically. Sensitive data, such as personally identifying information, material non-public information, financial reports and intellectual property, poses a risk to organizations. Complicating the matter further, the majority of business data, up to 80% according to IDC, resides in an unstructured format. Examples include Microsoft Office documents, PDFs or database extracts often stored in collaborative, shared-folder environments. Organizations often find difficulty in answering the following questions related to unstructured data:

  • What data is sensitive to my organization?
  • Where is this data stored?
  • Who has access to this data, and is the access profile appropriate?

Last year, 40% of respondents to the Ernst & Young Global Information Security Survey reported that implementing or improving data loss prevention technologies would be their second-highest security priority in 2010. Data protection efforts are most successful when all the dimensions of unstructured data risk are understood, appropriate business owners for the data are identified and existing processes and systems are leveraged to drive an  unstructured data protection program from concept to reality.

Defining the dimensions of unstructured data risk

The three primary dimensions of unstructured data risk are data sensitivity, sensitive data location and inappropriate access to data. Understanding and quantifying these three dimensions are the first steps toward executing a successful unstructured data protection program.

  • Data sensitivity is perhaps the most familiar of the three dimensions of unstructured data risk. The definition of sensitive data must take into account an organization's regulatory environment, risk appetite and corporate culture. The trade-off between user experience and the enforcement of an unstructured data policy must also be considered.
  • Sensitive data location, both from the perspective of physical infrastructure and logical organization, is a critical component of protecting unstructured data. It also enables the identification of proper data ownership. Management must decide which environments are appropriate for housing sensitive data and which environments are considered "open access," and thus should house only public information.
  • Inappropriate access to unstructured data, defined as users having access to sensitive data with no business justification, increases the risk of sensitive data leakage. Access to data is typically provisioned, yet seldom revoked. Over time, this practice results in an excessive number of users with unnecessary access to data. Recent technologies assist in the reduction of excessive access by using statistical methods to recommend access revocation for resources that are not actively used by a particular user. The recommendations for reductions in access can also be identified during the entitlement review process, which detects and eliminates other instances of inappropriate access.

Designing a data protection program with a thorough understanding of these three dimensions sets the stage for identifying data ownership and building a sustainable data protection program. When all the dimensions are understood holistically, the initiative can be sustained by adopting a logical approach to access control, building upon existing processes and executing change.

Adopting a logical approach to access control

One of the first steps is to identify appropriate business data owners and create a sustainable controlled, collaborative environment. Business data owners understand the context of sensitive data and are therefore more appropriately positioned to decide which users should have access to the data. IT functions have the ability to control access, yet often lack the context or understanding of the data necessary to make well-informed access decisions. Emerging technologies for data governance can accelerate the process of identifying business data owners and creating a sustainable controlled file-share environment.. A combination of emerging technologies, statistical modeling, compliance with a complex regulatory environment and proven experience can all expedite the process of identifying the business data owner. Business owners can then perform periodic access reviews, often vastly reducing the amount of excessive access to file repositories, and approve or deny access requests going forward. This, along with periodic access reviews, helps keep excessive access under control.

While business data ownership is a critical component of the unstructured data protection value proposition, additional techniques can be used to maximize value while expediting the overall effort and reducing the impact to business processes. These techniques are discussed below.

Building upon existing processes

Organizations may be wary of undertaking a large unstructured data protection program due to the daunting scale of the task. However, there are several tactics that can reduce the impact of the effort. The initiative does not always have to start with a "clean-slate" approach. Existing processes and technologies can and should be leveraged to reduce the effort required to tackle unstructured data protection and to ease the transition into a controlled file-share environment.

Financial services organizations often employ a data access request and approval process. By redirecting access decisions to the business data owner, access is more likely to be limited to those with a valid business reason for it. Furthermore, there is often an access review process for high-risk applications. By expanding this entitlement review process to file shares, especially those containing sensitive data, the risk of sensitive data leakage is reduced.

Finally, an organization's data lifecycle management policies and procedures may be modified to include the retirement and disposal of sensitive unstructured data. Timely disposal is particularly important for unstructured data as much of this data goes stale soon after creation. Adjusting existing processes for access provisioning, data management and data classification to enable a sustainable controlled environment will reduce the likelihood of disruption, often caused by lengthy access request, approval and provisioning cycles or lack of transparency around who has access to what resources. These are metrics that business units can understand and appreciate.

Executing change

Data has taken center stage as companies respond to emerging regulatory requirements and set their sights on growth. While data protection has long been a high priority, efforts have historically focused on structured data. Now, however, new technologies make it easier to capture, quantify and manage unstructured data. Motivated by the rapid growth of unstructured data and the increased cost of breach notifications, companies are eager to bring clarity to addressing this complex situation.

While an unstructured data protection program may require considerable effort from business users, process efficiencies will be gained in the audit and regulatory compliance space. For example, IT auditors often demand proof of entitlement reviews, complete with request and approval sign-offs. This request can be easily fulfilled after completion of an unstructured data protection effort.

As with any change in accountability, there will be resistance. Management must gain the buy-in of business data owners by communicating the less obvious but equally tangible value to the business. Submitting access requests to IT and justifying the request to someone with limited knowledge of the business can be time-consuming and frustrating. But as data owners, business users have the ability to serve themselves, performing access changes in real time on an as-needed basis. This capability also leads to greater confidence that only users with a valid business reason can access the business data.

With an unstructured data protection program in place, organizations will gain more certainty around what data is sensitive, where this data is located and who has access to it. By taking a holistic approach to managing the risks of unstructured data, aligning accountability with business responsibility and building upon processes already in place, organizations can take unstructured data protection from concept to reality.

About the author:

Gary Lorenz is an executive in the Financial Services Office of Ernst & Young LLP. Gary is based in Charlotte, NC and can be reached at +1 704 331 1868 or The views expressed in this article are those of the author and do not necessarily reflect the views of Ernst & Young LLP.

Dig Deeper on Risk assessment and management in financial institutions

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.