Rampant growth in unstructured data is not a new phenomenon, however, this type of data has typically proven to be challenging to manage effectively, says Chris Hathaway, director at Soarsoft International.

This challenge is only exacerbated by the risks and ultimately costs of dealing with the problem in a “piecemeal” and re-active rather than pro-active approach. Organisations that cannot effectively manage their unstructured data not only open themselves up to risk, but also face the ramifications of non-compliance with laws like the recently enacted Protection of Personal Information (POPI) Act among other data management and governance requirements.

The reality is that manual data management practices can never cope with the management and monitoring challenge, when it comes to the unstructured data that is continuously generated by information workers. Organisations need to leverage file analysis technology that can keep them in line with best practices around information governance, as well as detecting risky information and helping to develop a proactive approach to managing data and reducing costs.

It is incredible how much data is retained by organisations on an annual basis, simply because it can’t be properly identified to ascertain whether or not it has any business relevance. In order to reduce this risk and cost, organisations need to understand what data exists and where, how old it is, who owns it, who can access it and more. File analysis is an essential tool in the journey towards more effective data management, and information governance as a whole. This technology provides a comprehensive and actionable map of an organisation’s unstructured data, access rights and ownership – a foundation that is critical for this data if it is to be effectively managed and leveraged usefully.

File analysis tools offer a centralised view and easier control over the vast volumes of unstructured data within an organisation, by creating a complete index of users and groups with a comprehensive in-place mapping of unstructured data and the owners and permissions.
Importantly, this “view” is obtained quickly, and without impacting the end users, by managing the data “in-place” where it currently resides. File analysis uses metadata – which is the data about the data – to deliver this visibility and control rather than the need for it to be copied and moved to yet another repository that will also need to be managed in yet another costly process. File analysis shrinks costs quickly by reducing the amount of data stored and facilitates the classification of valuable business data, so that it can be more easily found and leveraged for day-to-day activities and supporting e-discovery efforts for investigations.

This not only reduces the time to deliver value, it also reduces complexity while simultaneously decreasing storage and infrastructure demands. File analysis tools are therefore not simply a grudge purchase to provide a better vision of information governance, but rather something that can add real value to an organisation by giving visibility to the data and then allowing it to be retained appropriately or “defensibly” deleted. The business value of such a tool is that it saves organisations money and reduces risks.

File analysis is fast becoming an essential component of effective data management by turning the “data” into coherent “information” and Gartner lists File Analysis as an advantage technology that “should be acted on within 12 months”.

With regards to unstructured data management, here are some key questions that a file analysis platform may be able to answer for Chief Information Officers (CIO’s), administrators and forensic teams:

* How do you identify your dark data on sources like File, Messaging, Collaboration and especially “Cloud” based services and storage options?
* Is this hampering migration and adoption of cloud based services, as you don’t know what can be usefully moved to the cloud.
* Are you worried about managing the information risks within these cloud services and storage platforms once you’ve adopted them?
* Do you need to reduce obvious Redundant, Outdated and Trivial Information (ROT)) and properly classify your information.
* Do you need to perform deeper analysis, incorporating advanced data analytics to add meaning and context to the data so that it can become information, rather than just data?
* How do you defensibly collect data from “cloud” and on-site unstructured data platforms and repositories”?
* Various repositories such as SharePoint contain vast amounts of complex data, from blogs to custom list items. How is this complex data preserved and collected in the specific way that it needs to be done?
* How much data is being retained that can actually be defensibly deleted and disposed of?

Answering these questions is critical in leveraging the value of proactive unstructured data management. Achieving this requires the policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost-effective IT infrastructure, from the time information is conceived through to its final disposition.”