When building software or even implementing an application, testing is vital – and good testing requires data. Ziaan Hattingh, managing director of IndigoCube, a company that enables and improves the productivity of the application life cycle in large organisations, says that the best testing data is obviously production data.
“However, the question is how to use that data intelligently and safely,” Hattingh argues. “Companies hold personal data that is sensitive and usually protected by law, and that is highly attractive to criminals.”

According to research by the Ponemon Institute, customer records fetched an average of $197 each in 2007. With company records potentially in the thousands, it’s easy to see why breaches in data security occur.

Hattingh says that companies thus need a solution that will enable them to mask sensitive information in production data to ensure that software is tested using “real” data without the risks of allowing unauthorised people to use it.

“There are other issues relating to the management of test data,” Hattingh adds.
“One needs a way to extract the data relevant to the test one is running, and then to track versions. Version control will help to ensure that the right data is used for a particular test and, for example, will make it easy to use the same data for tests that take place at different times in order to compare apples with apples.”

An added complication is that some tests would require the data to be corrupted in order to force specific errors. Testers require comprehensive relational editing capabilities to compose this data – and, of course, there must be no risk that it could make its way back into the production environment.

In fact, the growing importance – and scrutiny – of data means that this test data falls under the auditor’s scrutiny of the entire data life cycle. Testers need to understand how this data fits into the company’s broader data policies. And, at a practical level, data governance needs to be streamlined to make it easy for testers to access the production data they need.

A solution that handled all these challenges would thus mitigate risk for companies while also streamlining the testing process to produce much better results.

But the benefits of a solution for managing test data more effectively are also wider, Hattingh adds. Masked production data can be used to improve training programmes, and can also act as a failsafe when a large database migration goes wrong.

“The fact of the matter is that a company’s data is a hugely valuable resource—but companies will only be able to realise that value when they have a solution in place to manage it effectively, and cope with the exponential growth in the sheer volume of data,” concludes Hattingh. “Putting such a solution in place is vital for testing, but it’s the first step to much wider benefits.”