First it was price, then it was service, now it’s information.
We often read or hear about “big data”, but it’s not that often that we hear about ‘data science’ – the process of taking a huge amount of (often unstructured) information, analysing it, and using the results and insights to create something better than what existed previously (for example, by using it to develop a new product or service, or to automate an existing process).
Google, LinkedIn and Amazon have led the charge, using the data generated by their services to create new offerings that their customers have come to value (like Google Maps and Amazon’s recommendation service).
Getting the data ‘to talk’ has provided these organisations with new ways to differentiate themselves. And it’s not just global giants who are benefiting. The opportunity is just as big for smaller businesses that generate a lot of data.
Some say that data science is an art, others say that it’s pure science.
“Data science is a collection of techniques and algorithms used to analyse, interpret and visualise large amounts of information,” explains Riaan Swart, Senior Data Scientist at Stone Three Venture Technology in Somerset West. “The goal is always to find ways to provide a better service to customers.”
Stone Three’s Data Science division works with a number of industries in South Africa, Europe and North America, including healthcare, media, social media and finance, where big data is standard. In these sectors, the biggest value often comes from using data science to develop automation tools that not only reduce costs and ‘see’ patterns that are not clear from raw data, but also to lower the rate of human error and free up employees to focus on higher-value, more strategic work.
Stone Three recently partnered with Netherlands-based big data text analytics and visualisation provider, Treparel (acronym for Trends, Patterns and Relations), to automate the process of performing classifications on huge amounts of database entries, for example to check whether a similar patent already exists. Stone Three also helped MySmartFarm automate the process of collecting and presenting a wealth of agricultural information – including weather forecasts, satellite images, soil moisture monitoring and irrigation plans – and combining it into a single online dashboard.
Swart is passionate about the difference that data science can make in the future, and believes that some of the greatest medical and social advancement of this century will come from the work of data scientists.
“We’re already doing some very interesting work with the medical sector, including rolling out a system with the CSIR that consists of a distributed database, web app and client app. The system is designed to store, retrieve and process the wealth of medical metadata gathered by health experts. We can then use this data to run image analysis and classification to identify trends quickly.”
As the amount of data that can be computed and stored continues to grow, so will the opportunities to harness that information. Swart explains that, for example, it will soon become the norm to use anonymised data to predict the spread of pathogens to prevent disease; and to analyse DNA patterns to identify predictors of future health problems.
But, as Francois Swanepoel, CTO at Stone Three Venture Technology, explains, data science should always be just that – scientific: “There is no black magic or crystal-ball gazing in data science, despite what some of the more ‘creative’ promoters of the discipline would have you believe. We follow a very detailed and methodical approach, where each step of a project is verifiable, and every recommendation we make is backed up by hard evidence.
“There is no art in good data science, just inquisitiveness, the determination to solve a problem, a solid understanding of scientific principles, and the commercial astuteness to turn the insights gained from patterns and probabilities into tangible business benefits.”