The way big data is defined and is being used across industries is evolving rapidly. What defines big data today, and how is it being used?
If data is a square, big data is a cube. A collection of data becomes ‘big’ when there are large amounts of data, if the rate of data acquisition is very fast, or if the data is complex, such as human behavioral variables. These three areas together present complexities in storage and analysis. Industry-wide, there are challenges developing in areas of privacy, regulatory development and compliance, and security, as well as bias in large data sets.
All that information is fairly well understood. But what do we do with this information? How to we make sure conclusions drawn are accurate and just?
Business is using big data analytics in several areas: profitability, sustainability, including environmental stewardship, and in areas of agility–ensuring that the business is on its toes, ready to respond to a rapidly changing environment.
For a global company like UPS, the details matter, because their impact is magnified across the miles. They enlisted the ORION algorithm to help map their truck routes, because shaving off even a mile a day per truck, across the entire number of trucks on the road, is big money. Across the global transportation industry, real-time data is being collected and analyzed to keep increasingly crowded sea lanes and roads running in an organized and efficient way. The challenges in scheduling maritime ports alone are daunting, and the current big data systems can analyze and predict the comings and goings in the busy world shipping lanes.
Health care is increasingly using big data analytics to collect and study the subjective, hard-to-measure human behavioral variables that affect the accuracy of health care research. Data on human health and illness is both complex and being collected very quickly, but the key to using the data is asking the machine learning platforms the right questions. Health care is taking a leap of faith, and using alternative methods such as crowdsourcing, along with newly developed tools, to ask the right questions.
In the criminal justice sector, police are using multiple data sets to pinpoint areas that need more police presence in an attempt to prevent violent crimes against people. Using similar mapping programs to the transportation sector, the criminal justice agencies are combining mapping with 911 calls to focus prevention efforts. Many jurisdictions are also using big data analytics for risk assessment in sentencing programs; these programs have become known for bias in the data sets, and scientists are working on how unconscious bias in data sets can skew predictive analytics.
Our challenge moving forward is to develop a set of checks and balances. We need to make sure the data sets are diverse and accurate, and we need to find a way to make sure the conclusions drawn through big data analytics can be verified and confirmed. In addition, issues of privacy, security, and regulatory development and compliance are emerging as new challenges.