Data and Analytics came together in the 16th Century catalyzing the scientific revolution. But today, Data has moved far ahead of its Analytic counterpart. Here's how they were brought together and what's needed to keep them from pulling apart.
If you were born before the early 70's, you might have watched the original Cosmos series. Carl Sagan ("Billions and Billions") created, hosted and narrated the series that first aired on PBS in the Fall of 1980. In episode 3, Harmony of the Worlds, Sagan tells the story of Tycho Brahe and Johannes Kepler who's encounter he describes as the birth of modern science where observation meets theory. Their story begins in the latter half of the 16th century when science was Astrology and fear, famine, pestilence and war ruled the day. For 1500 years Ptolemy's geocentric model of the universe, with Earth being at its center, was accepted as fact by both science and religion.
Yesterday's Data Revolution
Kepler believed in the new heliocentric (sun-centered) model of the universe first published to much scorn by Nicolaus Copernicus. While teaching mathematics, he had an epiphany of a universe where the planets (only six known at the time) revolved around the sun in spherical orbits that followed the geometry of the five Platonic solids set inside each other like Russian nesting dolls. Kepler was not able to prove his model with the limited available observational data. He needed Tycho Brahe's observational data.
Brahe, a flamboyantly wealthy Dutch Nobleman known for both his scientific mind and his passion for partying (the 16th-century version) had decades of observational data that was five times more accurate than anyone else at the time. The accuracy was within an arcminute (60 minutes in 1 degree), all gathered before the telescope. However, he was not able to gain the great insight even with the large quantity and high quality of his data. His gut believed in a hybrid model where the Sun and Moon revolve around the Earth, and the other planets revolve around the Sun. He invited the young Kepler, who was becoming well known for his mathematical abilities, to work with him and share in the data he'd amassed. However, when Kepler arrived to work, Brahe would only share some of his data with him. To Brahe, Kepler was still a rival. For eighteen months they argued, separated and reconciled before Brahe's death, thought to be related to his overindulgence in food and wine.
Following Brahe's death, Kepler convinced Brahe's family to allow him to access the data. Brahe had pointed out to Kepler that the orbit of Mars was the most peculiar and may be the one that provides some answers. After three years of calculations, Kepler found that his circular orbit of Mars matched 10 of Brahe's observations within 2 minutes of arc. The joy did not last as two additional measurements were off by eight minutes. He firmly believed in his hypothesis, but he also believed in the accuracy of Brahe's data. He later wrote, "If I had believed that we could ignore these eight minutes, I would have patched up my hypothesis accordingly. But, since it was not permissible to ignore, those eight minutes pointed the road to a complete reformation in astronomy. "
Fortunately for us, Kepler believed in data-driven decisions over his intuition. He moved away from his belief in perfect circles created by a perfect God to look for another answer. He tried less perfect oval like curves, but due to some mathematical mistakes, he did not find the pattern. Then finally a few months later with some desperation, he tried an ellipse for the orbit of Mars. He discovered that an ellipse with the Sun at one of the two foci matched Brahe's data. He was able to do the same for the other planets. Based on this hypothesis and the matching data he developed the three laws of planetary motion.
- The path of the planets about the sun is elliptical in shape, with the center of the sun being located at one focus. (The Law of Ellipses)
- An imaginary line drawn from the center of the sun to the center of the planet will sweep out equal areas in equal intervals of time. (The Law of Equal Areas)
- The ratio of the squares of the periods of any two planets is equal to the ratio of the cubes of their average distances from the sun. (The Law of Harmonies)
A generation after Kepler's death, Isaac Newton would use Kepler's insight on planetary motion to figure out why it happened to formulate his theory of universal gravitation. From data to insight that centuries later would result in satellite launches, moon landings, and Mars landings. The more important legacy may have been as Sagan stated, the birth of modern science where observation (data) and theory (insight) helped set off the scientific revolution.
Today's Data Revolution
The digital revolution is a direct result of the scientific revolution that began in the 16th Century. Once again data transformed into insight is at the heart of it. The Internet of Things (IoT) exemplifies that as much as any other part of the digital revolution. The economic opportunity from IoT is expected to be significant as highlighted in the graphic below from ParStream (Cisco).
Value creation from IoT is the main reason for the exponential growth of this market. The McKinsey Global Institute (MGI) published the report in June 2015, The Internet of Things: Mapping the Value Beyond the Hype. MGI breaks IoT into the settings where it will be used rather than by industries as shown below.
McKinsey is estimating the economic value of IoT by 2025 to be a minimum of $ 3.9 Trillion to a high of $ 11.1 Trillion, where the upper estimate represents 10 % of global GDP. The following graphic highlights the major applications for creating this immense value.
Has Anyone Seen Kepler?
The value comes from analyzing the data to gain insight into decision-making, improving operations and creating new revenue streams. However, one of MGI's key findings is that less than one percent of the data gathered is currently used. The particular example shown is from an IoT implementation for an oil rig. However, the difference between data collected and data used seems to be the general case for most current IoT applications.
Similarly, in manufacturing the data is primarily used for controlling machines and sending alerts when tolerances are out of permissible range. Manufacturers are not entirely realizing value when data leads to optimization and prediction. The inability to fully use the data start at the top of the funnel where technical challenges in transmitting and storing the data exist. Aggregating the data into a usable format for analysis is equally difficult. Even as we solve technological problems at the top, the skills, people and process challenges at the lower part of the funnel beginning with Analytics may be the most challenging.
The MGI report provides a good example of the current state versus the future desired state. Data from sensors in a water pumping station trigger alarms when the pump overheats. The alerts are an improvement from waiting for the system to fail to fix the pump. The desired future state would be to understand the conditional inputs that would cause overheating and gather sensor data from all of those points to predict failure and correct long before any failure occurs. This future state takes many variables, data, and the algorithms to detect the subtle changes that may create the failure condition later in time.
The expected ROI from IoT and other Big Data applications will not be exclusively realized from the technology of sensors, data capture, and data management. Without the analytics, the value of data is minimal at best. Currently, the upfront technology is more mature than the downstream analytic processes and technology. The media focuses on the "talent shortage" being the main issue, but it's just one part of the problem. Analytics success requires solutions to problems from organizational, process and talent perspective. Most organizations have gaps or even significant divides in all three areas leading to a lot of data without much analytics - Brahe without Keppler.
Analytics and Organization
Analytics is the key to data-driven decision making whether the result is to make human decisions or for machines and software to make decisions. Data-driven decision making needs to be integrated into the organization from the top. Leaders need to believe that data can be used to drive many of the decisions in all the key areas of the organization. This initial step can precede the implementation of advanced technology and big data. This will make sure the investment in more advanced technology and big data will pay off. The belief and commitment came to the Oakland A's from Moneyball fame well before the advanced technology. They started with the standard statistics that were around for a hundred years to move decision making away from purely gut decisions to data-driven decisions. Over the last 17 years, the technology and data have improved dramatically to make data-driven decision making a foundation of most teams in Major League Baseball. That kind of leadership and commitment needs to happen for any organization to get the full potential ROI from IoT and data.
Analytics and Process
Analytics can be defined as a process that begins with asking a question and then driving towards finding an answer and implementing the solution. The book, Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data Into Profitable Insight, provides a framework for this process that the authors have named B.A.D.I.R. The graphic below highlights the significant steps. For a more detailed look read 5 Steps to Go from Data to Decisions.
From Behind Every Good Decision: How Anyone Can Use Business Analytics to Turn Data Into Profitable Insight
Notice that data doesn't enter the picture until step 4. Defining the business question and the analysis plan are critical to getting the ROI at the end.
Analytics and Talent
Over the last seven years, many thought leaders from leading organizations have predicted the shortage of analytics talent where demand outstrips supply. The numbers have ranged from hundreds of thousands to millions of vacamt positions by 2020. Not everyone agrees with this, but the consensus is that the need will be growing fast. The market has been working to fill the need. Many universities have started analytics programs, online education sites like Coursera and Udacity have added many analytic courses. Bootcamps have begun to churn out data analysts and scientists to fill the void.
However, to fulfill the promise of IoT and Big Data, a core group of experts within any organization that understands data is not enough. To understand data one must roll up their sleeves and use it. The more everyone within the organization has some level of data literacy, the better the ROI. As shown in the Analytics process above, data science skills are only part of the need. Moving from data to insight also requires decision science skills (soft skills like project management, communication, empathy, etc.) and domain expertise. Analytics technology is not at the stage where it just needs data and it outputs insight.
Currently, it requires both data and human knowledge to define the problem, understand the data needs and form hypothesis to be tested. AI through Machine Learning and Deep Learning will get us closer to inputting data and outputting knowledge but that is still many years or even decades away. Thomas Davenport, a professor at Babson College, is a leading proponent of the link between management and analytics. He believes organizations will need to develop analytics skills internally where domain expertise and data expertise can come together rather than just recruiting for a core group of experts.
Data and Insight
When Tycho Brahe was near death his final words were "May I not seem to have lived in vain, may I not seem to have lived in vain." He spent decades gathering precise data on the movement of the planets, but he was about to die without the world understanding what the data meant. Thanks to Johannes Kepler, he did not live in vain. The data for the movement of planets was always there waiting to be captured. Even Kepler and Newton may not have known the insights they discoverd from Brahe's data would one day lead to human exploration of all of those planets. Similarly today, the data for curing cancer, self-driving cars that eliminate vehicle death and injuries, reducing crime, improving health and perhaps even increasing happiness is out there waiting to be gathered. But without the Keplers to find the insights from the data that leads to actions that result in all of these improvements it is meaningless. We are just beginning the Analytics phase that will make use of Big Data and IoT data.
In order to keep moving Analytics maturity higher to match the technology maturity in gathering, storing and aggregating data organizations need to focus on three areas.
- Develop leadership to have a firm committment to use data to make company wide decisions at all levels.
- Define and develop their analytics process that takes a business problem to solution using data.
- Build Analytics capability in-house through training and recruiting ensuring that everyone has some analytics capability rather than just relying on a core group with analyst and scientists titles.