This question comes up often. It is typically asked when you are thrown multiple data sources, running up to millions of rows.
Sometimes you are lucky — you may be asked to solve a very specific and well-studied problem (e.g. predict which customer is likely to cancel their subscription within the next month). But often you are simply asked to “mine the data and tell me something interesting”.
Where to start?
This is a difficult question and it doesn’t have a single, perfect answer. I am sure experienced practitioners have evolved many ways to do this. Here’s one way that I have found to be useful.
It is based on two notions:
- Every business can be thought of as a complicated system with many moving parts. Nobody really understands it 100%. Even for experienced employees, there’s a gap between their understanding of the business and how it actually works. And since the business keeps changing, this gap can never really be closed.
- Any data you have about the business describes some aspect of the behavior of this complex system.
Given this, you can think of an “insight” as anything that increases your understanding of how the system actually works. It bridges the gap between how you think the system works and how it really works. Or, to borrow an analogy, complex systems are black-boxes and an insight is like a window cut into the side of the black box that “sheds light” on what’s going on inside.
So the search for insight can be thought of as the effort to understand how something complicated really works by analyzing its data.
How to search for insight?
There are four steps you can follow to uncover insights from the data.
Step 1. Hypothesis
Using your current understanding of how the system works, make certain predictions.
Before you explore the data, write down a short list of what you expect to see in the data: the distribution of key variables, the relationships between important pairs of them, and so on. Such a list is essentially a prediction based on your current understanding of the business.
Step 2. Experiment
Check the data (sometimes setting up elaborate experiments to generate the data) to see if it matches predictions.
Analyze the data. Make plots, do summaries, whatever is needed to see if it matches your expectations.
Step 3. Analysis
If the data does not match your expectations, dig into what’s going on and update your understanding. This may require modifying your original hypothesis.
Is there anything that doesn’t match? Anything that makes you go “That’s odd” or “That doesn’t make any sense.”?
Zoom in and try to understand what in your business is making that weird thing show up in the data. This is the critical step.
Step 4. Repeat
Make predictions. Repeat the cycle.
You may have just found an insight into the business and increased your understanding.
Real world example – banking sector
Here’s a real example. I was looking at customer-level data (e.g. balances, rates, geography, etc.) from a global U.S. bank. One of the fields in the dataset was ‘deposit amount per branch’.
What did I expect to see? Well, I expected that the largest amounts would be from February-October, with fewer amounts around November-January because this is holiday season and customers are less inclined to do any banking activities.
When I checked the data, it showed the trend that I was expecting. However, when I compared this with benchmarks (i.e. competitors), I noticed something was “odd” as others didn’t have the same distinctive periodic difference.
So I investigated this anomaly.
Turns out there was product cannibalization happening from a new promotion that launched in late October. People were taking money out of their existing accounts and moving to this short-term promo product to take advantage.
This modest “discovery” set off a chain reaction of interesting questions about what sort of products these customers are interested in, what factors impact their purchasing decision, what promotional campaigns may be best suited to them, and even how this data can be used to inform regional expansion plans.
Conclusion
Note that working back from the data to the “root cause” in the business takes time, effort, and patience. However, the more you understand the nuances of the business, the more pointed your predictions will be, ultimately allowing you to uncover better insights. So, do everything you can to get into the details of the business. Seek out colleagues who understand the business, learn from them, and if possible make them your co-conspirators.
Jason Oh is a management consultant at Novantas with expertise in scaling profitability and improving business efficiency for financial institutions.
Image: Pexels
🔴 Interested in consulting?
2 replies on “How to glean insights from data and where to start?”
[…] half of all businesses using big data feel they’re drowning and can’t glean insights from the massive amounts of diverse data types flowing at them with alarming […]
[…] Analyzing the data to test the hypotheses might then involve conducting various forms of statistical and data analysis. […]