Three steps to make an effective data-driven analysis (white paper)

In 2006 I won a scholarship to study Information Systems. At that time I didn’t know how to turn on a computer. I was the only woman in my class, and my colleagues were already working with programming. But I had decided I was to overcome my limitations. I study and evolved my skills because I just fell in love with the pieces of information. I was amazed at the power of data when it became knowledge to make decisions. Since then I’ve been working with data and information in order to add value to the organizations. Over these years, I continue to update my strategy every time I need to analyze some data. Follow I describe a little of how I work.

1: Self perspective (Exploratory Analysis)

As every autonomous agent does, the first step to making decisions and succeeding in a new environment is to understand it. As a Data Analyst, the first step is to understand the data according to the business context. I usually follow these simple steeps:

a) A quick and dirty analysis that can answer these questions: What means the null and missing values in the features? What are the features and what does each feature mean?

b) Identify Threats and Variability across the data.

c) Identify Interconnection between features.

c) Define the domain(business) data context.

d) Guarantee the quality of the data.

When we know more about the data, we can figure out what questions we can answer. And I usually to that in the next phase, prescriptive analysis.

2. Prescriptive Analysis (Opportunities)

Every data domain is fixed by 3 pillars: (i) technology, (ii) people or culture, (iii) process/roles. Those are socio-technical resources that organizations can use to create a strategy. So, after we make a quick exploratory analysis is time to look at the details. We can do that in a few steps:

  1. Identify correlations.  Sometimes it is necessary make a research to understand a bit more about some domain. It means we can read documentations or talk with the expert domain, executives or any expert that can help us to understand how we can create value with the data we have available.
  2. Identify opportunities to automate data processes (if it is required). We can identify opportunities to automate some data pipelines. If it is required, we also can make a feasibility study about a machine learning model. Let’s be honest, some business or domain issues doesn’t need an automation or a Machine Learning solution. Sometimes a simple report, a dashboard, or even a new technique is enough to deal with the archaic data processes. And it can be less expensive and more effective.
  3. Being always skeptical when we choosing some data questions: when we work with data science we need to be very critical. Sometimes we think we can answer some business questions, but we don’t have enough data to formulate a data storytelling to answer it. Sometimes we thought we found an amazing insights, just for figure out it is not a big deal. Sometimes the data surprise us and bring some knowledge complete different that was expected. Most of the time there’s noting wrong with it. Is just the reality that can be very unique. This is why we need to be very careful in order to establish some data questions, because maybe, just maybe, we can establish the wrong question. Thus, be skeptical and detail-oriented can save the day.
  4. Save Know How. Usually I provide several documentation about my analyses such as data dictionaries, data storytelling, videos, articles, mental maps, taxonomy, ontology, read me markdown and so on. It’s help me to track the process and also give reproducibility to the data analysis process and helps others researchers to continue the research.

3. Optimization cycle

How do we know the analysis was successful? I believe we know that the analysis was successful when we apply the prescriptive analyses, made from our insights to business or domain decisions. After it, we can keep “hunting” more data in order to make a more accurate model that represents even better the reality. An easy way to do it is to consider the client and internal team suggestions in order to identify the Key Performance Indicators (KPIs).

Conclusion

In summary: 

There are 3 big phases to make a data-driven analysis: 

  1. Exploratory analysis (Picture): some actions in this phase are (i) Identify features, null and missing values means, (ii) Identify threats and variability data, (iii) Identify data interconnection, (iv) define the domain(business) data context, and (v) guarantee the data quality. 
  2. Prescriptive Analysis (Opportunities):  some actions in this phase are: (i) Identify correlations, (ii) Identify opportunities to automate data processes (if it is required), (iii) being always skeptical when we choosing some data questions:, and (iv) save know-how.
  3. Optimization cycle: optimize our chosen solution and our KPIs.

Leave a Reply

Your email address will not be published. Required fields are marked *