Dive Into Data

Clifford E. D'Souza
6 min readJun 23, 2021

Curious Joe heard a lot about data recently. Instead of figuring it out all himself, he decided to talk to Genius Jane. “I want to know all about data”, Curious Joe said to Genius Jane with the look of someone in desperate need of knowledge. Ever knowing and supportive Genius Jane nodded and replied, “Sure Joe, let me lay that down for you! Let’s dive right in…”

Picture of a delivery truck on a street in downtown somewhere

(Photo by Yulia Matvienko on Unsplash)

Jane says: Imagine for a while that we’re talking about a door-to-door courier company. This company owns a large fleet of delivery trucks that operate intra-city and inter-state. The company maintains in a spreadsheet the list of trucks along with details such as the date of registration, city of registration, vehicle manufacturer & model number, maximum load-bearing capacity, number of wheels, etc. They also keep records of the list of drivers working for the company. It has their names, identification numbers, license permit numbers, residential addresses, and phone numbers. Records are also kept on a daily basis of the trips made, the number of packages loaded and unloaded into/out of the trucks and which driver is driving which truck as well as the opening and closing mileage for the day. All this is data.

Curious Joe: Wow, that’s a lot of data I could imagine. Guess they’d transfer all that into a database. And looks like they need it to ensure smooth operations. I’ve heard of people saying they use data for actionable insights & intelligence. What does this mean?

Genius Jane: For sure. Data is the raw material. This is where it all starts. Good quality data is an asset. It needs to be up to date, be void of missing data and be consistent at the same time. Sound Data Management practices help here. Once you have good quality data, you can do analytics on the data. Analytics is the process of discovering patterns and trends from the data. Analytics could be telling you a million different stories, but insights is the process of understanding the true story of what is going on with the business and its customers. Data-driven decision-making means taking action based on intelligence and the insights that it creates. So intelligence is the process that creates insights.

Genius Jane explaining stuff to Curious Joe

Curious Joe: This is all good. However, what is actionable insights?

Genius Jane: Once data has been processed, aggregated, and organized into a more human-friendly format that provides more context it’s transformed into Information. Information is often delivered in the form of data visualizations, reports, and dashboards. Insights are generated by analyzing information and drawing conclusions. Not all insights are actionable.

Curious Joe: How so, Jane?

Genius Jane: Suppose the courier company in our example had a key performance indicator (KPI) of improving the number of first attempt successful deliveries by 20% as compared to the baseline at the end of the previous year. If the business were to receive analysis that showed the highest delivery success rates occurred on weekends and Monday mornings, then here’s an insight. Weekends and week starts are the best time for deliveries. This insight is aligned with the stated KPI and acting on this by aligning as many deliveries during these days as possible could improve this KPI. What it means ultimately is lowered cost of fuel for re-attempts due to failed deliveries. That’s business value through potential reduced operational costs. Predictive Analytics takes this even further. Predictive analytics is the use of data, statistical algorithms, and machine learning techniques to identify the likelihood of future outcomes based on historical data. The goal is to go beyond knowing what has happened to provide the best assessment of what will happen in the future. Using these approaches, given details like delivery pin code, time of day, weekday, the value of the package, and so on, the company could predict the likelihood of the first attempt delivery down to the package level. And there Prescriptive Analytics :)

Curious Joe: Ah, okay Jane. You got me curious on this one now. What exactly is Prescriptive Analytics?

Genius Jane: They don’t call you Curious Joe for no reason. It’s a good question. What if besides getting predictions, we got recommendations as well? Prescriptive analytics is a process that analyzes data and provides instant recommendations on how to optimize business practices to suit multiple predicted outcomes. In our courier company example, what if for every package the system could generate a package outbound for a delivery schedule based on the time and date when a package is most likely to be delivered on the first attempt and also gives the best route and order of delivery for the day? That could save time, cost as well as effort.

Curious Joe: It is as if we’re putting the data that’s collected to work on overdrive and deriving as much benefit to optimize operations and delight customers as possible! While improving efficiency and cost.

Genius Jane: You’ve got it.

Curious Joe: This is quite a lot. Amazing stuff. I have one final question. I hear quite often about big data and small data. What are these?

Genius Jane: Let me start off with Big Data, and then I shall come to Small Data. Let us take the courier company example once again. If all the delivery trucks were fitted with road facing and internal cameras. There would be plenty of videos generated. Add on GPS that sends the location data of the trucks every minute to a central server store as streaming data. Customers provide textual feedback about package delivery online as well. Delivery personnel on the trucks also call consignees ahead of delivery. All this is data again. It is not structured and requires very large amounts of volume storage. The data is characterized by volume, variety, and velocity. That’s big data. Technology exists today to mine the data. From these large data sets, we could potentially get valuable information as starters. From the camera footage, we could derive truck driving behavior, how often the packages are being unloaded. From GPS data, the route traveled by each truck could be extracted. We could look for natural language processing (NLP) techniques to derive the sentiment from the textual feedback provided by Customers. From the call records provided by the telecom company, we can come to know whether a customer answered the calls at the time of delivery. That’s the first level step in processing the data. Further downstream pipelines could help uncover more actionable insights. Small data is data that is ‘small’ enough for human comprehension. In a volume and format that makes it accessible, informative, and actionable.

Curious Joe: Thanks Jane for walking me through all this. I see the world differently now, and I know a little more about data. How it originates. And that it needs good data management to ensure the maintenance and quality of the data. Other good things then build upon it, such as predictive analytics. The intelligence process helps to derive insights and when insights are aligned with the context of KPIs, they become actionable. We can also source data from unstructured formats and thanks to modern storage and compute, these can also be consumed — both big data and small data. Speaking of storage and compute, where can we get these?

Genius Jane: Let’s set up some time to talk about this. There are lots of topics to cover here. Aren’t these some amazing clouds by the lake? (Photo by Quino Al on Unsplash)

Picture of a lakeside with clouds hovering above
Picture of a lake and clouds above

--

--

Clifford E. D'Souza

I'm a citizen data scientist and software engineer. I love to mine insights from big data for solving business problems & for supporting decision making.