Good Data

Clifford E. D'Souza
4 min readDec 31, 2021

--

Curious Joe has learned about Data, Data Lakes, Data in the cloud, and Data to Insights from Genius Jane. In fact, the last time he met with Genius Jane, he heard of the case of a manufacturer that used the data it owned to boost revenues and discover new strategic partners. Genius Jane promised that she would share another case when they met again. That day had come in the office late one evening.

Curious Joe: Hi Jane, thanks for the meeting.

Genius Jane: Sure Joe! I’ve got 10 minutes before I get into another meeting. I’m going to talk to you about an organic farming company that employed high tech to manage its greenhouse operations. This included the use of IoT sensors at all their greenhouses. These devices capture readings such as temperature, humidity, atmospheric pressure and send the measurement data every 30 seconds to their cloud streaming service.

Some real-time analytics is performed automatically inside the cloud infrastructure and actuators in the greenhouse are activated accordingly. So if the temperature in the greenhouse, where the produce is grown, gets too hot — the air vents are opened to let in the cooler outside air. If the humidity gets too low, the humidifiers are turned on automatically. You could call these smart greenhouses.

Picture depicting a scene in office where Curious Joe is seen discussing something with Genius Jane.
Scene rendered with Blender with the actors posed in Daz 3D Studio

Curious Joe: That’s a precise use of technology. I guess they end up with the best harvest with high-quality produce.

Genius Jane: Yes, they do, and the demand from consumers has been increasing. Not before learning the hard way though. Initially, the automated system worked so well that everyone was happy with it and blindly trusted in it.

However, after the passing of a few years, it was noticed that the quality of the organic produce started to degrade. Many unripe fruit and vegetables were found in the harvest baskets, unlike in the previous years.

Curious Joe: I guess people cringed because of the lack of consistency and below optimal harvest.

Genius Jane: The people who loved the automated system were concerned and wanted to have the problems fixed as soon as possible. An investigation into finding the root cause was ordered. After a few days, the problem was identified.

Picture of an Internet of Things (IoT) sensor that is plugged into a Raspberry Pi WiFi development board.

(Photo by Denis Cosmin on Unsplash)

Curious Joe: Well…what was it? (Sounding really eager and wide-eyed at the same time)

Genius Jane: It was the IoT sensors and the data they were producing. Every greenhouse of the company relied on a single sensor for the temperature, humidity, and atmospheric pressure readings. Due to the sensor’s age and moist conditions in the greenhouse, some of the sensors started producing data readings that were 20 to 30% off from the actual conditions.

This led to the machine learning inference models hosted in the cloud to send commands to the actuators in the greenhouse that were not required. In one case, the air vents were opened which caused the temperature in the greenhouse to drop below the optimal level.

Curious Joe: So…bad data in this case!

Genius Jane: Good data always represents the truth or fact. In this case, the measurements were different from the truth, leading to incorrect decisions being taken — like in the case I just mentioned with the premature opening of the air vents in the greenhouses.

Curious Joe: What did they do to solve their problem?

Genius Jane: The company replaced the IoT sensors with more robust and durable hardware. These were much more expensive. However, the cost was not as much as what they lost in revenue because of the un-ripe produce. Furthermore, they signed up an annual maintenance contract (AMC) with the IoT Sensor manufacturer.

They also did one more thing. Every company-owned greenhouse now had two sets of IoT sensors installed. So they built-in resiliency and better observability of the data. They sent their measurements at the same time to the cloud streaming service.

In the cloud, a serverless program compared the two readings, and if there was a difference of more than a pre-defined threshold between the two, an SMS text notification was sent to the greenhouse management to alert them.

Curious Joe: So if the readings were off from one another, that would mean one of the two IoT sensors had failed because they were taking readings of the same greenhouse. And this would need immediate attention because bad data has started to be produced from one of the IoT sensors.

Genius Jane: That’s right. There hasn’t been a sensor failure event captured as yet though. If in the future, it does happen…then the failure would be caught almost immediately.

Curious Joe: Fixing the problem in time saves the day. It’s interesting to see how the organic foods producer company ensured their data quality was good by automatically monitoring it closest to the source.

Genius Jane: That’s the way to go. All the analytics that comes after that are highly dependent on the data quality. You know, Garbage-In equals Garbage-Out (GIGO).

Curious Joe: I agree 100%. And thank you for sharing this story.

Genius Jane: All good. Also with 10 minutes up, I got to rush for my next meeting.

Curious Joe: Sure Jane. See you later.

Picture of a green house with ripening produce seen hanging from the plants on either side of an aisle.
Greenhouse with produce

(Photo by Markus Spiske on Unsplash)

Originally published at https://www.linkedin.com.

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

Clifford E. D'Souza
Clifford E. D'Souza

Written by Clifford E. D'Souza

I'm a citizen data scientist and software engineer. I love to mine insights from big data for solving business problems & for supporting decision making.

No responses yet

Write a response