With the introduction of IoT devices and a better understanding in data collection we can create far deeper insights.
As such, in this episode of PixieTV, Stuart will be explaining, and drawing the architecture of a recent project that focused on obtaining and interpreting data for the better of everyone.
Welcome to the third episode of PixieTV: where we’ll be passing on our industry insights in bitesize videos.
This episode follows on from our previous conversation around enterprises moving VDI to the cloud. This week our Data Science & HPC practice lead, Stuart Anderson, will be walking through RedPixie’s health related solution that turns everyday data into essential insights.
Beyond this solution, Stuart’s team gets involved in a lot of different things: from building high-performance computer stacks to re-platforming and re-engineering architectures in Azure.
Designing an IoT solution that provides insights
One of the easiest way for us to describe what we do at RedPixie is to take you through something we’ve built and designed.
The true essence of data science insights.
An end to end IoT architecture, which is taking telemetry from wrist-worn wearable devices. Thereafter a whole suite of technologies and services work with that data up in the cloud.
From that point, it is about some of the interesting things you can do with that data – turning data into insight and into action.
1. Data collection
This all begins with a wrist-worn wearable device, which is a Samsung Gear S3.
We find it an interesting device – almost an iPhone on your wrist.
The Gear has several interesting sensors:
- Optical heart rate
- Gyroscope and more
It’s also 3G enabled, so it can send over 3G/4G directly. More importantly, there are more of these great devices coming.
This is the start for this kind of technology.
So, we capture all the data from our devices and send them to an IoT hub in Azure.
However, the amount of data we take from each sensor depends on the need. While some are as frequently as possible, temperature is only every 30 seconds.
In contrast, we capture your heart rate once a second, all intended to optimise the battery life.
Sending the data
This data is the sent to an IoT hub in Azure, which is a service capable of supporting millions of simultaneously connected devices bio-directional communication.
With this, you can send up to 300 million messages per day to the largest version of the servers, and you can just keep adding servers.
As you can see, this is scaled to the wider population right off the bat.
2. Processing the data
Secondly, the IoT hub has several streaming analytics tasks.
This listens to all the telemetry coming through and runs reasonably simple logic on them.
These sequel-like queries are looking for things like:
- Does the battery need charging?
- Is the heart rate or skin temperature raised?
- Or other simple measures that are easy things to catch.
If it doesn’t acknowledge that something negative has happened, we have another service called an event hub.
The event hub is just another messaging place.
If the battery life looks like it is getting low, we might need to contact the person to say –
“Hey, you might need to charge your device tonight.”
Equally, if the heart rate is slightly worrying, we might just send a message –
“Are you okay? Yes or no.”
The other messaging services that we’ve created can take various actions depending on the severity of the event that has taken place.
If it’s serious, we hit an API of a care provider that we’re working with, and send them into a different workflow where they can be communicated with and potentially an ambulance sent if it looks like they’re in serious distress.
Those insights can save lives!
One of the stream analytics jobs has another string to its bow: it sends all the data to a petabyte scale, Azure data warehouse service.
This is a distributed storage engine that is capable of some interesting storage techniques. Such that you can store data, row-wise, column-wise, highly compressed, and in many other ways.
In addition, we have some visualisation technology sitting off the data warehouse. This means we can draw dashboards to ultimately understand what is going on.
Lastly, there is a data lake.
This supports the hundreds of thousands of users, however keeping that data indefinitely inside the warehouse isn’t the most efficient way to do it.
As such, we dump it into data lake – which is just an enormous pool of blank storage that you can move everything into. When at some point in the future you can come along and put something like a spark cluster with it.
3. Applying machine learning
The final step of the programme is applying some machine learning.
The Azure machine learning service is looking at the data warehouse and attempts to capture models that apply to everyone.
Therefore, if we can capture the lead-up to a serious event like a stroke or something, hopefully that would be generalizable to a whole population.
The lead-up to a stroke generally looks the same for a whole population. Equally, we also want to do specific models for you.
What does normal look like for you?
Let’s model that. Let’s model your every day, and then see when you deviate from that.
Expanding this principle
Azure services are helpful in that the stream analytics service can natively hook you into the machine-learning algorithm, enabling a great feedback loop.
Put your data in the data warehouse and the machine learning models can learn from that data, and they can also immediately score from those in your streaming analytics services.
The ability to do this at this scale is backed by world-class data centres of Microsoft. All of which combines to makes it interesting, and revolutionary.
Applying this elsewhere
While there may seem like lots of parts to the puzzle, it merely starts within an IoT device.
So, this is agnostic of the end device, so we’re talking to other companies about telematics boxes in cars. This input is different, but the rest of the process is the same.
The cloud scale that you get right off the bat is what makes this different to other services.
While all these components have existed in various bits and pieces for a long time, we can turn those on and start using them cheaply.
We can scale them up to massively.
Similarly, for the stream analytics task and the size of the data warehouse.
I can pause it, and I stop paying for it. I can scale it up hugely when I want to do some compute. It costs more, but it’s efficient.
Why it interests me
What’s interesting to me about this architecture is that my background is in healthcare.
I did 8-years clinical research at a prior company where we did lots of this kind of architecture work for different industries
Now with RedPixie, we can change how people’s health is managed. Stopping people from just rocking up at hospital when they don’t have to.
If we can detect something from getting worse, like people falling over, we can avoid hip fractures. Hip fracture is 1.4 million bed days in NHS in England last year.
It would be great to hear your thoughts on manoeuvring data science to get great insights. Stay tuned for more videos – if you feel compelled, feel free to subscribe on YouTube.