azure, data science, machine learning in production

Building a Data Science Environment in Azure: Part 1

For the last few months, I have been looking into how to create a Data Science environment within Azure. There are multiple ways to approach this and it depends on the size and needs of your team. This is just one way in a space where there are many others (e.g using Databricks).

Over the next few months, I will be running a few posts about how to get this kind of environment up and running.

First off, let’s mention some reasons why you might be looking to set up a Data Science sandbox in Azure rather than on premise.

Reasons why:

  • On-prem machines too slow.
  • Inappropriate (or no) tooling on-prem (and not fast enough deployment of relevant tools to local machines).
  • Slow IT process to request increased compute.
  • On-prem machines are under utilised.
  • Different needs per user in the team. One person may be running some heavy calculations, whilst someone else just runs some small weekly reports.
  • Lack of collaboration within company (perhaps cross department or even regional).
  • No clear process for getting models into production.

Once you have clarified the why, you can start to shape the high level requirements of your environment. Key requirements of our Data Science Sandbox could be:

  • Flexibility – Enable both IT to have control but also the data scientists to have choice.
  • Freedom – Enable data scientists by giving them the freedom to work with the tools they feel most confident.
  • Collaboration – Encourage collaborating, sharing of methods and also the ability to re-use and improve models across a business.

You will want to think about who your users are, what tools they are currently using and also what they want to use going forward.

At this point, you might do a little scribble on a piece of paper to define what this might look like in principle. Here is my very simple overview of what we are going to be building over the next few posts. I’ve taken inspiration from a number of Microsoft’s own process diagrams.

Let’s take a look in more detail at the above process.

  1. We have our Data Science sandbox, which is where the model build takes place. The Lab has access to production data but may also need to make API calls or users may want to access their own personal files located in blob etc. This component is composed of a number of labs (via Dev Test Labs). These labs could be split by team/subject area etc.
  2. Once we have a model we would like to move to production, the model is version controlled, containerised and deployed via Kubernetes. This falls under the Data Ops activity.
  3. The model is served in a production environment and we take the inputs and then monitor the performance of our model. For now, I have this as ML Service but you could also use ML Flow or KubeFlow.
  4. This feeds back into the model, which can be retrained if necessary and the process starts again.

The main technology components proposed are:

  • Dev Test Labs
  • Docker
  • Kubernetes
  • ML Service

In the next post, we will start setting up our Data Science environment. We will start by looking at setting up Dev Test Labs in Azure.

docker, kubernetes, machine learning in production

Deploying an ML model in Kubernetes

A while back I started looking into how to deploy and scale Machine Learning models. I was recommended the book Machine Learning Logistics by Ted Dunning and Ellen Friedman and started to look into their proposed method of deployment. So far, I have only got to the containerisation and orchestration, however there is still a whole lot more to do 🙂 

I thought I would offer and easy tutorial to get started if you want to try this out. I’m not going to talk about a production ready solution as this would need a fair bit of refinement. 

There are various options for doing this (feel free to let me know what you might be implementing) and this is just one possible approach. I guess the key is to do what fits best with your workflow process. 

All of the code is on GitHub, so if you want to follow along then head there for a more detailed run through (including all code and commands to run etc). I’m not going to put it all in this post as it would be very long 🙂 

For this project I decided to run everything from of the Azure DSVM (Data Science Virtual Machine). However, you can run it locally from your own machine. I ran it from the following spec machine:

Standard B2ms (2 vcpus, 8 GB memory) (Linux)

You will need:

  • Jupyter Notebooks (already on the DSVM)
  • Docker (already on the DSVM)
  • A Docker hub account
  • An Azure account with AKS

Building the model

I won’t go much into the model code but basically I built a simple deep learning model using Keras and the open source wine dataset. The model was created by following this awesome tutorial from DataCamp!

I followed the tutorial step by step and then saved the model. Keras has it’s own save function, which is recommended over using pickle. You need to save both the model and the scaler because we will need it to normalise the data afterwards in the flask app.

Building a Web app using Flask and Containerising it

If you are using the DSVM then under the ‘Networking’ options we need to add another option for the ‘Inbound Port Rules’. Add port 5000. This is the port where our flask app will run. 

For building the docker container, I used this easy to follow ‘Hello Whale’ tutorial by Codefresh as a reference. 

I built a simple flask app, which predicts red or white wine by using some sliders to allocate values to the attributes available in the dataset. As I mentioned, the code for the app is on GitHub. It’s not the prettiest app, feel free to beautify it 🙂 

You will also need to create a Dockerfile and a requirements.txt file (both in the GitHub repo linked above). The Dockerfile contains the commands needed to build the image and the requirements.txt file contains all of the components that your app needs in order to run. 

You will need to make a folder called flask-app and inside place your app.py file, your Dockerfile and your requirements.txt file. 

Navigate via the cli to the flask-app folder and then run the following command:

docker build -t flask-app:latest .

Now to run the container you need to do:

docker run -d -p 5000:5000 flask-app

If you want to stop a docker container then you can use the command:

docker stop <container_name>

Be sure to use the name of the container and not the image name, otherwise it won’t stop. Docker assigns it’s own weird and wonderful names unless you specify otherwise using the –name attribute.

Upload the image to Docker hub

You will need a Docker account to do this. Log in to docker hub using the following command:

docker login --username username

You will then be prompted to enter your password. Then run the following commands to tag and push the image into the repo.

docker tag <your image id> <your docker hub username>/<repo name>

docker push <your docker hub name>/<repo name>

We now have our image available in the Docker hub repo.

Deploying on Kubernetes

For this part I used Azure’s AKS service. It simplifies the Kubernetes process and (for the spec I had) costs a few pounds a day. It has a dashboard UI that is launched in the browser, which lets you easily see your deployments and from there you can do most of the stuff you can do from the cli. 

I set a low spec cluster for Kubernetes:

Standard B2s (2 vcpus, 4 GB memory) and with only 1 node (you can scale it down from the default 3). 

To deploy from the docker hub image. 

Log in to your AKS cluster with the following command:

az aks get-credentials --resource-group <your resource group> --name <your aks cluster>

Pull the image and create a container:

kubectl run wine-app --image=josiemundi/flask-app:latest --port 5000

If you type:

kubectl get pods

You can see the status of your pod. Pods are the smallest unit in Kubernetes and what Kubernetes groups containers in. In this case our container is alone in its pod. It can take a couple of a minutes for a pod to get up and running. 

Expose the app so we get an external ip address for it:

kubectl expose deployment wine-app --type=LoadBalancer --port 80 --target-port 5000

You can check the status of the external ip by using the command:

kubectl get service

This can also take a couple of minutes. Once you have an external ip you can head on over to it and see your app running! 

To delete your deployment use:

kubectl delete deployment <name of deployment>