From Fortran on the Desktop to Kubernetes in the Cloud: A Windows Migration Story
Hey. How you doing my name is Elton I'm a docker captain, and a Microsoft, as your MVP and, thanks for joining me for this session here on topic on live this, is actually my sixth topic on and although the commute was much easier this time it's a shame that I can't meet you in person, but I hope you'll learn a lot from this session anyway so. In my day job I'm a consultant, helping, companies move their applications, to containers, and this, session is all about a real-world project where, I help the client migrate their existing application, to containers, so they can run it in the cloud the. Subtitle, of the session is a Windows migration, story but the focus here is on migration and not specifically, on Windows the. Key thing about this kind of project is that you're trying to maximize reuse. You're trying to take an application that already works well and just want to run it in a different space and give a different way of accessing it and that's what I'm gonna be focusing on here now. These real-world sessions, are often interesting, but hard to relate back to your own projects, so as well as explaining what we've did and giving you some good tips and best practices there, are some major themes that will apply to any containerization. Project, and the, first one is about flexibility. So, if you've been using docker for a little while you know about the flexibility, to be able to run your applications, in containers. On any platform and if you're using multistage, docker files then you can build your apps in containers, to using whatever infrastructure. You like using Jenkins or github actions, or Azure DevOps, and it's all the same but, there's one more aspect of flexibility, the containers, give you which is flexibility, in your application, design so, for a long time I've been saying that containers, are not just something that lives in the dev world or the ops world but it's also going to affect the security team IT management, and architects. And having, the ability to run components, of your applications, in containers, on potentially, different platforms, gives you a lot of flexibility, in how you design your app and the next one is speed and as I walk through the stages of this project even at a high level you'll get an idea that we managed to do a lot of work very quickly because, we were reusing, so much stuff from the community, so, we've built the new components, really quickly we have them up and running in a car and super quickly and because, of the flexibility, that we've built into the design we were able to change pretty quickly and have different ways of running our application, and the last one is about standardization. So, in this session I'll be using docker in Cuba Nettie's and there's a learning curve for both of those docker is an easier to to pick up but you'll get a lot more benefit, from it if you invest the time to learn it properly and then when you're comfortable with docker then you can move on to cuba Nettie's which has its own much steeper learning curve but, if you invest the time in learning doctrine, kubernetes, and moving your applications, to them you get this layer of standardization. In how you model, your applications, using cuba Nettie's manifests, and docker files and, how different parties can consume, your application, so if it's something that third parties can run themselves or they need to have Tsukuba Nettie's tusk there and if you're adding new members to the team if they've got a good understanding of docker and cuba Nettie's they'll be up and running super, quick ok, so let's move on to talking about the project so in its current incarnation the, whole application runs on the desktop so there's the UI component, and that passes on most of the work as job requests to a separate compute component, now in the real project like compute component, was all about working out the best places to put wind turbines, to harvest, the maximum, amount of energy and he was doing some really hardcore, computational. Fluid dynamics, that was super compute intensive so, if you wanted to run this stuff although it was a desktop, application you, had to have some pretty serious kit to be able to run it yourself and then you construct your parameters, in the UI the, computations, are all running this separate binary and the two parts communicate, together over the fastest and, this is a pretty long-standing, application.
So All the hard work is done in the compute module which uses Fortran, and the rest of the application was written in net and parts, of the app or at least 15 years old now, the goal is to be able to turn this into more of a SAS model and have a cloud offering, and our initial approach for that was to take the compute module and docker eyes it so we could run it in a container and we're going to host that any container platform, in the cloud which, would also have a REST API also, running in a container which, acts as the control plane for the compute layer so you can submit jobs to be computed, and you can query the status of existing, jobs now we wanted the new delivery to be able to work with the existing desktop so you would still have your desktop UI but instead of calling the compute layer directly it, calls into a shim which looks the same but actually sends the request for the compute on to the API and then the shim also takes care of moving the data around, so it uploads the files to a cloud blob storage component, where the compute container writes the output and then the shim copies that down into the local filesystem so, as far as the UI is concerned, it's all working in exactly the same way but, now we're hosting the whole compute, layer in the car and we can burst that to whatever level of scale that we need and the, other advantage of moving this to the cloud is that we now have a public API which we can use with other consumers, so, the ultimate goal is to have an alternative, to the desktop UI which is a website that works in exactly the same way and uses, the same compute, module that the desktop UI calls into which is actually a wrapper around the original Fortran, code okay. So let's see how it looks I've got a bunch of demos to show you but instead of doing some hardcore computational. Fluid dynamics stuff which I don't really understand myself my, demo app is just gonna be computing, PI but, although it's a much simpler application, it has some of the same concerns and I'll be following through the same process, that we took with the real application. All, the demos are going to show you you're up on github and at the end of the session I've got a page of links that will tell you where you can find this stuff and we're going to start by looking at the daka file for the compute module. So. It's a pretty simple takaful so I'm using Windows container so you can see that my base image, is using the.net framework and, I've pinned to a specific version so I'm using 4.8 of the.net framework running. On Windows Server 2019. Now, in this approach I've got a really simple docker file where I'm just capturing things like the entry point and the command to run the application so this docker file really just captures, the API what inputs am I expecting, and what outputs am i gonna push out and then the application, itself comes from another docker image so the purpose of doing this is to make it simpler, to work with the image that I'm gonna build repeatedly, if I want to change the default variables. Or package to mean configuration, I'm working with a nice small docker file it's gonna be quick to build the, main docker file has got all the complicated, installation, steps and in, the real application that's is storing all the dependencies, and storing the application itself setting up the registry a whole bunch of complicated stuff I only need to work with day-to-day so by breaking that apart into two separate images like the core image that's got the installation, steps and the day-to-day image, let's draw the computes, API I'm making it much easier to work with okay.
So Let's, open up my terminal and, we'll. See that I'm a Windows detainee mode here I do docker version, I've, got the Windows client and the, Windows Server ok so we're running in Windows containers, I've got docker desktop installed on my Windows 10 machine and that's kind of all I need. So. Firstly I can just run this container, which. Wraps up my legacy, application, so this could be my Fortran, application, that's do my computational. Fluid dynamics, but in this case it's just running PI and by default the. Settings that I've got in my daugher file give me pi to six decimal places but. The containers setup to allow me to specify how many decimal places I want so, if I want PI to 100 decimal places I use the same container I just passed in a different input and the. Beauty of that is I've wrapped up this complicated, step into a single, docker container, that I can run anywhere so, if I want to run a computer op in the card instead I can do that with this your container instance is or I can just create a new container from that same docker image and I'm, using the same kind of API so I can specify how many decimal places I want and also because I'm in the cloud now I can specify the amount of compute that I want so in this case I can say I want one CPU and two gig of ram and this will create contain, out running up in Azure and. If I decide I want to have something a bit bigger so that I can run a bigger computation, I'm using the same image again same, API. Pass in a different parameter for, the number of decimal places and I can use a bigger compute instance to run the same container image now, this is the windows container image and that works fine with is your container instances, you can spin up a container from a Windows image and it will run on a Windows server, but. One of the disadvantages in having a legacy, application, package in a Windows container image is that they're pretty big so the base image I'm using has got the latest version of the dotnet framework built. On top of Windows server core 2019, and that's a fairly big surface area so that's why my image is six and a half gigabytes, even though my own application, is just a few tens of megabytes but if I go and look at my as, your portal, so. I see I've got my two container, instances, here here's the one I created to compute PI into five thousand decimal places if, I look at the containers, I see a whole bunch of events and, when the image started to be paused when it got pulled when the container was started, and if I look at the logs then, I see the output of pi to, 5,000 decimal places I won't. Go through all that cuz you can probably work out it's just a big load of numbers if I look at the other instance, this is PI to, 250,000. Decimal places and, this is still running so, even though this is running on bigger, instance, there's more compute power available, it's still quite a lengthy, calculation, so there'll be no logs in here it's, still running it'll run as long as it needs to and then the container will just end it's the same user experience but I've got much more compute available to me and I can scale out as much as I need and that's. All fine but as a developer, or a tester working on the system you don't have a lengthy, duration, between running each container just to check that everything's working okay, and that's why when you're looking at migrating, an existing, system it's a really good idea to create a stub of the application, in a separate dr image which runs really really quickly but has the same api so. If I close them or terminal, here and have a look at what's dubbed opera file this, is based on the Linux container image using all Parton so it's gonna be a super tiny container image it's, got effectively, the same API so the way you call it uses the same entry point which is the name of the real binary but in this case that isn't a binary then runs in the container it's just a simple script by, drawing a look of our script all, it does is echo out PI to 2 decimal places so, that might seem pretty useless but what it means is I can run that container super, quickly and if I'm working on some other aspect, of the application I, can fire off a calculation, check, that it runs correctly, and I don't need to know hi 250 thousand decimal places I can do that for the real computation, of when I'm iterating on the project, I can run something that looks like the real computation, module without having to wait for order to spin up so.
If I now switch to Linux, containers, using, docker desktop, and if I open my terminal back, up again. This. Time my, stub has a different image name. But. The way I use it it's pretty much the same so I do a docker run and I get the IRA now it's super fast because it's the Linux container which is just echoing out a single command and if I want to ask you for more decimal places I can do that and it, will work in the same way it doesn't give me the right result because that's not what it does it, just ignores, the input that I pass in but, it does give me an actual result so I could use this for real similarly if I want to try this up in there's your container instances, I can, do the same kind of thing I can do an AZ container, create and the, API is pretty much the same the command line that I passed is the same as I would use for the real Windows image but this time I was saying that I wanted to run on Linux and I can have a tiny instance, with one CPU core, and half a gig of ram and, it will work in the same kind of way but, the main difference of course is that this is a linux image based, on Alpine so, instead of being six and a half gigabytes, it's actually, just five megabyte. So, it's super fast to work with it cuts down a lot of the waste and you were development in your testing cycles to have a stub that looks and feels like the real thing but doesn't do the complicated, compute and the other advantage of that particularly, if you're moving Windows, applications, and you want to run in cuba Nettie's is that by having a Linux tub you can run the whole thing in a Cuban NC's fuster on your laptop, without having to spin up a whole bunch of virtual machines right. Now there isn't a good development, experience, for having a hybrid, cuba Nettie's cluster, where you can run Windows pods, and Linux pods you can run a Linux only communities cluster super easily and if you can isolate the compute components, and replace them with stubs then you can work locally in cuba Nettie's using, just Linux containers, so. I've got that option here so, using this cumulative manifest. This. Is going to create a community's job which is like a batch job which will create a pod to run my application container. My, application container is using that same Linux, containers, doublet just gonna echo out PI and I can figure it in the same way using the same sort of command pass me at the decimal places that I want it to work with so, if I run this, claim. Was awesome space so. What I've got here I, have. A look at my notes just to make sure that I'm connected, to the right thus there so I've got docker desktop running if, I, apply, my manifest. To create my job. That's. Going to submit my job for me and when jobs get submitted they create a pod and they add a label which has the job name so if I get the pods and I filter, on the label but I should see that my PI stub is there and it's completed and if I look at the logs then. I see the fake calculation, of pi is coming from my stuff and the reason why that's useful is I can work on my cuban ities job definition, I can work on my manifest, I can get everything running locally, and then as soon as I'm ready I can submit that to my real cluster in the cloud which has windows nodes that will run the application for, real and the only real difference there, is in my manifest, I'm. Specifying the real docker container image here which is my windows, image that runs PI and.
I've Got a node slack down here that says this should run on the windows node and that's all I need to do as long as my cluster has some, nodes in there that can run Windows containers, then I can use exactly the same kind of job definition, that I use locally, to run up in the cloud so. From docker desktop, here I can switch to my IKS cluster which is running up in Azure, let's. Make the switch. And. Now if I look at the nodes just confirm that I have got Windows nodes in there. So. I've got multiple, nodes in here and one of them is running Windows if, I create a Windows version of that job. It's. Exactly the same way of describing my application, it's exactly the same way of interfacing with it so I should see this job in there and, this is in the status container creating, so, my Windows node is busily downloading, the image I look, at the pod here, then. I will see that, it successfully pulled the image and. If I now go and look at the logs it's. Computing, PI to a hundred thousand decimal places and that's. Done and. The final digit is 6 and that might come up in a pop quiz at some point and if you're running with a managed cuban entities cluster most of the cloud offerings let you burst workloads, out from your cluster into some other container platform, and that's exactly what ATS lets you do, so. I can submit a similar job definition, but, this one's configured to use the virtual cubelet, which in IKS means i can burst out and create a CIS S's, just like I'm manually, created with the AZ command, but humanities will do that for me so when I submit the jobs if there's no space left on my cluster to run a new job for me it will automatically, burst out to a CI and like a scale out to whatever I need, okay. So let's go back. So. What we saw in that demo is effectively, the POC stage of the project well we just did the bare minimum, to get this stuff up and running to see how it looked to see what it was going to give us so firstly we took the original compute, module and we packaged that up in a docker container which, became the base image that had everything it needs to run the application, but we don't use that directly, instead, we have a compute image which has a much simpler, docker file for us to use and iterate on day to day and it pulls the binary than it needs out of that base image and as we've built up a compute image we realized we're spending a lot of time moving images around and a lot of time running computations. That we didn't really care about the result of as part of developing, the wider project, I don't really need to run the real calculations so. That's why we have this stub and the stub looks the same as the main compute module but it's much lighter it's much faster it's much easier to work with and, what that gives us is the ability to run either of those components in any, container platform, I can run it in docker on a server or on my laptop or I can run it in cuba Nettie's in the cloud or on my laptop and it behaves in the same way wherever I run it obviously if I'm using this stuff I don't get the real results, but I get exactly the same user experience I can use the same communities, manifest, the same docker commands, and using, the stub lets me keep up the pace of the project, without constantly, running the same computations.
And. Because of that focus on speed we were able to get that first stage of the project done pretty quickly we spent about seven days from the initial, consultation, about what we were aiming to achieve through. To getting this stuff up and running in kubernetes, in the cloud running, the real computations. And also running an equivalent, stack locally, in the developer machine just, using docker desktop with kubernetes, and that, flexibility, was something that we really wanted to keep in the project as we went forward so, the next stage was building out the API they kind of abstracted, away the details of running the computation, whether it's directly in a docker container, or through a Cuban and his job and we knew that we wanted to be able to run in different platforms so he wanted the API not to be exclusively, tied to Cuba Nettie's but to have the option to invoke the compute component, in some other container platform, so we focused on maintaining that flexibility, in the design and we're going to do a straight back to the next set of demos to see how that looks with the prior application. Okay. So let's clear this stuff down let's. Go to my demo - and so, I'll start by switching back to Windows containers, and. I'm. Gonna run my API. So. It's just a REST API which gives me access to my PI compute, layer but in a nice way I'm. Gonna. Really container the mount docker socket, or in the case of Windows that's the docker pipe so when this API runs it's gonna be able to talk to the daka api and create new containers, to run the jobs so let's start this up my pies are from running so inside, the API container, there's a config, file that tells the API how, to submit compute, jobs and, in this case is using the docker processor, so it's expecting, to be able to talk to the local pipe and create new containers, and it's, going to be using the real windows contain an image that really does the computation and it's expecting to run a window so that's all fine that's why I switch back to Windows container mode okay, so my API is up and running so if I make a curl command to my API and I'm, just posting your job here and asking you to compute PI - 700. Decimal places and. Let's. Go into the URL where my container is listing and. The response gives me the processor, ID which could be a container image or a community's, job name so with this here I can, grab the container image and, I can do a docker, logs that's. Gonna be the container ID sorry and, I, get me out 4 here so the API here is running locally in a docker container, when I submit, a job request it's going to create another docker, container which executes, the job and then I can see the logs coming back out so the API is kind, of decoupled, from how the job gets processed but.
It's Running the same container image that I've been running up until now so we're keeping that flexibility, we've really isolated. The original, compute, module, the original, kind of legacy application, we've wrapped that up and now we're moving it around and running it in different ways and giving us different ways to consume the application, so. If I switch back here. Now. Because I have my stub, which. Runs in Linux containers, I've, got the ability to run my API in cuba Nettie's and have the api submit a Cuban Eddie's job for me and that will run my Linux container stub locally, and the only difference between what I'm running locally and what I deploy to Azure is, the config file for my API so. Let's switch here, make sure I'm using the right Cuban. Eddie's faster, docker. Desktop. And, the configuration file on your deploy here to Cuba Nettie's this. Segment we're going to use the kubernetes, processor, so when the API receives, a request to compute PI instead of creating a docker container, it's going to create a queue manatees job and that's custom code for all that to happen but it's really pretty simple and it's all up on github if you want to see how that works and the, community's, job will start a pot and the pod will run a docker container, and that docker container is gonna be using the stub image this time so so I'm gonna be running the API locally, in Linux in my cuban eTI's cluster, and I'll be using the Linux stubs when I compute my jobs in Cuba Nettie's ok, so let's, go back here. If. I deploy, my local, configuration, first, so. This is the compet father saying I'm going to use the stub when I try and compute PI and I, deploy the whole rest of the API so I won't go through all these Yama files because I'm gonna go half an hour but if you want to look at them they're all up there it creates a deployment for the API which creates a wrapper cassette there's a config map from the configuration details, there's a service account in there and the whole bunch of our back stuff so that the API can talk to the cuban eTI's api and create, new jobs ok. So if I have a look and make sure that's all running, at, all looks good so I've got my API service, which is listening on port 8080 3 my, pods are up and running but all is good ok cool, so. Now it's the same API so, although it's running in Linux rather than Windows although, it's running communities rather than a straight docker I still talk to it in the same way so I can curl my local host on port. 8080, 3 and ask for PI to 100, decimal places and the. Response I get out this time I still got that processing, ID but this time that is a cubed Eddie's job name so if I do cube cuttle get jobs and. I. Pass it that process the ID. Then. I'll see it's completed and. If I look at the logs again using that same processing, your ID. Then. I get the stub response which is PI, to two decimal places because there's not running the real computation, because as a developer I'm working on the API I don't need the real computation, I don't need to submit PI to 500,000. Decimal places and wait 10 minutes for it to compute all I want to do is test it gets even vote correctly, and it gives me some sort of output so this is perfect, okay. And the. Last thing I want to show you is the final aspect of that flexibility, because when I deploy this to my real cluster, up in the cloud we just a mixture of Linux and Windows nodes are we using exactly, the same cuba, Nettie's manifests, the, only difference will be the configuration, file for the API so inside here the. Config map that I'm using in Azure still. Starts to use the kubernetes, processor, I'm not using a CI here but I've got a flag here in my API that I could burst out to a CI if I wanted to and this. Time I'm using the windows image and I'm using the Windows platform so when I deploy to Cuba nighties in the cloud, this will be the API configuration. I still use the API in the same way but my API pods, will be running on Linux nodes and my actual compute, pods they get created by the jobs they're gonna be running on Windows nodes ok so let's, switch out of here let's. Go back to my production. Cluster, that's my hybrid cluster. And. Let's just check them I'm in the right place, yeah. I've got my Windows knows there so I'll start by deploying these your configuration. And. Then. All the rest of the API components. Okay. So it looks good I do. A quick get to make sure they're all there that's. All looking cool so I've got inside here I've got my load, balancer API so that's my public API for my service so I can go and call this now but, I've got a nice little format.
In Here for cube shuttle that would give me the full URL of my service, so, that's just cube couples get a service but I'm passing it a format string this is gonna take the public IP address, and out on my port and my URL that's, gonna give me this address here I'm should have piped that straight into my carol command so I have to worry about where the actual IP addresses, this is a load balancer type of service so I'm gonna get a public IP anaz, your so. If I computer, 10,000, decimal places. So, I get my response back by, processing, your ID here that's my job name if, I do cube, colligate jobs. With. My processing, ID. And. I see I've got my 1 completion, that's 18 seconds old only took 6 seconds to run if, I go and look at the logs, again. For that same job ID then. I get my output there and that's running my Windows container so I'm using exactly, the same API. Code I'm using exactly the same API modeling, in my humanities manifest, I've got a different, configuration and, I've got the flexibility, inside my API to, vote my compute module in different ways depending on how the platform, is running ok so let's go back to the slides. So. For that second stage as we've built up the API we call that the evaluation. Stage because, we were looking at some different options and, we wanted to see what the move to the cloud was gonna give us and as, we saw in the demo we ended up with a few different ways to run our application so, we can run it all locally, using docker desktop with Linux containers, in Cuba Nettie's using, the stub for the compute module so we can still run the full end-to-end but without the real results, on the amount of time it takes to execute those results, which is a really nice pattern if you're trying to migrate something, that's compute, heavy or i/o intensive, you can stub that stuff out and focus on what you're building around, your legacy component, then it's reading the speed up your delivery, and, then we had the option to run in the cloud with the hybrid cuba Nettie's cluster, so unless, you're on AWS, on GCP you can run a cuban eTI's cluster, that has a mixture, of Linux, and Windows nodes if you want to run a hybrid, solution like this and in our case we're running the API which is a brand-new dotnet, core component, and that's going to invoke the original, compute module by creating, communities jobs which we figured to run on the windows nodes but, what you also get with aks is the ability to burst out from your own cluster which has a fixed number of nodes and run additional compute, using as your container instances, acim massive deployment mode which will allow us to scale to pretty much any level that we need we can run with a fixed cluster that's good for the regular compute requirements, and know that we can burst are kind of whatever size we need and, as part of this evaluation, phase when we realize that we've broken all these things into different containers then, we can run in different ways we, look to one other option which is not having communities, at all and running the API using, a service function using, the same container image and bursting, out a CR in the same kind of way so that we can run the same compute job but in a scaled-down platform, that doesn't require all the kind of heavy lifting that you get with Kuban at ease of having to manage all the manifest, and having to manage the cluster itself so, just by carefully, designing the approach to migrate that existing, application, into containers, and designing, the flexibility, around it we had all these options from when we went live to host the application in different ways and that stage, of the project was pretty quick too it took about ten days to get to this point of having the Kuban eTI's option and the, local docker desktop, option, and these your functions, option or using, the same kind of core components, the API is the same everywhere and the compute module behaves the same everywhere, even though I might be using a stub when I'm running locally, okay, so just to go back to the themes that I pointed out originally so flexibility. Was really core for this project and it should be the core for any migration, project, you've got something that already works and you want to move it to a different runtime. Platform, or moving it to containers, should be the last big move that you make because from then on you can run it in all sorts of different platforms, using the same container logic, that you've already invested it and the, speed part of this was really important, simple things like focusing, on how long it takes your docker images, to build and optimizing, that focusing, on your dev workflow, and optimizing, around the fact that the containers, look and feel the same even though they're doing something different internally, to really shrink down your development, cycles, and help you be a lot more productive really quickly and that last thing about standardization.
So If you're go and look at the github repo and you look at all the cuba Nettie's manifest and if you're not familiar with communities, it's not entirely clear what all those things are doing but, if you are familiar with communities, you don't have to have any understanding, of what the API does, or what the compute module itself does because you're familiar with how the applications, be modeled just by looking at the llaman files so, the benefit of going through the learning curve and really learning deeply, about doctoring kubernetes, is that those skills are transferable to, other projects, and other organizations. And when other people come into your team if they've already got that knowledge then they'll be up and running pretty much straightaway, okay. We're just about done so I said to me a bunch of links at the end and here they are so first of all this is where you'll find all the demo code that I went through today so you can try running this stuff yourself you, only need doc a desktop install to run the kind of stub versions, if you want to try out the hybrid, versions, then you need to spin up a hybrid communities, toaster which you can do in IKS or AWS or, GK if you're, interested, in going through the learning curve that I've written a book called learned, docker in a month of lunches and I'm currently writing the sequel to that which is learn kubernetes, in a month of lunches they're both published by Manning and they're both very task focused so, there's 20 old chapters, the idea is that each chapters, got a whole lot of exercises, that you follow through as the larb at the end to help you cement what you've learnt and each, chapter should take you about an hour and both those books have a really clear learning journey that kind of starts from the beginning layers. On more and more understanding, it's until you're kind of happy to run your own apps in docker and Cuban at ease and if you're interested in those books there's a discount here that you can use on the money website and if, you're not a book reader and you're more into videos than most of my content is on chloral site which, is an online training platform, it's kind of like Netflix, for geeks you pay your monthly subscription.
And There's something like 7,000 courses on there now I've done 20 odd courses, that cover all sorts of stuff like monitoring, containers, with Prometheus, using sto with kubernetes, doing, containerized, builds, with jenkins a whole lot of stuff I'm sure you'll find something interesting there so, thank you again for watching my name is Alton I hope you find the session useful I hope you enjoy the rest of dr. Khan and maybe next year we can meet a dr. Khan in person.