Implementing Scalable Storage for Kubernetes on GCP (Cloud Next '18)
Hello. Everyone. Welcome. To implementing, scalable, storage for. Kubernetes on GCP, contendere mobility, requires data, mobility, I'm. General McFarland VP of Marketing at, last the file and, I'm gonna be joined today by my co-presenter, Katrina Malkin technology. Director and cloud whisperer at the HudsonAlpha Institute, from biotechnology. Now. In today's session we're gonna be talking about the, critical relationship between, container. Orchestration, as, facilitated. By kubernetes, and the underlying data that, many containerized, workflows, need to persist and manage now. As a framework for this conversation I'm gonna take you through a simple use case example, involving. Web Services implemented. Across a hybrid cloud in. This example is going to let us discuss. And illustrate some, of the key considerations, around. Data management and data mobility, for these types of containerize workflows think. Katrina's gonna join us and she's gonna take you through a different use case in a demo, illustrating. How HudsonAlpha. Is leveraging. Kubernetes, to, manage data centric workflows across how the clouds in the scientific, computing domain so. By the end of the session and you'll gain the flavor for a fairly, broad set of different, use cases in this area so. Let's get to it so. As, a framework for the web services use case example, I want you to think about a multi-platform. Advertising. Campaign, so, in this advertising, campaign we're, promoting a product called Alaska, fizz it's this naturally, essence gluten-free zero calorie beverage that'll, last the file is decided to put out now. In an effort to promote Alaska citizen drive traffic to our last offense website where, we take orders by the caseload. We've decided to splurge a bit and purchase, some commercial time during the recent World Cup final, now. Having high confidence in our marketing team and the, effectiveness of the commercials that they're going to design we, expect a massive influx of traffic. To our website as a result of our new commercial so. Of course we, want to maximize our ROI we want to get the best bang for our buck so we need to make sure that our web infrastructure, supporting, our you know website. Can scale effectively, to support this increase, in load this increase in demand so that's the challenge before us, now. This, type of a challenge it's, a fairly common one to, many organizations who. Are looking towards the cloud as a way to extend. Their IT they, may have some sort of bursty workload or some increase in demand that they're expecting, and they want to leverage, elastic. Infrastructure, to be able to scale to support that demand of course the cloud has near infinite elastic, resources, so, it seems like a perfect fit so we should just tie everything together and make it all work right should be simple well, unfortunately, as many of you may have experienced, if you tried to implement these kinds of workflows in the past they're, making it work part hasn't necessarily always, been so easy right, in many cases especially when you're dealing with data and the considerations, around data you, may have found yourself implementing. Isolated, storage islands, in, the cloud. You may have found yourself pushing. Data into the cloud with. Manual, tech forklift, migrations, to manage, the data in the cloud you may have implemented complex, schemes to copy data between storage, islands, in an effort to create some semblance of synchronization, or data sharing right, now of course these, types of manual, methods they definitely can work but they can be extremely painful, fortunately.
With What kubernetes has brought to the table and with, the data management technologies. And techniques and I'm going to be describing and then we'll be demoing today. Implementing. These sorts of workflows, is easier, than ever before. Now. Before, we get into the implementation and. Some of the specifics, around that I want to take a quick step back and seek on how we got here so as you all know right no surprise containers, are everywhere at, this conference at almost any conference, in this space kubernetes. Docker, containers, are on everyone's lips right, and that's been the case honestly. For the past several years but. The thing that's changed, more recently, is many. Workflows. Core, workflows, and core use cases are now being targeted for containerization. That, previously, may not have been considered, to be a containerized, Bowl if I can make up that word a type of a workflow right, containers. And kubernetes, are no longer the exclusive domain. Of stateless, applications, and freshly written microservices, people, want to use them for much much more and, kubernetes orchestration, isn't just making these expanded, use cases possible, it's really making them easy, and. When you think about these expanded use cases data is a big part of them right we, all know that kubernetes, and containerization. Offers, massive benefits, for packaging, and mobilizing. Applications. And to, extend those benefits to. Core workflows, that, involve data you, need a persistent, data management, layer right you need something to be able to store and manage data beyond, the lifecycle of a given container or a given pod so, these types of solutions become, necessary as you extend the scope. Of workflows, and use cases that you're leveraging with containers. Lots. Of different applications have these requirements right from examples, that involve web services like we'll discuss today -, any kind of example that persists. Or manages user data to a broad range of industry specific, applications, and domains like scientific. Computing for Life Sciences or. Financial, services or media entertainment and there's many many more examples now. Fortunately kubernetes, anticipation. Of these requirements has evolved. To, supply services, that, help you address these, data centric, requirements. And data centric needs right with kubernetes persistent, volumes you can seamlessly integrate, a data, storage, persistent. Storage layer into your kubernetes deployments, with. Kubernetes, storage classes, you, can control and manage the dynamic, provisioning of, storage. By leveraging kubernetes, volume provisioners, and with stateful sets you can manage the ordering and management of certain, sets of pods or containers, within kubernetes, so, with, all of these flexible. Capabilities, and integration options that kubernetes has given you you have a lot of choices and options for, what to do on the data side so what should i deploy as a data management layer what, should it look like should I be deploying file storage should I be targeting block storage looking, at open source solutions are looking at vented solutions, so, next I'm going to go through some of the key considerations at, least in our view and, we think you should be taking into account as, you think about what types of stateful services, and what kind of data management layers and data management platforms, you should be deploying and then I'll introduce you to a qualifying, solution, from Alaska file and, take you through the demo flow, services, example. So. First, key consideration, and there's going to be four the, first one that I want to talk about is usability and, we think about usability, we're, really talking about ease. Of integration and, ease of adoption ease of use and from, the data manager in the storage perspective, you're really talking about how you interface with the, data and so really what kind of storage, you're using are using file storage using objects or using block storage and in, our view file, storage makes a lot of sense as the default storage.
Platform, Storage interface for. Stateful. Containers for, persistent storage, implemented, with kubernetes because. Of the broad range of applications, and workflows today that, rely on the file system then expect the file system right he, last years next and one of the presentations, Descartes, labs noted, that 90%, of the, calls and the open source github repository, were the storage interface calls we're calling F open right, so they were expecting, a positive compliant. File system underneath and those, were by far the, most prevalent type, of storage that was being called and with an order of magnitude over, any, other type of storage including object so really I think file is the way to go in addition, of course to being application. Usable application, accessible file, systems are also human. Usable, and human accessible, we're, all very familiar with file and directory hierarchies. What, those mean how to traverse them so, leveraging, a file system provides, a high level of usability, in, conjunction with your containerize workflows. Another. Key criteria to consider is scalability right, as we all know kubernetes, offers, amazing. Capabilities. For scaling, application. Deployments. And scaling your ability to handle workloads, so, when you're implementing a persistent storage later you want to complement that you want to have something that can scale to support the capacity on a performance, requirements, at the storage level, to. Meet your needs of today for your workflows as well as you're evolving these as your workflows may scale. Third. Criteria, is shareability right, you, want to get away from dealing with storage islands right it's a problem it can create a lot of complexity, in data management so, if you have a solution that can allow you to share data across your infrastructure, so across your nodes across, your storage devices and present, single namespaces, that can scale out that. Provides huge benefits. A seamless collaboration across. Applications across. Users, and across workflows, and. The, final key criteria, that I want to discuss is around data mobility, right, so as we all know kubernetes. Provides. Extreme. Power for application, mobility the ability to package and mobilize, applications. So having, complimentary, capability, to package and mobilize your data it, can be hugely beneficial especially. When we're talking about hybrid workflows, and the, needs that arise to. Take, data from one premises, and be able to leverage and access that data in, the cloud, so. Here are four key criteria, and now I'm going to talk a little bit about the Alaska file solution, and of course you know we feel that we meet these criteria fairly, well and give, you a sense for how. You can leverage us as we go through the demo flow. So. In, a nutshell the. Way to think about Alaska file is enterprise, create scalable POSIX, compliant file. Storage for the cloud right, you can deploy us in GCP marketplace, and essentially. We consists of two key components one. Is the core file system itself the Alaska file cloud file system or ecfs. What we call it for short it's a software-defined scale, out distributed, file system, the other, key component of the Alaska file solution is what we call Alaska file Cloud Connect and what, Cloud Connect does is it delivers data, mobility, between file, systems and object, storage and really, allows you to combine those two storage, paradigms, to, get the best of both worlds and I'll go into a little bit more detail about how Cloud Connect comes into play in these sorts of containerized.
Workflows, And across hybrid clouds in just, a moment but. First a bit about the file system so, ecfs. It's a software-only, environment, agnostic, file system right you can run it in any cloud you, can run it on premises as well and essentially, what it's doing behind the scenes it's it's aggregating, the, storage resources, of. Your cloud vm's and presenting them to your applications, in a single shared namespace, and, what it delivers to those applications is. Posix-compliant. Primary, storage when I say primary storage I mean this, storage is meant for applications. And workflows and users to interface with directly, right can support transactional, workloads and it's, really built from the cloud all, of the services that underlie, the Alaska solution are distributed, including, the metadata management, as many of you who deal with storage may know metadata. Management can often become a bottleneck to performance, a storage system scale, so, our solution, was designed for the cloud and so the metadata management model, was, implemented, in a fully distributed, fashion to ensure that you, can scale effectively. With the scaling, linearly as you scale out. Now. To, the user what last--i file provides is scalable, capacity, and performance right, you have the ability to add capacity to your cluster in a hot fashion and dynamic fashion without bringing the cluster down you, can remove it if necessary and as I mentioned before performance. Is gonna scale out linearly as you add capacity as well full enterprise feature set including, snapshots, data reduction, through compression deduplication high. Availability a. Sucker's, replication, etc everything, that you can do with alas the file through our GUI and through our CLI is also fully REST, API controllable. So you can implement seamless. Workflows, leveraging, infrastructure, is code like terraform. Or ansible as well with the last two file and, as I mentioned it's easily deployable at the push of a button through, the GCP marketplace, so. I'll just quickly show you a quick deployment we're gonna spin up an Alaska file cluster here so, we're gonna call this one a Google next demo so you can do this to marketplace today you're, gonna select your zone some, basic networking information and you're gonna hit deploy now. What's actually going to deploy here is the elastic file management yeah that. Management VM is gonna give you access to our UI and from, within our UI you're. Gonna be able to specify the, characteristics of the cluster in the storage cluster that you want to create the most important one of course being the, capacity, that you want to create so. Here we're logging in for the last two file UI so, when that's done you can see our manager in VMs up and running from, our management, console there we're gonna select the type of storage that we want and we're gonna add a nine terabytes, of capacity for this particular cluster in, this case that's, going to correspond, to spending up three, instances.
In GCP, and if we go to the compute engine we'll be able to see you can see the management VM as well as those three storage. Nodes that are gonna be part of our cluster they've all been started and, so if we go back from the elastic file UI, you'll be able to see the deployment finishes, and now you have a cluster available, with the capacity, that you specified, so, that's it that's how easy it is right you can just spin it up the click of a button GCP, but, the capacity you are and we're up and running now, through that UI that you just saw a preview of there you can do lots of other things you, can manage capacity, and monitor capacity, you can also manage your monitor performance you, can also look at the status in your environment like the number of nodes in your cluster the number of physical devices there are connections, etc you. Can also scale the cluster through that same UI right you saw us deploy to specify a certain capacity you can also scale it and we're continually, offering. More and more options, for the types of storage and the granularity, of storage that you can deploy in a cluster to give you more options to. Create the types of cost and performance trade-offs. That you want in your environments, right even, since the time when we recorded that previous, demo clip we've added additional, options, for, adding capacity and the politically, you can see here we're showing SSD, persistent disks and standard persistent disks. You. Can also drill down into performance, you can look at it at the high level you can also drill down into certain time scales you, can also look at performance for. Certain calling. Collections, of data that we call Alaska file data containers, that, are segments, of data within your file system that you can choose at, your will as desired and data, container is something, that you can basically configure. With certain policies so you can say for this data container this collection of files and directories, I want, compression to be on or want the implication, in beyond or one both in the new beyond and, you can also set quotas so it's a way of managing collections, of data inside, of your continuous. File system and, finally you also have the system view that, enables you to see the health of your environment and a continuous, basis. Now. Once you've got then elastic file file system spun up in the cloud you, want to leverage it with kubernetes right so to connected, into kubernetes we've, also just released just, last week the. Alaska file storage provision ur also, available through tcp marketplace, and what the storage provision ur does kind of calling back to those kubernetes, functions that i mentioned before it, creates, a storage class for. Alaska file storage within GCP, and through that storage class which, you can see here you can connect a cluster and you can now specify persistent. Volume claims as a person, of volumes that leverage, and deploy elastic. File storage on a, dynamic basis, so now you have the ability to connect, an elastic file simply it's, any kubernetes deployment, by leveraging this provision. So. If we go back to our checklist really quick and see how we're doing usability. File interface right, we provide NFS file storage so we're very user, and application compatible. Its POSIX compliant and, the key thing around POSIX, is the strict consistency, so, if, an application, writes, data to our shared namespace and another application tries, to read data the. Application, that's reading is gonna get the latest data that was written it was written right, it's strictly consistent, fully, REST API controller I mentioned before and we have a kubernetes, volume provision available, in the marketplace I think we're doing okay on usability, scalability. As I mentioned Felucia tributed metadata model so allows you to scale out with, linear performance scaling you can have capacity to remove capacity, as necessary so highly elastic, shareability. Again, it's a shared file system so you're unifying all these resources within a single namespace, so, all the m's all pods all users, can access the same data regardless, of where they're connecting in from the file space so. The one piece that we haven't talked about yet is the data mobility, piece and that's where Cloud Connect comes into play so, as I mentioned Cloud, Connect manages the transfer between file, and object right and it's a bi-directional transfer. That it can manage and it really blends the best of both worlds in use cases that involve both hybrid, data, management so from on-premise a cloud and also in cloud data management between object and file so, the thing to understand about Cloud Connect, is.
Really, What it's doing is it's helping you to. Encode. The data of any given file system into, object storage so that it can be stored there in an inactive state until, you're ready to use it and then when you're ready to use it you can pull the data out again using Cloud Connect into another, file system so in a hybrid mode like what I'm highlighting here what, we would do is you can start with data on Prem from any on-premises, file system it doesn't have to be elastic file but it could be what. Cloud Connect will do is it will compress the data it, will they dupe the data it, will encode, the data into, object, storage format so in this case into a GC s, bucket, and we're, gonna retain all the file and directory hierarchy information. All the attributes all the links everything, about the file system but encoded into object format then, when you're ready to use that data in the cloud and process it natively. In your file system and your primary storage Cloud. Connect can also allow you to then check out the data so putting it from file to objects into check-in you, can also check out the data from object to file and. Put it into an elastic file file system that you're running in the cloud you'll have to pull out the whole namespace you can pull out certain files directories, up to you what you want to pull out so, again we try to be very efficient, about how we transfer, the data because. That can be a painful, component, for many organizations, so. We compress the data we, need OOP the data and after, an initial synchronization let's, say from files an object in the cloud any, additional, transfers will ownership across the diffs so we try to be very efficient there and Cloud Connect there's also fully REST API controllable. So anything you can do through, our UIs you can do it you're using rest as well and. The key thing to think about with Cloud Connect is this, constant level connection, and this, connection the connection has a source, and a target and, in, the case of going from there from on-prem to the cloud one, firm location is going to be your source and your cloud bucket is gonna be under target, so. Now let's go back to our use case example, right so our elastic is we've, got this amazing product and got this commercial, we're, ready to scale our infrastructure, so as. A starting point we've got some new data for, our last of his campaign that we've got one premises, and we want to get it into the cloud right, to start off we have a little bit of data in the cloud as you, can see here we got a couple simple the last the file logos, and stuff we want to bring over some of these new characters and, some additional logos, to get ready for, this commercial, and the influx of visitors that we're expecting to our website so we want to stage this data and get it into the cloud so, we're going to use Cloud Connect to, bring the data over from our on-premises, location, into.
GCP And it's a GCS, and that's. What we're gonna show you it's, a pretty simple process the, first thing is let's just look at the state of our environment, so we go to our WordPress back-end in our media library and you can see as I mentioned we've, got one logo so it's pretty lonely there's kind of poor so let's, go into the Cloud Connect so go to the last file UI we're, gonna select Google of course and, we're gonna call this an upload, connection, as we're taking data from on. From to the cloud our bucket name is gonna be GCP demo 1 and in, this case if we go to the google call back end just to show you gzp number one doesn't currently exist right, that bucket has not yet been created we're gonna create it as part of this process automatically. So, we're gonna select a check in connection, and we're gonna go ahead and perform a check in connection, in operation here now. What. We're doing now is we're going here and we're gonna grab the, source directory, from, one for mrs. the environment, that has the data that we want to bring over these are additional logos, and files etc so, now we created the connection and we're going to move forward and we're gonna actually perform, the check-in and we'll give it a description of version, one so. You go back to the Google Cloud UI after. The check-in has been created, and performed, we're gonna refresh and you'll be able to see gcp, demo one has now been created right this bucket that didn't exist is now available, if we click into the bucket you can see it's full of objects these, objects, represent, the file system data that, was encoded leveraging. Cloud connect, again retaining, all the file and directory hierarchy all, the links all the attributes of that file system but, stored in sheep in deep object format ready for you to use so now, we've got our additional, data for our campaign staged. In GCS, in the cloud so, let's, load it into the alaska file file system because we're gonna run our web services infrastructure, on, top of alaska file right just like we were running it on Prem we're, gonna run it using kubernetes, in the cloud on top of the last the file clicks then our infrastructure, so. Let's do that. So. You can see now we're gonna create a download, connection, to check the data out again, we're gonna specify that same bucket because we're pulling the data from GCP demo one and now, our target is gonna actually be a file system that, corresponds, to a persistent, volume claim that, was created, using, elastic, file storage provisioner right so we connected into kubernetes with that storage provision in our storage class so, we're gonna specify that is gonna be our target so, we put that data in there and we're gonna create our check out connection. So. With that connection created, we're gonna actually perform, the, check out. And. Here you can see some options, for how you handle files and a check out if they were already existing etc so, we'll pull up pull, the check out that.
Happens If we look in our in cloud file system you can see an image upload, directory, has been created, that wasn't there before and. If you look there's several files in that directory so, just to close the story we'll go back to our media library we'll look ok image upload directory that's the one that we just created with our check out we're, gonna select a few files pull. Them into our media content, library we'll take a look and see what they are and. There. You can see some additional logos we, got our cool little characters, so now we're ready right our, website is now primed with the additional content that we want to leverage in our scale out infrastructure, to support the, influx and visitors from our commercial so. That's great some got the data in the file system yeah, question. So. You can't you can use our sink to bring data across but, with Cloud Connect you get a lot of a lot of granular, capabilities, to manage the lifecycle of the data in that, you can have the store an object until you're ready to use it let's say you can pull it out into the file system when you're done with the data and processing with the file system you can push it back at the object and spin down the file system so Cloud, Connect gives you a little bit more granularity. On how you manage the lifecycle of data whether it's stored in object, or in file right so again, trying to blend the best of both worlds there are sync might be characterized, as one of those kind of manual methods to bring the data over that I mentioned before all, right so now we've got the data in our file system so we're ready to go so let's just do a little bit, of work before, we do the work actually we come back to the data mobility piece and just make sure we talk about we close that so we've shown hybrid. Cloud connectivity we brought data from our apartment to the cloud and we've shown the connectivity, between file and object and one thing I didn't mention is. That Cloud Connect is bi-directional of, course you know I showed a flow bringing data from, on-prem and some object and then from object into the file system in the cloud you can also bring data back right you can bring it back into object from the file system you can take it from object, back into your on-premises environment if so desired so, I think we've covered the data mobility, so let's just do a little bit of work let's make sure that everything kind of fits together and actually works properly so.
Here We've already got a wordpress deployment, spun out you can see one WordPress pod I also want to highlight that we've got this, Divi loader pod here and what that's doing is actually generating, synthetic, i/o, reads, and writes on the backend of our storage infrastructure I'm just hitting the Alaska file cluster with some reason right so we want to make the cluster work in addition to the, read commands that are going to be coming in through jmeter, which is what we're spinning up here generating, some workload through our web infrastructure, so we've got the web workload and we've also got some back-end i/o and that will come into play later so you can see everything. Seems to be working we're handling about 44,000. Request per minute so we're able to handle some traffic to our website so. That's interesting but, really with kubernetes what you want to be able to do is leverage the scalability, right so let's let's scale the front-end right let's get some additional WordPress, front-end pods, going here in our GK he deployed it so. That's what we're gonna do next so, here you can see got, the initial WordPress pod that's there so we're gonna scale to four replicas, I'm, using coop CTO will do that you'll, be able to see there's, now four wordpress, replicas, up and running so we scaled our front-end and you're gonna see the performance, is gonna scale up of our, cluster as well so we went from around 45 and, we're gonna scale up to about, 70,000. Request handle per minute now. Of course scaling. The front-end is. Only one piece of the story all right kubernetes, provides seamless capability, to do that as we just discussed scaling. The backend and scaling the storage can, also be an important part of the story right what if you need additional capacity. What if storage has become the bottleneck to the performance, for your given applications, you need a more scale out performance, so of course when the last file as I mentioned previously you can scale the filesystem as well so we're gonna show you quickly how to do that as well control, all parts of your infrastructure, so, as I mentioned before we had that one BB loader that was running and generating some synthetic i/o on the back end so before. We scale our cluster we're gonna scale that DB loader so you can see we've now created four replicas, for that DB loader so we're increasing the amount.
Of I/o that's, hitting our filesystem, and the reason that we do that because we wanted to be able to show in the, performance, UI for Alaska file you can see the performance of the cluster scaling, up right in conjunction with that additional IO that's being hit we, can see the additional performance growth there in the UI I'll actually, maintain a fairly low latency under two milliseconds, so. You're able to see the. Ability to monitor the performance as, your environment, workload changes, so now let's actually add some capacity, so here we're going back to that same UI I'm going to add three terabytes, and the type of storage I've selected so, in this case that's going to correspond, to spinning up one, additional, node all right so as we saw before that's, happening in the back end on Joseph II we're spinning up that additional infrastructure, to, supply that additional storage so, you'll be able to see now that node is now active, and available in our cluster what's, ready to deploy in our cluster and we're, going to deploy it and that's, to make it active, so. Once we hit deploy now. Seamlessly, in the background that, additional capacity, is being added into our cluster no need to take any additional actions, your cluster performance. And capacity are going to extend with those additional resources seamlessly. Simply by deploying it and adding it and as you saw there it's now active, and now we have four storage nodes in our cluster giving us additional capacity, and additional performance so, we've, actually done quite a few things here right we started with some data on premises, that, we wanted to get into the cloud we, spun up the alasa file file system seamlessly, so we had that, storage capacity, available there push, the data over at the object storage we, decided okay we're ready to load the file systems, and we pulled the data out of object into the file system we scaled our WordPress infrastructure, and then we scaled our storage infrastructure so we've done quite a few things in this kind of simple use case example, but, really. Storage. And data manager for kubernetes are not, topics, that only relate to Web Services they relate to a much broader range of applications, and use cases and with, that in mind I'll now hand over to Katrina from HudsonAlpha, who's going to take you through a different, use case in scientific computing domain. Not. In this example no but you can use they force us in conjunction with us. All. Right Thank You Jerome it's great to be here hi everyone I am Katrina, Moloch and Technology, Director and cloud whisperer HudsonAlpha. HudsonAlpha. Institute for biotechnology. Is a non-profit, genomic.
And Genetic research institute. Located in Huntsville Alabama. We. Leverage the power of the human genome, sequence. To. Better understand. Causes. And early indicators, for complex, diseases. Such. As cancer. And Alzheimer's and, the genetic causes of rare and undiagnosed diseases. As well we. Are a genomic, sequencing. Leading, facility, in the world for. Human health. As well as agriscience, endeavors. Researchers. Entrepreneurs. And educators collaborate, with each other and over. 40, associate, companies located on our hundred and 60 acre campus in Huntsville. Our. Education, program provides, on-prem. And. Virtual. Curriculum. For a dull since students, their, teachers, and, lifelong. Learners around the world and Hudson, elf is also home to the first, stand-alone. Genomic. Medicine clinic, which uses, genome. Sequencing, data to provide patient diagnosis. So. It's a little bit about HudsonAlpha. One. Of the great things about HudsonAlpha, is our ability to rapidly adopt, emerging, technologies. And to do this we in IT, have. A DevOps mentality. Entry. Infrastructure, is code so. Our on-prem. Composable. Infrastructure. Provides, the backbone for a variety of use, cases including. The one shown here which. Is an on-prem kubernetes, cluster built, on bare metal. The. Kubernetes, master, and several worker nodes are. Provisioned. Using. Open source tools such as CentOS and Python, and hash. A court console. Console. Provides a service discovery and data. Dog which is a cloud hosted monitoring, service provides. Information. About, our uptime and monitoring. Application. And infrastructure. Performance. So. What we're gonna do today is, talk about the, power of provisioning. This unfriend, kubernetes cluster is code and then show why and how it can. Become. A hybrid, cloud, kubernetes, cluster by, extending. Onto GCP. So. First let's take, a look at the process of provisioning, this on-prem. Cluster. Okay. So here you see some some, Python so. This. Is just a way to, provide.
Configuration. For the master and worker nodes here you can see things like enclosure numbers Bay numbers template, names these, are all custom, attributes, that are read by, scripts. That. Work. As the nodes come up for the first time and provide a personalization. Experience, so the nodes know their role what, they're supposed to do, console. Plays in here also the roles can read key, value pairs and write key value pairs so awareness of other nodes, happens. Through this process there's, a simple. Common. Provisioning, script that is used for all of our use cases, you'll. See it makes a connection. To. One. View in this case for our composable, infrastructure. It. Parses, the configuration, from the master and the worker files. Create. Server profiles. Sets. Custom, attributes that will be read by those startup scripts and so the whole point here is one, line of code and the environments, provisioned you, write this script one time you check it into version control and. This. Speaks highly to accuracy at speed of delivering, the resources, so, with this one line right here I'm. Gonna go ahead and provision, a kubernetes, master node. All. Right it's already starting to come up so let's take a look at console console runs as a docker container. See. Right here so, in, addition to the key value pairs it, also provides, service discovery and, IP. Address, hostname, and status. Of every, member. That. Talks. To this console server so right now I just have myself but as things are provisioned, this is a one-stop, shop for me to come type this command and see what's in my environment and be able to connect to it. Alright. Looks like this one just powered on that's. The master so. We're gonna rerun the same script but this time, creative. Worker and I can repeat this for as many resources as I have I can I can keep coming up with workers and we do of. Course have multiple workers in our cluster. So. We're waiting for that to come up I'm gonna go back to console, and actually, check on things here, see if the, master is showing up. Looks. Like the worker just came on there's. Our master. And. We'll. See the worker in a moment so one of the things that the, automation. Does is it embeds SSH, keys which. Allows me to do. A password list connection as soon as the nose up as soon as I verify it's up with console I can do a password, list connection, into the kubernetes master so remember the only thing I've typed is a. Python. Script with an argument I mean I have not done any vanetti set up at this point this is all automatic. Part, of the provisioning, scripts, so. Here. We come into. The kubernetes master. I'm. Gonna do some sanity checks to see if it knows about the nodes and it knows, about the master, it looks like the worker spent, twenty seconds, it's not quite ready so I'm. Gonna give it just another couple seconds and, check, back there. It. Is. Another. Thing that the automatic scripts, do in this case is they embed data dog monitoring, as a daemon set so. That's. The next thing I'm going to check on is to see if this happened. It'll do it it'll automatically, deploy daemon set / worker node and of. Course data dog was like I mentioned it's used for monitoring and. Time. And performance, so. We see we do have a data, dog agent, running, on our worker. So. What I'm going to do just for the purpose of this demo is just. An engine X deployment, to show you we're. Gonna start with three replicas, so I can scale it up later. And. See. How this works. This is San be checking for the environment. To. Make sure that my. Cluster. And, API. Are working so. All. Right and just. As we expect, we have containers. Creating, and there's. Our pods so. So. Far so good. So. This is all on, prim bare metal kubernetes, provision, is code, on. Composable. Infrastructure, at hudson alpha and as, a final check we'll just check on beta dog more at it yes. Those sets are running so I should have my master and worker and I should be starting together, coming. Up it's starting to gather these metrics. But. So. What if the on-prem, environment, is not enough, you. Know genomics takes a tremendous, amount of data we generate about five petabytes, of data each year. And. One. Of the tricks with running in cloud is how do you get your data, into the cloud so as jerome mentioned, last, vital is helping us do that. The. Amount. Of compute, can sometimes not be sufficient in your on-prem environment, either so let me just give you a genomic sequencing, 101, when. DNA and, genomes. Are sequence that's represented, as data so you take the data from a sequence subject, you compare it to what we call a reference genome and the. Process. Of figuring out all of the differences, between you, and reference, genome is known as variant calling it's, a very compute intensive process, requires. A lot of scratch space and a lot of demand on your resources and. It's. Accomplished. Through a set of what we call pipeline, a set, of tools that execute in a certain order what.
Happens Though is that new, versions of the pipeline become available so, we size our data center to. Handle the day to day 24/7. Variant. Calling right but then when these new versions come out. Researchers. Want to go back in addition, to the day to day workload they want to go back and reanalyze, everything with a new version because they might you, know glean some new knowledge because this new version has come out so, in that case your. Computer. Acquirements, are definitely, stressed. Beyond their normal workload. The. Good news is that tools. And pipelines are becoming containerized, and. Therefore. A great solution and this. Is what I set out to do the hudson alpha was to figure out how to burst, our on-prem, kubernetes. Solution. Into public cloud and. Part. Of what makes that possible is things. Like a Direct, Connect VPN, to public cloud you there's there are ways to connect the networks. But. You also have to worry about connecting, this storage so the on-prem worker nodes access the on-prem data the cloud worker nodes access, the. Cloud data so. For, us that's a matter of deploying, you lost a file in the cloud and Cloud. Connect on Prem I'll show a graphic in just a moment but that was the next hurdle. To overcome is, how are we going to get the appropriate, data into, the cloud so that these, worker nodes can can chew on that. The. Good news is console and what I'm using for service discovery can communicate with everybody, because, of that Network the private, networking set up with between. Us. And cloud and then, of course your workload could be manually scaled or ideally, auto scaled, as, soon as those resources, become available so, this is how this looks on the left is. Huntsville. Alabama, here's. Our on-prem compute, that we just talked about with the master and workers so I showed how easy that is to spin up. Existing. Mass or. Last. A file on pram can work with the Cloud Connect node so. Cloud, Connect knows about our existing data Cloud Connect. As. Jerome showed. Pushes. The, data up to. Yes.
Cloud Storage and then. Over in Google. Land we. Can publish or push up as many kubernetes, workers as we'd like and they all have access to that data so, the the. Workers, at the bottom here the the tool set is very similar except I'm going with Hasek or terraform. For my provisioning, and public cloud instead of python and, we've. Out in alaska file of, course to this to the mix so let's take a look this is my, terraform. So it's, pretty standard stuff just to find a google cloud provider there's. A few variables. Of. Course that you know at your discretion you can choose as many variables as you'd like for the environment, this, just shows some of the ones. I'm. Using here, and. The. One, of the big differences here to accomplish the automation is that there's a scripts in a service pasta rectory the, scripts. Consist of the. Their. Bash scripts but they start. Up all of the, application. That we need to and all of the automation and then the service files are system D files so, it really is as simple as. Telling. Terraform, that, we want in, this case I just chose a number so we're gonna say three. Purely. Vanilla, CentOS 7 there's nothing special about that CentOS 7 and that's on purpose I'm a big believer that you don't put the magic into the image you put the magic into the code so. Scripts. And service. File directories are going to be copied up and then. These are commands, are gonna be executed. On each notice they boot some, of this is to mount the AUSA. File NFS. In FS tab to, set up my dns for last two file and to mount that directory. So. At this point I'm I've written a code pushed, it up to version control you, know I'm gonna do. A sanity check here with terraform. And. If everything looks good go ahead and apply it and we're gonna watch this thing come to life in Google and then, it will automatically. Associate. Itself with my on prim bare-metal. Kubernetes, cluster. And. This is real time here. So. From the moment I type yes I can, immediately go check out. GCP. Do. A refresh and. They're. Coming, here's. The three nodes of course that I asked for. Go. Back and shave a little bit of time off it's been two minutes and 40 seconds everybody's up and, my favorite command are. They here, yes, I not. Only have my on prim bare metal cover days cluster I've got the three new worker nodes I've got their IPs I've, got their status, so. We're good to go I can, connect into one of these if I want to for sanity, I don't have to I, can. Say you know let's connect to it once let's just make sure the scripts work let's make sure my data is there so. We're gonna pick the first guy we're gonna connect to him and. Just do a simple DF and, looking. For this line. So. He's mounted up and at that point this, node can consume. That. Directory, as, it, needs to apps, running on there so we, exit back we're gonna repeat, the ssh to our worker node i'm, sorry our master, node. And. I'm gonna ask it to. Get the nodes we're gonna see if everybody's here yep. So. I've got my three GCP workers. There. Good to go and I'm. Gonna do a sanity check on deta dog make sure the David sets functioning, as it should across, to the cloud I. Should. Have one, yep, one. Per. GCP. Node over there and then as a as a final demonstration, instead. Of auto scaling it, I'm just gonna force my in Jeanette's deployment, to, go to like 30 pick, a nice number now that I've got some more compute. Yeah. I've got three that I asked for initially we're gonna scale that up. Sorry, about the typing I guess I was a little tired I don't know it's kind of slow I'm working on it okay. I, scaled. It up and repeat the pause just to show ya, everybody's.
At. The party now we've got on prim from, the bare metal we've got the GCP, nodes and each. Of these can be scaled, independently. Based on which. Script you're executing if you want more on prim you do the Python if you want more in the cloud you do the terraform, but it's, all code it, works every. Time and it's, repeatable, at this point you don't have to waste a lot of time sanity. Checks anymore because you know it's. Gonna be the same every every time so here's. Data dog they've, checked in and, I'm. Starting to gather metrics, on those already. So this is successfully, scaled kubernetes. Cluster with, my own prim environment, as one, of the clouds my, private cloud right and then, GCP. Public. Thanks. Tree really. Cool stuff to see those. Types of workflows in action for real-world use cases especially in, domains, like life sciences that can benefit you know so many of us just. A couple of final, thoughts or things that, we want to leave you with here. So. If, you remember anything what, we'd like you to really remember is that you. Have the capability to leverage, kubernetes, stateful, services to. Package, scale, and mobilize, a broad. Variety of, applications. And workflows right again like, I mentioned before it's. No longer the sole domain of, micro, services, and stateless applications, you, can leverage kubernetes, for much much more including, applications, that need to persist and manage and mobilize data, we. Encourage you to evaluate your, storage options very carefully right you have lots of options with kubernetes but. Make sure that you're, really considering the key selection criteria, for your workflows right around usability, scalability, shareability, and data, mobility, to make sure that you have a solution that can provide all the needs that you're gonna have both, today and in the future and, finally of course as we mentioned we, have a turnkey solution available from Alaska file through the GCP marketplace, both, for the final system as, well as the storage provision for, the connectivity, into kubernetes, and I also want to give a special thanks to Ronan Cohen one, of my collaborators at last the file who helped develop all, the demos that you saw today, and. Finally if, you want to learn more about it last the file we're. Here in the South Hall Moscone. And Booth s1 5:03, so, we've got all the experts from our teams are, here, we can talk about pretty much any angle of these types of solutions that you want to go through so, I encourage you to visit us.