Cloud OnAir: KubeFlow: Machine Learning + Kubernetes

Cloud OnAir: KubeFlow: Machine Learning + Kubernetes

Show Video

Welcome. To see each hat hosted, by cloud, on-air live webinars, from Google cloud just. A reminder we are hosting these webinars every Tuesday, my. Name is J Smith I'm a cloud customer, engineer here at Google and, today. I will be talking, to. Dave. Berra - nice, thank. You very much today. We're going to be talking a little bit about cube flow and just. As a reminder guys, we, will, we. Encourage you to ask questions anytime. On the platform, and we have Googlers, on standby to answer, them and we'll get to some of them at the end as well so, let's. Get started great. Thanks, for watch Jay as, you. Mentioned we're here to talk about cube, flow that's, machine learning on kubernetes. Or as we like to say cloud native ml, and. I, think it's a really interesting time, to be, talking about machine, learning and. Kubernetes, together the, reality is is that everyone. Hears about how great ml, is they, want to adopt it and and, you know immediately make changes to their business and, I think the problems, that a lot of people have today is it, can be very complicated, and there's a big distance, between the actual. Vision of ml and getting it going off the ground, a lot of people today you know say ml, is great they hear, about it it's in the news they, hear it from their colleagues and friends they want to go and solve a bunch of problems very definitely, a big buzzword and, everybody's, trying to invest in it absolutely. And, and the reason is is because of things like this this is something, that Google did specifically, this, is Google engineers, Google data scientists. Who spend their entire life getting, great at data. Centers, using. Renewable, energy using the latest technologies. And then, they go and they attach an ml solution, to, the problem to handle, the power and cooling and management. Of their data centers and almost. Instantly, they were able to save. 40%. A huge, amount, of money by, attaching, machine learning models, to these frameworks, so. They, were able to make a huge difference. Already. Better than the, in in almost any industry just, by using this machine learning technology and this is something that a lot of industries are really excited, about and they were like well geez you know I'd love, to get some of that as well. Unfortunately. Machine. Learning is very hard and it's not just, because I'm terrible, at math it. Is because, this, is the reality of machine, learning on. One side you have that magical, AI goodness, which is so easy to see where, the benefits, are so powerful and on the other side you see where. Most folks are today they they. You. Know have. You. Know old-style, legacy. Infrastructure, they need to integrate, with they have very large data sets that they're trying to manage. And maintain and, make the most use of and there's, so much distance, between. Machine. You know where they are today and the magical AI goodness, in the future well. That's pretty extensive, they're just. Not. A magic, wand like a lot of us think just throw good throw, pictures, and get a result yeah, no question, there's, there's there really is no magic wand and the, reason is there's, no magic wand is because you basically have this reality there.

Are Two sides, of the world side, one is where, you have these do-it-yourself, solutions. Where you set it up from scratch where. You integrate, with legacy, systems and then once, you've finally done, all that work which can be you know weeks months years in. Order to do that you often have to do it all over again as you need to migrate as you change your solution, as anything. In your system overall changes, the. Other side of the coin is around hosted, and I should say this is not all hosted, solutions but certainly this is the trend in the, first five minutes it looks great you're immediately, able to make progress a change, the way you do your business and and and at least experiment. With the model but, then in the next five years you, face everything, you did in the, do-it-yourself, scenario, you have the same you. Know customization. Issues you have the same legacy, systems, that you need to integrate plus. Now, you're locked in you've used a. Framework. Or a model that only works on a single cloud and you're kind of out of luck. So. The, question is haven't we heard this story before. And the reality is we have before. I came, to work on cube, flow I ended, up I was working on the kubernetes platform. And kubernetes. When, it first came out in 2015 it was very very similar story the, reality, was that. Containers. And kubernetes in 2015, and before was, a highly, bespoke, solution, you had to come up with a lot of custom. Tooling. Frameworks. And so on in order to get things up and running and and, in order to wire, everything together it became an enormous amount of work you had to have entire teams dedicated. To that when. Kubernetes, came along in 2015, you now started, to have a standard. Set. Of api's, and, in, native. Ways to handle the complexities, of distributed. Computing, and that really. Changed the game people started, to leverage. The, the kubernetes, native, platform, to, go to the next level off in. In. Cube. Con austin. 2017, in, last, December we talked about some of the ways that that we really have started to unlock that, future right, you guys were just a cube Connie you as well all right that's absolutely right and, and, the way that we were able to unlock, that, that, future, was with these native extension, points you, have all these various components that that, operate. In layers from, all those runtimes, and plugins moving, up the stack to the new API, layer application, layer governance, layer so on and so forth but each of these had very clean lines and they enabled, people to swap, out the things that made the most sense for, running kubernetes. And what that ultimately resulted, in is people being able to run cloud, native, applications. Applications. For running containers that. Used all the stat things that were provided, to them without any customization and, so, what. We would like to propose around machine learning is cloud. Native ml, a, similar. Approach to the system, which. Adopts the core components, of what, makes containers, and kubernetes so strong that, is composability. Portability. And scalability. So. For. Each of those I think it's worth, just a moment to talk about how they apply to machine. When. You think about composability, first. You think about building a model. Around. Your machine learning solution, and. That's certainly something very interesting, whether or not it's tensorflow cafe, CMT, km x net pi. Torch you name it there are many many different ways to build a model and people get very excited about that but. The reality is when it comes to building a machine learning solution, it, ends up being much much more than that you end up having all these, various. Components, that you need to integrate and each, of those components are often slightly. Different based on the data scientist working on it that's definitely not a non, magic there that is a lot, of work and a lot of knowledge and a lot of skills, needed. Right to make, any of this work right yeah so. And and I think that's exactly right. Data, scientists, get have, skills in particular areas, they, want to use the, frameworks, that are familiar to them the tools that are familiar of them in each of these various stages from data to experimentation. To production and ultimately rolling it out and I think that that, really speaks to how you have to have these things be composable, so if you have a data scientist in one organization, they, can use standard techniques, and then swap them for a different set of techniques for. A different data scientist in a different work all right so, that's kind of category one we want them to be composable, and let people choose what makes sense then category. Two is around portability, so earlier. We talked about all the steps involved, a machine learning framework, but, that's even that is just the start then you start to talk about all the elements in your stack, os, drivers, runtimes.

So On and so forth again it's very very complicated, and to. Set up even one of these environments, can, be very challenging yet, the, average data scientist, often, faces, multiple, of these they, might start on their laptop, or on a private rig they, might move to a training solution, training. Cluster and then finally, into the cloud for. Final production and and training, at scale and to. Have a portable, solution that, works in all of these places would, greatly, speed their ability, to work instead, of having to have them set, up a custom, bespoke, solution, for each one and, then. Finally we get to scalability. Scalability. Is often, thought. Of as just a number of machines but really in machine learning it's much more than that it's accelerators, things, like GPUs, GPUs. CPUs. The speed of your disk and networking whether or not it's a fixed disk SSD, so on and then, it's not just you, know physical assets, but it's actually human beings how, do you scale people with different skill sets, interns SWE's data, scientists, researchers, IT, ops how, do you scale various teams if they're geo-located. Even, if they're you know sitting, right next to each other but they happen to have different focuses and ultimately. How, do you get, more experiments. How, do you run them simultaneously and, so on so. Those, are the three things that we think make up cloud native ml proposed, ability. Portability. And scalability, and. What. We think is a great solution for that is. Containers. And kubernetes that's really, what they were designed for loosely. Coupling, a number of micro services together and making. It very production, ready in scale. However. If. You. Want to use machine learning on kubernetes, it can be very challenging there, are a lot of core, components, that are involved, containers. Packaging, service, endpoints, persistent, volumes these are often very new concepts, that people are familiar really, with just VMs. And discs and, even worse if you get into the data scientist, realm where they really just want to focus on their experiment, yeah, I remember when I first started getting into containers, it was very, different than what I used to do and I actually started using containers, just as kind of like makeshift. VMs, not using, them to their full, extent and you know, first you have to learn that aspect, then you have to now learn the machine learning or as you said data scientists, just care about the data yeah absolutely, so.

That's. Why we introduced cube flow a way to set up a complicated. But, composable. Portable. Machine learning, framework, in a way that's that takes. Advantage of all the advantages, of cloud native machine learning we, want to make it easy for everyone, to develop deploy and manage portable. Distributed. Ml, on kubernetes, and it's. Really important that we mentioned, that last term about, focusing. On, Nettie's anywhere. There is a kubernetes, deployment, this runs great so whether or not that's Google or on. Prem or on VMware, or on other clouds as long. As there is a kubernetes, conformant, cluster you, can set up and run a cube flow and I will show you how that works in just a second, the. Way to think about it is thinking. About back to all those various stacks we had earlier you, have kubernetes, as this base layer you, can be sure that it runs everywhere that's a core vision of the kubernetes project then, you describe, your overall, complex, deployment. Using. Cube flow native tooling, and you're, able to stamp out cube flow in all of these various locations, and. Specifically. As. As. Jay mentioned at. Cube. Con EU on Friday we announced, cube flow 0.1. A, significant. New milestone, for us to, get out the door and start, helping people really start to use production, or excuse me use, cube flow in, their enterprises to solve problems. Today, today. In the Box we have Jupiter. Notebook we have distributed, training, in the Box we, use multi, framework, model, serving and I should really stress right, now that, though flow, is in the name we really think about that as machine, learning flow. And not, tensor flow we support, cafe PI, torch and. We have lots more frameworks, coming soon our goal is to really support whatever. Idata scientists, needs using, standard. Packaging. For. Customizing. It yourself and, so. With that I'd like to show you just a very brief demo, about. Cube, flow let, me just swap, over here. You'll. Forgive me this. Was, a live. Demo this morning but, just, because we have to run on this, laptop we're gonna use a recorded. Video of the one I did earlier so. What, you have here is a standard, Yelp, analysis. Excuse. Me Yelp sentiment, database this is public data and. In. The public data basically, there's, all the various results, from Yelp. From. Yelp content. And. They, have large comment, databases, and you're able to see through and page, through all the various information in, this case you. Have a variety of different things and and looking at this middle one it's a lot of words but really, I think as humans, we're able to see that it works properly a little, on the pricey end, for baked goods but the big goods, work good.

Also. Their coffee products, are great these are a lot of words but. Clearly, as humans were able to see it however, when. You, apply this to a standard machine learning framework what, ensued. Me a standard, solution. Today it may, not look as you know as, obviously. Correct, for, these folks you. Know it's often very very, hard using, regular. Expression, and analysis, to. Have, a system, go and look at this so let's actually try and we hit the prediction in sentiment and it, didn't come back very well so that's not exactly what we want to see I. Think it's actually kind of funny you know when. You think about it you what you might come up with for a solution here as well maybe I'll just do some regular, expression, matching, against the underlying code and looking, at this regular expression matching you're looking at it and you're like well alright, you know oh geez someone commented, out good and great we're. Just gonna try and find things where it says annoying terrible, and bad you, know not, ideal it kind, of works but, you, know obviously it missed on some of the things that we were looking for earlier, so. What I'd like to do is I'd, like to use machine learning to try and solve this problem the. Google team the Google brain team as we call them actually, I think that is literally, their name. Released. An open-source, machine learning framework called tensor a tensor that is a very. Very well designed, Eric's and me kind, of pre-built. For handling, a lot of these language issues and you, can see here this is an, entire set of things it describes, how to install, and, run these various, things which. Is obviously, pretty complicated, but you can see some of the things that it's really good at. It's. Really good at things, like image, language, or image classification language. Modeling sentiment, analysis and so on and so. Really, that sentiments, in is what we're gonna try and sit up here, that's. Just a demo script I'm not, everything, we're here is really happening, that's just demo script to kick everything off and what, we're gonna do first is we're gonna set, up a, standard. Machine. Learning, framework, on your, local cluster using something called mini cube mini cube is an application, that runs on your laptop and it it creates. A real-life, kubernetes. Cluster running. Locally, so that you have as little difference between, your laptop. And ultimately. What you run in production, so. What we're gonna do here is first, we're, gonna start using mini cube mini cube is already running in the background I'm, gonna switch my, tube. Control to point at that mini cube and then, I'm going to run, a deployment. Of cube, flow core that is a standard. Package that, we have in cube, flow you just execute, literally, that command, that KS, there stands for case in it and you apply it to mini cube and, what happens, immediately, is it goes and deploys, all of, the various components that you might want for a core. Machine. Learning framework, it gives you tensor flow it gives you Jupiter hub it gives you an ambassador for, HTTP. Proxy, it gives you a distributed, job framework, and. It gives you a dashboard so, really, really powerful right, out of the box it, gets, going and, what.

Like I said it starts up with a Jupiter notebook in, this case accept, any, username and passwords hey you don't want to enable this to the web and did you get it going but. You can see here you're able to go and select an entire array of, various tools, we. Come with a number of different packages built-in for, running your kernels whether or not it's CPU or GPU a whole variety of, tensorflow frameworks. And. With various notebooks, and then. It also allows you to do things like set the CPU set the memory extra. Resources, including. Things, like GPUs, which is obviously very very important. So. With, that we're. Not going to do an interactive. Jupiter. Training, today we've, already decided. To use tensorflow and it's pretty good so, we're gonna do what's called distributed, training that's where you take that model that temperature tensor model you. Kitten and container and then you're able you know with this very few amount of commands and you're able to distribute it across an entire cluster, and that means that you, it's up to you to now decide how, many, nodes. You want to use as part of your training so if you wanted to go faster you could add ten a hundred a, thousand if you wanted to go slowly or be art you're more cost-conscious you could use just two or three. So. We'll go and we'll do that to. Run a distributed, training you, need to do one command, or you need to do two things the first thing you need to do is set. Some parameters and, this is really where the power, of cube flow comes from or it starts to you highlight first, you saw me set up an entire framework earlier using. A very straightforward command. To select the package that I want but, now I'm, able to set parameters. For. The specific, job that I'm gonna run without, changing. Any of that underlying code and these are parameters you understand, you know whether or not there's a GPU present, how many workers there are where. I'm storing my data all these various things you're, able to do it just like setting any other variable. So. I'm, setting up them for. Distributed. Training and Here, I am I'm kicking off the, with. This same command you saw earlier and. Kicking off tension or a tensor which I have, built. Into a container and just to, prove that it is in fact running here you can see that, the, tensor. A tensor job is running it's running in CPUs, because I'm only using my local laptop and I want to have it run and. Here. You can see you. Get the exact same logs that you would run from any normal tensor. Flow job with. All the appropriate, information there again nothing, special here this, is just it I wouldn't ask you to read this wall of text I'm just pointing, out that this really is a real live job running and you saw me execute just you know four or five commands to get it up off the ground all. Right so, now. That I've done that, you. Know it's running okay. But, what you saw is you didn't see any syntax errors you know that it ran properly, maybe, it converged maybe it didn't that's okay. All I'm really doing on, my local laptop is making sure that this does run properly without any air. What I'd like to do now is now that I said hey this does run great I'd like to take it to a cluster in the cloud to, run. That training very very quickly and so in this case I've set up a gke. Google. Kubernetes, engine cluster, it runs, on, on. Google, cloud obviously, and it's going to run with two, different. Sets of nodes first. It's going to run with GPU, nodes. And. Those are the the absolute latest went, from Nvidia. With. A whole, variety of different nodes, and. Node poles you, can see here that we're using 8 Nvidia. Tesla. P 100 accelerators, attached, to those nodes and then. Second, we're also going to use the brand new T Pugh's from, Google up let me just do, that you can, see the p1 hundreds there and, second. We're gonna use the TP, use those, are the tensor processing, units attached. To, the same cluster and we're gonna run those jobs, simultaneously. Across those clusters to see which one runs first sometimes. P1. Hundreds run better sometimes, TP use run better it doesn't matter to us week because we have it's so easy to set up multiple of these we can run the experiment, ourselves, and. You. Can see here this is what the TPS look like and they support tensorflow version 1.7, so we'll make sure to support, that in our notes ok so. Now that we've kicked, it off, we're. Going to. Run. This and now, we're going to switch to that cluster that I just showed you and we're, gonna apply that exact. Same, deployment, that I ran locally and again this, is part of the power of cube, flow I don't have to make big changes in fact in this case I'm making no changes, from the exact same deployment, I had locally, I'm able to deploy to my gke cluster that makes it easy for me to get started easy for me to have, reproducible.

Machine, Learning and data science. And. You can see they're updating. It in the cluster on the cloud. Next. I'm going to kick. Off my. GPU. Training, again, I'm gonna use that same package, that I had built earlier, for, running, on my lokah machine that, was running again CPUs, but, this time I'm gonna change, the parameters, and say no in fact I do have GPUs, on these nodes and I want you to pick it up and. Then. I'm also gonna change here, where the data is stored and again I'm doing that based on parameters. Not, changing, code and I, apply those and those, should, be. Kicked off momentarily. Second. I'm also, going to deploy that to TP, use and so, again I'm going to change the I'm. Gonna change a parameter to make sure that this new job is running, against TP use and I'm, also going to use a slightly, different build, unfortunately. Today you, have to build different, versions based on whether or not you're using GPUs, and CPUs again, for, data scientists, this is. Great, because it means you're able to package and describe, exactly how you want that model to run okay. So, we're kicking that off as well this. Time I don't have to go into my local logs to do it because google, kubernetes engine actually has a great UI for me to explore what, jobs are running simultaneously so, here you can see all the logs the. Current loss rate, it's. Happening very quickly it's. Saving checkpoints, into that Google Storage bucket, and. You can see here all the various jobs that are running simultaneously. Okay. So, the jobs are kicked off they're running again, we're, not going to bore you with how long it takes to run all these I'm, gonna use in fact tensor. Board and. In this case tensor, board also, ships with team flow it's built in and it, shows you how quickly things are stepping through the overall process and. How quickly things are resolving, great. UI and, it helps your data scientists, really debug and and. Understand. What's going on under the hood and most, importantly, it, ships in the Box you saw literally. Exactly, what you saw earlier is what's, running now I didn't, have to do any special new packaging, or distributions. To run this and yet, it ran great on my laptop it ran great on GPUs and CPUs and in this case we're able to see roughly. Speaking about sixteen, and a half steps per second for, the TP use and you saw earlier about three steps per second for the GPUs, so, in one, command I'm able to test again multiple, different accelerators, and in this case the TPS came out about five times faster, really. Really powerful stuff and we're, really really excited, to show you, and. Then finally like. I said the demo needs to be short so we'll try, and jump forward here a little bit, also. Included. In the cube flow box I promise you, know it's, not unlimited but there's a lot of stuff in there is. Tensorflow surfing so in this case you, have tensorflow. Serving, which. Means we're gonna take that trained bottle and we're actually gonna be able to serve it and attach it to that yelp review, you saw earlier we. Deploy that serving, and. Now. We're going to. Excuse. Me we're gonna swap over you. Can see here, in the google. Kubernetes, engine UI that. Tensors. The tensor de temps are serving, is running. And. It, ready to receive, results so, we'll go back to our yelp review, this. Is a different. Web page that is now designed, to use ml here, were the original results all, wrong one that was happy when it should been sad one that was sad when it should have been happy and the final one that was happy, when it should been sad and we're. Gonna reattach, that swap, back over, and. Go over to the UI and hit the predicts sentiment again and it. Now comes out correct, and. So that's really the summary, of cube. Flow it's a very very straightforward framework. That, really empowers you to make decisions, about machine. The best machine learning framework for you without having to worry about any of this stuff under the hood yeah I mean that's definitely that's, I, mean, seems, like it's leaps and bounds for machine learning cuz, I, can. Like, you show with mini cube run the cluster on my machine and whether. It be servers, in a Colo, or, servers. In my office or, even a cloud Google. Cloud I can just move, it and, continue. Running my models, with, more power and then, of course it comes packaged, with all those great little tools like tensor, board absolutely. And and that really is it when you go and talk to almost any data scientist in the world today they, often, have very. Dispersed. Environments. That they'll need to run in they want to be, moved, where, the data is they. May have petabytes, of data that, they want to process and that may not be in, this cloud or that cloud and. What they're really looking to do is get to an environment, where they can think clearly, about the model and and let the separation and IT ops.

You, Know take care of it for that awesome, what. You didn't see there were some really important, points first you didn't see any bespoke, solutions, these weren't you, know custom, new ML frameworks, they weren't brand new open-source, solutions we're. Using existing. Very, popular, open source technologies, today that, you can use or swap, out for your own things. Aren't like insanely, tightly, wired together second. You didn't see any cloud specific, technology, even, when we were deploying to Google kubernetes engine, we were using this exact, same software, that we had running locally all, we did was swap out of parameters, and then, finally you didn't see any forking of the kubernetes api meaning. Anywhere. You have a kubernetes compliant, cluster it's, gonna run great, and. Really. This, is gonna empower, a brand-new. Set of it, data. Scientists, and other, people in the world there's, a great story, out of sweden, where a group. Of scientists. Were able to look at a number of data and, identify. That what they were doing wrong was they, were plowing, the wrong streets, at the wrong time and, stay-at-home. Parents, and women. Who happen to not be at work that day ended. Up being injured, far, more frequently, because, in light, snow situations, they would be tripping and falling on, sidewalks. When cars could actually handle wet so, by looking at the data scientists, and using frameworks just, like cute flow they were able to see that, they could make a huge impact, in people's lives, without. Any change. Or any other budget, just by changing the order and that's, really, what we need to do as far as using cue flow it's really empowering, everyone. At the various levels to make, smart, decisions and, help, them you know make the world a better place and. By. Doing that by focusing, on giving, people clean layers by giving the data scientists, the ability to make these decisions they're. Able to hand off to the next, them, to make even better decisions, and really change the world we. Are just getting started with cube flow it is 0.1, it's a little early for running in production even though all the components, of cube flow are battle-tested, things like tensor.

Flow As I mentioned in tensor board Jupiter and things like that have been out for many many months. Cube. Flow is still brand new but. We do have a lot of people helping you can see a small subset, of people who are contributing to the platform today and then. What's next in tupelo 0.2, which we have hoping to have, out by this summer we're, gonna make it even easier to set it up, integrate. With more and deeper kubernetes, features and a. Lot better packages, but, really, the feature we'd like to have is the one we haven't heard yet there are many many data scientists out there that are trying to solve the problem and cube, flow is entirely an open-source project, we. It is though Google contributed, to is significantly. We, have over 70 contributors, from 17 different companies, working, on it today and we want to make sure it truly does run everywhere and so if you find that, there is a machine, learning problem that, you're running into today, we. Would love to hear about it we're. On github, so. Exactly. And, so, with that that's a wrap you, know like I said we we have all the various places for us to reach out and I think we're going to take just a moment to collect the questions. You. All. Right we're back so let's go into some questions here. What. Does the adoption, of cube flow look like in the wild, so. The reality is I wasn't, I wasn't joking q flow is a zero point one release, right now we would actually not. Recommend. Using, it in too many production, workloads right now, that. Said we have heard many, many stories and you can go and check out the cube, float org, website coming, soon not quite available yet or. Our github repo to, see a number of places that people are using it today in. The examples. Directory you, will already see, a great, number. Of examples, there a solution. Around github, summarization, github. Issue summarization. Temp sort of tensor is obviously built in and a number of other ones where, you can download and start experimenting with Q flow today awesome. So. This cube flow run on gke. Absolutely. So the, demo that you saw there was a. Cube, flow running on google kubernetes, engine, and. It obviously runs great and it takes advantage of all the elements that are available in google cloud, node, pools preemptable, vm's GPUs. GPUs. So, on and so forth but like I said it really does run anywhere on Google excuse, me kubernetes runs and so that might be another cloud or on Prem but. Obviously we think that you know we can do a pretty good job serving, cube flow on gke. Of. Course oh here's, a very interesting one when, would I use cube, flow over Google, Cloud, ml. Engine, absolutely. All right so this is a question we get a lot and, the, reality, is is a cloud machine learning engine is fantastic, we strongly, recommended, what, we have found though is that oftentimes, a, data, scientist might want to swap out or, need something in a hybrid or on-premise solution, and in those cases that's. When you would want to swap out so basically the summary is if, you'd like a hosted, solution where, you really don't have to think about anything under the hood you just want to hand a model to something and get back an answer or use, it for inference, cloud, machine learning is fantastic, and we would strongly recommend it if you, have customization. Or, you want to do on-premises, or you want to do a hybrid scenario, cube, flow would be your best they're really do a layer. Over each other very nicely right so right, tool for the right job kind, of thing absolutely, if. I am ml, data scientist. How, will cube flow simplify, my daily life rather, than complicate, it by adding the kubernetes layer absolutely. Um you, know I think one of the biggest things here is that a lot of data scientists, really want to operate at that next level up I mean, I can't tell you how many data scientists I was just talking with a, fantastic. Researcher, last week who, literally. Couldn't reproduce, their. Experiment. From a year ago because the versions of Python had changed, and they weren't sure what happened in their library dependencies, that failed. Tube. Flow allows, you to describe your entire, framework, in, a containerized. Hermetically. Sealed dropped, in amber kind, of scenario, and, and, really detail, it forever and so that's something that you can share with your colleagues you can put in a research paper all those various things and it, means you also don't have to solve all the problems under the hood you don't have to deal with service, discovery, you don't have to deal with provisioning, persistent, disk you don't have to deal with.

Attaching. Drivers, and GPUs, and things like that kubernetes, takes care of all that stuff for you now that said if you are just experimenting, and you want to use a single you, know we call it kitchen sink container that could be perfectly fine what, we found though is more often than not what. Happens is that immediately. Following, that experiment, a data, scientist, has to completely, rewrite, their entire model completely. Rewrite all their dependencies, and set up and and have a lot of trouble even reproducing. The experiments, they were doing in the first place because. That that, experiment, that they were running initially, is no longer production-ready, and that's where we hope to make a lot of distant distance, with cube flow kubernetes. Is very very easy to set up it comes in docker it, has mini, cube as, set up and then you can deploy it you can describe and deploy your entire framework of cube, flow onto, that local cluster and that is. A production ready solution that's something you can take elsewhere it's very good yes kubernetes, that there's a little bit of a learning curve with it but it's pretty. Easy to set up and again it's much easier, as you mentioned, control. Versus, learning. How to. Manage. Yo. More apt repository. And get all the right dependency, and everything, so yeah I mean the, the demo you saw today was totally. You know live real, demo, and, you saw no, kubernetes. Commands literally. None. Other. Than just to show off you know that something was actually running they all the, systems, for running and trying your model were built. In and not that data science layer that you're all familiar with alright. It looks like we've got one more question here what. About data and model, versioning, absolutely. That's something that's very actively, being discussed in the cube flow community, right now, we. Have heard this request a lot this, is something, where we really do want to empower the community, to make these decisions and help, think about what should come in the box and what isn't, kin, float isn't trying. To solve a lot of these problems we're really packaging. The best-of-breed that's out there today and so, in, many of those discussions, that are happening right now in the cloud or excuse me in the community you. Are seeing, folks. Begin. To to come up with some norms, about, what. Data, and model versioning should look like and when. Those norms, land we will ask people who want to participate in those discussions to snap, to those norms which basically say hey you, know if you're gonna run a machine-learning or excuse me a model version or. A data versioning, pipeline, you, should do one two three four and as long as any solution, comes along and does one two three four it, should run fine well. That's great well thank you David so much for joining us and, telling. Everybody about the wonders of coop flow hopefully we can get some people to start using it and start, contributing, in we. Want to see all those pull requests, coming into github so and all he also hear your success stories, so it's. Definitely been great to have you absolutely thank you so much well. Thank you very much for watching and, please tune into Tuesday's, cloud. On air next week in every following week. You.

2018-07-14 15:26

Show Video

Other news