And. This, is us if, you want to keep the conversation going on to us on Twitter. Slide. Credit, has to go to my, other partner in crime, John Star mur John also runs the user group with me John, runs it from Maui I know. Right I know ya, don't feel sorry for John so I do try to put him to work like making slides and things and he made this one for us oh. You. Know I can't beat a little bit this one working Terry does this does. This work is, there. A way to turn it on that I don't know oh maybe like that thing right there. It. Might it's really, small writing. And. You would need other glasses for that one I know. How that is this when I hosted this event last year I don't even think I had to wear glasses. Okay. So. Um we're, gonna talk, to you about kubernetes in the real world because. It's. Gone beyond. The point where Enterprises are. Thinking. About transforming. Their architecture. To you, know contain or oriented. Data, centers etc it's it's. Happened, and it's not even pcs, anymore see a Newton, organ right, where oh I. Knew. I wasn't crazy well. I mean maybe that's still debatable. But I knew that wasn't working, okay. So what I was saying is, a. Lot. Of people. Are running kubernetes. Now not just in. POC is any more like actually in production so can I just real quick see a show, of hands of who is running kubernetes, in production. Okay. All, right we got a little more okay a, little more each time um I love, this community it's like all my best friends come to these things and okay. So, this, as you know has you. Know given us some benefits. But also caused some problems and. All. The things you expected, our VMs aren't necessarily, there natively. With, the container. Infrastructures. That are set up so. We. Have a, a panel, of, architects. And admins, and operators. And customers. And user, group managers and I know that's more than four, people so some of us wear multiple hats but, we. Are gonna break it down tell, you some more do's and don'ts, and so. That you too can run kubernetes, in the real world so, why don't I kind of start with what's, the first question you know how, do you enterprise this transform, their and their architecture, is with containers without, losing all the real live a little bit of. Viability. And availability, that they've come. To love with VMs and I apologize it's been a really long day. No. No keep. The community yeah. No I think you, know moving moving to kubernetes is is, actually a big challenge for a lot of companies the, containerization. Process isn't, just about deploying. Things into containers, and that's why kubernetes, I think is so powerful because it's an excellent management, platform but it really does mean that your application.
Development Community, has to move to cloud. Native models, for their applications, you, probably still have to consider that some of the services might need to migrate on a longer, time scale like let's just not try to do everything all at once I think, if we can sort of start thinking it through some of those processes we, can we. Can actually get enterprises, into a more, cloud, native mindset, it's just going to take some time right. So. One of the and, I want Michael Richman to take this next and, take us in this next erection cuz I want him to talk a little bit about what, he built at neo I teach, did a little in the morning session he, is got a rather large database, he ran in Shanghai running, a lot of course in production of kubernetes and, so he's had growing pains and he's learned a lot and. So if you can explain, to some of us particularly, around the area of, you. Know we saw this with OpenStack right when it grew really fast and then maybe, didn't, bring the admins, and the operators, along as fast as we would have liked to so then a whole bunch of other stuff had to happen so, same with kubernetes are the docs, or the tools there for the operators the admins and if you could talk about your experience, and, then I'm sure Tony will be itching to. So. That. Now we're, building electric. Cars as our primary product, the. System. The kubernetes systems that we're building, to. Support the development of those vehicles so we have, in. A US data center which is about two miles from this building we have about. 3,000, cores under, today's management, we have another 3,000, cores on under, DCOs management, in. Shanghai, we have about, 1500. Cores in kubernetes. And 1,500, eCos work we're actually running kubernetes, at DCs. Side-by-side. Largely, because, we. Were. Going to we, were going container. Native when we started this project, two. Years ago when I joined the company and. It. Wasn't clear, what. The right, container. Orchestrator looked like at, that point in time for. The various workloads, that we've been needing. To run and needing, to build out so we actually took, a model of deploying. Both. Leading Orchestrator side-by-side, and we have the. Bets that we made on what, was, best, suited for DCOs, in. Some cases we were correct in some cases we were wrong and saying for kou Pineda so we were actually actively. Shuffling. A workload around but. One of the to. Actually answer your question one, of the things that we've been finding is. Kubernetes. Today. Is. Based. On our experience stable, introduction, we're running. Kubernetes. 194. In our production systems, and the, 1/9 series, was largely, when we were finding community, it was was. Pretty solid but. The path to get to running 1/9. Was. A long education. Process, and a.
Lot Of skills building, because, kubernetes, today in my experience, is. Somewhat, like where OpenStack, was in the. F&G. Release. Cycles where the, technology's, there but. It's actually a difficult. Process to get a deployed and operate it that's fulsome in grizzly for those of you keeping score at home. Not. Anybody's coming through the OpenStack user group some, people are from some of the. Thank. You I would. Like to point out to Michael and Robert that I have a lapel mic. No. I see, the hierarchy. Tony, Campbell and some, interesting thoughts on this I agree there's a kubernetes. Is, went, through this pipe cycle right where everybody, was trying to get their hands on there everybody was using it but, not necessarily, using it in a way that was production-ready a, lot of bumps and bruises we saw in the last presentation a lot of people were able to shoot themselves in the foot but. The interesting, thing is. Kubernetes. Uses. A pattern, in order. To make, its components, reliable, within, itself right it uses this controller, loop right that is watching. The, resources, and it's, always trying to reconcile, those resources, desired, state to, the current state right so if a pod disappears, this. Thing is watching seize that pod disappear, brings, up another one right, that's the kubernetes does the, interesting thing is we can use that pattern to, help build some reliability, into, kubernetes itself, where, if, you have your kubernetes, cluster up, and running and. Let me back into this so I'm with red right now but, I came over with the coral s acquisition, so I was with coral s and that's what black sites before then but. At coral s we. Along, with a lot of other installers. Started doing this as well we, started to to. Take the control, plane of kubernetes, and run. As much of that as we could within, kubernetes, itself, so. Yeah. So the API server, right. Was, actually running within our kubernetes cluster that's called self-hosted you see it all over the place now self-hosted, kubernetes so, the the API server for example was, running as a pod in the, kubernetes cluster so, I've got my API servers running and if I lose one of my API servers, it's, just the kubernetes pod where, does kubernetes do. Brings. It back up right, so, that's sort of pattern I think the more we figure out how to make that pattern work in different places we can we, can help to build more reliability, into, our kubernetes clusters. Okay. I'm, gonna ask you an interesting question. Why. Don't I so what I've noticed so, we're gonna do, a goal DevOps e here I'm talking about IT admins, versus operators. Which says. We. Know okay. All right so. In my. Experience because, our user group here that and if I didn't really reduce myself I'm Lisa and I've run the sfa, OpenStack user group for the last five years and we're now sort of SF big cloud native open infra but it's all the same all one big happy families, same user group so. I know most of you I think since I've been around a while so. I've talked to a lot of you and I know our environment. Our community. Is is consist of a lot of admins and operators. Maybe. Third developers. Probably, that's about the ratio and. So in my experience. I've noticed that the IT admins, sort. Of take. A lot of interest in the technical, details of. Launching. And maintaining kubernetes, clusters, as opposed to the developers, who just sort of think the. World is a kubernetes, cluster. You. Go with me here I go it's a real question and. So there's, a kind, of a control. Thing about who's developing, what and especially when you introduce, you know gke. And uks and they but they do yes and so where. Do you think how much of the development, is happening and is there a handoff, and at what percentage is there a handoff and how does that happen and. Well. I think you know for one, thing I've, said for the longest time that developers. Are not happy, with DevOps that, they were happy with it because it enabled, them but, they're not happy with having to deal with operating. System updates I mean who wants to do that right, operators. On the other hand are happy, with it because they can automate it they can say I can I can build a model against ten thousand machines to automate this process and, it's not as much of a problem for me so, when we split the two and say that's why I say ops dev I say that there's an Operations, development, process, we should be automating, all the things still and infrastructure. As code is still valid but, when we get to the application why do we want the person who's trying to figure out if they've patched the latest security bugs on their operating system also trying to figure out what next feature they need to build in the application, we have a nice divide now and kubernetes, gives one model, for what that divide looks like that's, why I think there's so much interest in it so.
That The developers can now focus on how, they're consuming, kubernetes resources, and I'm actually a huge fan of the operators thing that the core OS folks introduced, into, the kubernetes community because, you can actually use that to operate your own application. All right so now you're thinking about operations, from the application. Perspective rather, than the infrastructure, perspective you've moved infrastructure, up level I think that's why people talk about no-ops kind, of like server lists. Just. Throw, it all out there I. But. But these are the two key technologies, right we can we can break this boundary, application. Developers can focus on their application, they still need to think about operating, that application, how does it scale and, some of the things that even the previous session, like oh don't, just assume that cron is gonna do the right thing or cron job is gonna do the right thing you have to still be aware of what's going on in that new. Layer of infrastructure, but you can at least now let go of some of the underlying resources, all, right make them a little bit more flexible, so that's. My 10 cents. 10. Cents, just. To just. To actually, add a little to that a. Lot. Of what a lot, of those high-level, technologies. That you're talking about to serve those no. Options it etc is, largely. And move towards, saying. There's. Only one set of skills that we need to operate complex. Distributed, systems and that's never going to be the case it's. Important. For everyone, no matter where they live in the stack to have an understanding, of other parts of the stack and maybe. Be. Abstracted. From the details that are not important, to their layers such as application, developers should. Be able to have a environment, such as kubernetes, where they can run their apps and not worry about the infrastructure, stuff but. We still need people to worry about the infrastructure stuff either, those people are hired by AWS, because you're running in a hosted. Environment or, they're your IT. Traditional. IT people or they're your operators, that you high to support, your applications, developers. There's. No free lunch I mean, ultimately, not. Everything, is a completely. Stateless, application. And. I'd actually say that nothing is a stateless application. My. Behind one of my absolute, most favorite quotes in the planet, during, container world on stage and he said, the, only, truly. Stateless. Application, is hello world. My. State - that - because it is in the codes. I. Came, up as a software developer and I'm gonna have a real vulnerable moment right now. When. I first started doing software development, I knew. This zero about, the Linux command line, nothing. I knew, nothing about it I was hacking out java day and night and then somebody told me hey put, this on my Linux box I'm, like yeah how do I change directories. Yeah. So. No, work for that oh come. A long way but, what the point is I think, if developers had, their way they. Would be hacking. On applications, and not, worry about infrastructure, at all and. I think us trying, to to merge those two understand, what we're doing so were thrown stuff over the wall we, wanted some sympathy on both sides of the wall but, I think we may have went a little bit too far and now, I think that's, kind of pulling back a little bit one of the things that the cloud providers, have done for us is they, have spoiled, us because, now I don't, have to worry about a no sequel database, I can, go to any cloud provider click a button spin, one up connect it to my app and there's, so many cloud services, out there that I like that well, I can just consume. Them and build my application on top of them I think we've gotten really spoiled, with that the nice thing is we can do the same thing in kubernetes. But. Here's the thing right so that that model, of oh I don't have to worry about the infrastructure, leads to major outages, right I mean we've seen them even in the cloud providers, it's not that the cloud provider outage, wasn't expected. I mean eventually something's, going to break but.
Then You have somebody who's like well the u.s. East is the cheapest Amazon zone so I'm gonna put all my servers there and, then Us East disappears, and what are they supposed to do all right their application, doesn't just automatically, spin up somewhere else because they didn't automate that right. So kudos is the same thing just because it's easy and you see this a lot you see a lot of developers, that say oh I'll just deploy kubernetes, and you, know I've sort of fixed my ops issue, except. That they haven't thought about what it means to act to be running kubernetes running it at scale running it in multiple sites resilient. Applications, still have to exist and that's that's the piece that doesn't go away. Right. I mean, with the data it's, and so I'm I'm supposed to be my writer so I rather than have my own opinions I'll quote other people when, I wander through one of the labs earlier, they, even Jeff's lab I heard, somebody say that when, a kubernetes. Goes. Monica, manie's container goes down it's a five minutes, minimum, or something to get back up that way I said five minutes and yes. So okay so what happens to all the data what happens to that persistent, that stateful, application, all of a sudden when you lose that container so, you solve, this problem Michael can, hear you tell everybody about it. Why. Did you try because, you're running stateful, applications, on communities, which is a, land. That people have not boldly, gone, it's, so much but they need to how. Did you do it. We're. Working on that. We, we've. Deployed, some technologies, at neo that have simplified, a, lot, of the, choices, that people, would make in deploying kubernetes so. That we don't have to directly. Address that challenge today. Kubernetes. Is great it schedules, it runs everything if, it was like you're running a process because fundamentally that's what you're doing is running a whole lot of processes. Pain. Points come in with container, networking, and container storage. You. Can deploy many. Any one of many networking. Solutions, that will either give you a whole, lot of overhead and slow, down your network terribly, or will give you addressing. Nightmares that are difficult to debug. Nia. We chose to go with a hardware.
Based In Sdn. Vendor. Who worked with us to implement, their CNI integration, for kubernetes, that. Has been going, fabulous. Fabulously. That vendor is a big switch for anybody interested. We. Also deployed a. Storage. Solution for containers. From port works after. Working with a number of. Stir. Solutions, and kubernetes and that, is providing, us. Network, addressable, volumes, that follow our containers, round as they get. Relaunched, in the cluster. All. Of that works. To. Date and. It. Has been a evolutionary. Process, to get to. The. Stability. That we have but, it's not a solo problem, it. Is, a, emerging. Solution, that. That. The community. Needs to decide, really. What types of workloads the, community, wants to run on kubernetes. But. If you have a even if you have a a. Stateful. Web service, a stateless, web service that is just returning hello world if, you have a five minute outage, when that restarts, you have a deployment problem goodbye, world. I. Think. This is also where the, operations. Development, automation, and application. Automation sort of come together right the application developers cannot be completely, blind. To what lives underneath kubernetes, environment, right at, the same time the operations, teams can provide, resources can provide tools to make it more resilient but, they still have to explain how the pieces are supposed to fit together right, it's not just one size fits all in the way we go it's, all done we still have to sort of tie the pieces together and I think that's again where some, of these additional, models, like Tony you'd said right start using the, kubernetes, automation, model community Bernays control loop as a model, for managing some of these stateful, applications, that's, where the the controller resource, or, the operator resource sorry comes, into play and I think it's a useful tool that. Can be implemented to support some, of those stateful, environments, when, your application, expects. Things to exist it expects your database to always be there an, operator can at least help handle, some of the automation tasks, right so it is something comes from the ops side, of the world I think Roberts. Setting you up can I go yeah. What. You're also saying is. Building. Model I think applications, is easy building. Distributed, applications, is hard if. You're building if you want to build an application that, scales. And, is in the kubernetes, mindset, or even the VM mindset. You, need to be thinking about, microservices. Architectures. Replicated. Architectures, and none. Of that comes without the cost of understanding, what you're doing compared, to just slapping, code together in, one, process locally and, ultimately, I, think, a lot of this is, that. We're still trying to deliver, what Java promised, us Java promises, to write once. Run, anywhere that's, actually what kubernetes, is getting as much closer, to its, what AWS, got us close to with the. VMS, and by what that's a 30 year old dream that we're still trying to actually realize yeah yeah, we've talked about this in previous editions, of this panel it's about the culture shift that is necessary, to move the kubernetes the, way if you have to change the way they think the way they design applications. The way they think about applications, and. You can't just take your model with what. You can but, it's not going to be very successful you, can take your model with and drop it in the container if you want to but you're not getting you're not getting all the benefits of that that gives you yours five big container image that was being talked about in the previous area so, the.
First. The. First. Piece of functionality that I built out in the microservices. That's the foundation, of all of the. Work we're doing at, neo. Was. The. Slash, status, return code or. Endpoint. From my. Vehicle, web. Service and, that. Just returns success but. That's the thing that I launched, in Dhaka and then built the rest of the database, functionality around. Because. Having. The application. Or a service, that you build for kubernetes be. Able to actually say yes I'm here, and I'm healthy is so. Fundamental, to be able to work in the cube based landscape. So. If you an operator. One of the coolest things and I talk about it a lot but since you're the expert okay, I have a lightning talk on it but I got more time here for. The light. I. Know, if you guys heard of the operator framework but just real quick to fill this out to you we've mentioned it a couple of times so, the idea of kubernetes, is we take that control loop and we use it for our own software right. So if I have an application that, has certain operational. Requirements, here's how I back this thing up here's how I restore, it here's how upgraded. Right now that knowledge is in the mind of an operator a person, right while we're doing the operator framework will allow you to take that knowledge put it into code you're, gonna put that into kubernetes, code and you're gonna run that into your system with the control loop right, so instead of me as the sysadmin watching. This to see what happens when I need to back this up and I'm physically doing it the, controllers. Watching, that resource and it says oh I see I need to back that thing up and it fires it off and does the back up. Yeah. I. Questioned. The need for backup if, this is truly. Immutable. Infrastructure, that has been deployed and you have a deployment, file somewhere, yes. There may be state behind it state things be backed up but the application, itself the stuff Nick deployed. Should. Never be backed up from kubernetes, we, should be backing out the thing that you've fed to kubernetes, in, order to deploy stuff if. People are going. When. You need to. Recovering. From them. That. You got into that pack but, the idea is like if you need to do an update you know for example if you have software out there running and you need to an update you can codify, that into an operator that can automatically allow you to do that that's. Something that we're really pushing now that we're trying to push through the kubernetes community, you.
Can See us on kubernetes slack kubernetes, - your operators. Github.com. Slash, operator - framework, we have an SDK we. Have an operator lifecycle, manager it's all open source come join us as we push the operator revolution, I guess I saved the hardest question for last and, actually interesting, what you just said Michael may change the, answer to this but whose responsibility is the security is that the platform is that the app how, do we secure this stuff, yes. Everyone. Everyone. Everyone. We. Have a. Security. Team at that's part of our IT, operations which, is separate, from my team thanks, looking. At, company. Security and they, they. Have strong input into what we're doing we, have. Parts. Of our system that we, are designing, /, implementing. To address. Container. Level or micro service level security, and. This. Is a great opportunity we're, also hiring, for, operators, and, security. Experts to join my team and help us build out this element and then right across the street you saw them if you drove into that side of the parking lot they are right across the street and, as you can see that's. The internet Michel he looks the least like. So. Find him near me around Ning Green and okay, and close from neo is here by the way. So. Find. The new people and let, them tell their story it's a really really cool story, okay. Last words we got two minutes. Use. Kubernetes. I. Think don't be scared about any of this technology but, but be aware that it's not just the one one one solution fixes all ills right there, it brings its own problems and as long as you work through them too like any of the technologies, there I think it is maybe easier to operate in the long run then than some of the things we've seen come before it but. It certainly doesn't solve all problems and, you know take it easy it doesn't, have to be instant commit. He's never distro, I'm. Gonna ask you because these two did it. So. We went native and I'd do it again, I'd, also say, that. Art. I've, been fairly strong. In my, comments, about communities, being stable I had, one of my data, scientists, launch an analytics, job last. Week that consumed a thousand. Cause for, an hour processing. And, it ran to completion and has consistently, run to completion every time you submit that job, this. Is one of the data points that makes me confident. About kubernetes, I also. Need to spend some time with my team optimizing. That processing, because that was an hour of processing, to process 30 minutes of data but. That's a actually. 30. Kubernetes. Operators. The. Key to automated, operations. Join. Us oh my. Gosh oh yes. And I miss my Nancy and it has been a pleasure to to, be managing and orchestrating, and architecting, this user group for five years.
2018-09-21