Extending the Anthos Consistent Experience to AWS (Cloud Next ‘19 UK)
Welcome. So. Today we're gonna talk about. Enabling. Antos, to provide you a consistent. Operational. Experience across. Multiple clouds before we get started why. Don't we do quick round of intros so my name is Alan I work on product, for Antos and. Kyle. Bassett from Arctic Google. Partner has been working with anthos, for a couple years now yeah. So. Again the, objective of the session is really to demonstrate, how Antos can provide a common, experience, across. Hybrid, and multi. Cloud. And. Before. We get started let's just spend a couple of minutes talking about multi cloud and, the. Challenges, with operating, in a multi clad acquirement. As. All of you know most. Companies that we talk to typically. Tell us that when. They're moving to the public cloud they. Oftentimes want, to really transform, the way they do things and oftentimes. Having, a multi cloud strategy is very important, right you, don't want to lock yourself, in two particular clouds so when, you talk to you. Know various organizations. Across many industries being, able to have that portability. And run across, multiple, clouds and on-prem, is one of their top priorities. And. It's not just multi-cloud. Every. Organization, I talk to typically has. Applications. That need to run in the cloud but then there's a lot of applications, that have to run on Prem out, of curiosity how, many people here have apps, that, literally, run in pretty much every, environment. Yeah. That's what I thought. So. Why multi cloud, some. Of the common use cases and reasons that we see people using multi, cloud are as follows, number. One every, cloud has its, superpowers, right. Google. You. Know we like to pride ourselves with, kubernetes, and, machine. Learning and analytics, other cloud providers have, their own superpowers. Perhaps. You want to be able to extend, your application, into different geographies. Maybe. You want to run, your application close. To where the data lives. Perhaps. You don't want lent vendor, lock-in you want to be able to pick the right cloud negotiate. Across different vendors and be able to move your application, from, one environment to the other I've. Seen cases where, applications. Are run across different, clouds from a resiliency, standpoint, you've got your primary, in one cloud and then your backup, failover in another cloud, and.
Then There are cases where you know organizations, acquire companies. And companies use different technologies. That, run on different clouds but you want to really. Be able to manage these different. Applications without. Increasing, your operational, overhead. Now. We're, doing all that there's. Challenges. The. Common challenge. That I see out there's as you start running in multi cloud environments, you start seeing cluster sprawl and you. Start getting, into situation. Where you're building a platform on, top of a platform to. Have to manage these various environments we. Start to build silo, teams to focus, on operations. And one at each one of these environments, kubernetes. Does a very good job at abstracting, away the underlying runtime. And infrastructure, but, then what about operations. What about monitoring, logging. Auditing. Security. Networking. You, know oftentimes every, cloud provider does things a little bit differently so, when you implement a multi cloud strategy all of a sudden your operational. Expenditure. Starts, to go up right. So. This is why we developed anthos, right, Antos is our. Platform. That. Enables, you to run and operate your applications. Across, hybrid. And multi cloud in a. Consistent. Fashion. Oftentimes. People think a lot about day one experience. Getting the environment, up and running but, the challenge, really is around, keeping, the environment up and running it's really around day - how. Do you do upgrades how. Do you ensure your environments, secure, how, do you build a platform engineering. Team that can serve applications. And deploy them in the right cloud or, right environment, without, we use 10 different tools, how. Do you simplify, your environment, in such a way where you're more productive. That's. What anthos is bringing, to the table at its, core is, kubernetes, through. Google kubernetes, engine, and then on top of kubernetes, we provide an ability to run configures, code that. Enables you to be able to apply configuration. Across clusters. That, are running in any environment, and then on top of all that we provide a very sophisticated service. Management, capability, that, enables you to start building an SRE practice, and, offer access re. Features. An experience, without having to use five different tools, and extend that across, multiple clouds and on-premise. So. In today's talk we wanted to do something a little bit different, right rather, than just show, you slides we want to show you this in action so, Kyle why don't you show us how to on board on anthos. And make this real yeah.
Cool, So I think what we're gonna do is focus on lots of demos today so hopefully, the demo gods are good we have some backup plans if the only thing but I'll. Walk you through what we're gonna do is I'm gonna show you what we built and we're gonna kind of move it up from the ground up we're gonna have the kubernetes layer and we're going to kind of stack it up from there to. Tee things up Alan talked about config, management, so one of the biggest parts I think with. A lot of my customers when they're replac forming, they're starting to look at how operations. Is going to work and how changes are gonna get introduced into the environment one. Of the big principles we. Have is everything, should be done through code so you've probably heard of get ops and the get ops approaches, so everything we're gonna do on these clusters today is all going to be through, check-ins, circuit and, it's, gonna push everything to their clusters so no more jumping on clusters, running, coups ETL commands, we. Have a source of truth that's going to be in get and the workflow is regardless if it's on Prem or in, another cloud or in gke we're, gonna be pushing, out consistent, policy, across this and we're gonna deploy an application we're. Gonna shift a bunch of sto rules around we're, gonna manipulate, an app a little bit and the, whole idea is it's this common experience so we've got gke that everyone know and loves it's been around for a long time in cloud now, we were able to run this on Prem and I have a VMware, environment, running this on and we're, gonna also run this in another cloud the fairly popular one that you'll probably recognize, and. We'll be able to show you how we can make these things look the same and start to shift traffic, around. We're. Gonna use if, you've heard of the hipster store so Google, has a micro. Services application, that you can all use it's. A cloud native app it's made up of a bunch of micro services, it allows us to scale different components, we're, gonna deploy this all through code and. Essentially. The use case you can probably tell I'm Canadian so I've used a few different regions, of Canada so hopefully we're not dealing with too much latency across, the the, pond but I'm pretty sure Google's, got a good Network that can handle that for us what. We're doing is if you look at this middle cluster, this is our on-prem cluster this is running in vmware in the states in. A bare-metal service, it's, going to deploy the entire application so, all of these micro services, are gonna be deployed in that in that cluster, the. Other cluster, the. Canadian cluster to. My to. The left I guess is where. We're just going to deploy the front-end services, and that one's running on GCP in a Canadian region, the. Cluster to the to the right is actually, running an AWS, so it's a it's a GK cluster, and anthos cluster running an AWS, we're also deploying front-end services, there and I. Figured it wouldn't be right if we didn't send up a UK cluster and push some workload to it so the. Idea is we have front-end, services, that we're able to scale up think of like a retail use case where you have a lot of load coming in but, we're keeping all our data private, we're behind a private. Cloud and that's where our datasets are and all of, those pieces and keep in mind everything, we're showing here is using a common, set of tools. So. T up the use case for year we're gonna go through three different use cases and think of it we're gonna build from the ground up and then we're gonna start to layer on some services, things like, sto. And those. Smash we're gonna talk about day 2 ops so. Multi, cluster we're gonna use get ops we're, leveraging the hipster store we're gonna start with three clusters and then I'm going to show you how we can add a fourth cluster just by a simple config, change we're. Gonna light up sto, service mesh on every, node that comes up so every pod we scale joins, the mesh you can communicate, across all these clouds and, we'll, show you a little bit of the VMware stuff as well in the AWS, setup. So. Let's get into it so. Maybe. First thing I'll do is I'll show you what, this looks like, in. GK. Or. In GC, P so. One, nice thing about anthos, is we've, got clusters in AWS, we got clusters on Prem but we're managing all through the GCP console it's our single pane of glass we're. Able to see what's going on in here we're able to interact with this so a lot of the the, operations, teams are people that want to contribute they don't have to learn cube CTL they can leverage the web UI and, have this experience, through this so yeah. And and these are gke, clusters, so if you build your application, on google cloud on gke, you, can basically just move it around from, every one of these environments.
So. I've got a label set on these so, you can see I've got a cluster in AWS, I've, got the hipster store in Canada I've, got a UK cluster, and I've got my own Prem cluster. We. Dive into my own Prem cluster you, can see the nodes that I have so I've got five nodes and we can scale these nodes up we can scale these nodes down if we want we. Go into our vSphere, environment you. Can see this is a three, node bare metal cluster, this. Is broken out into my user cluster, so these are the nodes that you just saw this. Is a pretty standard be Center set out we're just using a regular V switch we're leveraging data stores we, can carve out persistent, volumes dynamically. Everything. That you have in in gke is is ported, into this view. So. We'll dive more into the into, the GCP. View in a little bit when the day to operation, stuff gets in but let me show you what the clusters look like so. I'm doing a watch command, on the hipster store namespace it doesn't happen too yet because I haven't deployed it, so. Just to show you we're not doing any smoking mirrors yet but we've. Got our UK cluster we've got our AWS, cluster, we've, got our Canada cluster and then we've got our on-prem cluster. And. You can look at this you can interact with it just, like you would any other cluster. So. If we look at our nodes there's the same nodes you saw in VMware. Live. Demo, he's got guts. Elec. Didn't look too good though let's fix that so, there's my Canadian cluster and you'll see the same thing we just have three nodes I'm. On a budget so we went minimum, size possible. So. The, other piece what I want to explain to you is. We'll. Get into this later but you'll see everyone's part of a service match so, we've got when. I went back to the diagram earlier we've, got gateways, in front of all these clusters, so if we look at the namespaces, you're gonna say we have a gateway and these things are in sync we've, got one cluster here that's not set and.
That's. Because we haven't authenticated it so that's the UK cluster it's, trying to join but we haven't authenticated it, so it's, it's in the mesh but it's not going to be participating, very nicely at this point in time. What. We're gonna do here is I've already come i've already deployed, the configuration, management operator, to all these clusters so we've got an operator, that, is the cantos, config management, but, we haven't told it to do anything yet so now that namespace will, just be empty what. We've got here is I've got a simple shell script and I'm gonna run this and what you can see is all I'm doing is this, defines my own from cluster so. I'm just telling it you're gonna sync with this get, repo you're, gonna sync with this, branch and you're. Gonna use SSH, to do the authentication and, then, you can see I've got the exact same little, chunk of code for Canada I'm. Not using the u.s. cluster anymore I don't have my UK one and there's my AWS, cluster so, we'll add the UK one later but just. To, give you a feel for what the code looks like it is literally this simple all clusters, are blank, right now all I've done is deployed, the operator. So. If I go back here. This, is just a git, repo I pulled down that same repo you saw and I'm. Gonna run the. Script so what it's gonna do is it's gonna look to see what is available to, push this out - we've, got three clusters, so, it go ahead and pushes out config management, to those three, clusters. And. What you're going to see happen. We. Go back and do our watch. Plod. Start coming up already so within a couple seconds, the clusters said we've, got a sinker it looks at the repo and says what am I supposed to be so immediately, as you, can see this is my arm from its gonna get every, service. And. If we do a watch on. This. Cluster, we, just got our front ends and, same. Thing with this so essentially we got on Prime with everything, front. Ends in the cloud so now if we need to scale things I can, show you that example, I'll. Save myself typing, and we'll do some, copy. Paste so. This. Is my AWS cluster. Let's. Scale that up ten nodes. So, we've scaled it. We. Do a watch on that, you'll. Be able to see all these nodes will come up. So. Now we've got ten more front end nodes you can also do this through the GCP console that I showed you so you don't have to do it through cube cjl so. We brought up all these environments, using anthos, config, management, so think of ability. To bring up ephemeral, clusters, and then get those clusters running, with the right software, on, top, of them all using configuration. We. Might as well scale up all the clusters we're, already paying for the infrastructure, so. So. In this case you see all I've scaled up all my front ends to ten I just threw five in the in. The other cluster and we don't, have anything in our UK cluster, yet cuz it was commented, out in that code block that we had and. We'll. Get into this piece later but you can see my service mesh got very busy so every one of these pods that came up it joined the mesh and now front-end knows how to talk to cart service knows how to talk to back-end, know DNS changes, know ip's not. Worrying about any of that stuff so from a developer perspective you. Can code these services, and you know that this service needs to talk to check. Out service payment service you don't have to worry about IPS, and routing, and networking you just define, a service if you, scale it up in a lot of these cases we'd probably use a horizontal. Pod scaler something, that we're ever gonna recognize, load and then scale up in, this case we're just showing it manually. So. That's the build side of things, maybe. We'll jump into the operation stuff a little bit Allen and then they'll show that show, some more in the demos let's, talk about operations, and it's, really important. In terms of what Kyle just showcased. Right from a platform operator, standpoint. I've. Gotten. An ability now to create, a cluster, to provision. A cluster with the right configuration I have. A lot of the cluster lifecycle, management, capabilities. That are provided, to me I. Could take advantage of anthos, for a lot of the day to cluster. Upgrades, security. Patching. Node, repair, all, those capabilities, that oftentimes are very hard to do and, those bricks to me is a platform operator, and when we think about some of the core benefits that, anthos provides. You. Know one concept, that resonates. Really well with companies, that are really going down the path of modern application, cloud native is, this whole notion of separation of concerns right. Oftentimes. You know in in in, previously. There's a lot of friction that exists. Between developers. Between. IT.
Between. Security, admins, and one of the core, areas that we're providing value. With anthos, is to make this a collaborative, process, right, the whole idea is an application, owner basically. Creates, a build. Pushes. It over has. A platform, operator, now can define configurations. Can, push it into staging, a security. Admin comes along before it goes into production can. Apply guardrails, to find those guardrails as policies, as code and then, have that basically, be. In place before any application. Hits production, all, of that is collaborative, in nature and, we've. Designed to anthos in such a way where you, know we can truly meet you where you are if you have an existing C ICD that you use we can plug into it but, at the end of the day it's really about bringing that collaboration, together, across all these various roles, so. Why don't we show, how all that stuff works, all. Right so we all know when. We do new stuff you know some people get to do the fun stuff and build it and then someone's got to fix the leaky pipes when it's done so day. To ops is always, hard you're, not alone, you. Know these new platforms are rolling out or really changing how we're an infrastructure, in the past and I see, certain. Teams sometimes we have like you, know an SRE team or a DevOps team who's doing the build and support but in many large organizations we, still have to hand, it over to the operations, team which means training documentation. Lots, of stuff and if that falls down it can be a huge challenge and you could have done a lot of great work on the build side but if you haven't provided. The ops teams the tools the, knowledge the, ability to troubleshoot stuff, like that it, can really fall apart and I you know coin it kind of the wall of confusion, sometimes as you're throwing stuff over to teams and they're not enabled, if, anything, I can recommend is if you're building new platforms, bringing the different groups in early, bring, security bring networking, bring ops let, them be part of this and then when they receive it to support it when you know dev teams are on or sre, teams are on to build something new they're gonna really care for it as much as you did so you, know that's the culture conditioning, side of tech and you know tech is sometimes, the easy part and the people side of what we do is sometimes a little more challenging, so.
What, We're gonna do now is I, showed you a little bit of the gke console, we're you know using, less cube CTL which is always good less. Yamo is probably better, I think we'd get a few tiers out of that one we're. Gonna scale up some pods we're gonna push our code out to that UK cluster so it can be part of the part, of this environment I'll show you some stackdriver integration. These are all the tools you're going to need to be able to manage these environments, monitoring, logging, and. We'll do some destructive stuff and we'll show you the power of config management, how it can be that source of truth to recover things back. Anything. You'd add to that Allen other other challenges, we keep running into yeah. We we. End up oftentimes, running. Into situations where. People. Take for granted the, complexity, of the. Day to side, of the house especially. As it pertains to, you. Know situations, where you, know clusters, aren't configured, properly and. Typically. If. You don't have a networking, parameter, configured, all of a sudden things break but you're unable to figure out in pinpoint, what, went wrong so a lot, of what we're showing you here has to do with preventative, controls as well to. Ensure that your configurations, done the right way yeah. I look at it like get that foundation built, really solid, and then you'll be able to build up from there so. Our stores up we can, you. Know buy cool hipster stuff I didn't have time to get the tubes delivered, or else we would have had these available for people but you can see that it's pretty snappy again. This, is all hitting my on-prem cluster I've. Scaled up my workloads, in the cloud but I haven't pushed any traffic, to them yet so they're sitting kind of on standby. Let's. Go ahead and push things out - to. That UK cluster let's let it kind of participate, in this so. Normally. You, would jump on the, command line and you would run a buncha ammo files and probably, deal with a bunch of space issues, and and things like that so what. We're gonna do in this case you probably remember the look of this file when I showed you it and BS code this is our definition for our clusters. So. I've, got a plain Jane UK, cluster, that I haven't really done anything with all I'm gonna do is. Uncomment. This section. Commit. This yes, I'm committing de master that's a no-no but for demos it's okay. And. Every one of these clusters has, what's called a config, management, controller that's. Running and is in sync with the git repo so. Any time there's a change it'll. Automatically, pick it up and apply so. The, other thing we might as well do is, we. Might as well change some traffic around, so. This is my sto configuration. For familiar, with this to Yeosu can do traffic shaping and and be, able to do redirections, and things like that so again this is all declarative, it's all in my git repo you, can see here I've got 100%, of my traffic going. To my own Prem cluster I've got. Zero percent of my traffic going, to my remote cluster. So. I'll wait on this change let's go ahead and bring, that UK, cluster into into. The mix here so, the first thing I'm going to do is, do. A git pull I made a code change so there's my code change and, then. I'm just gonna run set. Up hipster again so, since it's declarative, it's, gonna know that I have clusters that don't have any changes so it doesn't change them but you notice it created this other cluster it created this other definition, and that's our UK cluster so, if we do the watch on that in. The next minute or so we should see that front-end, pod start to come up there. You go it's starting to come up so you might as well might. As well scale it up -, we'll. Let that finish while. That's finishing let's. Go ahead and push this traffic shaping, rule so when I showed you earlier the. Hipster store. Keep. An eye on this top bar right here as, I've got a different docker file loading on my my cloud busters, so.
If We go back to get. Will. Go commit this change. All, right. To try again. I see. Let's go. All, right. We're, still at 100 let's. Do the Edit. So. Just, to summarize, we have two front ends one running on G key in Google Cloud another, running on AWS and we're, shifting traffic, between, one so that we're having requests. Coming in 50%, on, AWS, and 50% on Google Cloud hitting, the backend that's running on vmware in the data center somewhere so. There's, our front ends are up now you, should be able to scale. Those. So. Right now we got 10 there we've, got 10 front ends here. We've. Got another 10 there. So. What, we should be able to see now is. If we go back to our hipster, store and we, do some jumping, around. There's. Our remote starting to show up so what's happening here is when, said we're splitting traffic actually between on-prem and our. Clusters in the cloud. So. You, can start to steer traffic you can start to do a lot of these capabilities a be testing, a lot of what I see a lot of customers doing is they're actually staging, full clusters, they, get their environment, ready they get all the testing done they. They, do this during the day and then they pick a time to cut over the load bouncer, and they enjoy. Their Sunday nights with their family instead of doing releases. And pushing out new code and crossing your fingers that it's working. All. Right so let's do a little bit more day to ops and then we'll get, into the service mesh pieces so as. I said there's lots of tools in GCP all the clusters are going to be managed through here the, nice part is is you can go, into workloads, it's. Going to a granade, all this stuff for you so you can start to look at cluster. You. Can start to mine through different clusters and in, this case we'll look at our. UK. Cluster. We'll. Look at our namespace. And. If. I type properly, and. While Kyle's setting that up we also have a marketplace. An, operator, marketplace, so. You can basically go, find, the operator. Lots, of third-party open source software and. Then you can point to, any one of these gke. Clusters, regardless. Of where they're running and deploy, the operator. So. Live. Demos, I didn't, have this happen before but again, it's. Pretty hard to dive in cube CTL, get logs pod, name all, of that stuff I can. Get all my error messages, and all my logs here I can, see what's going on with this front-end service. You. Can get all your telemetry, all your metrics through here you can look at your labels, you. Can see what's going on on. The SEO side you can see your cluster front-end so, with these pieces you can really use the GCP console to be able to do this without having to be mucking around in your clusters all the time now there are probably times to do it but I would recommend doing all that in dev putting. It into a repo testing, it and then making getting, people to do all their changes, through our get-ups workflow, you, do a PR you do a peer Roo push. It out and you're gonna get consistency, so, you can tell if I have four clusters I'm managing here I could manage a hundred clusters it would be the same amount of work for me other than just managing that one config file, yeah, and Google with. Antos we're providing, support. With anthos so a lot of the. Master. Logs, are, being aggregated, through, stag driver so, that when, you open up a support ticket right our support team can. Can. Have visibility, into the, issues right and help proactively. Troubleshoot. So you're not waiting you, know a long time I. Wasn't. Gonna do this Allen but let's do it anyway you can also deploy stuff for marketplace, which is pretty cool yeah so this, was arriving, earlier, I think. It'll work let's, hope for the better you sure I. Something's. Got to go around risky he's taking a chance right now so. I, had a customer, last, week run.
Sonarqube They have a hard time deploying, it and I, demo, and they. Thought it was pretty cool because now they can provide self-service, to teams I just go into the marketplace they're, able to deploy stuff into, their on-prem clusters I think. My connections, a little slow but this will what. Will cut through it as best we can while that's loading well. It loaded it so. In this case I'm able to choose my cluster so in this case I'm gonna choose my own prim cluster I'm. Gonna say I want to create a new namespace, I'm. Gonna deploy sonar, cube and then. I'll deploy that, what's. That's gonna do is instruct, my my cluster to. Be able to go pull the animals pull the containers and deploy them one, of the things we forgot to talk about was every. Cluster has, an agent. Called GK connect it runs as a set of pods and it runs an outbound connection, and wires you up to GCP that's how we're getting all this telemetry, so, no inbound connection, it's a secure connection out you'll go through a proxy put, it through a firewall whatever you want that's your lifeline to be able to get all this data out and also be able to give, API calls on the inside yeah and this and this is how you're able to see all the clusters show, up in the gke. Consoles, through this GK connect agent. That's running on those target, clusters. So. Let that finish. This. It's still rolling away. All. Right so other stuff you're gonna care about if you're doing day to ops. Monitoring. Everything. Default, integrates, the stack driver so we're gonna send cluster. Logs up that, doesn't mean you have to send application, logs up a lot of people like to keep their application, logs local, we can configure flu and D or something like that to be able to send to ELQ internally. You. Can send logs wherever you want but by default you're gonna get monitoring, you're gonna get logging into this environment, so you're, able to go in and look at all the things that are happening in your cluster start, to dive into these you get a real-time stream, of all your logs so, this is a great way to troubleshoot you can aggregate by cluster, by pod whatever, you want the. Other piece that I think is really interesting if no one's ever looked at stackdriver trace, you can start to get a lot of tracing capabilities, with this data so. You can start to dive into like latency, what's the latency, across micro-services and we're gonna get into a little bit about this when we talk about the service mesh but. Just data, is power and just, laughing data and giving people read-only tools, to your team's gives you probably. Gets you off the phone a lot of people saying my clusters, slow but you send them to go look themselves, and then when they come to you they have an intelligence, that a data for you to start to troubleshoot, so, you can start to dive into all these things why. Is this running at 400 and filling 50. Milliseconds, you get all the tracing, you, can dive in and you. Can really start drilling down and you, can dive in and see the exact traces, all the way through every call of your micro services application. Troubleshooting. This stuff through command line is next to impossible as you can imagine and, we don't have the tools like tapping, a network like we used to anymore, and go to the network team we're, able to kind of get this in line all through our Juice EP console I.
Think. We should dive into the the service match pieces Allen, yeah let's do that, the. Next demo we're going to show you is really around service, management and. Observability. I'll. Catch up we got our slides up there, there. We go all right. This. Is where Anto service mesh comes in so an tow service mesh is built on top of Sto. You. Can easily when, you configure an tow service mesh every, service that you spin up has, a Envoy. Sidecar, proxy, that, gives you this ability now to do advanced, traffic, management, kubernetes. Does, a good job at you know load balancing traffic, cross pods but, then suppose you want to do more advanced patterns, like routing, 80%. Of traffic here spinning up a new version trickling. Requests, to the newer version, you. Know a lot of that is things, that envoy really can help with an. Tow service mesh also provides, you the ability to get a observability. Into, what's going on with all these services, that you have deployed, it's. Really, hard building. Services. That live on, top of different, clusters, different environments, and have them connect them together and stitch them together and get. Uniform observability, across all these environments, that's, we're trying to do here with anthos, is you know provide you that control point in the cloud that makes it easy to run these open, source software like, SEO and Nettie's in varying. Environments, and then finally, as a security, admin, I can define policy, as code push, it as config, and then be able to define rules around how, my services are supposed to talk to each other what. It what are my ingress, and egress policies. What, can't services, talk to and I can do that in a consistent fashion, across. All this environment. So why don't we show. This in action, all. Right so I'm. Gonna steal one it Kelsie's terms but this is what I'm going to show you first is it's do the hard way so we've deployed sto on our, own into these clusters. We. Pretty much we're using envoy in the way in you've seen we've got mutual TLS, set up we've got certificate. Manager wired up to do all our autorotation, we're, integrating, with let's encrypt. Everything's. Multi sites you've seen everything joins the service smash as we spin up nodes. Take notes down. The. Application, discovery, pieces are looked after so if, you bring up a new micro service and you define it as you know Kyle's, payment, service it'll. Find it it doesn't there's, a routing mechanism, inside a sto and if we have things failing, it'll reroute traffic so it'll look after those pieces for us we showed you a little bit of SEO pieces, and the traffic shaping when we did the 50/50, rule earlier it, SEOs really a whole bunch of capabilities, and projects, and Alan kind of covered, the base capabilities. The. Other piece that that, Google's bringing to the table is anthos, service match so I'm going to show you kind of the open source way to run this that does come with the burden of managing it looking after it care and feeding for it and I'm going to show you what Google's, bring in with anthos service mesh and how they're bringing all these tools together and. Keep in mind to, actually do, everything that Kyle is showing. Oftentimes. Requires five. Or six different tools, we're. Showing you a way of being. Able to do it in a in a common, one. Common platform that, gives you all these various capabilities. So. We forgot to show the whole destructive, part lost in the ops part so we got some time let's cover that too. Come, on Internet. One, of these cuts is gonna behave. Okay. This. Namespace. Here it's the one we created through our config manager repo it's our application, is there you, know probably next to well, it's probably the most important namespace because if it's not there we just have a cluster doing nothing so normally. You wouldn't want people to just go and delete your namespaces. Because they're pretty much deleting, everything in your cluster, we're. Gonna do it anyway. All. Right and again much of this is is to you the hard way with. An tow service mesh we. Simplify, this whole, process, by, providing a control plane that runs in Google cloud you. No longer have to do things manually. And deploy separate, control planes in every, environment, okay, so we're terminating this namespace we're, basically deleting, everything that's in it and what you're gonna see after it takes longer, to terminate it than does to recreate it so once this thing goes terminating, namespace, is gonna go away it's, gonna there.
We Go came all the way back up took one second, and then. If we go in and watch the pods again. There's, our pod running yeah and and that was Antos config, manager in action, that's config manager in action that pods gonna join the service mesh again if. You notice anything in this we. Only have one pod running I had, scaled those up to 10 these. Other clusters, have. 10 the. Reason that is is because our source of truth says you need to run one pod, we, didn't override it because we want to allow clusters, to scale but, we also it. Just shows its reading it off the git repo so that was one pod so. If everything gets deleted we're gonna bring it back up to one and then you can scale it back up it's a good reason to use Google's horizontal, pod scalars and things like that and you can see we've joined the mesh everyone's, happy, again. So. Let's dive in I think we'll let's. Show off some of the SEO pieces, what do you think Allen. So. One. Of the things that we're shipping with ISTE oh why don't we show you the config, first how're we doing on time. We're. Good so. So, we got this we got this namespace, called sto system. We've, got a bunch of stuff to put in there oh oh. There. We go no. Lost power. It. Was interesting. All. Right we're back, that's. A good test. Alright, so this is my um from cluster I've got a lot more sto, components, because this is the this is the one that's running all my telemetry and everything if you look at the other clusters it's just gonna have a couple couple. Components by. Default we deploy, Grif. Anna so, by default you're getting all the metrics as soon as we deploy sto so, we've got nice graph Anna dashboards. You, know we can drill in you can see this was me testing, before the before, the event we deleted our clusters and we brought it back up, so you can start to slice and dice into all of this you. Can also choose which dashboard, you want to look at so. Lots of good stuff there. This. Is the cool stuff that we're getting to them so, if. You haven't heard a key ally key, Ally's a mechanism. That's going to give us a lot of telemetry and a lot of tracing, it's an open-source tool it's part of sto and you, can see it's automatically, picked up our namespace here so it knows we've got pods running, we. Can start to dive into all these you can see health checks. Statistics. You, can also see logs. Really. Anything you want to be able to see here get, integrated, graphing. The. Really. Cool part is since. These are in the sto service mash we're getting telemetry off of this information without doing any development, work so we do we did not have to tell our devs to change their application, all we did is we inserted, a sidecar, into, every one of our pods, so. When we look back in this piece here if, we look at one of these environments, you. Can see we have two containers running one. Of them is the front end the other one's a sidecar, so all the traffic that comes to these pods is being collected and, we're getting it with sto which gives us lots of information one. Of the coolest things you can start to do is you can turn on the traffic animation, so. We've got a load generator, running against this environment, you can see I've got a load generator, hitting the front end you. Can start to dive in and, see. How many milliseconds, things like that response, codes. What's my service up time you could start to slice and dice by different times gives. You all this information we can also look at the, sto namespace, and. We can overlay the sto namespace onto our store, so the, sto namespaces. Is, down here. So. This is all just out of the box so when people say I'm not ready for sto I generally. Say at least deploy side cards you're gonna get an awful lot of information you're gonna be able to share this with your dev teams they don't need to make any code changes, they don't need to really know about it and you start to get telemetry, yeah, one really, important, value you can get out of this is suppose, you've got a monolith, that, you want to break up into, microservices.
But You don't know where to start right, throw. An envoy proxy, in front of a VM. Start. To collect traffic, patterns and gives you an idea of what parts of the monolith, you can start focusing, on first. Yeah. So as you know this information is very hard to get if with, this do it's just in line of the traffic of the networks you're gonna get all these services it knows what the service is based on headers things like that so this, is a you, know this is open source this comes out of the box but. What you've seen is we've got lots of different tools this is jäger interface, similar. To what I showed you with stack, driver trace you, can see we can start to dive into our micro services I would start to look at this what's, this why is this slow I can, start to dive in say there's my sidecar what, service is running what, ports it running on what's. It doing I can start to slice and dice time zones everything. Kind of updates, real-time so you can really start to dig into what's going on these, are really great tools while, you're developing as, you're building you, should be looking at this build a baseline as you recode, see if it's getting better see if it's getting worse if, you do a release, baseline. It on the Monday see if your code has affected, your website all these things can can, really come in handy. All. Right so you saw that it, can be pretty hard to you, know build all this manage, all this stuff also so. Let's, show you ASM. So. What. You're gonna recognize, here is you. Saw a lot of graphs you, saw the little dots that showed latency, they were all different tools so. What Google's bringing to the table with ASM is taking. All those open source tools and putting, them back into the service and allowing, you to manage, all this through a service you, know I know you're talking to a lot of people but ASM, and what's the biggest, impact people want. To get out of this from you, know the ability to define error budgets, SLI. Is SL, OS across the service get visibility around all these traffic patterns, we, hear a lot from customers. That just are interested. In building sre, functions, and need the right tool set to enable it and we find ASM. Really resonates, and in that journey, so. You can see it's automatically, picked up the services, that I've deployed knows, about them in this, case we've set up a few SL O's so we want to know our. Micro-services, violating, any of this so since we know how long it takes our front-end to talk to this service or one. I would probably care about as my check out my, payment service stuff that's gonna affect the customer, and cost you revenue, well. You can start to do is you can start to build budgets, around this stuff. So. We're. Really starting to see the business value of this so this is a piece where someone. From the business can start to come in and define what's important, to the business they don't have to know the deep level tech side we're, collecting all the telemetry we, can give them new telemetry we've, got a baseline, so if our customers, were happy last week and then, we did a code change and they're not happy this week I'm pretty sure late and see when something. Started to fall apart so we've got insight into that now yeah, and where this becomes super important, is as you're, going down this. Journey right, to cloud. Native and you start building this sre function, and and and you start in some ways reducing. Up X costs, question. Comes up is how do you measure that. Op X reduction, you, know the hope is by. Providing tools, like this you, can actually start, to measure and show value. Yes. You can see this service is out of budget. This, is a different cluster we have setup that is misbehaving. On purpose one, of the things we didn't mention is Circuit breaking is another component of sto that people are very interested in so being able to break services, for testing, and be able to see if things are gonna route properly, fault tolerance stuff, like that so. In here you can start to see. Our late our latencies. Off I can create an alerting policy integrate it with stackdriver so, now I could say I don't want my front-end. Card service to be any more than 300 milliseconds, to my back-end if it ever violates, that policy, send, an alert the, other thing you can start to do is you can aggregate stuff, of you. Know there's. Great things about monitoring, systems but, when they get too noisy everybody, ignores them so you can start to say don't. Send it after 300 milliseconds, wait a minute, and then do the aggregate, over that minute and then send an alert so you can start to aggregate stuff together so, we can go you know it'll jump us into create an alerting, policy, right into stackdriver, so integrated, alerting, which is kind of nice you could certainly integrate this into a ServiceNow, workflow, or something like that whatever you're using, we.
Jump Back into some of these other screens you'll start to see it starts to look a little bit like the. Other pieces, I was showing you as. Far as being able to view the application, let. Me try pulling up a new window here. Are. You gonna show them the topology, of you yeah well show the topology view. So. It starts to look a little bit like golly we've got the topology view but we'll also get our alerts showing up so this would be great in an, OP Center on the screen you, can start to jump in dive in one, of the really cool things you can start to do is you. Can start to enable side by side so. Then you can start to trace over time and compare, side-by-side comparisons. Of your entire environment so we, did a release yesterday. Why. Is it worse today things, like that start to kind of give, you a lot of insight, you can start to do a side-by-side comparison. Start. To look at these services, so if it was green and then red later, I could dive into that and dive into the logs so. Lots of capabilities, there and then. You've obviously got your table view you can filter by budget. And in, this case we've. Got we, also have a couple, that. Are set up that are that are doing well so they're green so they're passing so, the other thing is people are asking you your customers are asking you what kind of uptime do you provide you've, got real telemetry, to prove to them that you're providing, the uptime that you need to provide and if someone calls and says you know you haven't been doing it it's most likely their internet connection if this is because this is getting it great from the source great, from the cluster view, and. We've. Got a couple minutes left but just you know you. Don't have to be an expert to use these tools they're pretty intuitive, you can start give people read-only access, to, them let them start playing around as you, can see we've got full. Metrics, latency, reports, we can dive into the time we. Can drill down you can start to break down by different protocols, things like that. Diagnostics, will allow us to get into the logs so, we can see we've had some errors in this so. We can dive in and see what's going on there and again. Like I showed you before you can open right, into the log from this pod that's, been misbehaving and I can bring it right in back into my GCP console, into, stackdriver and then, start to really look at what's going on I can also look at older logs so if I want to look at what was going on yesterday you can do that so. I think that's it for the demo Allen let's let's, wrap it up we got any comments, left. There. We go all. Right so sorry Alan I would cover this first you want to talk let's talk a bit about the future so this, is what I'm gonna build next or my team's gonna build next because I surely didn't build all of this I showed, you bringing traffic into an on-prem cluster and then we're distributing across the mesh I think the real dream is you start to use things like traffic. Director, have clusters in different places when, someone from Japan calls, my URL I send them to the Japan cluster when someone in Canada calls send, them to their why we crossing the pond for all this stuff the other thing we can start to do is obviously, data, and state is important, start to use services, like spanner now you can actually distribute, clusters, and distribute, datasets so we, hopefully will give you a sneak peek at this in April maybe in San Francisco, yeah and keep in mind unique, thing about Google, cloud is we have a global network right, so you can expose a VIP, that, can route across, different. Regions so you don't have to poke. Firewall. Rules between regions and do all that complicated, stuff. So. To summarize. You. Know everything. We showed you today is the. Value, and experience, that we're bringing forth with, anthos, anthos. A software, based 100%. Software, base so we're not selling. You hardware we're, providing your software, that you can run on your existing, hardware we're, working very closely with all the various OAM, partners, out there to come up with reference, architectures. Anthos. Is also. 100%. Open, api as i mentioned everything we build on is open. Source kubernetes. And Sto and we're. Providing a control point running in Google Cloud that makes a lot of the day to operations. Something. You don't have to worry about right, we, want to take care of that and enable you to build value on top of it third. Thing is really. Around flexible. Operations, right and portable, and consistent, we're. Really addressing, this whole notion of being able to use a single toolset, across.
Different. Environments, all the way from the kubernetes, layer up, to service management layer and. Then finally, this. Whole notion of get. Ops and configuration. Is code is something, that's built into the DNA of. Anthos, everything, we do with anthos is based, on declarative, this, whole notion of being able to provide. Your, desired State and have the system, basically. Ensure that the actual state is your desired state I'll, just wrap it up with you know our goal here is to meet you where you are whether. You're running a hundred percent virtual, machine today or you're going down the kubernetes journey, right, we think anthos, is a starting, point that can enable you to go down that modern. Modernization. Journey for cloud native so, we. Will be around for. The next I. Guess, 10, minutes if you have any questions come on up there. Is a survey. If you want to show, it you. Can. Fill. Out we. Gave you a live demo give us a good score have you give us good scores you may get to see that other cool demo in April yeah, and. Thank. You very much everyone, Thanks.