Farming Cloud Native Platforms Be Prepared for the Army of Chickens and Eggs - Navigate Europe 23

Show video

Well, good morning. This is nice. So, today,  we're going to be talking through some scenarios   when you're building internal developer  platforms. There are all these chicken   and egg scenarios that happen when you're  building these platforms from scratch,   and we're going to talk to you a little  bit about how to identify those, how to   shake out the chicken and egg scenarios,  the right, or maybe not the right way,   a great way to orchestrate those sequences  of events using the Argo CD technology. And   we're going to show you a couple of examples,  live demo style, so wish us a lot of luck.

So, yeah, this is my co-founder, Jared.  I'm John Dietz, and we have been building   cloud-native platforms for many years,  and we've kind of just gone through the   ringer. We know all the patterns inside and  out. So, if you're looking for a guiding   voice on how to get through orchestration  sync waves and how to get it all set up,   we think that we're a pretty good voice of  authority on the topic. Oh, thank you, buddy. So, Kubernetes swallowed the internet. We all kind  of saw it coming, maybe some of Jared us. In fact,   when I first came across Kubernetes, I was  certain that it was just another buzzword,   and man, was I wrong. And I don't really  have time to get into why it won, but it won,   and it's complicated. As a technology, it's a  complex technology in and of itself, but then  

you add all the cloud-native tools that you want  in that cluster in order to power your platform,   and it gets all the more complex. So, when you  start building those cloud-native platforms,   there are all these chicken and egg scenarios  that you're going to have to battle. Examples   of these are like secrets management. Maybe you  want a secrets manager like HashiCorp Vault as   part of your Kubernetes platform, and you want  that to be the central source of truth for every   single secret on your platform. And when you  start building that platform from scratch,   it doesn't exist yet. So, what do you  do? Those are the types of chicken and  

egg scenarios that we're going to be talking  through today. So, you want to take it in? Yeah. So, when you're building  your cloud-native platforms,   these are the nine pillars that we've  identified as problem points or things   that you have to take into consideration  when building out your orchestration. So, taking a look at cloud and Git providers,   these are the first two pillars, and they do  not have any dependencies. So, it's great,  

enjoy it while it lasts. These things are  usually SaaS-hosted, so you're going to   sign into a console, be able to create some  resources, and nothing depends on those yet. Then, you're going to get to your infrastructure  as code. If you manage your Git repositories   in infrastructure as code, then your  infrastructure as code depends on your   Git provider. If you manage your cloud  resources in infrastructure as code,   your infrastructure as code depends on  your cloud provider. If you're running   infrastructure as code in a control plane like  Atlantis or Crossplane, then your GitOps driver   is going to be the thing that configures  that control plane. And if you're managing  

your users in infrastructure as code, then that  means your IDP has an IAC dependency as well. Next is your GitOps provider. So, woohoo,  GitOps. Yeah, your GitOps driver runs in   your infrastructure, in-cluster generally, or  it can, anyways, and it has a registry that it   pulls all of your config from, which is generally  a Git repository for using Argo CD, and it pulls   images from your artifact repositories to pull  those containers and have your charts running.

Next is secrets management. Secrets management,  as John kind of mentioned upfront, is a good   example. You can't have a secrets manager  established until you have your cluster,   so you have to figure out how to get  through those. You're going to need   secrets for your CI/CD, your applications,  and it creates this weird circular dependency   on how to establish a source of truth  before your source of truth exists.

Your artifact repository has a bunch of  dependencies. These are going to be for   things like containers, charts, packages,  CI dependencies. So, your CI system will   depend on everything previous, pretty much.  You're going to have your Git repositories,  

you're going to have secrets management,  you're going to have a whole bunch of   things that need off your system. It has to  have roles, has to have roles in your cloud,   in your cluster, across your applications, all  the tools that are running on your platform. And your observability layer, your observability  layer, doesn't do a whole lot until you have   all of this infrastructure that you need to be  managing, and you need to be actually monitoring. So, altogether now, we have all of these  relationships that are formed throughout   the provisioning process, and you have to figure  out how to get those all together and working.

It's a mess. Look at all those dotted  lines pointing back at each other. I mean,   it is a lot of circular dependencies, but  we have some good tips to defeat the army   of chicken and eggs that you're walking into.  The first thing that we're going to advise you,   so we happen to be in Argo CD shop. Flux CD  is also a great GitOps driver worth looking   at. All of the advice for orchestration  that we're going to give you today is   founded on the Argo CD technology,  so just understand that going in.

And there's a configuration in Argo CD when you  install it by default that is not turned on,   and it makes it impossible to orchestrate  your sync waves across app-of-apps when you're   following the app-of-apps pattern. There are some  key configurations that you're going to have to   get your arms around, and every platform is going  to be a little bit different. On our platform,   two of the main key configurations, and when I  say key configuration, what I'm trying to convey   to you is that there are certain technologies  that'll be on your platform where applications   that were installed before that technology was  on the platform need to be installed with one   configuration, and then perhaps it's a secrets  manager, you install your secrets manager finally,   and you want to hydrate that with all your  secrets, and then you want to leverage that   secrets manager in those applications that  you've already installed. So then, you have   a second variation of that same application from  before and after your secrets manager existed. That same type of scenario happens with OIDC  providers when you're self-hosting your users.   It happens, and again, every platform's different.  It's one of your most critical things is to figure  

out what your key configurations are and notate  them, and you're going to be driving all of your   decision-making about your GitOps unfurling with  recognition of those key configurations because   you're going to have to create the app one way,  and then a key configuration application gets   installed to the platform, and then you have to  reconfigure all those applications the way they   were originally installed. So, you can have  multiple generations of a single app during   provisioning. That's really important to know.  You can install Argo CD without an OIDC provider,   and then once your OIDC provider  exists, you can install it right   over top of it with Argo CD with that permanent  configuration. So that's a really helpful tip. You can script ops the creation of secrets  before a secrets manager exists. You can   literally kubectl apply. We're a GitOps shop.  We buy in completely to the GitOps discipline.   We think that it's the right way to run complex  Kubernetes infrastructure from an asset management   standpoint, from a security posture, disaster  recovery posture standpoint. GitOps is incredible.  

And when you're trying to create these GitOps  platforms from scratch, you're going to need   to script ops the temporary configurations  of these applications, the temporary secrets   that you're installing into these namespaces, so  that later on in your orchestration, you can lay   GitOps over top of it, and then GitOps is the  controller of those particular configurations. And there's one last detail, and we'll show you an  example of this. This is kind of hard to describe,   but it's important to understand that when you're  organizing your Argo CD orchestration sync wave,   whether there's a, if you're following the  app-of-apps pattern, which we think is just   the best pattern for Argo CD, it's important  to understand that there's a difference between   having a Kubernetes resource installed  in a namespace or not. Sometimes you'll   be building an app of apps, and you can  technically create an app of app-of-apps,   and when you do that, you can lose control  of being able to monitor other sync waves and   what their operational states are unless  you're deploying a Kubernetes object to   do that watching for you. We'll show you  exactly what that means in a live demo.

So with that said, it's demo  time. Take it away, Jared. All right. I never know where to stand. Where should I? I'll just stand right  in the middle here. So real quick,   I just need to make a quick change, not there,  management secret. So what we're doing here,   Jared is pulling up a GitOps repository. When you  have a GitOps platform, all of your configurations  

for your platform are generally speaking going  to be in a Git repository, in your GitHub or in   your GitLab, and then you wire up your GitOps  engine to look at that Git repository for all   the configurations of what it wants to install  and in what order it should be installing them. So, it's going to run through the  list of all the previous sync waves,   so it's going to take just a second for this to  get started. But this is going to be the first   demonstration of the sync waves operating  in Argo CD in the app of apps pattern. So,   if we take a look inside sync wave components  one here, oops, wrong button. We can see that  

this is just a job, it just ran a sleep, it  completed, and now as soon as that job completed,   sync wave 2 has started. And I'll click  into this one because there's actually   an app of apps in here, so it's a little bit  easier to see the control of these objects. But, basically, if you didn't have this job  here at the bottom that's called 'wait for   these sync waves', then what would happen  as soon as the sync process started on sync   wave 2-3 components, you might get back  to that main driver of your registry,   which is going to start something else,  which is going to fail because it had a   dependency on something that was  down in these lower sync waves. So, you can see there's  just some more jobs in here. Can we see what these, how this looks in the  GitOps repository itself from a code standpoint,   like how are you organizing the  three sync waves, for example? Yeah, so, let's see if we can probably bump  this font up a little bit for everybody, yeah,   that's good, thank you. So, each of these  wave one, two, three files is actually just  

referencing the folder inside the same directory  structure. So, when you look at this sync wave 2   that we were just looking at, we have, excuse  me, wave 2, I'm in the wrong, wave 2L, oh,   going off the rails. I was expecting to  see more than just that one wait.yml,   but yeah, I got nothing, I don't see the  other files I was hoping to see here, no. Yeah, sorry. So, it's sync wave 1, two and  three, and sync wave 2 has an internal one,  

two and three, that's all you're looking for. So,   can you describe for us in the registry,  those wave one, two, three at the bottom,   the YAML with the annotations, and describe how  it is that we're conducting that orchestration? Sure, sure. So, Argo CD just has a concept of sync  wave, so, basically, when this is running through,   you can see on line 9 here that this is  the first sync wave that's going to happen,   it's going to run whatever is in this directory  structure here, and wait for all those things   to happen, and then just continue down  that same chain with wave two and three. And can you also show us the job that we had that  was waiting on a particular sync wave? That's an   important detail, it's basically an example  of what we were describing with deploying a   Kubernetes job to watch a particular Kubernetes  object, to wait until it exists or whatever it   is that you need in your orchestration,  to hold up until something completes. Yeah, and I can't find that job, so  I'm just, oh, is that right? Yeah,   I am having a hard time. But so,  moving on, and I think I already   actually pushed the secret overlay, so  I need to comment that out really quick.

Can you pull it up in Argo CD? Sure. So,  and can we look at that job right there,   the wait sync waves? Sure, as a desired manifest.  Thanks, buddy. So, this is an example of just a   Kubernetes raw manifest that we're deploying  to a particular namespace, and it right now,   what we're showing you is just like wave  one, two, and three, but if you close your   eyes and use your imagination for a second, just  imagine that this is like Ingress NGINX first,   and then external DNS second, and then maybe Cert  Manager third. Those are the types of sequences   of operations that you're going to have to  be ordering when you're trying to unfurl a   GitOps ecosystem from scratch. And it's just  a Kubernetes job, so you drop it in there.

Now, if you want your Kubernetes job  to be able to access certain resources,   like maybe you need to be waiting until  a secret exists or waiting until a pod   in some other namespace exists or whatever  it might be, you are going to have to mess   with RBAC a little bit. You're going to have  to create a service account, give that service   account access to whatever resource it is  that it's supposed to be monitoring. But   it's a relatively simple way to just get a job  running in Kubernetes that's going to monitor,   it could literally just be like to kubectl  get pods with a particular label and in an   infinite loop until it's available, and  then allow it to eventually give up. And then, can we also show an example of creating  a secret via script ops, and then having the   ability to overlay an Argo CD driven external  secret, so that it's managed by GitOps instead? Yeah, grab this kubeconfig really quick.  Alright, so we'll create this secret in   the overlay namespace, and this was script  ops. So, just a kubectl create command,  

and now if I want to go take a look at that  secret. Kubectl get secrets, man, this is   going just as planned for a live demo. That  must have been in the default namespace. Yeah,   so that was bad on my command there, let's go  ahead and create that again, again, one more time,   in the correct namespace for the rest of the demo. Okay, create a secret, singular, yeah, that's  

going great. So, I think what I'd rather do is  just change this to go to the other namespace. No,   I can't do that either, default. Alright, so now  that we have the secret back on the rails, barely,   but now we'll take a look at the secret, and I  should probably base64 decode this real quick. Aren't live demos more exciting? They're  more exciting this way, right? Yeah,   I mean, it could have just been a gif,  right? That would have been so boring. So, all we have here for the value of secret  one is intro secret. So now, we'll demonstrate  

overlaying that with something, which is  going to change the value intentionally. So,   give me one second to push that change. Alright,  so if I pull this probably over just a touch, and   this over a touch here, I can hit refresh on this  side, and then, as soon as this secret resource   is going to change the value of this, and this  is kind of what we were describing earlier. So,   sometimes you would script op a secret, but  later on, you're going to want that to be in   your desired state of your GitOps content. So, as  soon as did that just turn green? It did. So now,   if I go to YAML again, we can see  that the value of that secret has   now changed. This value is actually coming out  of HashiCorp Vault, and it's a different value,  

that's a demo example value, but just to  describe the point of overlaying that right. And this particular example, we're  changing the value, that's not something   that you'll do when you're creating your  platform. When you create your platform,   you'll create the real secret with  the real value, you'll script ops it,   you'll put it in your namespace before your secret  manager exists, and then once your secret manager   exists, you'll add some GitOps content to your  GitOps repository to drop a secret right over   top of that secret that you script ops'd,  except now it's being managed by GitOps,   and you're in a better permanent state  from a disaster recovery standpoint. So, yeah, that's pretty much what we wanted to  show from a demo standpoint. Jared, can you bring  

up the final slide if that's alright? Yeah, so we  don't want to make this a product pitch at all,   but we do want to share with the world that we've  been working on this open-source platform for four   years now. It started as a nights and weekends  project, and then we ran a one-year pilot with   this enterprise company. We decided to open  source the platform, and we've been running   for the last year now with the team, and we're  trying to see the rest of the vision through. We create instant GitOps platforms, so that you  can have a better starting point for Kubernetes   with all of the best practices that we've  collected over the course of the last four   years. And it's all free and open source, so  if that's valuable to you at all, we hope that  

you'll give it a shot. We also have a community  of 250 or so engineers that have all agreed that   like these certain great technologies are just a  better starting point for Kubernetes, and they're   all buying into the same tools, working the same  way, and they're all there to help each other. So,   it's a really great learning environment, it's a  really great production-grade platform. We have   a GitOps catalog and all kinds of cool stuff,  so I hope you'll check us out. We're easy to   find at, and we hope that we can be  helpful for you. Does anybody have any questions  

about any of the stuff that we covered today?  We got a few extra minutes. And show of hands,   how many of you are using GitOps right now?  Okay, okay, about half, very good, very good.   And one hand for Argo CD and different hand for  Flux CD? Okay, alright, thank you guys. And are   you building those platforms from scratch, like  from the ground up, just all the way? Are you,   and does, is this talk resonating with you,  from a standpoint of what you run into with   these chicken and egg scenarios. Awesome, okay.  I had heard it called the waves, that right on,   right on. Yeah, so, anyway, we're easy to find,  we're super helpful folks, and oh yeah, please, I do have a question. So, one of the challenges  that we found is that every tool that we deploy  

in Clear Unit, they have a Helm chart that  lets us have declarative configuration. Oh,   okay. And so, do you have any, have you seen that,  do you have any patterns or recommendations for,   like, one example might be, you need  a, you need to be able to access an   API or click something in a UI before you  can update and add the password, right. Right, so if you have to literally click a button,  in order to get that operation, you would have to   leverage some type of a technology that could  click that for you. And generally speaking,  

if they don't want to provide that in their Helm  chart, there's usually a good reason that they   want that to be a button click. So, there are  just some circumstances where you have to cut   your unfurling in half and say that this is  the bootstrap portion, and then you do your   button click, and then you can, then register  the second half of your orchestration waves. A lot of times though, it's, for example,  sometimes there's a Helm chart, where you   have to create a secret, and they expect you to  have a secret, in order to install the Helm chart,   but the Helm chart doesn't cover the secret  at all. In that type of a circumstance,   it's totally okay to write scripting  in your GitOps. So, you can have like  

an Ubuntu image that's just sitting there in a  Kubernetes job that's on a particular sync wave,   and it can be responsible for the generation  of a secret value, putting it in a namespace,   and then you're still guaranteed  that disaster recovery posture of   leveraging GitOps. Even though there's a piece  of your GitOps implementation that's scripting   a secret. By wrapping that in a Kubernetes  job, you get to turn it back into GitOps. So,   for whatever help that gives you. Sure, we'll  take one more. Thank you, appreciate it. What about the situations where you  need to, an image isn't available yet,   if you're rolling out. Right,  chicken, yeah, we, we do. So,   we have Argo workflows, but we're only  leveraging Argo workflows for our CI   story, on the platform that we've been  building. How would you answer that one?

Metaphors is the closest. Yeah, yeah,  good. I guess if you're not going to,   metaphor is an application that we deliver  on the platform that's supposed to be just   like your application would be on the platform.  And we have to build those images, Helm charts,   and publish those before we can deploy  those. So, there is a little bit of that,  

but we don't have any particular, like, holding  up and waiting for those images to publish. But,   probably could script some or  put something in to do that. Yeah, I imagine we would probably do something  that was like a Kubernetes job, honestly,   that was checking against your registry in some  type of a loop to see, to wait, the tricky bit   is figuring out what that circumstance is to know  that it's that container version. And you would   have to source that from your Git repository. It  would depend on your convention for how you're   naming the tags of your container. But yeah,  from a GitOps standpoint, that would probably   end up being another one of these script ops  jobs that we add to the GitOps orchestration.

Any other questions for anyone? What you get, maybe you answer,   sorry that I missed it at the beginning.  Not at all. The approach you have,   is it better but replaceable? How easy is it to  maybe remove parts, to say okay, I don't want to,   for example, okay maybe to inverse but  say there's a different approach external? Sure, yeah. So, in order to, what we're doing, we  have this upstream GitOps template repository that   has all of our opinions. And when you say that  you want to install a platform, we pull that down,   we hydrate it with the details about what cloud  you want to install it to, what domain you want   to attach it to, and what Git provider you want  to use. And then we give it right back to you,   and you hosted in your GitHub or your GitLab. And  that's powering all of the automation, all of the  

configurations of the entire platform, all your  IaC's in there, all your GitOps configurations are   in there. And because of that architecture, you  own the GitOps repository. So if you don't like   a particular tool that we're using, you just  pull request and remove it from the platform.   If you have some additional tools that we want  to add, again, it's just a pull request away. So you could also, we, that GitOps template that  he mentioned, you could also fork that or create   your own and then reference your GitOps template.  So you could actually, with more longevity,   use that replaced tool and your own personalized  stack, as long as it conforms to the end result. Any other questions? All right, well, thank  you guys so much, much thank you. [Applause]


Show video