Farming Cloud Native Platforms Be Prepared for the Army of Chickens and Eggs! - Navigate Europe 23
Well, good morning. This is nice. So, today, we're going to be talking through some scenarios when you're building internal developer platforms. There are all these chicken and egg scenarios that happen when you're building these platforms from scratch, and we're going to talk to you a little bit about how to identify those, how to shake out the chicken and egg scenarios, the right, or maybe not the right way, a great way to orchestrate those sequences of events using the Argo CD technology. And we're going to show you a couple of examples, live demo style, so wish us a lot of luck.
So, yeah, this is my co-founder, Jared. I'm John Dietz, and we have been building cloud-native platforms for many years, and we've kind of just gone through the ringer. We know all the patterns inside and out. So, if you're looking for a guiding voice on how to get through orchestration sync waves and how to get it all set up, we think that we're a pretty good voice of authority on the topic. Oh, thank you, buddy. So, Kubernetes swallowed the internet. We all kind of saw it coming, maybe some of Jared us. In fact, when I first came across Kubernetes, I was certain that it was just another buzzword, and man, was I wrong. And I don't really have time to get into why it won, but it won, and it's complicated. As a technology, it's a complex technology in and of itself, but then
you add all the cloud-native tools that you want in that cluster in order to power your platform, and it gets all the more complex. So, when you start building those cloud-native platforms, there are all these chicken and egg scenarios that you're going to have to battle. Examples of these are like secrets management. Maybe you want a secrets manager like HashiCorp Vault as part of your Kubernetes platform, and you want that to be the central source of truth for every single secret on your platform. And when you start building that platform from scratch, it doesn't exist yet. So, what do you do? Those are the types of chicken and
egg scenarios that we're going to be talking through today. So, you want to take it in? Yeah. So, when you're building your cloud-native platforms, these are the nine pillars that we've identified as problem points or things that you have to take into consideration when building out your orchestration. So, taking a look at cloud and Git providers, these are the first two pillars, and they do not have any dependencies. So, it's great,
enjoy it while it lasts. These things are usually SaaS-hosted, so you're going to sign into a console, be able to create some resources, and nothing depends on those yet. Then, you're going to get to your infrastructure as code. If you manage your Git repositories in infrastructure as code, then your infrastructure as code depends on your Git provider. If you manage your cloud resources in infrastructure as code, your infrastructure as code depends on your cloud provider. If you're running infrastructure as code in a control plane like Atlantis or Crossplane, then your GitOps driver is going to be the thing that configures that control plane. And if you're managing
your users in infrastructure as code, then that means your IDP has an IAC dependency as well. Next is your GitOps provider. So, woohoo, GitOps. Yeah, your GitOps driver runs in your infrastructure, in-cluster generally, or it can, anyways, and it has a registry that it pulls all of your config from, which is generally a Git repository for using Argo CD, and it pulls images from your artifact repositories to pull those containers and have your charts running.
Next is secrets management. Secrets management, as John kind of mentioned upfront, is a good example. You can't have a secrets manager established until you have your cluster, so you have to figure out how to get through those. You're going to need secrets for your CI/CD, your applications, and it creates this weird circular dependency on how to establish a source of truth before your source of truth exists.
Your artifact repository has a bunch of dependencies. These are going to be for things like containers, charts, packages, CI dependencies. So, your CI system will depend on everything previous, pretty much. You're going to have your Git repositories,
you're going to have secrets management, you're going to have a whole bunch of things that need off your system. It has to have roles, has to have roles in your cloud, in your cluster, across your applications, all the tools that are running on your platform. And your observability layer, your observability layer, doesn't do a whole lot until you have all of this infrastructure that you need to be managing, and you need to be actually monitoring. So, altogether now, we have all of these relationships that are formed throughout the provisioning process, and you have to figure out how to get those all together and working.
It's a mess. Look at all those dotted lines pointing back at each other. I mean, it is a lot of circular dependencies, but we have some good tips to defeat the army of chicken and eggs that you're walking into. The first thing that we're going to advise you, so we happen to be in Argo CD shop. Flux CD is also a great GitOps driver worth looking at. All of the advice for orchestration that we're going to give you today is founded on the Argo CD technology, so just understand that going in.
And there's a configuration in Argo CD when you install it by default that is not turned on, and it makes it impossible to orchestrate your sync waves across app-of-apps when you're following the app-of-apps pattern. There are some key configurations that you're going to have to get your arms around, and every platform is going to be a little bit different. On our platform, two of the main key configurations, and when I say key configuration, what I'm trying to convey to you is that there are certain technologies that'll be on your platform where applications that were installed before that technology was on the platform need to be installed with one configuration, and then perhaps it's a secrets manager, you install your secrets manager finally, and you want to hydrate that with all your secrets, and then you want to leverage that secrets manager in those applications that you've already installed. So then, you have a second variation of that same application from before and after your secrets manager existed. That same type of scenario happens with OIDC providers when you're self-hosting your users. It happens, and again, every platform's different. It's one of your most critical things is to figure
out what your key configurations are and notate them, and you're going to be driving all of your decision-making about your GitOps unfurling with recognition of those key configurations because you're going to have to create the app one way, and then a key configuration application gets installed to the platform, and then you have to reconfigure all those applications the way they were originally installed. So, you can have multiple generations of a single app during provisioning. That's really important to know. You can install Argo CD without an OIDC provider, and then once your OIDC provider exists, you can install it right over top of it with Argo CD with that permanent configuration. So that's a really helpful tip. You can script ops the creation of secrets before a secrets manager exists. You can literally kubectl apply. We're a GitOps shop. We buy in completely to the GitOps discipline. We think that it's the right way to run complex Kubernetes infrastructure from an asset management standpoint, from a security posture, disaster recovery posture standpoint. GitOps is incredible.
And when you're trying to create these GitOps platforms from scratch, you're going to need to script ops the temporary configurations of these applications, the temporary secrets that you're installing into these namespaces, so that later on in your orchestration, you can lay GitOps over top of it, and then GitOps is the controller of those particular configurations. And there's one last detail, and we'll show you an example of this. This is kind of hard to describe, but it's important to understand that when you're organizing your Argo CD orchestration sync wave, whether there's a, if you're following the app-of-apps pattern, which we think is just the best pattern for Argo CD, it's important to understand that there's a difference between having a Kubernetes resource installed in a namespace or not. Sometimes you'll be building an app of apps, and you can technically create an app of app-of-apps, and when you do that, you can lose control of being able to monitor other sync waves and what their operational states are unless you're deploying a Kubernetes object to do that watching for you. We'll show you exactly what that means in a live demo.
So with that said, it's demo time. Take it away, Jared. All right. I never know where to stand. Where should I? I'll just stand right in the middle here. So real quick, I just need to make a quick change, not there, management secret. So what we're doing here, Jared is pulling up a GitOps repository. When you have a GitOps platform, all of your configurations
for your platform are generally speaking going to be in a Git repository, in your GitHub or in your GitLab, and then you wire up your GitOps engine to look at that Git repository for all the configurations of what it wants to install and in what order it should be installing them. So, it's going to run through the list of all the previous sync waves, so it's going to take just a second for this to get started. But this is going to be the first demonstration of the sync waves operating in Argo CD in the app of apps pattern. So, if we take a look inside sync wave components one here, oops, wrong button. We can see that
this is just a job, it just ran a sleep, it completed, and now as soon as that job completed, sync wave 2 has started. And I'll click into this one because there's actually an app of apps in here, so it's a little bit easier to see the control of these objects. But, basically, if you didn't have this job here at the bottom that's called 'wait for these sync waves', then what would happen as soon as the sync process started on sync wave 2-3 components, you might get back to that main driver of your registry, which is going to start something else, which is going to fail because it had a dependency on something that was down in these lower sync waves. So, you can see there's just some more jobs in here. Can we see what these, how this looks in the GitOps repository itself from a code standpoint, like how are you organizing the three sync waves, for example? Yeah, so, let's see if we can probably bump this font up a little bit for everybody, yeah, that's good, thank you. So, each of these wave one, two, three files is actually just
referencing the folder inside the same directory structure. So, when you look at this sync wave 2 that we were just looking at, we have, excuse me, wave 2, I'm in the wrong, wave 2L, oh, going off the rails. I was expecting to see more than just that one wait.yml, but yeah, I got nothing, I don't see the other files I was hoping to see here, no. Yeah, sorry. So, it's sync wave 1, two and three, and sync wave 2 has an internal one,
two and three, that's all you're looking for. So, can you describe for us in the registry, those wave one, two, three at the bottom, the YAML with the annotations, and describe how it is that we're conducting that orchestration? Sure, sure. So, Argo CD just has a concept of sync wave, so, basically, when this is running through, you can see on line 9 here that this is the first sync wave that's going to happen, it's going to run whatever is in this directory structure here, and wait for all those things to happen, and then just continue down that same chain with wave two and three. And can you also show us the job that we had that was waiting on a particular sync wave? That's an important detail, it's basically an example of what we were describing with deploying a Kubernetes job to watch a particular Kubernetes object, to wait until it exists or whatever it is that you need in your orchestration, to hold up until something completes. Yeah, and I can't find that job, so I'm just, oh, is that right? Yeah, I am having a hard time. But so, moving on, and I think I already actually pushed the secret overlay, so I need to comment that out really quick.
Can you pull it up in Argo CD? Sure. So, and can we look at that job right there, the wait sync waves? Sure, as a desired manifest. Thanks, buddy. So, this is an example of just a Kubernetes raw manifest that we're deploying to a particular namespace, and it right now, what we're showing you is just like wave one, two, and three, but if you close your eyes and use your imagination for a second, just imagine that this is like Ingress NGINX first, and then external DNS second, and then maybe Cert Manager third. Those are the types of sequences of operations that you're going to have to be ordering when you're trying to unfurl a GitOps ecosystem from scratch. And it's just a Kubernetes job, so you drop it in there.
Now, if you want your Kubernetes job to be able to access certain resources, like maybe you need to be waiting until a secret exists or waiting until a pod in some other namespace exists or whatever it might be, you are going to have to mess with RBAC a little bit. You're going to have to create a service account, give that service account access to whatever resource it is that it's supposed to be monitoring. But it's a relatively simple way to just get a job running in Kubernetes that's going to monitor, it could literally just be like to kubectl get pods with a particular label and in an infinite loop until it's available, and then allow it to eventually give up. And then, can we also show an example of creating a secret via script ops, and then having the ability to overlay an Argo CD driven external secret, so that it's managed by GitOps instead? Yeah, grab this kubeconfig really quick. Alright, so we'll create this secret in the overlay namespace, and this was script ops. So, just a kubectl create command,
and now if I want to go take a look at that secret. Kubectl get secrets, man, this is going just as planned for a live demo. That must have been in the default namespace. Yeah, so that was bad on my command there, let's go ahead and create that again, again, one more time, in the correct namespace for the rest of the demo. Okay, create a secret, singular, yeah, that's
going great. So, I think what I'd rather do is just change this to go to the other namespace. No, I can't do that either, default. Alright, so now that we have the secret back on the rails, barely, but now we'll take a look at the secret, and I should probably base64 decode this real quick. Aren't live demos more exciting? They're more exciting this way, right? Yeah, I mean, it could have just been a gif, right? That would have been so boring. So, all we have here for the value of secret one is intro secret. So now, we'll demonstrate
overlaying that with something, which is going to change the value intentionally. So, give me one second to push that change. Alright, so if I pull this probably over just a touch, and this over a touch here, I can hit refresh on this side, and then, as soon as this secret resource is going to change the value of this, and this is kind of what we were describing earlier. So, sometimes you would script op a secret, but later on, you're going to want that to be in your desired state of your GitOps content. So, as soon as did that just turn green? It did. So now, if I go to YAML again, we can see that the value of that secret has now changed. This value is actually coming out of HashiCorp Vault, and it's a different value,
that's a demo example value, but just to describe the point of overlaying that right. And this particular example, we're changing the value, that's not something that you'll do when you're creating your platform. When you create your platform, you'll create the real secret with the real value, you'll script ops it, you'll put it in your namespace before your secret manager exists, and then once your secret manager exists, you'll add some GitOps content to your GitOps repository to drop a secret right over top of that secret that you script ops'd, except now it's being managed by GitOps, and you're in a better permanent state from a disaster recovery standpoint. So, yeah, that's pretty much what we wanted to show from a demo standpoint. Jared, can you bring
up the final slide if that's alright? Yeah, so we don't want to make this a product pitch at all, but we do want to share with the world that we've been working on this open-source platform for four years now. It started as a nights and weekends project, and then we ran a one-year pilot with this enterprise company. We decided to open source the platform, and we've been running for the last year now with the team, and we're trying to see the rest of the vision through. We create instant GitOps platforms, so that you can have a better starting point for Kubernetes with all of the best practices that we've collected over the course of the last four years. And it's all free and open source, so if that's valuable to you at all, we hope that
you'll give it a shot. We also have a community of 250 or so engineers that have all agreed that like these certain great technologies are just a better starting point for Kubernetes, and they're all buying into the same tools, working the same way, and they're all there to help each other. So, it's a really great learning environment, it's a really great production-grade platform. We have a GitOps catalog and all kinds of cool stuff, so I hope you'll check us out. We're easy to find at KubeFirst.io, and we hope that we can be helpful for you. Does anybody have any questions
about any of the stuff that we covered today? We got a few extra minutes. And show of hands, how many of you are using GitOps right now? Okay, okay, about half, very good, very good. And one hand for Argo CD and different hand for Flux CD? Okay, alright, thank you guys. And are you building those platforms from scratch, like from the ground up, just all the way? Are you, and does, is this talk resonating with you, from a standpoint of what you run into with these chicken and egg scenarios. Awesome, okay. I had heard it called the waves, that right on, right on. Yeah, so, anyway, we're easy to find, we're super helpful folks, and oh yeah, please, I do have a question. So, one of the challenges that we found is that every tool that we deploy
in Clear Unit, they have a Helm chart that lets us have declarative configuration. Oh, okay. And so, do you have any, have you seen that, do you have any patterns or recommendations for, like, one example might be, you need a, you need to be able to access an API or click something in a UI before you can update and add the password, right. Right, so if you have to literally click a button, in order to get that operation, you would have to leverage some type of a technology that could click that for you. And generally speaking,
if they don't want to provide that in their Helm chart, there's usually a good reason that they want that to be a button click. So, there are just some circumstances where you have to cut your unfurling in half and say that this is the bootstrap portion, and then you do your button click, and then you can, then register the second half of your orchestration waves. A lot of times though, it's, for example, sometimes there's a Helm chart, where you have to create a secret, and they expect you to have a secret, in order to install the Helm chart, but the Helm chart doesn't cover the secret at all. In that type of a circumstance, it's totally okay to write scripting in your GitOps. So, you can have like
an Ubuntu image that's just sitting there in a Kubernetes job that's on a particular sync wave, and it can be responsible for the generation of a secret value, putting it in a namespace, and then you're still guaranteed that disaster recovery posture of leveraging GitOps. Even though there's a piece of your GitOps implementation that's scripting a secret. By wrapping that in a Kubernetes job, you get to turn it back into GitOps. So, for whatever help that gives you. Sure, we'll take one more. Thank you, appreciate it. What about the situations where you need to, an image isn't available yet, if you're rolling out. Right, chicken, yeah, we, we do. So, we have Argo workflows, but we're only leveraging Argo workflows for our CI story, on the platform that we've been building. How would you answer that one?
Metaphors is the closest. Yeah, yeah, good. I guess if you're not going to, metaphor is an application that we deliver on the platform that's supposed to be just like your application would be on the platform. And we have to build those images, Helm charts, and publish those before we can deploy those. So, there is a little bit of that,
but we don't have any particular, like, holding up and waiting for those images to publish. But, probably could script some or put something in to do that. Yeah, I imagine we would probably do something that was like a Kubernetes job, honestly, that was checking against your registry in some type of a loop to see, to wait, the tricky bit is figuring out what that circumstance is to know that it's that container version. And you would have to source that from your Git repository. It would depend on your convention for how you're naming the tags of your container. But yeah, from a GitOps standpoint, that would probably end up being another one of these script ops jobs that we add to the GitOps orchestration.
Any other questions for anyone? What you get, maybe you answer, sorry that I missed it at the beginning. Not at all. The approach you have, is it better but replaceable? How easy is it to maybe remove parts, to say okay, I don't want to, for example, okay maybe to inverse but say there's a different approach external? Sure, yeah. So, in order to, what we're doing, we have this upstream GitOps template repository that has all of our opinions. And when you say that you want to install a platform, we pull that down, we hydrate it with the details about what cloud you want to install it to, what domain you want to attach it to, and what Git provider you want to use. And then we give it right back to you, and you hosted in your GitHub or your GitLab. And that's powering all of the automation, all of the
configurations of the entire platform, all your IaC's in there, all your GitOps configurations are in there. And because of that architecture, you own the GitOps repository. So if you don't like a particular tool that we're using, you just pull request and remove it from the platform. If you have some additional tools that we want to add, again, it's just a pull request away. So you could also, we, that GitOps template that he mentioned, you could also fork that or create your own and then reference your GitOps template. So you could actually, with more longevity, use that replaced tool and your own personalized stack, as long as it conforms to the end result. Any other questions? All right, well, thank you guys so much, much thank you. [Applause]
2023-11-30 13:20