GitOps & ArgoCD with Viktor Farcic

Viktor Farcic & Adam Hawkins discuss logic behind GitOps, how to use ArgoCD, and changing the way we think of production environments.

[00:00:00] Hello and welcome. I'm your host, Adam Hawkins. In each episode, I present a small batch, with theory and practices behind building a high velocity software organization. Topics include DevOps, lean, software architecture, continuous delivery, and conversations with industry leaders. Now let's begin today's episode.

[00:00:26] Aloha everyone. Adam here coming to you from sunny and warm Hawaii, right in the middle of the Pacific ocean. I have a few things to share with you today before we get into today's episode, I've updated all the free resources I've made to support the show. You may have already seen some of these, but if you haven't, they've got a shiny new coat of paint.

[00:00:45] The first is my project product email course. You've heard me talk about the book in the flow framework on this podcast, and even an interview with Mick Kirsten to dive in deeper into it. This email course goes deeper than we could in these episodes and covers parts. We've not touched on so far. My favorite parts are the three Epiphanes and the turning point into the age of software. Both are key points to the high level thesis that haven't appeared on the show. You can get this one at projecttoproductsummary.com.

[00:01:13] Second is my continuous improvement pocket guide. This is really the third way of DevOps as described through Mike Rother's Toyota kata book. The pocket guide covers the connection between dev ops and Toyota, a shareable five point summary and tips on bootstrapping improvement and coaching kata. So if you're feeling stuck in the status quo, then topple it with the Toyota kata. Get this one at toyotakatapocketguide.com.

[00:01:38] Third is my war and peace and IT pocket guide. You could spend a week reading through why IT fails to deliver on time complete and on budget projects, and then continue on why hypothesis driven thinking, lean theory and dev ops offer way forward. Or you could just spend 20 minutes with me and only get the best bits. It's like a small batches episode for a book. Get this one at warpeaceanditpocketguy.com.

[00:02:04] All three of these are free. Now I also got to give a shout out to my flagship dev ops course. It combines the best of the DevOps handbook accelerate with years of my practical software delivery experience. Get the free nine day email course at freedevopscourse.com.

[00:02:20] If you just started listening to this podcast, then this is a great place to start, it goes much deeper into the topics that I'm able to cover on this podcast.

[00:02:29] Lastly, there's something new I'm trying to put together for February or March, 2021. I'm running a four week software delivery dojo. Participants will meet each Sunday for practical and theoretical exercises related to software delivery. The point of this dojo is to level up people's skills in building, deploying and operating production systems. This is really an experiment for me, so I'm not entirely sure what we'll cover was more participant driven at the moment.

[00:02:56] I'm offering this at an amazingly affordable price to listeners of this podcast and members of my list. Spots are limited, so apply now at softwaredeliverydojo.com.

[00:03:07] Of course you can find links to all of these on the show notes page at smallbatches.fm. Now let me introduce today's guest today.

[00:03:16] I'm speaking with Victor Farcic. Victor is a principal DevOps architect at code fresh. He's a member of the Google developer experts, continuous delivery foundation ambassadors, and even Docker captains group, and also a published author. His big passions are DevOps, containers, Kubernetes microservices, continuous integration delivery, deployment pipelines, and test driven development. Really sounds a bit like me, right? He's also published the DevOps toolkit series of books and videos and the test driven Java development book, as well as numerous courses on Udemy. He's also the co-host of the dev ops paradox podcast, which I recently appeared on. That was a fun episode discussing a day in the life of an SRE, linked to that one in the show notes too.

[00:04:02] As I mentioned, Victor is a principle DevOps architect at code fresh, which I am a big fan of recently. I did a webinar on building extendable and compostable deployment pipelines using code fresh. This was an official collaboration between Skillshare and code fresh. So if you're curious about how I think about and build pipelines and check out that webinar, the recording is linked in the show notes too.

[00:04:26] So I'm stoked to talk to Victor because he has such a wide range of experience. Plus he's really just a fun guy to talk to. I think that really comes out in this interview and I don't think I've laughed so hard while talking tech in long time.

[00:04:41] Anyway, Victor. And I discussed the get ops workflow, how and why to use Argo CD, but we don't really go too deep into the technical specifics, but instead we focus on how these tools fit into the deployment pipeline and help teams achieve fast flow.

[00:04:57] So I give you my conversation with Victor Farcic.

[00:05:04] Adam Hawkins: Victor, welcome to the show.

[00:05:06] Victor Farcic: Thank you for having me.

[00:05:07] Adam Hawkins: My pleasure, I'm more than happy to talk to you today, it's actually a kind of a special episode of small batches in a way, because I think this'll be the first episode that we talk about a specific technology. You know, typically on the podcast, we talk more about like high-level theory and practices rather than implementing with specific technology.

[00:05:27] Adam Hawkins: So I'm happy to have you on today to talk about Argo CD and GitOps. So maybe we can start just at the basics and level up from there. Can you explain what is GitOps and what is Argo CD.

[00:05:43] Victor Farcic: GitOps. I can explain it as a set of principles. The thought with the idea that everything is defined this code. And from there on, we can follow that line of thought. If everything is code, everything, is in Git. If everything is in Git, Git defines the desired state, what we want. And then we have some processes out to meet corporately that are converging the actual state, what is happening in your cluster. So even what your clusters are. Into that desired state.

[00:06:16] Victor Farcic: I think that kind of, if you go further away from the definition, we would just enter into specific use cases, but that should cover, I think, call GitOps.

[00:06:27] Adam Hawkins: Yeah, so, okay. I think that some of the sort of expectation around GitOps with this, at least in my experiences, it kind of comes down to like, Hey, everything is yaml. If you don't have yaml, then it's sort of like weird, it doesn't fit. But in your opinion, how does other infrastructure as code or declarative type things like Terraform and all of that fit into GitOps?

[00:06:51] Victor Farcic: I think that that fits perfectly and that I don't believe that everything should be yaml now maybe it should be 90% or 20% right. But you just mentioned data from Terraform is probably the best tool we have today to manage infrastructure. It is not in yaml. It's in, I think it's called HCL and that's. Okay. Right. And many other things might or might not be in yaml. I would rather say that it's very helpful if it's, if that something is defined in a declarative format. Because it's much easier to describe, to define the state of something in a declarative format, then imperative right.

[00:07:33] Victor Farcic: Now, hey, if you want to use Jason, that verge just as well. If you draw, if you want XML, I feel sorry for you, but still that works just as well, right? Yeah. It's not really about that. It has to be yaml, doesn't just kept us to be the most popular today. Yeah.

[00:07:49] Adam Hawkins: Well, I think the key point of the definition is that declarative code. Like the format doesn't necessarily matter. It's Jason, yaml, HCL, whatever you want to call it, but you'll pass this to some other program. It analyzes the state of the world and then makes the declared state into reality, right?

[00:08:09] Victor Farcic: Yeah. I mean, theoretically it could be some imperative format as well, but then we would really do this favor to machines who need to converge that right? Because for machines reading, something defined in decorative format makes it so much easier to figure out, you know, what are the differences for what is missing? What is different, so on and support. It's it's not necessarily even that it has to be decorative. It's just so much easier. And I think that almost all the tools that are of that type, right, that defines the state of something that for not based on decorative format, are probably gone, right. That there aren't many left.

[00:08:51] Adam Hawkins: Just a quick sidebar into this discussion, but where does something like Ansible fall into this in your mind?

[00:09:00] Victor Farcic: I mean, depends how you define the question. I think that from GitOps perspective, yes. I mean, honestly, we'll define the state of your infrastructure even of your applications and from GitOps perspective.

[00:09:15] Victor Farcic: Yes. That's just as valid as anything else. Now I have a personal drudge with Ansible. So what they think is I think that Ansible and the predecessors of Ansible, like chef puppet and CFEngine, if whomever doesn't know what CFEngine means will make me depressed because that means that I'm old. But I think that all those tools are based on promise theory and were really designed with mutable logic, right?

[00:09:46] Victor Farcic: Mutable infrastructure, mutable deployments. Now I know full well that Ansible can work with immutable stuff. Right. But it's not really designed for that. I don't think it's just as good fit for immutable world, as let's say Tera for more or even it could be pulled on me and now we're already entering into the domain of it doesn't even have to be decorative right? So it doesn't, to be honest, actually, there is no such thing as pure decorated or pure impeative either. I can find just as many examples in Tera for definition spare, when you have to do some loops and some if statements and also not. Right. So it's for like, it's more declarative than not the other way around. Right.

[00:10:32] Adam Hawkins: You spend more time describing what you want rather than how to do it.

[00:10:36] Victor Farcic: Yeah. And this conversation, I think, leads us to not the rule, not the requirement, but the nice to have thing about GitOps sintetics everything being immutable, doesn't have to be, but it certainly really simplifies, helps a lot. And have this feeling that people are still having trouble wrapping their heads around immutable world.

[00:11:00] Victor Farcic: Somehow with applications, we are all got used to images, being converted into containers. Hopefully not doing any funky stuff once that becomes a container, but with the infrastructure still, when I tell people you never update anything inside VM, or you never update your. Sometimes they still look at me, oh, what is this guy? What do you mean?

[00:11:28] Adam Hawkins: I get where you're coming from. And I think this is something that I also encounter and just like my own talking to people. It's hard for me to understand, or just like always remember that, like, there's some people who have never even heard of a concept of infrastructure as code or immutable infrastructure, you know, like some of these things that we take for granted that just they're just assumed that people never heard them.

[00:11:52] Adam Hawkins: And you know, if you tell somebody, Hey, you never update your VM or an update, anything inside that image. Well, that's true. But if you need to update something, you just create a new image. You still do that. It's just that how it happens is different. And I agree with you in that Ansible and some of these tools, like they predate the idea of immutable as principles zero, right?

[00:12:15] Adam Hawkins: Like there's the idea of SSH in the machines, running some scripts and changing the special pets and making the world as you want is different than, Hey, I create the whole world and if I want to create a new world, I just change it, run this script, run this program and it built a new roles for me. I can tear it down, recreate it, modify it, whatever I want. You don't, you don't care there's no attachment to any of these things they don't matter. You know, that opens up all kinds of extra use cases like, Hey, you opened a branch. Oh, you can just take everything in this branch, deploy it to a preview environment, open the application. And then when you're done, tear it down.

[00:12:47] Adam Hawkins: You can create test environments, you can do, you know, all kinds of stuff like thT. So now to circle back to GitOps, this is I think, where it's a really useful, because if you have the entire world or stage or application, whatever you want to call that to find in a code repository, then you can just pass this repository through your deployment pipeline and at the end of the day, the desired state is the final state. So how does Argos CD fit into this? Maybe you can explain what Argo CD is.

[00:13:19] Victor Farcic: Yeah. Before that, let's just say very quickly that the hate, the name. I dislike that it's called Argo CD because I can with a different definition of CD, what CD is?

[00:13:30] Adam Hawkins: I want to hear this one.

[00:13:30] Victor Farcic: Okay. So CD, really very is about the ultimate thing all the steps in life cycle of your application from the moment you commit it to get until it is releasable to production. Or released to production. There is continuous deployment model is the same thing, right?

[00:13:54] Adam Hawkins: Yeah. We, you and I have the same definition.

[00:13:56] Victor Farcic: Yeah. And then, but then there is that, you know, movement of people naming things randomly, and it's almost random, right. Because, okay. I'll get back to CD. Let me just inverse set this. Explain what Argo CD does, is deploying applications based on the finishes to get.

[00:14:18] Victor Farcic: Okay. And now if I circle back to CD, that's not CD, right. In my head, at least that's a fraction of what CD is, because I cannot deploy thin air. I need to build it something I need to push it to registry. I need to test it, hopefully, you know, and so on and so forth.

[00:14:36] Victor Farcic: So see the world out of the way, Argo CD is what it is really doing is monitoring Git repositories and making sure that the actual state on the application level is the same as the desired state.

[00:15:01] Victor Farcic: But now we're more looking at what's going on. Oh, this change that should execute this commander should execute this command. I should do whatever I need to do to make sure that the changes that applied. Either because I pushed the change to Git, or maybe I didn't push anything, but simply the state of the cluster for boundaries or another change. And it doesn't comply with my picture, you know, decided to in, in Git. And so from that perspective, Argo CD is almost the same as let's say, running pipelines triggered by you making changes to the code repository, and then executing QP, Cutler, plier, whatever your would be using. And I think that the real essential difference between the two is in the, whether it's a push or pull-based model.

[00:15:56] Victor Farcic: So while traditionally we would intercept those changes, triggers of changes and get repository, and then, then do this or that either from my terminal, maybe, or from a CD pipeline or from here or there this is an entity running in a cluster and monitoring you, you're monitoring your get repository. And that means that basically, theoretically, I know that it's never going to be like that hundred percent in practice, but theoretically, I could hear my production without any incoming traffic, without people or other tools having my secrets, having access to my cluster, having any way of communicating with the cluster. And yet cluster always being in the state that I define.

[00:16:42] Victor Farcic: So I many people don't, I human crib, that kind of being as, as a main description of Argo CD, but from my perspective, that is the main difference. That full model, instead of a push one of the most of us are using.

[00:16:59] Adam Hawkins: Yeah. So to just restate what you said to make sure I understand what you're saying and that let's imagine one type of system where you could imagine some sort of deployment pipeline. One step, let's say you're deploying Kubernetes based application would be like generate manifest and do something like cube CTL apply in the pipeline. That's like a direct communication to the cluster as a push to the cluster, pushing new application to the cluster. And then an Argo. You have something running in the cluster that says, oh, I observed an external state change outside the cluster, now I will do whatever is required inside the cluster. So that's a pull model, the same type of thing that you would see in other types of configuration management systems.

[00:17:41] Adam Hawkins: You know, like if you're running something like a puppet or chef, you know, you have some agent running, it notices a change and apply that to everything.

[00:17:48] Adam Hawkins: One question. I'm just, this is just me not as familiar with Argo CD as I maybe should be. But what types of applications can you could play with Argo CD? Is this a Kubernetes only thing or is there something else?

[00:17:59] Victor Farcic: Kubernetes only. In case of Argo CD itself, it's only Kubernetes. Now of course we could apply the same logic outside Kubernetes.

[00:18:09] Adam Hawkins: It's just easier inside Kubernetes because it already is effectively GitOps in the sense that you throw some yaml at the manifest at the API or whatever it makes me calls. And eventually Kubernetes does whatever to change all the desired state of everything that's there already, or creates updates as to CS fed.

[00:18:26] Victor Farcic: Exactly. And you know, if I would be starting a project of that type of scratch right now, I would do it Kubernetes only, at least until they earn a lot of money and I can hire 500 people simply because we can one API to rule them. All right. Kind of it works in AWS. It works in Azure. It works on prime. It works forever. Kubernetes is.

[00:18:46] Victor Farcic: I mean, theoretically, you could have it even in either 99 IOT, right? Argo CD doing stuff, maybe not by running there.

[00:18:58] Adam Hawkins: Yeah, that's true. Okay. So technical question here. So let's say that I'm deploying an application with Argo CD and like me personally, I'm used to writing, manifests, used to making helm charts, writing all these things.

[00:19:10] Adam Hawkins: So if I wanted to deploy with Argo CD, am I writing Kubernetes manifest? Am I writing something else? Say, how exactly am I declaring these things?

[00:19:18] Victor Farcic: No, that's, I think one of the major of difference between Argo CD and probably the only other similar tool, which is flux is that Argo CD is really whichever way you're used to have your manifest that continues working.

[00:19:33] Victor Farcic: The only difference between what you might be doing today and if you would adopt Argo CD is that you have one more definition, which is a kind of definition would be application Argo CD, the application. Which in assesses. The only thing that that application does in the simplified way is provides Argo CD with the URL of the report that it should monitor, right?

[00:19:57] Victor Farcic: So once you create that Argo CD application and essentially are telling it, Hey, monitor this director in this repository from their own, it could be manifest as, in any way you already have them right now. Now, to be honest, once you adopt Argo CD, then you might start seeing some, and GitOps in general, you might start, start seeing some patterns which might compel you. To change some of the things you do, but it's not mandatory. Right? It's more like, Hey, look, if I can do this, right.

[00:20:31] Adam Hawkins: Let's talk about that a little bit, because say, if you're coming from sort of, Hey, I saw this new technology. I think it will help me. It changes the workflow. I have to think differently. So you mentioned some patterns.

[00:20:43] Adam Hawkins: So what are the kind of like day two realizations when you adopt a workflow like this?

[00:20:48] Victor Farcic: So what I normally do is I would create what we call application of applications in Argo CD so application would be here's the repository of my product. Right now, traditionally, that would mean that in the reposit, the production repository, I would have all the definitions like helm charts or Kubernetes assemble files.

[00:21:11] Victor Farcic: But what normally we will do is in that repository production, instead of copying and pasting all the files all over the place, you know, in staging production, I would have just a single application application a, point to the repository of application a and potentially maybe I'll write those values, like number of replica. So my production repository would be collection of Argo CD application system of traditional definitions. And that's just pure lies, right? Hey application, is there or write those values application bees over there, or read those values, etcetera, et cetera. And then you would have maybe staging repository or a branch doesn't matter, which would apply the same method, but pointing to same applications, but maybe different tags.

[00:22:01] Victor Farcic: Right. So that's kind of one of the changes that you might not want to keep copying files all over the place, because assuming that you can multiple environments. Right.

[00:22:16] Adam Hawkins: Okay. I see. So if I understand what you're saying, if you kind of imagined say like a helm based deployment system. You might have say a values file for each of the environments that you deploy to and say, if you get into this GitOps and more declarative state workflow, instead of having multiple files where you say like, Hey, there's code plus configuration, and that means development or staging or production. And what you're saying is it could make more sense to have say one repo, that's just application, whatever you want to call it. But inside that there's a branch for staging or branch for production or different refills or whatever that instead of modeling your environments as different configuration files, kind of think of them as different branches or repositories that allow you to group all the different things that compose. Environment. Is that what you're talking about?

[00:23:12] Victor Farcic: Exactly. Exactly. So one environment is a repository or a branch in a repository that provides two things. The links to where the original definitions of applications are, right. And usually links to repository, service applications and whatever you want to our right.

[00:23:32] Victor Farcic: So that can values fall by the Asians. I would probably not keep them in the repository of the application itself, but in that repository that defines the environment, that allows me to go because it's very powerful being able to say, hey, I go to this repository I could find out everything I need to know about the desired state of my production, let's say, instead of going, Hey yeah, this file in the repository of application, A plus this file of repository application B, and you know, like you, you go running around like chicken without head until you figure it out what it is.

[00:24:10] Adam Hawkins: If you could even figure out all the things that actually a part of that environment, I think, which is a segue into the next set of topics I wanted to discuss with you.

[00:24:17] Adam Hawkins: But first I want to follow up on this point, which is one thing that I've always struggled with, is that, and this is a problem that gets worse as you move towards more services and try to adopt things like microservices, is that... okay yes, on one hand we have the ideal that, you know, teams should be able to work independently, have, and have ownership and autonomy all over there. You know, they're a bit of code that they're responsible for. They can build it, deploy it, run it in production, all that. Now that's great for that team. You know, purely concerned with that little slice of the application and product service, however you want to define it.

[00:24:52] Adam Hawkins: And then there's the other set of people. These people tend to be like myself who are operating either at like a lower level in the platform at a higher level in the service that say, okay, well now I need to... I have an environment that I need to have service, like all these different applications or services like service a service b and service c. So I can create an environment that I can test out the product or test out some infrastructure. And when you have, you know, 10 or 20 in a hundred different microservices, all spread out across, you know, X number of code bases, what actually represents a production environment. Whereas like with the approach, I think you're talking about here, you can just say, oh, it is the production folder in this repository, or this file in this, that lists out the config or declares everything for all of the things that actually compose all of the environments. Which I think is probably one of the key differentiators or like benefits of this type of approach is you can really reference all the different parts of your system in this one way.

[00:25:51] Victor Farcic: It's like treating your production or staging or whatever, as if it is an application. As if it's Uber application, right? Yeah. And just like normal applications are not only the code of that application. They have libraries, that they fetched from different places. And then they're parametrized with different values. And what's the not same thing for an environment, right?

[00:26:14] Victor Farcic: The environment is a collection of applications, just like application is a collection of libraries, plus custom code. Right. And it's, it's a similar approach I believe. And if you go back to that autonomy, sorry of those teams in most cases that becomes relevant for them, because I don't want anybody to go in and start manually changing a repository of production. And then you say, oh, I need to care about autonomy. No, I mean, that's all part of some process you say, Hey, I want to promote this release to production. Maybe I give you a button. Maybe I give you a script. No matter what I give you that button's script or whatever, we'll make some changes to get repository Argo CD, or whatever else you might be using, we'll converge that change into the actual state.

[00:27:03] Victor Farcic: So for most people that is transferring that is simply something happening in the background, but for people who do care about what is in production, no, sorry, let me rephrase, not what is in production. What is in production? You, you, you see by looking at the cluster itself, but there are people who want to, who might want to know what should be in production. Sounds like semantical difference, but actually it is a huge difference what should be and what is.

[00:27:33] Adam Hawkins: Yeah. Okay. So one other question I have here is, let's imagine you're working with this sort of like GitOps workflow and you have this repository, that's like, Hey, this is what production. What I'm imagining in that repository is a some configuration of applications, this might be like, you know, Docker images helm, charts, whatever. And then some version, it could be a commit shaw could be a semantic version, could be whatever. So let's say that I'm working on application A and I want to release application A, given that we're talking about, you know source code or get different workflow, then I would go to this production repository, make a change to the version of application aDA should be running and commit that, and then let the declarative stuff do its business, right?

[00:28:27] Victor Farcic: Yes. Except that you shouldn't be doing that. You know, I think that this is where our perception, I think of going to the very beginning of this conversation, this is where our perception of the roles of CD or CIA or whatever you want to call it. Pipelines of changing, right? Yeah, you, you would have a, a pipeline that, so let's say that simplified workflow, right?

[00:28:50] Victor Farcic: You create a pull request. When your pull request is created, it is deployed somewhere. You do, you run some tests, whatever, you merge, pull request to master, and then it goes to production simplified version, right. Now, what that means from process perspective from pipelines perspective is that when I create the pull request should build, it should do this or that. And for a moment, it should go to let's call it, pull request environments, repo, it should go there and add, Hey you should create a unique name space based on maybe a pull request ID or whatever, and deploy this application. Right.

[00:29:27] Victor Farcic: It should make the change to get repository and that that's done by a pipeline, right? Yeah. And then pipeline would maybe receive a signal back from, let's say, Argo CD, Hey, I'm finished. And then pipeline continues and prance your tests or whatever. And then you merge that pull request to the master branch, that similar pipeline would remove that full request based application because you don't need it anymore, but then it would go to your production repository says, Hey, you know that yaml file over there that among other things gets the tug helm value, let's say, change the tag to this. I know what the tag is. I just built it right. There is no need for you to tell me what the tag is. I just built it. Right. So just pass that information and make a change to Git.

[00:30:14] Victor Farcic: Yeah. And that's not that different from what we were doing before, except that pipeline is not touching your plus three. For pipeline there is not cluster that is Git repository right. And it just fetches information from there changes information. Maybe it would create a pull request for the production repository instead of changing directly to tag and wait for you to merge that pull request, right? There are different partners. Yeah.

[00:30:40] Adam Hawkins: So for the listener there is if you're adopting this type of workflow, instead of learning to, you know, script and program around keeps CTL time to get really good at programming with Git is you, instead of making these changes directly with get commits or doing it yourself, you will have a script that says, oh, open a PR, make change to this file. This is where things like YQ and JQ and all of these sort of text manipulation and parsing things come into play because you know, like, as you said, humans need not do this and should not do this. It's all about making machines, do all the work for us. So just a way of like changing your thinking in the sense of, Hey, what you're going to be automating and putting into the pipeline is not state management or infrastructure management, cluster management. It's going to be commit changes Git instead of changes to infrastructure or something else.

[00:31:31] Victor Farcic: Right. I think it's this, from my perspective, I think that people also themselves have troubled to revive their brain to that decorative type of thinking, which is pretty, very strange because via declarative. Most of the time outside of the software industry. Like if you go to a restaurant, right, you say, can I give a, can I give a burger with cheese and baco? Right. That's what you said. And now if we would translate that experience to what we traditionally do in software industry would say, Hey, can you go outside, find a cow of meat for that cow, let you drive for a vial. And then, you know, and put a bit of foil. Now we don't do that. In the real life, we are take clarity. Most of the time we express our desires. We don't give commands unless you're in military maybe. And it's that type of thinking. I'm just expressing what I want. Go figure it out.

[00:32:33] Adam Hawkins: Well, one other, I think benefit from this approach is thinking in a sink by default in terms of pipelines, because eventually something has become a sink in that process because you do want a pipeline that says I made a change to this application, deploy it to some pre-production environment that's going to take, who knows how much time it could take five minutes. It could take an hour. You don't know, point it's going to take a long time. It could fail or succeed for any number of reasons. And then you need to listen for that move through the next stages of whatever kind of async that could happen. But if you think in like an async model of the pipeline, by default, you open up a whole bunch of other use cases than if you were thinking and assume that everything was going to be synchronous by design.

[00:33:20] Victor Farcic: There is this recent story from within my company. discussing how we can, you know, we do pipelines some other things, but in this context, pipelines are what matters. We're discussing how we can marry pipeline, send and get ups, right? And then some people the best possible actions can come to the conclusion that, Hey, we need to provide in pipelines mechanism for a pipeline to sync with, to send the sync command to Argo CD beta theological CD is finished, doing whatever it's doing so of pipeline can continue and maybe run tests only after your is up and running.

[00:33:59] Victor Farcic: And that was the moment when I freaked out. No, you don't do it. You push to get, we already established that and then you do nothing. Then you wait until Ardo CD comes back to you and says, I'm done kind of you don't, you don't wait and say, are you done? Are you done? Are you done? Done? Done? Yes. Okay. And then you continue. Now, you're just creating a mechanism to listen to evens. Everybody listens to somebodies evens. Hey, it will tell you when it's done.

[00:34:28] Adam Hawkins: Yeah. It's like push and then pull, push, and then pull like you, instead of thinking. Sure. You can think conceptionally as my pipeline from commit to production is one pipeline, but that need not be technically implemented as one pipeline. There could be any number of steps, steps that could execute in parallel. They could execute, not in parallel. They have a sequence, you know, there's all kinds of different sub pipelines that give you everything you need to say that, Hey, one change is ready to go to production, right?

[00:35:00] Victor Farcic: Correct.

[00:35:01] Adam Hawkins: Okay. So then now let's move into the next topic. We got a little sidetrack there, but I think it was good stuff to mention, which is, I want to bring this back to the three ways of dev ops and why should we care about technologies like this? So like, what I talk about on small batches is the idea of improving velocity, increasing reliability and building higher quality software and doing all of that faster.

[00:35:29] Adam Hawkins: So the first way of dev ops flow is about optimizing to improve the speed from development to production. Right? So continuous delivery, automated deployments is all part of that.

[00:35:42] Adam Hawkins: And then the second way of dev ops. Feedback, which is providing feedback from production back into development, right? So if you see that something is not working in production, you have the appropriate telemetry to notice that either do like an automated rollback or whatever, but to learn from that, and then, you know, make sure that those same negative outcomes are not repeated.

[00:36:05] Adam Hawkins: And then we have the third way of DevOps, which is learning sort of, if you can imagine this as sort of an inner feedback loop, so you get this nice cycle of going from development to production and then production back to development and an outer feedback loop that is continually trying to learn and optimize all the processes that make all those other things happen.

[00:36:26] Adam Hawkins: So why should we care about these technologies or this way of working? Like how does it make a difference in those areas?

[00:36:36] Victor Farcic: So. I think that the second one is problematic, at least from GitOps perspective, I'll get to that, but flow in learning to begin with enforcing the idea that everything gets in Git is potentially generally speaking these yes way we have the data to enforce a certain flows and certain facilitate learning for a simple reason, because Git might be the only tool we have in industry today that nobody disputes, right.

[00:37:08] Victor Farcic: We can fight for a long time whether to use, you know, this monitoring tool or that monitoring tool, this and that. But the only tool that I know of that nobody disputes kind of like, Hey, there is not discussion it's Git. The only tool that every single engine there, no matter whether we're talking about infrastructure, testing, writing, Java, decent, that that everybody uses, unless you're a manager, that's Git.

[00:37:34] Victor Farcic: Right? So Git become in a way that hub where information is collected one way or another, you know, we're even moving to read miss instead of confluence pages, right?

[00:37:49] Adam Hawkins: If only that could happen faster.

[00:37:54] Victor Farcic: I'm in a process of doing that within our company. So I'm not there yet.

[00:37:58] Victor Farcic: So I do think that maybe I'm wrong, maybe I'm too technical and naive, but to me any float that is based on Git and we'll be relatively simple for people to adopt because they're using it no matter what. And that whole idea that everything is an open book, richest basically code itself, starting kit might be the easiest way to facilitate learning, right? Because if I have some obscure tool that is doing something right with some internals, if that's your tool that you're using it as infrastructure guy, and I'm a JavaScript developer, I'm most likely not going to bother. Why do I, it's not my tool. I have nothing to do with me, but everybody can go to Git report to find out information about something. And especially since we are now talking about decorative format, which is easy for everybody to read. I might not be a Kubernetes ninja, might not understand what this is, service mesh, but seeing a change like, Hey, I, you, I see that you changed something called tug from 57 to 58.

[00:39:20] Victor Farcic: Most likely you increase the release there. Right? You might not understand the details, but just by looking at yaml, normally you can, you can understand what's going on now. No matter whether that's Kubernetes definition or whether that's, I dunno, pipeline or, or even Tara for right Tara from... Can be tricky for people to understand, but still, Hey, you can somehow follow what, what the idea is behind a place.

[00:39:46] Victor Farcic: And it tends to be relatively easy to make some minor changes, right? If you have a huge Terraform definition that does 57 different thousand different resources in AWS, because that's how much you need, you might not be able to do that yourself, but okay. Follow created a lot bouncer, you know, this and that to me code is the easiest way to learn. I might be naive.

[00:40:10] Adam Hawkins: It definitely solves one part of the problem, which is just making it visible, which is one of the hardest parts when it comes to like facilitating and learning and understanding of these complex systems that we all work with is like, when you mentioned it earlier in the conversation, which was point to some person, let's say for the sake of argument here, pick a person in your company or team, ask them to identify all the different source code files that compose the production environment.

[00:40:39] Adam Hawkins: See if they can do that. Right. How much do they get? How much are they missing? Like what did you not know that they did? If you can see that all of that stuff is present in a get repo, then you can at least start to Brock the whole surface area. Right? You gain some awareness of the whole system, such that then maybe you can say, oh, this is this. Maybe now I can make a change that you weren't able to before because was just invisible to you.

[00:41:06] Victor Farcic: And that's one of the reasons why like that concept we discussed earlier about let's say having a repository for production environment, but that repository contains production specific values and links in a way to repositories that have full definitions, let's say on an application level.

[00:41:26] Victor Farcic: Right? Because that cap is to be just a level that is easy to understand, to get the picture. And then yes, if I want to know who, whether there is an English in service and this and that, I go kind of follow the links, almost like, you know, webpage is follow the link and go to that application if I'm really, but most likely I don't care. So those five applications are running there. Excellent. That's just the level of information I need at a glance, you know?

[00:41:54] Adam Hawkins: Yeah, this also comes back to one other topic. That's been coming up a lot on the podcast. I just like it cause it's so I really haven't seen it discussed so much, but once I heard about it, it's really important. And the idea of cognitive load in that you, as an engineer or as a person working in these complex systems, you literally only have so much, you can fit in your mind at one time and things that require sufficient mental cognitive load, or, you know, say, imagine yourself as a CPU. If it takes, you know, 90% of your CPU cycles to just process whatever you need to do, then you have no room for anything else. And you don't have room to learn new stuff or investigate things. And like you're coming back to like, you want to be able to operate at a high enough level of abstraction where you can do what you need to do without having to learn about everything that happens behind the scenes.

[00:42:49] Adam Hawkins: And that level of abstraction will be different for every single person in the team or value stream, wherever they are, you know, like SRS like myself, like, yeah. I want to know if there's an ingress. I want to know, like all the infrastructure I need to know all of these things. They're really important to the work that I do. But if I'm the person who's writing the CSS or building some, you know, small component and say a JavaScript front end or some small thing like that, I don't care about any of that stuff, nor should you.

[00:43:17] Adam Hawkins: I mean, that's like the exact opposite thing you want to be taking up your cognitive load on. So the idea of say, like having everything and get as some simple declarative format, and then you can, as you say, you can follow the links deeper and deeper and deeper. Into the system to gain whatever knowledge you have.

[00:43:38] Adam Hawkins: I think this point is almost self evident in the fact that all of us who work with Kubernetes, we don't care what goes on behind the API. We just make the API calls and Kubernetes does whatever it does. That's what we care about, what the end declared state, not how it gets there. If everybody had to concern themselves with how it got there. Oh man, that's too much. It's just too much, you know?

[00:44:03] Victor Farcic: And it also solves, you know, having in a vague code that X is your documentation solves in my head, even bigger problem than the problem of people not learning. And that is a problem of learning something that is not true. Yeah, when you kept those extra sheets and word documents and Wiki pages, and then you tell me, Hey, go there and you will find the information that you need. And I just spent an hour reading it only to find out that actually I come back to you and I didn't, I do not get it. Then you take a look and say, yeah, because it's obsolete. I don't believe in documentation, excluding end users documentation. Right. And use of documentation, excluding that, within, in your silent believe in documentation is not code. It's not accurate. It's not up to date. Never. I never seen a company or a team that keeps their Wiki pages up to date all the time never.

[00:45:04] Adam Hawkins: Talk about a unicorn. Show me an example of a sufficiently complex system that has up-to-date documentation for the developers of that system. I have never seen it. It doesn't happen. I don't know about you, but I will happily take any number of good code comments over anything else, because like, when you said learning something that was not true.

[00:45:25] Adam Hawkins: I felt like I had been just kicked in the stomach because I just recently had gone through an experience where, you know, I'm like working on something. I'm trying to understand this. And I'm like, okay, this is like, I'm reading all this. Like, this should make sense. I'm doing it. Like this is not working. Why is this not working? You know? And the way that I think about it in my mind is like, when you're on the edge of a rabbit hole, you know, like you have to decide to take that like one step and tumble in or look, you know, stay on the outside and see if you can get somebody to help you.

[00:45:50] Adam Hawkins: It's really easy to like tumble in and you come out and you ask somebody like, Hey, why isn't this working like, oh, the documentation's wrong. You're like, oh God, you know, they're just going all of this time and energy and the worst part about it though. It's not even that it's, that it creates this sort of cognitive dissonance in like, Hey, what are you? If I, if I'm reading this, is it actually true? Whereas if you read the code, you're like, this is the code I had to know. I know that, but you know, you see a confluence page last a month ago, two months ago. It's like, I don't know. I don't know.

[00:46:25] Victor Farcic: Even in those cases when it's not code, right. That there is always a case to write some texts. But you know, if you put the comment above, let's send it to helm value X equals five, and then you need to put the comment. If you put it comment above when it changed the behavior of that, something, it is infinitely higher change that I will notice and change that comment over there. Then go to some obscure Wiki page in some random place.

[00:46:54] Victor Farcic: I mean, simply the statistical possibility of me updating Wiki page is lowe or no matter how much I want to do that, even. Yeah. Simply it's, it's already kind of fits in my vs code. It sits there it's it reads me it's next to the code that I'm changing. I'm more likely to keep it updated.

[00:47:15] Adam Hawkins: You know, and the unfortunate fact of the matter is that those Wiki pages tend to be in confluence, which I don't know about you. I don't like using confluence and it's like, oh, I got to go over there and use this editor. And you know, it's not marked down as some other thing, it's in this other place. And then I didn't expect to be talking about documentation, but like one thing that always bothered me about this is that you'll have a branch of code and no branch of documentation.

[00:47:38] Adam Hawkins: Like you can't have like a con if you're using confluence, you know, how do you create a potential change to the documentation you do it after, and then update the thing after the code is merged or whatever. How do you synchronize these two different versions of reality in these two different models of what truth actually is like this doesn't fit fundamentally.

[00:47:59] Victor Farcic: It's always frustrated me. I'm much worse than you probably in that time, I'm like a radical Taliban in So I joined a company called Trish and they asked me can you write the blog post about this since it? Sure. Why not? Where do you write it in? Well, you have your blog in WordPress. I go there I see a markdown plugin is not installed. No. I'm not writting anything. What do you mean? No, no, no, no, no, no. I just finished writing it. I created a report for you. It's pushed over there. I can do an extra for to copy and paste into WordPress. I'm going to do that because I'm a nice guy, but I'm going to copy and paste my markdown.

[00:48:44] Victor Farcic: Now either you're going to install the plugin for me, or people are going to read something that cause caches and asterisks all over the place.

[00:48:57] Adam Hawkins: man haven't laughed like that for awhile, but it's so true though, because you get used to a certain level of quality or workflow that makes it easy enough to do what you need to do. And if you're a radical Taliban like you are. And like I am about those things that if there's a sufficient deviation from that, you will just say, Nope, I'm not going to do that because it doesn't make sense. It's not right. You know, in quotes, but there's a certain level of quality that you get from specific workflows. And if that changes, it's like, oh, I can't work like this.

[00:49:35] Adam Hawkins: You know, it's like, you'll go to a company and they don't have automated. Okay, well, you don't have automated testing. Well, you I'll see you later. I'm not even going to bother with this. This is not even worth my time, you know? And I'm not the nice guy who's going to write your test for you. I'm just going to peace out.

[00:49:50] Victor Farcic: Exactly. I think it's very important, at least if you're not, I graduated yesterday type of person, right? If you give certain experience, it is very important that we have certain expectations and demands. And when we draw companies, Hey, I worked like this. can change, however, but you need to convince with that changing is for the better, right?

[00:50:13] Victor Farcic: I'm not going to change for the worse. Yeah, I'm all about change, but only for the better. And if you want me to, I could change to us to doc.

[00:50:29] Adam Hawkins: Even textiles, we can go that way. It's fine. But just give me, like, give me something like this. Okay. Nice little sidebar there so much, always enjoy talking to people like that. Especially when somebody says like they're radical about something. I love to push on that and find out more about it because it's like you say, you have to be a radical, you have to be committed to certain things. You have to have the type of principles to succeed.

[00:50:57] Adam Hawkins: Like if you're just going to be pointed in one direction, because somebody says. You know, new technology, this or whatever, but without any kind of critical thinking or analysis behind as to why of what is this actually going to improve for me, then you're just wasting your time. You know, like I don't want to make lateral moves. I only want to make increases.

[00:51:19] Victor Farcic: I totally did. But you know, if you don't have your opinions, if you don't have your style of broking, if you can just pick the redirected in a say, if I can just take you and say, move 37 degrees and go. Then guess what? I don't need you.

[00:51:35] Victor Farcic: I can replace you with a script. Kind of like scripts are very good at doing exactly what they want. I don't need that. It's true. Guess what scripts are cheaper than people. So how to fight? There is no incentive for use people that will do exactly what they want. I want people who will do things that I don't know that I want you know, bring something new to the table.

[00:51:58] Adam Hawkins: A little chaos engineering for your social aspects.

[00:52:02] Adam Hawkins: Okay. So why, why don't I ask you one last question before we go regarding like what we expect and workflows and like what kind of tools we need to succeed. And we'll talk about these sort of deployment tools, not continuous delivery tools, but things that are actually about rolling out code, making changes to applications, doing these things. We need different deployment strategies. Like blue-green Canary is like rolling deploys.

[00:52:27] Adam Hawkins: How, if at all, does Argos, like, does it support that? How are, if not, like, how do you coordinate those types of things using this tool?

[00:52:36] Victor Farcic: So Argo that's supported because Argo will simply apply whatever is defining Git and, you know, yaml file or this and that. And if you use Canary, you're probably going to use like something like Argo rollout, separate project, or maybe flogger, which again, everything is defined as yaml, right? If it's defined in yaml, I will see they will apply the differences, but it can only deployments. We are already getting to the subject of actually difficult is that we are experiencing to GitOps.

[00:53:08] Victor Farcic: And that is that. So if, if the desired state isn't Git. And then the trope of tools like Argo CD to make sure that the actual state is the same as to the desired state the problem start when your actual state starts changing on its own. Let's say canary deployments, right? You define Ivonne you're defining Git and say, I want to care version 57. Argo CD applies a definition that happens to be, let's say Argo roll-outs, which does kind of deployments. It starts progressing 10%, 20%, 30%. It is monitoring your metrics in permit use. And then it one moment to discover, say, there is something wrong. The other rate is too high. For example and gives it a couple of more tries and it gives up and say, this is bad. I'm going to roll back to the previous version.

[00:53:57] Victor Farcic: So far, so good, kind of that's exactly what we want. But what is happening from that moment on is that the desired state is actually not the state you want anymore because your desire to stay still says version of 57, but get your Canay deployments rolled back to 56 whatever you continue doing Git this correct, likely going to mess, mess that up unless you're extremely fast to correct that thing on Git and do something you have that drift of the desired state, not being what you want, an actual state being something else.

[00:54:32] Victor Farcic: The same thing can be said for, you know, scaling up and down for there are many, many things happening today in Kubernetes cluster that are simply happening, right? Without you even knowing that it's happening. Like your process, if you're not really watching the process and you don't care notifications, even you as a person will not know that the drawback.

[00:54:56] Victor Farcic: All right. And now you're clever guy. You will set up some notifications, so you will receive the team. Hey, I rolled back and you say, yeah, that's what you should do. But the whole concept of design, the actual state falls apart. I mean, it's a strong word to say falls apart, but that's one of the many things that are unsolved in GitOps. Unsolved without you tweaking this, the tools that you're using, of course you can write the code that fixes everything. Everything can be done. Right. But on a tooling perspective, that's one of the many things that we haven't solved just like observability. Right. I don't have really good ways to observe the desired state.

[00:55:39] Victor Farcic: Yes. I could go to Git, and that's great because I can see all the details, but sometimes I just want, okay. Give me all the all the issues that resolved between the current versions of all the applications in staging and the current version in production. I want to know the thrift between staging and production before I promote things from staging to production. Tell me that difference, what it means in terms of maybe issues that Maybe all the differences in Git, but without me going through 57 pool requests and so on and so forth. But I'm just mentioning two of the things that are many, many I'm just pessimist by nature.

[00:56:19] Adam Hawkins: Ah, okay. So then I just started thinking about how you would do canaries or how you do blue-green you mentioning like the diff between the declared state, the actual state, and then, you know, externalities that change the current state that create the Def that got me thinking I'm like, well, If you want to Canary we'll then make a commit that creates the Canary and just let it run. These automated actions are not done below Git they have to be done at the Git level to maintain that kind of consistency. So instead of having something that did like automated rollback of the Canary, or like terminating the Kubernetes deployment, you know, you go revert the commit that made the Canary or whatever, then the Canaria would go away, you know, via Argo, whatever this sort of state manager type thing is.

[00:57:08] Adam Hawkins: So instead of thinking like below the level that should like coordinate this whole process, you know, you have to think of break each non atomic unit down into individual commits that can be done you know, step-by-step in the way that you would think about how do you do continuous delivery? When you think of database migrations, you have to do them and it's, you know, a stepwise fashion, like where you add a new column, migrate all the code to use the new column. And then when you're all done, then eventually remove the old column, like, that's like a Canary. You might think of it as one whole process, but there's, you know, three or four, like distinct steps in that process, which couldn't have been mapped to commits the same thing for blue green, one commit to create blue one commit to create green one commit to swap them, or, you know, whatever, you know, you just have to think of it.

[00:57:53] Adam Hawkins: In that level, do you think that if you think in that way, then those sorts of inconsistencies or quote, like problems become less so, or at least they're more accurate in terms of the modeling?

[00:58:06] Victor Farcic: Yeah, it's, it's mostly about tools adopting, GitOps principles, right. And this is now coming in one thing that kind of makes me nervous is that adopt for real, not only from marketing perspective, like, you know. If you go to a homepage, a flogger, which is still, I really, really like, I'm going to get them store and then you just drove back by herself. so I think that hopefully we will be getting there that those tools are becoming, Git aware and then flogger or our grill outs. When they decide to roll back, Hey, push a comment or, or send a notification to a tool who will push the comment and if it needs to be done fast and you don't want to wait for the system to synchronize, Hey, it's okay to do a work around you can, you can roll back immediately after that kind of, I'm not even a post to a Canary deployment type of tool. Rolling back just like that many second, before you roll back yourself, send a push to Git, just kind of just do that, a millisecond exactly. That doesn't have to rely on Argo CD to kind of apply the raw pick itself. You can even do it, but it really, really depends on how, how popular get open sourceyou know, it's always about how popular it is. If it's popular, then companies open source projects are created. People are interested in it. And then commercial companies are going to build commercial software on top of it. It always depends on how popular something is. I believe that that GitOps is becoming a big thing.

[00:59:46] Adam Hawkins: Hmm. Well, that's something we'll have to watch out for in the next year or two and see what happens.

[00:59:52] Adam Hawkins: Well, Victor, it was a lot of fun talking to you, a lot of information exchange here and definitely some good laughs and I'm always happy to meet to another member of the so-called radical Taliban.

[01:00:02] Adam Hawkins: Is there anything you would like to leave listeners with before we go?

[01:00:06] Victor Farcic: Go to codefresh.io, that's my shameless plug.

[01:00:09] Victor Farcic: So shameless plug, go to codefresh.io and try all this stuff out.

[01:00:16] Adam Hawkins: All right, Victor. Well, thank you so much for coming on the show and keep in touch and I'll have you back on again some time.

[01:00:22] Adam Hawkins: Definitely.

[01:00:24] Adam Hawkins: That wraps up this batch visit smallbatches.fm for the show notes.

[01:00:28] Adam Hawkins: Also find small batches FM on Twitter and leave your comments in the thread for this episode. More importantly, subscribe to this podcast for more episodes, just like this one. If you enjoy this episode, then tweet it or post it to your team slack or rate this show on iTunes, it all supports the show and helps me It's more small batches.

[01:00:48] Adam Hawkins: Well, I hope to have you back again for the next episode. So until then, happy shipping,

[01:00:57] Adam Hawkins: are you feeling stuck trying to level up your skills deploying software Then apply for my software delivery? dojo. My dojo is a four week program designed to level up your skills, building, deploying and operating production system. Each week, participants will go through a theoretical and practical exercises led by me designed to hone the skills needed for continuous delivery.

[01:01:19] Adam Hawkins: I'm offering this dojo at an amazingly affordable price. to small batches listeners spots are limited. So apply now at softwaredeliverydojo.com. Like the sound of small batches. This episode was produced by pods worth media. That's podsworth.com.

Creators and Guests

Victor Farcic
Guest
Victor Farcic
Developer Advocate and of The DevOps Toolkit series
GitOps & ArgoCD with Viktor Farcic
Broadcast by