Saltside Chronicles with Sebastian Dahlgren

Sebastian Dahlgren & Adam Hawkins continue the Saltside Chronicles by discussing the rewrite's long tail impacts on Saltside.

[00:00:00] Hello and welcome. I'm your host, Adam Hawkins. In each episode I present a small batch, with theory and practices behind building a high velocity software organization. Topics include dev ops, lean software architecture, continuous delivery and conversations with industry leaders. Now let's begin today's episode.

[00:00:26] Welcome to 2021. Let's start off with a small update on the show. I got some amazing episodes in the pipeline for you. I've recorded episode with Matthew Skelton. Matthew is the author of the team typologies book. It's a book I've covered numerous times in the podcast. So I'm looking forward to sharing, my conversation with him with all of you.

[00:00:48] There's also my first ever combo interview with Jeffrey Fredrick and Douglas squirrel. They're the co-authors of the book agile conversation. It's a really interesting book about the human challenges of building organizations. And if you've listened to this podcast, or maybe you heard me speak to some people you'll hear me say that.

[00:01:09] I don't think that the tech is a problem in the systems. It's usually the humans, which are the problem. So their book speaks to how to work on the human side of building software organizations. It's really fun book. And I'm looking forward to sharing that conversation with all of you too. And I've also recorded a conversation with Dr. Steven Spear. He's the author of the book, the high velocity edge. This is a really great book that I wish I would have read earlier. You know, it's mentioned a bunch in the DevOps handbook as the inspiration for the third way of dev ops. I really liked this book so much that I put together a solo episode.

[00:01:52] I just didn't think it would do justice to throw you into a conversation between me and Dr. Spear without some context about the book and the ideas. So I will release this solo episode first to set the context for my conversation with Dr. Spear. I'm really stoked to share this one with you. I also have a, another episode with Carmen.de Ardo from Tasktop on value stream. This episode is a direct response to a listener request on the call in line, which reminds me that there's also a podcast listener request line. So call plus +1 833-933-1912 and leave requests in a voicemail. Reference Goes to listen requests. So call in, tell me what you want and you'll probably hear it on the show.

[00:02:43] There's also an episode with Brian Fenster on development environments in microservice architecture. This seems to be a well that I can just keep going back to, perhaps this is just one of my hobby horses, but Hey is an important topic. And let's see, there's another episode with Marcus shirt that I really need to get out of the backlog and into production.

[00:03:06] Marcus is a really smart guy and the brains behind the Ruby mutation project named mute. He's a wicked sharp and principled software developer. We talk about a bunch of different things. So look forward to that one and that's just, what's already been recorded. There's even more already scheduled. So stay tuned for updates and hopefully there'll be some big names coming to small batches in 2021.

[00:03:32] So watch out. All right, so let's get on with the first episode of 2021. I hope you enjoy all of the salt side. Those episodes were a challenge to produce, but I think it's a great story to revisit after all these years and completely relevant to what we talk about on the podcast. I've learned a lot since then.

[00:03:52] And of course my perspective has changed completely since going through all that rewrite, Now hopefully that story connected the relationship between velocity quality and tech debt for you. Plus I just love Carmen De Ardo's framing on revenue protection versus revenue generation. It just cuts straight to the core of the problem.

[00:04:11] Anyway, let's talk about the rights goals for a second. One of the rewrites goals was to create an architecture that would support salt side well into the future. That meant not painting ourselves into any architectural corners and practice that meant setting the boundaries between teams, applications, and infrastructure with those boundaries in place.

[00:04:31] Each part of the system could change independently and that will provide the need to business agility to react to market conditions or grow, or even shrink the team. No. I saw this play out firsthand after assault site opened up their new development office in Bangalore, you know, the team grew and their product changed, but the architecture kept up.

[00:04:51] In fact, it actually accelerated some of the things we were able to do. So that was just really cool to see, but you know, it was almost expected, you know, eventually over time, a couple of years I left the company. That's where we pick up today's episode to see the long tail effects of the rewrite at salt side. Today, I'm speaking with Sebastian Dahlgren. Sebastian is the CTO at salt side and an open source contributor with broad experience from dev ops and engineering with an e-commerce classifieds and recommendation engines. Sebastian like myself has a strong passion for Asia and technology in emerging markets.

[00:05:30] So I interviewed Sebastian during the tail end of the. rewrite. He was sort of hired on to work directly with me. At least that was the idea initially, you know, he had joined shortly after we completed the rewrite and just before assault site opened to the office in Bangalore, which coincidentally is a really interesting time in that company's history.

[00:05:50] But anyway, he quickly shifted into a manager and engineering leadership role and eventually went on to become the CTO at salt side. Sebastian. And I worked closely together while we were both at salt side. He's continued to work there a couple years after I've left. So he's observed firsthand how the choices made during the rewrite have helped or hindered salt side.

[00:06:17] Today. Today, I give you a conversation five years in the making Sebastian and I discuss the long-term impacts of the rewrite on salt side today. It's a fun look into validating. Some of the hypotheses that drove rewrite. But more importantly, I think it closes the loop between the work everyone did in 2015 and salt side today.

[00:06:40] So enjoy a special bonus edition of the salt side Chronicles with Sebastian Dahlgren.

[00:06:50] Adam Hawkins: Sebastian, welcome to the show. How are you doing?

[00:06:53] Sebastian Dahlgren: Very good. Thank you.

[00:06:54] Adam Hawkins: Thank you for having me. My pleasure. I think it, you and I are going to have one of the rarest and just the most entertaining and fun conversations I can think of, which is, you know, being able to talk about the long-term effects of architecture, like how it impacts teams, individual engineers, like what they can do in their day-to-day work, how it relates to the success of the business.

[00:07:19] Adam Hawkins: And, you know, we're coming at this from the perspective of like, Hey, I worked at salt side until like 2017, I think. And then you stayed on and eventually became the CTO. So this whole time that like I've been away and after the rewrite you were there, so you've seen it grow and change. And what, and how the business has changed, how the organization has changed. and how The architecture put in place in the rewrite has worked, or hasn't from actual firsthand accounts of what it's like to work in this company. So maybe let's start there, like how did salt side grow or not, or change since that point?

[00:08:02] Sebastian Dahlgren: There are so many levels to that, right? I mean, the company has grown in the sense that we have a more users.

[00:08:10] Sebastian Dahlgren: Now we have still the three markets that we've been talking about previously on this show rights. So we're still in Sri Lanka, Bangladesh and Ghana. And, I think one of the things that the rewrite was first to deliver on was the possibility to quickly move into another market. Oh yeah, we haven't really done. I mean, we had a short venture in Nigeria, but that market is close down again. But the, the thing that I suppose we will focus on is the engineering aspects of this. The architecture is overall still there. it's evolved quite a bit, especially since we've seen new business needs and also more traffic, which has surface a few problems with the architecture that we've had to revisit, basically.

[00:09:02] Adam Hawkins: Yeah. So let's just, let me ask you some simple questions then. So one of the. Goals of the architecture for the rewrite was to allow salt side to exist, like grow and scale, you know, five to 10 years in the future. Do you think that that, that architecture is succeeded in that.

[00:09:23] Sebastian Dahlgren: Yeah. The short answer is yes. I mean, the proof is that we're still around.

[00:09:28] Adam Hawkins: Right. And you had the first profitable year at some point. Right?

[00:09:33] Sebastian Dahlgren: Exactly. Exactly. And, we've reached profitability a few times over, so, certainly has delivered on that. Right. And it has also delivered in the sense that it's been able to. Deliver or a scale or more and more users, but also scaling this as that we can sort of effortless lastly add more features, can expand the architecture. so it's us a future friendly in that sense.

[00:10:03] Adam Hawkins: Yeah, that's good news. So that's the technical side and I think where we're picking up the story is. And this salt side Chronicles, I go to Saul site in Sweden after the rewrite, but there's another chapter of this story, which is the company opened up an office in Bangalore in India and grew the engineering team quite a bit.

[00:10:24] Adam Hawkins: And then eventually there was only the engineering team in, in India. And the way that we were organized during the rewrite is not the way that we were organized. When we, when I left, when I left, we were quasi organized around features or some sort of more product stuff, as opposed to being just pure technical, like backend web or mobile.

[00:10:45] Adam Hawkins: So I thought there was always a mismatch between the architectural we created in the rewrite because that map to the team structure that we had when we created the rewrite and then the team structure changed after that. And we had this mismatch of who owns what services and they feature a product features don't map clearly on for architecture.

[00:11:04] Adam Hawkins: So like, you know, we were able to of course growth. This is both you and I were working at salt side during this time when we're actively working on growing the engineering team in India, you know, we're able to onboard new engineers, get them going, like launch new services, maintain new things. And I think one of the last things I remember was sort of like the introduction of a notification service or something like this sort of like collapsing some of the services that were already there.

[00:11:30] Adam Hawkins: So how has the team structure changed on top of this software architecture? Are they kind of more in sync or where is the state of that at the most?

[00:11:40] Sebastian Dahlgren: The final question, actually. So we've been through so many iterations of team structure and who owns what, and also how they respond to different needs in the organization.

[00:11:54] Sebastian Dahlgren: So we've had teams that worked solely for a specific market and, you know, their features. we've had, like you said, back in only team and then mobile teams where we're at right now. is Similar to what you described. Right? We have, what we call cross functional teams, which is teams, where we have backend engineers and mobile engineers and web engineers all together.

[00:12:19] Sebastian Dahlgren: And the idea is obviously that they can deliver a feature independently and independently in the sense that they don't need really other teams to help ship this feature. That doesn't necessarily not great to the architecture, I guess, but it, it works pretty well. I mean, we're a small team of, two engineering teams and a data team.

[00:12:42] Sebastian Dahlgren: So, Yeah, everyone sort of owns everything. So the back end engineer, each of the different teams, they work on features, but they also collaborate together in excellence groups. So they try what you were. Right. And they sort of together collectively own the backend.

[00:13:03] Adam Hawkins: Yeah. So that was sort of, that was one challenge that we had after we moved away from. Technical teams to some sort of product or cross-functional teams, which was, you know, we as backend engineers built all the services that compose the API is like, you know, all of us were used to building applications, to playing applications, running them, the production that you mean, this was sort of like our burden butter.

[00:13:25] Adam Hawkins: That's what we did. But when the organization moved to cross-functional teams, then we had this gap where you have supposedly like a team who can, you know, own one kind of whole vertical or some area, but they don't have the backend skills or the web skills, the mobile skills to kind of pull on all the different levers required in the different pieces of the architecture to actually deliver the value that their stated reason to exist is to do that.

[00:13:54] Adam Hawkins: Right. So. Is it like something like have, maybe, have you considered collapsing some of the services or like sort of changing some of the underlying technical architecture to make it easier to map on? Or is that not really so much of a problem now because people have the skills required to kind of pull all the ladders.

[00:14:19] Sebastian Dahlgren: Yeah, I'm leaning towards the matcher. We are not looking to collapse any services. quite the contrary, actually, one of the challenges we have with the architecture that was set during the rebrand is actually that what we have, what is called the core service and to give some context, right, you can, you can tell from the name that it's a rather important service in the architecture.

[00:14:42] Sebastian Dahlgren: And basically what it does is it owns So many of the key concepts and the key objects in the system has, which is obviously a super important to us, right? And accounts shops, which is essentially websites for customers and, membership logic as to who, who has to subscribe to a certain membership and so forth, and then hold them.

[00:15:08] Sebastian Dahlgren: This is all compiled into that core service, which makes that service. More and more difficult to work with and what we're actually looking to do and have been looking to do for an embarrassingly long time is to break out some of these components into their own statuses so that we more easily can maintain dependencies.

[00:15:32] Sebastian Dahlgren: Change programming languages if we want to and so forth. so we're not looking to collapse anything and I don't really see a big problem here in terms of mapping the teams to the architecture. But certainly there is there's something to him, right? I mean, it wouldn't be great. With someone, an appointed person to talk to you when there is a problem with a certain service, right.

[00:15:59] Sebastian Dahlgren: And instead the answer to those discussions is that we discuss them as a team. Yeah. Some people have a lot of ideas as to how a certain service should, should be functioning, could be because they wrote it in the first place. So they have been working on it reasonably and therefore have more opinions as to what is good and bad and what needs to change from that service. so it sort of works out. It probably one scale in, in the long run. I mean, if we were to double or triple up on number of engineers, then we most likely need to have some sort of ownership.

[00:16:36] Adam Hawkins: That's one of my favorite questions to ask. What I like to consider what will happen in this system, if we double the number of people or double the number of teams, because there's technical abstractions that hold true, like at a certain scale of like quest per second, whatever this compute resources, but then there's the human aspect.

[00:16:55] Adam Hawkins: Right. And when we were doing the rewrite, this was something that, you know, Peter and I talked a lot, all like all of us actually, you know,Daniel, Volley. And, and everyone, we talked a lot about what would happen if we had double or triple the engineers, because it was all assumed that like we're doing this set that we never ever had to do anything like this.

[00:17:20] Adam Hawkins: Again, like nobody wants to do this. Right. So if we're going to do it, let's make sure that we really don't paint ourselves into a corner for anything that we can like, maybe not say predict with certainty, but we can say with some degree of confidence, Could happen. So I'm curious, do you feel that the architecture painted the engineers or the business into any corners or is it flexible enough that given some change requirements, some change in, you know, whatever that there are ways it can be adapted or evolved?

[00:17:56] Sebastian Dahlgren: Yeah. Okay. So regarding your question about being painted into an architectural corner, right? I think there has been some areas where we've felt that and, but the, the overarching answer is the platform is flexible and future friendly enough for us to add new things. but just to take an example of something that has not worked well, is we use thrift tasks.

[00:18:22] Adam Hawkins: I'm really curious what you have to say about this. So please continue.

[00:18:25] Sebastian Dahlgren: Yeah. So I have a lot to say about thrift and. Maybe, let me start off with a problem. So one of the problems we've found is not with thrift itself, but the way we have used thrifts, which is that we serialize some of the eggs and store them to the database

[00:18:42] Adam Hawkins: as serialized, as thrift structures in stored in the database.

[00:18:48] Sebastian Dahlgren: And now we're looking to shift over from thrift to product. And, then that poses sort of an issue it's rather difficult to work with and migrate away from, in serialized form in the database. Maybe stepping back a bit and then talking about thrifts. I mean, one of the really, really cool things and things that I really have appreciated over a long period of time at salt side, is. thrift. So thrift to me has been great so that, you know, I'm working with multiple services at the same time and we are together. Right. And us always understanding what an ad is or what an account is in these, these services is super helpful because you have the same objects to relate to, even though you're working on multiple services.

[00:19:44] Sebastian Dahlgren: Right. So. Everyone. Everyone knows what it is. We have a clear definition as to what it is. And the other very important part here is that we're not super good at documentation, but when it comes to the thrift files, we are very rigid about documenting the object and the properties of the object. And since you also define the services in thrips and the RPCs, we do the same.

[00:20:10] Sebastian Dahlgren: Define the intent of the service and the intent of the different RPCs. And that is powerful documentation because everyone has the same idea of what is, what is, what and why it's there. And you also obviously have that in get, so you can see some history as to what has changed and why. And we have been doing a lot of good things with thrifts, but we've also done a number of mistakes.

[00:20:37] Sebastian Dahlgren: And, one of those mistakes is that thrift supports defining properties as required or optional. And when you define something as required, it's required forever. So if you want to change the type of something or deprecating. Then you're boxed in, right? You can't really do that in a good way. so what we've turned to do is to always keep everything or almost everything as optional, because that helps in our architecture quite a lot.

[00:21:13] Sebastian Dahlgren: What that does though, is that it moves away from that great feature of thrift, where you get validation. I mean, now we need to validate the properties in each service to make sure that the properties we want are actually there. That was previously something which was built into. thrifts. But if you're looking at the evolution of protocols, they moved from a concept where they have required an optional fields to a version of protocols where you can't even define required fields. And I think that, to me says something of, of what, what the problem with required fields can do in the, in the ecosystem as a whole, I gurdd.

[00:21:53] Adam Hawkins: Yeah. So for the listener, I want to give some more context to like thrift specifically when we're talking about it at salt side. So I spoke a little bit about this and the salts like Chronicles, but the reason why we chose thrift was that. We didn't want to deal with implementing clients, implementing servers. All we wanted was, you know, a statically defined interface request and response structure with some sort of rudimentary validation baked in as to like required fields type such that you could not send junk across the wire one way or the other.

[00:22:31] Adam Hawkins: And what we also did with thrift is something, actually, I'm really curious what you think about this after all these years, Sebastian, but because sort of the nature of our system with the core service and the thrift stuff was really a common interface across all the services. So, because we had these core concepts, like add and user, we did not want every single service to finding its own version of what those contexts actually were.

[00:22:57] Adam Hawkins: So we decided to say, There is a common definition of, of add and account and these core concepts and all the different bits of the services will interchange these concepts. And they could decorate them on their own domain context as to like what matters. So we have, like, we called the repository platform, thrift files, which contain all the thrift IDLs for all this trucks, all the services, all there, anybody, any service who was going to consume any of those repositories could just clone that repo, like use it thrift of. It was all there.

[00:23:30] Adam Hawkins: So I think what Sebastian is talking about here with the required fields is, yeah, we, we made some stuff required. We, because what was required at one point in time, but, you know, as requirements change. If you're unfamiliar with thrift, I mean, it's like a database schema in the sense that you declare statically defined things and they have to be there or not.

[00:23:50] Adam Hawkins: So there is like, if you consider what you have to do to evolve a database schema over time, you have to kind of think about some of this thrift stuff in the same way. And one of the other things that we did that which was, I always thought it was going to be a trade-off, which is like, why I didn't want to spend time doing a bunch of marshaling and remarketing stuff between different data formats, especially if the service was only really speaking thrift. But I needed to store some stuff. So what we did was we'll just dump the serialized, restruct in a database somewhere, be able to serialize stuff from the database, deserialize it in the application and get the entity back. And now we can do some stuff with it. That was sort of a, since we already have this interface for these trucks and all this stuff, like why are we going to write another layer on storage? Like we already have this. So that's some of the, the context and background for this for one.

[00:24:47] Adam Hawkins: So, yeah, sorry. I just wanted to make sure listeners got that, but where are we? I think you wanted to talk a little bit more about, about thrift, right?

[00:24:56] Sebastian Dahlgren: Yeah. I mean, so thrift has been, as I said, writes super helpful for us in documenting and getting every engineer on the same page when, when we talk between different services.

[00:25:08] Sebastian Dahlgren: And, one of the problems that we have is. Thrift has sort of a couple of components to it. So it has that serialization and deserialization, that was monster data format, but it also has a transport layer. So you can send the threat files between the different services, the clients, and the server that Adam spoke about, over different types of transport mechanisms.

[00:25:34] Sebastian Dahlgren: So it shouldn't be or TCP and, and so forth. And. That's an area where we have, ran into a number of problems. So, I think in one of the previous episodes, she spoken about our search service. She is essentially a, elastic search front-end service, which is constructing queries that we execute towards elastic search and that service is taking a thrift course.

[00:26:06] Sebastian Dahlgren: What is, or has three RPCs, but we've had so many different types of transport related problems in that service. And we then invested enormous amounts of engineering, into understanding what those issues were and why they occurred. And I'm not just talking about investing time in networking and so forth.

[00:26:26] Sebastian Dahlgren: We've also been diving deep into the actual thrift source code

[00:26:38] Sebastian Dahlgren: We did a number of faces. It improved a few things, but never really solved the issue. And what we try then was to switch the passport method from one to the other. So we witched over to the HTTP version of notes, transports for four thrifts, that two had the same issue, or.

[00:26:59] Sebastian Dahlgren: Kim was a different set of issues, I guess. So what we've done now, actually, and this works a little better is that we use scripts for serialization and then the protocol, but we use a very standard go HTTP server, and then she'd be client that we use to transport this serialized data. And then we've moved away from most of these different types of timeout issues that we've saw previously with transport.

[00:27:28] Adam Hawkins: All right, let's take a quick break from today's episode so I can tell you about my other software delivery resources. First I'm opening up my own software delivery dojo. My dojo is a four week program designed to level up your skills, building, deploying and operating production systems. Each week, participants will go through theoretical and practical exercises led by me designed to hone the skills needed for continuous delivery.

[00:27:53] Adam Hawkins: I'm offering this dojo at an amazingly affordable price to small batches. Listeners spots are limited though. So apply now at software delivery, dojo.com. Well, if you want something free, instead of got you there too find links to my free email courses and eBooks on any show notes, page my courses and eBooks cover topics in much more depth than I can cover on the podcast.

[00:28:15] Adam Hawkins: They're great on their own, or even as a useful compliment to topics covered on the show. Find all of my free resources smallbatches.fm. All right, let's get back into the. show.

[00:28:27] Adam Hawkins: I'm really happy. You mentioned go because now we can pull on one of the other threads, which is really important in the rewrite, which was at the time, you know, we were the, before the rewrite salt side was Ruby was Ruby on rails for pretty much most of their applications.

[00:28:43] Adam Hawkins: And like people were. hired, Expected to have Ruby experience and they built applications in Ruby. And this was way before there was any stuff like react or single page apps or any of this stuff. This was, you know, service side rendered HTML with jQuery, or maybe some, you know, some stuff peppered in like that really different than the way apps are built now.

[00:29:05] Adam Hawkins: But when we were doing the rewrite, we had new people who come into the team. They had, you know, experience with different languages. It was no longer just Ruby. We were starting to do some more stuff in node and, you know, one person really wanted to use D initially, but D didn't support, thrift. And when this guy told me about D I'm like, wow, that's a really cool language.

[00:29:28] Adam Hawkins: I hope we can use it, but it wasn't supported by thrift. And he settled on go. And when we're deciding to use thrift inside to use Docker, all this stuff, we wanted to make it so that the developer could use the whatever language they thought was the best tool for the job. Not eliminate that, like not painting us into a corner.

[00:29:49] Adam Hawkins: And one of the things that probably happened. maybe over the past couple of years, they'll like the last year or like year and a half, I was at salt side was, an introduction and transition into much more go at salt side, which I think is a good thing because you know, people use it. They like it. It obviously is delivering value for the business.

[00:30:13] Adam Hawkins: And now given the sort of the different layers in the architecture where, Hey, you can use thrifted speaks language X, you can use go because it's. Scaling problem or whatever to be able to do. That is great. So I think we can say that architecture has delivered on that language flexibility, and I'm curious, you know, like, are there more languages in production now or like what is sort of the proliferation of languages look like?

[00:30:42] Sebastian Dahlgren: Yeah. So we are currently using thrift to generate. note source country go Ruby, Java and Python and Java and Python is mainly used in our data platform. so for the regular production system, it's mostly go and Ruby still. And we have a note for a small part of the system only, but go and Ruby has been a very long story for us and, and over time, The trend has been, that we're doing more and more things in go, the new services we bring up, they are almost exclusively written in gold.

[00:31:24] Sebastian Dahlgren: And I think the, the reason for that is probably a number of different answers put together, I guess. But one reason is that we've had a number of performance issues with, with Ruby and, those have been easier to solve with go and multithreading and so forth and go. but I think the. Probably the biggest answer is that most of the engineers enjoy breaking go code.

[00:31:51] Sebastian Dahlgren: And when they bring up a new service, they can choose the language that fits that service and fits the skills that we have and so forth. Right. So, then go is the natural choice for us. And, It's really the, the language right now. It's on site. However, as I mentioned previously, we have that core service, which is still a super important service in our system. And that's written in ruby.

[00:32:19] Adam Hawkins: A lot of it actually written by yours truly. I'm curious what the git history like the commit stats that repository looks like now, you know, how much is it still Adam material at the top.

[00:32:32] Sebastian Dahlgren: It's actually a surprisingly much. And I think these names from the rewrites and actually commits from the rewrite is still very much a living components in most services. And I mean, especially when you look at configuration and, stuff like that, right. Then you can certainly find it. And, in, some of these parts where we have, rather static code written ones, there is certainly alive, but also, I mean, it's not only about what lines of code is there. I mean, the whole. And more important thing is that the concept is still there. The exact same way of working with the application and the abstraction layers and so forth. They are generally there, they have changed maybe natured and expanded, but mostly they're there.

[00:33:27] Adam Hawkins: Yeah. So I guess you're speaking to this sort of hexagonal architecture approach of like, Hey, there's this boundary in the place in the code there's clients and servers, separate parts of that.

[00:33:37] Adam Hawkins: This end of the boundary can vary there's test across the boundary, you know, these sort of what I consider, you know, basic software architecture principles, really. And of course all the tests to go with it. Right. Like always tests, hopefully that's still there.

[00:33:51] Sebastian Dahlgren: Yeah. And actually, that's one of the things I wanted to bring up when we spoke about doing this episode. Right. Because tests. There's one of these success factors, probably, maybe even the, the most important one to us today, because what all the tests do is I guess, two things. One is that it increases the quality of our engineering because you can really not submit anything to our purpose, which is not tested.

[00:34:21] Sebastian Dahlgren: And that drives up. The quality in general, the code, if people are more aware about why they meant something and how they do it and so forth,because they need to practice that makes sense. The other thing, and this is something I felt when I joined salt side initially, because I had a background from companies where tests were partly there, partly not there.

[00:34:42] Sebastian Dahlgren: And you were in a sort of unknown state as to if I make this change, will it break something in? Do I really know in my head what I need to test? She shipped this and I think this is the most important part to me because when we make a change, we almost know that this will or will not break something else.

[00:35:05] Sebastian Dahlgren: And I think that's a great enabler for engineers, because if you're a new engineer to this code base, which is a large code base, if you can, with some confidence, make a change and be ready to ship it without having to understand all the different business cases. And it's important in an organization like soul side, where the business cases are very, it's a large set of rules, right?

[00:35:32] Sebastian Dahlgren: Business rules, but also those rules vary by market. So what is true? One markets may not be true for the other markets, so you can't really expect any engineer to have all of that in their head. More, I mean, the same is true for quality engineering engineers and so forth. Right. For anybody. Really?

[00:35:51] Adam Hawkins: Yeah. So testing is important and this is something I've said on the, on the podcast so many times is that if you do not have sufficiently high automated test, coverage, You're done you're toast. You will not be able to do anything. And when we started the rewrite, there wasn't really much test coverage for the existing system. Right. So I had already been plenty at plenty of experience working in TDD. So like, that was my natural way of working for all the new code that we were writing during this. thin. Or during the rewrite. And I made sure, like this was one thing that I was not going to allow any compromise on, which was, if you're writing code,you're good.

[00:36:31] Adam Hawkins: You should do TDD. If you're, if you can't or you're not familiar with that, okay, let's help you learn, but you're going to write tests. You're not going to ship any code into this thing that is not covered by tests. And I always come back to my experience at salt side, going from there's no test to there is tests such that the kind of ideal I have in my mind, You can check out the code, run the test.

[00:36:53] Adam Hawkins: If the test pass, you should be able to confidently push that code into production. If you don't have that assumption made, you need to do everything you can to get to that base level. Cause that's the, that's the floor. That's not the ceiling. Right? So what we put in place in the rewrite was that level of test coverage and automation of the whole pipeline for every single service that was deployed to production. Right. And now having seen things be the other way I can tell you, I want to go back to how it was at salt side. And I don't really say that very often. You know, it's not a knock, but like, that's just how important these... Like what testing gives you. And I think you and I have seen it firsthand too. When new people join the team, they've never made a commit to a repository.

[00:37:37] Adam Hawkins: They don't know what it's doing, and they're given some, you know, small change and they didn't want to know if it works or not. How did they do that? They have the test they learn, and it's the best. It's just the best mechanism for this. You know, it's like, if you don't have it, that's where you, you know, where you got us.

[00:37:54] Adam Hawkins: I don't want to get, like going on testing again. Cause I mean, almost like religious about it and how important it is, but I'm cause you mentioned config and there's like maybe two, two things I want to talk to you about remaining. So we don't just talk about this forever, but one is the config, right?

[00:38:10] Adam Hawkins: There's a whole episode in the salt side Chronicles about config and how problematic it is. And what does it mean for salt side? Because salt side is just a huge ball of parametize logic around. config. effectively. Let's how it was when I was there. And the, we will, we solve this problem. And the core service was he wrote this like little DSL and Ruby to create these trucks.

[00:38:30] Adam Hawkins: And that's sort of like urban representative, this whole thing, we haven't codified. And there was like RPCs and core service to load config and get all that. And like the web app use it and the Android app use it, etcetera, that says kind of how we had the whole solve a config problem. So how has the config problem. Is it still there? Like, is that, like still working? Okay. Like what, what's the state of the, the config.

[00:38:56] Sebastian Dahlgren: That's a complex question. I think the coughing first and foremost, it's still there pretty much in the shape. It was previously, obviously we've expanded the concepts more and different types and so forth.

[00:39:10] Sebastian Dahlgren: We even have nest of times now. So, this is one of the. Big steps forward, in terms of logic, right? Because previously we have rather basic fields, if you will. I mean, they could be complex in the sense that they could have maybe a value and a measurement and so forth. right. But now we have, for example, fields where you select maybe a brand of a car and based on that, we get the subset of model understand match that friend, this problem. Yeah.

[00:39:49] Sebastian Dahlgren: And, that's, endorse folks have, problems, right?

[00:39:50] Adam Hawkins: Because now one depends on the other. Yeah, exactly.

[00:39:54] Sebastian Dahlgren: Yeah. Yeah. And that makes for very, very much more complex implementation, both on the client side and on the server side, the config itself is rather straightforward though, because it's just, you know, a treat.

[00:40:11] Sebastian Dahlgren: However, this is important because we see now as a trend in the market, that more and more things. are to go into the config. So let's just say that we have mobile phones, we have mobile brands and we have models, right. Which is a set of data, which is changing somewhat frequently. And when you expand that to more and more verticals, what it does is that the config cools engineering back into that critical path, right?

[00:40:44] Sebastian Dahlgren: That is something we haven't solve yet. I mean, this is, problem is just now dawning on us, I guess. It's, it's more and more, becoming an issue. And, we also have more and more complex pricing structures. So based on where you are and what you're paying for and so forth and what the price of the item is and so forth, right?

[00:41:06] Sebastian Dahlgren: We also have Very complex system for calculating prices and these things put together, makes config Difficult to match the more important of that is, it's taking away, a lot of the engineering time. So that is a problem we are yet to solve, I guess, with getting it closer to the business one way or the other.

[00:41:29] Adam Hawkins: That is a really hard problem. I, I guess, I don't know if that was sort of the initial. dislike assumption about how the config would be managed when they initially put the first version of the whole thing with storing the config in a database. I think that they had built it that way so that they could connect some like admin UI for it and allow people to change it.

[00:41:52] Adam Hawkins: But the problem with that approach. is that It's static information that has logic around it, that you can't easily just manipulate in a database and expect every single application to just update and work. So like that is one of the real. Tensions. And it was just a trade-off of building a product like this. I don't see that there's really a way around it that doesn't involve orders of magnitude of efforts, such that somebody can go into a UI and add another brand to a phone like versus having as unfortunate as it is having an engineer on the critical path to like add some, you know, entry to some Jason file, you know, add an array or whatever.

[00:42:34] Adam Hawkins: There's the kind of the dynamic versus static tension in the config all the time. And that's always been there. I don't think that's ever gonna go away.

[00:42:43] Sebastian Dahlgren: No, I, I don't think so either. It's just that we need to find a new balance.

[00:42:48] Adam Hawkins: Yeah. So the last thing I want to talk to you about is something that we talked a little bit about in the prerecording, which was the ghosts of the rewrite. You know, that now it's the rewrite completed in 2015, it's 2021. Now, as we record this, it's been, you know, four and a half, five years since then, you know, you mentioned that you look at the source code, you'll see commits from, you know, X number of years ago, mentioning like from the. rewrite, Old code almost like in software terms, archaic, actually.

[00:43:20] Adam Hawkins: And of course there's the technical structure that was created in the rewrite. There's all of the product stuff. I mean, unless we forget that there was a new admin, apple web app, a bubble and entered up an IO, you know, iPhone app, all of this whole, you know, the product that's in place right now. Came out of that.

[00:43:38] Adam Hawkins: And of course, then there's the stories that are told in the organization about the rewrite and the people who have anything. I don't even from the engineering side, I'm not even sure who's there, who's experienced that, but you know, high-level management, same S the same CEO is there, maybe some of those people, but, you know, you came in just after we had finished it. So. How do you see the, how does the, the story and sort of the rewrite, if at all, like manifested. self in salt site today, not on like the technical side, but just sort of the, the culture and the storytelling.

[00:44:14] Sebastian Dahlgren: Yeah. So I don't think it is, that infested a lot on the engineering sites, right? Like you mentioned, I joined drafts right after the rewrites and everyone who's on the technical team joined after that.

[00:44:29] Sebastian Dahlgren: So no one is currently working or actually working during the. rewrite. However to us, it's in ways, sort of a living ghost. We can, we sometimes talk about the rewrite and why we did it and what the trade offs were and so forth and why we ended up in that place to begin with, because this has been a very living story for a long time within the company.

[00:44:53] Sebastian Dahlgren: but on the business side, you're right. Most of the management is still the same. And, they obviously have a very fresh memory of the rewrite. And I think that serves us well right now, which is that management and sort of all of the business and engineering has a pretty well-balanced view of revenue generation versus revenue protection.

[00:45:25] Sebastian Dahlgren: We're focusing quite heavily on the protection side of things. And, that's never really a big debate in the company we don't need to fight to do the technical things. and I think maybe just to illustrate that over the last year, when COVID-19 hit the world, we restructured our team a bit, we've focused a lot more on features.

[00:45:53] Sebastian Dahlgren: One of the things that we wanted to get out of the door was something we call essentials, which is essentially groceries online. So then you can deliver groceries here to local communities with the riders we had in the different markets and, all the 2020, or a lot of, at least more than you shall, has been focused on revenue generation. However, now that we're into 2021 and have a, hopefully much pride, a future ahead of us, we are refocusing our engineering efforts so that we are doing revenue protection. We've said between 20 and 40% of the time, she'd go directly to revenue protection.

[00:46:35] Sebastian Dahlgren: Whereas that previously was at around 15% and that's done without major debates because people understand the value of long-term functional architecture and something that can actually scale in terms of not, not only in terms of number of users, not too, of course, but can scale in terms of adding new features. Because that's something we've done rather effortlessly. because The architecture has been future friendly.

[00:47:04] Adam Hawkins: Yeah. Well, I mean, it speaks to one of the big themes on this podcast, which is the reason why we do this tactical work is to deliver business agility, right? The world went through a shock when COVID 19 happened and basically changed the economic rules that for many companies and many companies needed to change their business model to suit the realities of the world and the organizations that could adapt faster would be more successful in that environment.

[00:47:33] Adam Hawkins: And if, I think back to what happened with salt side, when they wanted to introduce a mobile app, which took, you know, effectively a year, if they were in that position now and had to adapt to COVID-19, they would not have been able to do that. Right.

[00:47:47] Adam Hawkins: So speaking of really, just about ability to succeed as a business and adapt and succeed in the market, which is something that we don't really talk about so much, as you know, I think when we're writing code on the keyboard number thing that we're doing, that we're not really thinking like all the way up the stack, but the point of this podcast and the point of like this architecture and even the point of the rewrite was to unlock that business agility that was not there before, you know, like we have to really look through that lens for some of these big architecture, a technical choice that we put in place. Like we're not just doing this purely for fun or because we think that, you know, thrift is the, be all and end all of this stuff.

[00:48:29] Adam Hawkins: I know these are specific reasons why we're choosing to do all these things. I mean, I'm not trying to put words in your mouth, but I really want to sort of just give my, take on this after the cabinet's conversation with you and please like time and after which is, I think that overall as painful and as difficult as the rewrite process was, and I, I never, I never ever want to have to go through that again or do that kind of project again.

[00:48:58] Adam Hawkins: But if salt side did not undergo that project, they would not have a business right now. And that to me is speaks to the ultimate success of the rewrite and that salt side can continue to exist and grow and thrive and ultimately become, as you say, a profitable company year after year, not a one-time thing. I think, does that jive with you? Does that sort of mirror your own thoughts on it?

[00:49:24] Sebastian Dahlgren: Yeah, it does. And I think what is also important here isthat now when we see areas of the architecture failing, they are usually abstracted away. So one example here was we have done a rewrite of the assertion is migrated over from Ruby to go and change the laws of logic and so forth.

[00:49:46] Sebastian Dahlgren: And that could be done because the search showed as was its own component. We had the RPCs. We could just do that rewrite rather independently without having to impact the rest of the platform. So now we're in a position where we find issues and we do that of course, but when we do so we can take smaller steps towards the better architecture. And that is the key takeaway.

[00:50:12] Adam Hawkins: Yeah. Being able to do things in small batches really makes a difference. Right? It's like being able to actually move incrementally is the only way to make consistent and sustained progress. Like I had a really different perspective on the rewrite when I was doing it as compared to now, because I can't imagine how frustrating it was for the CEO and some of these board people, or, you know, these higher level management who, honest to God they had put a complete pause on any new business development, because there was no resources for it. They had to put everything into this for like nine months basically. And you try to pitch that idea to somebody in management. And they're like, no, it can't happen, but it was never intended to be tha. But it grew into that of course, but I don't know how he feels about it now, but it's curious to see you took this thing you chose to do the rewrite. It's certainly not a short-term investment. This is a big commitment for a, long-term like the long-term thing. You know, like if your outlook is five, 10 years and it makes sense if your outlook is a year, do not do that at all. And I, I hope, and I, I just really curious what the board and the CEO kind of think about it after all this time.

[00:51:23] Adam Hawkins: I don't know. I don't really kind of expect, I don't expect you to comment on that. It's just really changed my thinking about, does it make sense to ever do anything like this? You know, I don't, I'm not sure if it had that timeline in mind, but if they did it definitely paid off, like it's a long bit, I mean.

[00:51:38] Sebastian Dahlgren: I don't know what their thoughts were then and so forth, but I mean, one of these things thatstrange to me over and over with the small side of this, that we have long-term goals. So we go into these underserved markets with the long-term perspective. We don't expect to make profits in a year, not in two years, rather than five or seven years. And with that perspective, I guess the rewrite also made sense, although it could have been done differently in so many ways.

[00:52:07] Sebastian Dahlgren: Right. But, in the grand scheme of things, and now looking back, I think it means.

[00:52:13] Adam Hawkins: Yeah, i, I think so too that's, by far the most interesting story of, my career and why I wanted to share it on the podcast and, you know, really talk to you about it. and Get the other half of the story, maybe the third chapter. We've not to tell the middle one, which is what happened in Bangalore, but maybe that will be another series of the salt side, chronic.

[00:52:34] Adam Hawkins: Well, Sebastian, it has been a pleasure talking to you. It's really also just great to catch up with you, you know, for the listeners. you can't see us right now, but you know, I've been smiling and having a great time just talking to Sebastian. And this is, it was, one of the great pleasures of my career to be able to work in pair with Sebastian on stuff. I hope to one day be able to maybe work with you again. I don't know if that will happen, but it would be a lot of. fun. So, thank you so much for coming on the show. Sebastian, is there anything you'd like to leave listeners with before we go?

[00:53:08] Sebastian Dahlgren: No, not really, but thank you for having me. It's been a great pleasure Adam.

[00:53:13] Adam Hawkins: Well, thank you so much, Sebastian. Let's keep in touch and maybe we'll check in again in a few years when we have like salt side in 2025. Let's see.

[00:53:21] Sebastian Dahlgren: Yeah, let's do that. Take care.

[00:53:24] Adam Hawkins: All right. Thank you for listening everybody.

[00:53:27] Adam Hawkins: You've just finished another episode of small batches podcast on building a high-performance software delivery organization. For more information, and to subscribe to this podcast, go to smallbatches.fm. I hope to have you back again for the next episode. So until then, happy shipping,

[00:53:48] Adam Hawkins: like the sound of small batches. This episode was produced by pods worth media. That's podsworth.com.

Saltside Chronicles with Sebastian Dahlgren
Broadcast by