Saltside Chronicles #2: Technical Debt Calls

The second episode in a five part series on the Big Bang rewrite completed at Saltside in 2014/15. This episode discusses the technical debt and architecture issues that led to the rewrite.

[00:00:00] Hello and welcome. I'm your host, Adam Hawkins. In each episode I present a small batch, with theory and practices behind building a high velocity software organization. Topics include dev ops, lean software architecture, continuous delivery, and conversations with industry leaders. Now let's begin today's episode.

[00:00:26] Adam here for the second of five episodes in the salt side Chronicles, I outlined the entire story in their previous episode. This episode goes deep into the architectural limitations and technical debt that prevented us from launching a mobile app.

[00:00:45] First, a recap of the story so far, the timeline is 2014 slash 2015. Salt side is a startup that makes classified sites for developing markets. They've launched their first classified site as a web application that was appropriate. When salt side launched in 2011. What apps are easy to build? Plus smartphones hadn't taken hold in.

[00:01:06] Developing markets yet mobile phone users could be served a trimmed down load data, use mobile web version instead of the desktop version, eventually cheap and affordable Android smartphones entered the developing market. The fundamental assumption about how users would access the site shifted competitors picked up on this and launch mobile apps on their own mobile apps provided a much better experience for our classified sites.

[00:01:32] Then desktop computers, users could snap a photo on their smartphone and post the ad straight away. The CEO shared the news. Hey, we're launching an Android app. The development team looked back and whore at this fundamentally different. requirement. If there's an app, the app needs an API. The problem is that our architecture and current system was not at all suited for this for a myriad of reasons.

[00:01:58] This is where episode two of the salt side Chronicles really begins. First. I must clarify some of the production functionality I mentioned earlier that salt side launched sites in multiple Markets Markets are synonymous with a country. So salt side has sites running in three different countries. At this point, you may be thinking that this is the same product in multiple countries, just localized differently.

[00:02:24] That's only half true south side had built what you could call a CMS for classified sites. Allow me to explain every market is different. People have different things, so they should sell and buy different. things. It does not make sense to have a boast category for markets without water access, but it does make sense to list boats in country with water access.

[00:02:49] This demonstrates the category concept. Each market may have different categories in a different order. We call this the category tree. It was configurable per market. Next the locations were obviously different per market. And so is the structure. One market may be divided up into X number of top level districts and Y number of subdivisions.

[00:03:13] One market may only be divided up into X number of pop level divisions. Another market may be three levels deep. We called this the location tree, and naturally it was configurable per market. This doesn't even account for variations that can't occur at the category level. Here's an example, say that market a and market B both have a house has category each market market, a and market B share a similarities and also differences.

[00:03:41] The house may be represented as total area, which could be in square meters or square feet, or it could be described as a three BHk. That's three bedroom hall and kitchen, shout out to Bangladesh and India. Or there may be a checkbox for a private garage on market a and not on market B. The point of all this is to save the ad content was customizable to what made the most sense in that market.

[00:04:07] There was no global concept of what a house meant and the product. In fact, there was no concept of house at all. House was just an ID pointing to a database record and relevant configuration. Product managers could craft the classified site that made the most sense for that market by defining the location tree category tree, and exactly what could it be listed as for sale wanted, along with housing rentals jobs and more, it was a highly dynamic and configurable system.

[00:04:38] These concepts fell under the umbrella of config. Config was the systems got object. Actors in the system could not do anything without config because config defined the rules of how to behave across the entire system. The website can even display an ad without config that's because the ad reference the category, the category configure and defined all the property.

[00:05:01] If the ad had, which then required localizations in different languages, this is a simple read use case, right? It's like creating an ad or much more complex. So the system was entirely ruled by this config. It was global state that permeated every aspect of the code. Now you may be thinking this sounds like a multi-tenant system.

[00:05:23] I think that's a fair assumption, but it wasn't true. In the beginning, you see one set of applications and data source serve all markets applications. Use the HTTP host header to identify the market load, the relevant config, and then act accordingly. This model had numerous problems, but the most prevalent one was scaling. Salt side had three active markets at this point in time. However one was king. It accounted for most of the traffic. Thus, most of the load on the servers, it was common that a rush traffic on this site would slow down or even take the other markets offline. That's no good. Also the different functionalities in the site were all served by the same application.

[00:06:07] So searching for ads and posting ads all landed on the same applications. servers. This posed a problem because the scaling needs are vastly different. Say 90% of traffic was searches and 10% were ad posts. That's a difference between read and write traffic. One creates far more load than the other. These are two very different concerns, but they were not separated as such.

[00:06:29] So different usage patterns in one market would have been negative from implications for other markets. Plus the system has to run at a scale that supported the needs of all markets at any point in time. I don't want to digress into the infrastructure layer so much in this episode. So it's a phase to say that we had the idea of a multi-tenant system, but it had not been operationalized in that way, nor was it separable.

[00:06:55] So adding in new functionality, say like an API for a mobile app could negatively impact all markets. Okay. So let's jump back up the stack to the application layer again, and pick up a thread regarding config. Managing config was essential. Architectural. I was not present at the beginning of the company.

[00:07:13] I joined probably about two years after that I gathered that the initial architecture strategy went something like this. The website was initially a monolithic rails application, nothing controversial or surprising there starting with a monolith is the right initial approach. All the confab was stored in the database.

[00:07:33] Any page view or business logic could simply hit the database load, the relevant config, and then act accordingly. No real problems here because everything is service side rendered and everything is effectively in the same app. Things change over time. And eventually there's an inflection point where there is too much entropy in the monolith with that force is the creation of something outside of it, giving birth to the tiniest possible distributed system.

[00:08:00] Salt site also received more traffic on mobile web and desktop. It's outside had built a feature phone web version also. So they actually had three different flavors of their website, desktop mobile, and feature phone. The complexity of maintaining all three inside the same code base grew too much. So a new quote, mobile web service was born to handle the mobile and feature phone versions, leaving the desktop application in the monolith. As always the premise was the same. Introducing a new service allows the development team to iterate quickly on independent aspects of the business. I'm sure that was true. Then And Salt side case is forced the question though, what to do about the config. Now there is this mobile web service outside the monolith and how's it going to get that config?

[00:08:51] Well, dear listener, this is where the system crossed the Rubicon. The team decided to share configuration through a shared library that made database quarries and this model, the mobile web service wasn't writing config.It was only reading it. Thank God. I don't fault the team for making this decision.

[00:09:08] At that point in time, there was likely no better option using a shared library, acted like a common AP. And I use that term loosely between the monolith and the mobile web service. Granted, it is better than integrating directly with database calls. As architecture decision stipulated that there was little to no cost and accessing config.

[00:09:28] This is critical because as I mentioned earlier, can fake was global state that permeated every aspect of the code. So it had to be accessible all the time. So the mobile web service would make hundreds of calls to the config library, which are actually DB coins across the network. This was obscured from the developers because it was behind a library, but it wasn't all it hidden from operations or end-users because hundreds of calls per request increased overhead features add complexity over time.

[00:09:58] So the number of config calls increased. So the overhead increased complexity, increased coupling, increased really any form of entropy increased. This was problematic for another architectural reason. There wasn't a boundary between the business logic and config In practice. There was no way to say load all the config in a single batch upfront, then proceed with the business logic, not even close config, lookups could happen at any place in the code.

[00:10:26] Be it a model view controller, or even a helper. The code was not designed to accept config as an argument. Instead, it was simply globally accessible state. I have a feeling that many of you now are squirming in your seats. Just hearing all that. All right. But I got to circle back to the original business case, the mobile app, and it's required backing API.

[00:10:50] So how would a mobile app work? Well, it would need config. And how does it get that config? Remember the assumption that there was no overhead to get config? Well, that assumption goes out the window with the current model when considering on mobile. app. The previous decision to share config via the database only worked because all the different parts of the system shared the same network and a sense they were all co-located.

[00:11:14] It may be undesirable to make network calls for every little thing, but at least it worked a mobile app turned that assumption on its head, the mobile app will be running outside the network. It was totally infeasible to make API requests for config in the same way to current applications. We're doing.

[00:11:31] it. Now that doesn't even account for the built in latency in the end user's connection. Remember that salt side targets developing countries. This is also 2014. So it's totally expected that users have a, 2G connection, not even 3g. It would have been a horrendous user experience, even if we could make it work.

[00:11:50] Technically, that was just the tip of the iceberg. Even having this discussion stipulated that there was an API available. We weren't even ready for that yet the internals of the system weren't separated cleanly enough that we could even begin to create API calls for common operations like post ad or search ads.

[00:12:10] Everything was simply too entangled. Again, all that assume that there would be an answer to the global config problem, which we didn't have. So there we were completely and totally. The development team knew there was no way the current system could adapt to support such a drastic change in requirements to get.

[00:12:29] There was an existential business case. So what to do, we proposed a potential solution and started working on that more on that. on episode four, this completes a summary of the high level of technical issues have blocked us. at salt side. The next episode covers the business organizational and situational factors that combined to create the ground up.

[00:12:52] Rewrite that wraps up this batch visit smallbatches.fm for the show notes. Also find small batches FM on Twitter and leave your comments in the thread for this episode. More importantly, subscribe to this podcast for more episodes, just like this one. If you enjoyed this episode, then tweet it or posted to your team slack rate this show on iTunes.

[00:13:14] It also supports. the show And helps me produce more small batches. Well, I hope to have you back again for the next episode. So until then, happy shipping,

[00:13:27] are you feeling stuck, trying to level up your skills, deploying software then apply for my software delivery dojo. My dojo is a four week program designed to level up your skills, building, deploying and operating production systems. Each week participants will go through a theoretical and practical exercise led by me designed to hone the skills needed for continuous delivery.

[00:13:49] I'm offering this dojo at an amazingly affordable price to small batches. Listeners spots are limited. So apply now at softwaredeliverydojo.com. Like the sound of small batches. This episode was produced by pods worth media. That's podsworth.com.

Creators and Guests

Saltside Chronicles #2: Technical Debt Calls
Broadcast by