Boundaries

Hello and welcome to Small Batches with me Adam Hawkins. In each episode, I share a small batch of the theory and practices behind software delivery excellency.
Topics include DevOps, lean, continuous delivery, and conversations with industry leaders. Now, let’s begin today’s episode.

First a bit of news. John Willis’ book on Deming releases in E-book from next week on IT Revolution press. I’ve read a draft the book. It’s wonderful. It provides so much background and context to Deming that you don’t find in his own books. The print version drops in January 2024. I highly recommend this book. I’m stoked to share my thoughts on the book soon. Hopefully I can get John back on the show to talk about his process writing the book and Deming himself.
Now that’s out of the way, a short story.
A team member deploy a change that did not work as expected. The PR was appropriately reviewed. The tests passed. The problem and fix was clear shortly after deploying.
The API we used did not behave as expected. We tested to the contract, deployed, and learned that was insufficient. We needed to make a second API call. We committed a change that updated unit tests, pushed to master, deployed, then, boom problem solved.
This led to a discussion about the gaps in the test suite and how we might close them. The discussion centered around the most important concept in software design: boundaries.
I’ll share that discussion with you now.

What exactly is a “boundary”? It’s an interface. It’s a separation of concerns. It’s a hard line. It’s a line of bounded context. It’s a contract.
Coupling happens on the boundaries.
Consider a simple create-read-update-delete service. Imagine the code to read all records in the system. Say the records are stored in a Postgres database.
One implementation import the pg driver, creates a connection, make an SQL query, then parses and returns the results. The boundary is the pg driver in this example.
Now imagine one step higher on the abstraction ladder. This implementation imports an ORM library. The code calls functions and returns response. The boundary is the ORM in this example.
Again, imagine one step higher on the abstraction ladder. This implementation imports the internal repository class. The code calls a specific “fetch all” function on the repository and returns the response. The boundary is the internal repository class.
Each time we move up the abstraction ladder, more is encapsulated behind the boundary.
The first example has little abstraction. It’s just making raw database calls. The second example moves higher by abstracting away the SQL with function calls. The third example completely abstracts away the data store.
Now imagine how you would test each of these implementations.
In the first example, you could mock the db driver itself. This would be cumbersome because there may be multiple functions, function signatures may be weird, there could be nested functions with return values that need mocking. It would work but is problematic. This unlikely to work in the first place, because dependency injection is not possible.
In the second example, you could mock the ORM. This is less cumbersome but doable. There may be fewer mocking points and easier function signatures. Perhaps you’re able to use dependency injection.
In the third example, you mock the repo library you created. This is trivial because you own the interfaces. Your mocking library may even automatically generate a mock. You inject a mock repo in the code. Tests for message passing are trivial.
Writing code like the third example will take you _very_ far.
The structure allows you write very fast tests that check that your code calls my dependencies as expected and behaves correctly when it’s dependencies return X, Y, Z. This along with some simple smoke tests in the deployment pipeline is a fantastic starting point.
The inflection point in that simple pipeline occurs when defects escape to production because of missing integration tests. This is the kanban—or pull signal—to expand the test suite.
Imagine a test like “Given I added X records, when I request all records, then I should see X records”.
The test describes the system behavior from a user’s perspective. It also aims to execute as much of the code as possible. How would you write this test in the different implementations? Let’s consider the first and third implementations.
The first implementation couples the code directly to the DB driver. The boundary is the database. So database records must added and tests will hit the database. Perhaps there’s some concept of fixtures available. Direct databases calls enter the test suite. It would work, but is brittle and coupled.
The third implementation uses a repository class. The boundary is the class. This creates a few options. One option is actually connecting to the DB and letting the repository do its thing against a real database. Another option is to create a “fake” data store that “writes” and “reads” records in memory. Either way the repository is constructed then injected into the code. The consumer is oblivious to what happens on the other side of the boundary. The test code is the same either way.
These boundaries enable us to decide just how much we want in the system-under-test at a given time.
I’m particularly fond of repository pattern because it creates a strong boundary around center of most applications: the data. Then I double down hard on that boundary.
I’ve used the boundary to completely punt on datastore selection or schema design. Instead, I used an in-memory store to drive the integration tests, then uses integration tests to drive unit tests. Eventually we did select a datastore, though the code didn’t change. We just added more tests to the repository. Dave Farley tells similar stories.
I’ve leveraged the boundary differently throughout the deployment pipeline. I’ve used in-memory stores for tests that run on the commit hook. I’ve used integration tests with in-memory stores as a gate to integration tests with real databases.
The boundaries are not just around databases. I place boundaries between my code and all external systems. The tests for the small batches slack app have a fake slack they interact with an and a real DynamoDB datastore backed by local DynamoDB. The boundaries allow me to decide just how much is included in the system-under-test at any given moment.
This is only possible by adopting two good software engineering practices.
One, use dependency injection as must as possible. Code with injected dependencies is always easier to test, thus easier to change and develop.
Two, always write to first-party interfaces. In other words, own your APIs. Don’t couple yourself to third-party code. Writing your interfaces means reducing coupling and more narrowly focused interfaces. Both make code easier to test, thus easier to change and develop.
If you adopt these practices then you’ll eventually find hexagonal architecture, also known as “ports and adapters”. This is basically the concept of boundaries and layers.
Once that get’s going then, you can establish the two core test-driven design loops: BDD at the integration level and TDD at the unit level. The first one driving the app from the top down and the second driving the app from bottom up.
All these choices reinforce and leverage each other.

All right that’s all for this batch. Visit https://SmallBatches.fm/97 for links on boundaries, hexagonal architecture, and ways to support the show.
I hope to have you back again for next episode. So until then, happy shipping!

Creators and Guests

Boundaries
Broadcast by