Posted by Jakub Holý on September 12, 2012

Republished from blog.iterate.no with the permission of my co-authors Stig Bergestad and Krzysztof Grodzicki.

Three of us, namely Stig, Krzysztof, and Jakub, have had the pleasure of spending a week with Kent Beck during Iterate Code Camp 2012, working together on a project and learning programming best practices. We would like to share the valuable lessons that we have learnt and that made us better programmers (or so we would like to think at least).

 

Values Underlying the Programming Style

Most of the things that we have learnt stem from three basic values: communication, simplicity, and flexibility (in the order of importance). We will introduce them briefly here, you can find a more detailed description in Kent’s book Implementation Patterns, along with few fundamental practices such as the application of symmetry.

Communication

Programs are read much more often than written and therefore should communicate clearly their intent. Code is primarily means of communication. (For a typical enterprise system, a lot of code will be modified by many programmers over 5 – 15 years and they’ll all need to understand it.)

Simplicity

Eliminate excess complexity. Apply simplicity at all levels. Simplicity helps communication by making programs easier to understand. Communication helps simplicity by making it easier to see what is excess complexity.

Flexibility

“Making workable decisions today and maintaining the flexibility to change your mind in the future is a key to good software development.” — Implementation Patterns, Kent Beck

Programs should be flexible in the ways they change, they should make common changes easy or at least easier. Complexity can often arise from excess flexibility, but if that flexibility is not needed then it is waste. The best kind of flexibility comes from simplicity coupled with extensive tests. Trying to create flexibility through design phases or the likes usually ends up with this kind of “complex flexibility”. Most of the time you do not have enough information to make a proper design decision about where your program will need to change, so the lean mantra of postponing your decisions til the last responsible moment is a valid and useful approach. That is when you have the best grounds for any decision.

Summary

Write code that communicates well what and why is done so that your co-workers and future maintainers can take it over without too much cost. (Yet you have to assume some level of skill and knowledge in you audience.) You cannot foresee the future so keep your code simple and sufficiently flexible so that it can evolve.

Key Learnings

1. You Ain’t Gonna Need It!

What is today’s demo? What test to start with?

Before we arrived, we settled on a topic that we thought would be fun and challenging. We settled on trying our hands at prototyping a highly scalable, distributed database. We expected upon arrival to spend few hours discussing approaches to how we should actually implement this. After all, there are lot of things to consider: replication strategies, consistent hashing, cluster membership auto-discovery, and conflict resolution, to name a few. If you have just 5 days, you need to plan how to implement it in the simplest possible way. What to do. What to skip. Right? Wrong. Instead of any planning, Kent just asked us what would be the demo we would like to show at the end of the day. And his next question was what test to write.

It turned out to be a very smart approach because we actually implemented only a very small subset of the possible functionality, deciding daily what to do next based on our experiences with the problem so far. Any discussion from the beginning longer than 10 minutes would be 90% wasted time. This doesn’t mean that planning is bad (though the resulting plans are usually useless), it only means that we usually do much more of it than we actually need. Getting real feedback and basing our decisions on that, on real data, is much more practical and valuable than any speculations.

Therefore prefer on-going planning based on feedback and experience to extensive up-front planning. Ask yourself: What am I going to present at the next demo/meeting/day? What test to write to reflect that and thus guide my development effort?

2. Write High-Level Tests to Guide the Development

The goal of our second day was replication, demonstrated by writing to one instance, killing it, and reading the (replicated) data from the second instance. We started by writing a corresponding test, which follows these steps closely, nearly literally:

1 List<Graft> grafts = Graft.getTwoGrafts();
2 Graft first = grafts.get(0);
3 Graft second = grafts.get(1);
4  
5 first.createNode().put("key", "value")
6 first.kill();
7  
8 assertNotNull(second.getNodeByProperty("key", "value"));

(The API of course evolved later into something more general.)

Now the interesting thing is that this is not a unit test. It is basically an integration test, or to use a less technical term, a story test exercising a rather high-level feature. A feature of interest to the customer. While a unit tests tells me “this class is working as intended,” a story test tells me “this feature works as intended”.

I used to think about TDD at the unit/class level but this is TDD at a much higher level. It has some interesting properties:

  • It helps measure real progress of the project because it exercises something that is actually meaningful to the customer (if you permit me to use this “customer” in a little blurry fashion)
  • It helps keep you focused on delivering business functionality (by being on its level)
  • It’s likely to stay mostly unchanged and live much longer than unit tests or actually most of the code base because it is on such a conceptual level

Now, according to the testing pyramid, there are of course fewer story tests than there are unit tests, and story tests do not test all possible cases. Does it mean that you need to do all these story tests and then do them again only in smaller unit tests? No, that is not the point. Getting back to the principle of flexibility and the way things change, create additional unit tests only when you need them. For example when you encounter some case where the first story test did not actually “capture the whole” properly, or when you discover a really important corner case, or when you want to focus on implementing a part of the overall solution. Speculating about failure points can be just as wasteful as speculating about design.

3. Best Practices for [Unit] Testing

Write Tests From the End

We normally start a test with an idea of what we want to verify, but we may be not completely sure how to arrive there. Therefore it is good practice to express what we do know, the desired end-result, first. We do this in the form of an assertion and only then shift our focus to figuring how to get there. That’s how we started the test of replication in Graft, shown above.

This is an application of the key principle of focus.

Write Implementation in Tests, Refactor Later

You know the functionality you want and so you start writing the test for it. Instead of thinking about how it should be organized (what classes to create, where to put them, whether to use a factory class or a factory method), why not initially write the code directly in the test method? You can always factor out the code later. This way you can focus on what’s really important – describing the desired functionality with a test – instead of being distracted by secondary considerations. Additionally, by postponing the decision about the internal organization of the implementation, you will have more knowledge when actually deciding it and you will likely end up with a better solution.

Key principles: Focus, avoiding premature decision-making.

Bottom-up Design

Avoids:

  • assuming too much, too early
  • locking yourself into a specific design and premature design
  • restricting yourself (you will usually end up with the design you first intended)

Start by implementing small parts of functionality. Then combine them to form more complex behavior. Don’t get distracted by dependencies, write simple stubs for them that you will replace later with real implementations. Using this technique you are not bound to design decisions taken at the beginning as in the ‘top-down’ approach. It requires a little bit of intuition and some experience, but combined with TDD it helps to make better design and implementation.

We found this technique quite useful as we didn’t know the final solution at the beginning. When developing Graft, we haven’t designed the whole application up-front. We picked a use case on the first day, implemented it, and continued by choosing and implementing other use cases each day.

Act & Assert at the Same Level of Abstraction

Our Graft DB has a telnet-like interface for receiving commands from users. Consider the following two (simplified) variations of the addComment test:

1 // Test 1
2 Graft db = ...; this.subject = new CommandProcessor(db);
3 subject.process("addComment eventId my_comment");
4  
5 assertThat(subject.process("getComments eventId")).isEqualTo("my_comment");

 

1 // Test 2 (same setUp)
2 subject.process("addComment eventId my_comment");
3  
4 assertThat(db.getComments("eventId")).containsOnly("my_comment");

The first test, while testing the addComment command, uses another command – getComments – to check the resulting state. It uses only a single API entry point – subject – during the whole test. The second test accesses directly the underlying database instance and its API to get the same data, i.e. aside of subject it uses also the underlying db.

Thus the first test is not truly “unit” test as it depends on the correctness of another method of the tested class. The second test is much more focused and potentially simpler to write as it accesses directly the target data structure and thus performs the checks right at the source.

We would argue that tests like the first one, which perform all operations at the same level, namely the level of the public API of the object under test, are better. “Better” here means easier to understand and, more importantly, much more stable and maintainable because they are not coupled to the internal implementation of the functionality being tested. The price of increased complexity of these unit-integration tests (due to relying on multiple methods in each test) is absolutely worth the gain.

Tests similar to the second one are none the less more common, either accessing directly the underlying layers (an object, property, database, …) or using mocks to gain the possibility of direct verification of side-effects. These techniques often lead to coupled and hard to maintain tests and should be limited to the “private unit tests,” as described and argued in Never Mix Public and Private Unit Tests!

4. Focus!

  • Put tasks that pop up on a Later list instead of doing them at once
  • Focus on fixing the test first – however ugly and simple (and refactor later)
  • Focus on the current needs – no premature abstraction

One thing that really caught our attention is Kent’s focus on what he is doing at any moment. Being focused means concentrating on finishing that one thing you’re currently doing without getting distracted by other concerns, however important or simple to fix. (Side note: Never say never.) When having a failing test, focus on making it pass quickly, no matter how ugly the (temporary) solution is or that it “cuts corners.” If you notice along the way something else that needs to be done – giving a method a better name, removing a dead code, fixing an unrelated bug – don’t do it, put it on a task list and do it later. Otherwise you risk losing your attention and the current context. Do one thing at a time. When making a test pass, focus just on that, and leave concerns such as good code til the subsequent refactoring (which should follow shortly). (This reminds me of the Mikado method for large-scale refactorings, whose main purpose is also to keep focus and not getting lost in many sidetracks.)

A related practice is to focus on the current needs when implementing a feature, without speculatively designing for tomorrow’s needs (possibly literally tomorrow). Focus on what is needed right now, to finish the current task, and make the solution simple so that it will be easy to refactor and extend for both known and unforseen future needs. As Kent argues in Implementation Patterns (and others elsewhere), we’re very bad at speculative design, i.e. the future needs are usually quite different from what we expected and therefore it’s better to create solutions that are simple and with that also flexible. You of course need to pay some attention to the future needs but far less than we tend to do. Admit to yourself that you cannot predict the future. Even if you know what else is going to be required, how can you know that no new requirements that would change or delay that (up til infinity) will appear?

Some other stuff we learned

Parallel Design

Parallel design means that when changing a design, you keep the old design as long as possible while gradually adding the new one and then you gradually switching to the new design. This applies both at large and also (surprisingly) small scale. Though it’s costly – you have to figure out how to have them both run and it requires more effort to have them both – it often pays off because it’s safer and it enables resumable refactoring, discussed below.

An example of a high-level parallel design is the replacement of a RDBMS with a NoSQL database. You’d start by implementing the code for writing into the new DB, then you would use it and write both to the old and the new one, then you would start also reading from the new one (perhaps comparing the results to the old code to verify their correctness) while still using the old DB’s data. Next you would start actually using the NoSQL DB’s data, while still writing to/reading from the old DB (so that you could easily switch back). Only when the new DB proves itself would you gradually remove the old DB.

An example of a micro-level parallel design is the replacement of method parameters (message etc.) with the object they come from (an Edge), as we did for notifyComment:

1 - public void notifyComment(String message, String eventName, String user) {
2 -    notifications.add(user + ": commented on " + eventName + " " + message);
3 ---
4 + public void notifyComment(Edge target) {
5 +    notifications.add(target.getTo().getId() + ": commented on " + target.getFrom().getId() + " " + target.get("comment"));

The steps were:

  1. Adding the Edge as another parameter (Refactor – Change Method Signature)
  2. Replacing one by one usages of the original parameters with properties of the target Edge (Infinitest running tests automatically after each change to verify we’re still good)
  3. Finally removing all the original parameters (Refactor – Change Method Signature)

The good thing is that your code always works and you can commit or stop at any time.

Resumable Refactoring

If you apply the practices described below when performing a larger-scale refactoring then your code will always be buildable and you will be able to stop in the middle of the refactoring and continue (or not) at any later time.

The practices are parallel design and going forward in small, safe steps i.e. steps that provably do not break anything. In essence it’s about keeping the oversight and control, at each step you know exactly what you did which broke the test and this way you can not only quickly put the application back in a working state, but also quickly hone in on what exactly caused the problem.

(The Mikado method mentioned above is a great guide for refactoring systems where every change reveals a number of other changes required to make it possible. Of course the ultimate resource for refactoring legacy systems is Michael Feathers’s Working Effectively with Legacy Code).

Refactor on Green, at Will

The dogmatic TDD practitioners claim that you cannot change the behavior of production code unless some test forces you to do so. Thus it might be refreshing to hear that Kent doesn’t hesitate to generalize the code (e.g. by replacing fakes with a real implementation) even though there are no tests that require the generalization to pass.

On the other hand it doesn’t mean that forcing a generalization by tests is a bad thing or that you should not do it. This is basically a question of the economics of software development, of balancing the cost (of writing and maintaining tests) with the benefits (defect and regression prevention). It’s a question of the risk involved and of your confidence in your coding skills. Kent has rightfully much more confidence in his coding skills (and much more experience with it) than many of us. Our confidence is quite low based on past experiences and therefore we’ll probably go on enforcing generalizations with tests.

We’d close this topic by quoting Kent speaking about how much testing to do:

I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence (I suspect this level of confidence is high compared to industry standards, but that could just be hubris). If I don’t typically make a kind of mistake (like setting the wrong variables in a constructor), I don’t test for it. I do tend to make sense of test errors, so I’m extra careful when I have logic with complicated conditionals. When coding on a team, I modify my strategy to carefully test code that we, collectively, tend to get wrong.

Different people will have different testing strategies based on this philosophy, but that seems reasonable to me given the immature state of understanding of how tests can best fit into the inner loop of coding. Ten or twenty years from now we’ll likely have a more universal theory of which tests to write, which tests not to write, and how to tell the difference. In the meantime, experimentation seems in order.

Symmetry in the Code

Symmetry is an abstract concept, more specific than the values of communication, simplicity, and flexibility, but still rather general. In Implementation Patterns Kent refers to symmetry as a programming principle.

Code with symmetries is easier to grasp than code that is asymmetric. It’s easier to read and understand. So what, more specifically, is symmetric code? To quote Kent again:

Symmetry in code is where the same idea is expressed the same way everywhere it appears in the code.

Imagine a code where the some idea, like “getting the last updated document from the DB,” is implemented several times. The code is asymmetric if the method names differ, if they do things in different order, if there are some important differences between them. When you ask yourself “what does this method do” and you arrive at pretty much the same answer for all methods in spite of all the differences, then you have some violation of symmetry. An example of symmetry in code is keeping the abstraction level consistent within a code block, like a method. if the block is a mix of low level assignments and method calls, you may want to see if you can abstract away the assignments with a method. The astute reader have probably noticed that consistency is a large part of symmetry: being consistent with abstraction levels, consistent with method naming, and so on. But symmetry is more abstract in that it deals more with ideas, not rules (such as the rule that class and method names should be in camel-case).

And What Do you Know, Even Some More …

  • Manage your energy – be aware of your energy and stop before becoming tired. Don’t forget to take breaks. A rested developer is multiple times more productive than a tired one. (J.B. Rainsberger in The Economics of Software Design shares the story of working so intensively that he became exhausted and totally unproductive).
  • Pair-programming is a skill one must consciously learn (and it may be more challenging for some personality types, which shall to be respected)
  • Prefer IDE’s refactorings to manual changes – f.ex. none of us had ever before used the “inline” refactoring while Kent uses it all the time. Once you master the refactorings, they’ll become much more efficient than changing things manually and, more importantly, they avoid the small but non-zero probability of breaking something (remember that Murphy guy who said – what can break will break)

Code

You can find code for Iterate Code Camp 2012 on GitHub – bit.ly/codecamp2012

Conclusion

We hope that you, our dear reader, find some of these ideas interesting and have got inspired to try them in your daily practice, as we did.

Related Resources

  • Jakub’s blog post Principles for Creating Maintainable and Evolvable Tests summarizes some complementary principles for writing tests that he learnt from Kent
  • Rich Hickey: Simple Made Easy – a great talk that explains the crucial difference between “simple” (vs. complex) and “easy” and how our languages and tools aren’t as simple as they should be, often because they try to be easy

- Krzysztof, Stig, and Jakub, June 2012 -