Communicating With Code: Why, What, How?

Wednesday, August 20, 2014

Why, What, How?

I recently encountered an argument (Why Most Unit Testing is Waste) from Jim Coplien arguing forcefully against unit testing as it is generally practiced in the industry today. I generally like his thinking, but find I cannot agree with this thesis. I disagree with the thesis, because I think of unit tests as filling a different role than Mr. Coplien (and, I think, most others in our profession) think it fills. In order to say what I mean, I'll start by taking a step back.

At root, I think the design process for a piece of software consists of answering three questions:

Why? What problem am I trying to solve, and why does this problem need a solution?
What would the results be, if I had a solution to this problem?
How could the solution work?

There is a natural order in which these questions are asked during design, and it is the same order that they are listed above: you should understand "why" you need a solution before you decide "what" your solution will do, and you should understand "what" your solution will do before you can decide "how" it will do it.

These questions will also often be asked hierarchically: the "why" answer characterizing why you create a product might result in a "how" answer characterizing the domain objects with which your product is concerned. But the answer to "how do I organize the concepts in my domain" is actually another "why" question: "why are these the correct concepts to model?". And this "why" question will lead to another "how" answer characterizing the roles and responsibilities of each domain object. And so on down, where "how" answers at one level of abstraction become "why" questions at the more concrete level, until one reaches implementation code, below which it is unnecessary to descend.

It's also notable that there is only one question in this list whose answer will be guaranteed to be visible in a design artifact, and will be guaranteed to be consistent with the execution model of the system: the question, "how does this work?", is ultimately answered in code. Neither of the other questions will necessarily be answered in a design artifact, and even if they are answered in a design artifact, it is likely that this artifact will become inconsistent with the design, over time, unless there is some force working against this. And as design artifacts grow stale, they become less useful. In the end (and again, in the absence of some force pulling in the other direction), the only documentation guaranteed to be useful in understanding a design is the code itself.

This is unfortunate. Because design (including implementation!) is a learning process, our understanding of why we make certain decisions can and will change significantly during design, almost guaranteeing significant drift between early design documentation and the system as built. If, in mitigating this problem, one relies primarily on the code for design documentation, then it takes significant mental work to work out "what" the module from "how" it does it, and still more work to go backwards from "what" the module does to "why" it does it - that is, in relying primarily on the code for design documentation, you are two degrees removed from understanding the design motivation.

Consider, instead, code with a useful test suite. While the "how does this work?" question is answered by the system code itself, the "what does this do?" question is answered by the test code. With a good test suite, you will see the set of "what" answers that the designer thought were relevant in building the set of code under test. You do not need to work backwards from "how", and you have removed a degree of uncertainty in trying to understand the "why" behind a set of code. And, if the test suite is run with sufficient frequency, then the documentation for these "what" answers (that is, the test code itself) is much less likely to drift away from the execution model of the system. Having these two views into the execution model of the system — two orthogonal views — should help maintainers more rapidly develop a deeper understanding of the system under maintenance.

Furthermore, on the designer's part, the discipline of maintaining documentation for not just "how" does a process work, but also "what does the process do" (in other words, having to maintain both the code and the tests) encourages more rigorous consideration of the system model than would otherwise be the case: if I can't figure out a reasonable way to represent "what does this do" in code (that is, if I can't write reasonable test code), then I take it as a hint that I should reconsider my modular breakdown. Remember Dijkstra's admonition:

As a slow-witted human being I have a very small head and I had better learn to live with it and to respect my limitations and give them full credit, rather than try to ignore them, for the latter vain effort will be punished by failure.

Because it takes more cognitive effort to maintain both the tests and the code than to maintain the code alone, I must consider simpler entities if I'm to keep both the code and the tests in my head at once. When this constraint is applied throughout a system, it encourages complexity reduction in every unit of the system — including the complexity of inter-unit interactions. Since complexity is being reduced while maintaining system function, it must be the accidental complexity of the system being taken out, so that a higher proportion of the remaining system complexity is essential to the problem domain. By previous argument, this implies that the unit-testing discipline encourages increased solution elegance.

This makes unit-testing discipline an example of what business-types call "synergy" - a win/win scenario. On the design side, following the discipline encourages a design composed of simpler, more orthogonal units. On the maintenance side, the existence of tests provide an orthogonal view of the design, making the design of individual units more comprehensible. This makes it easier for maintainers to develop a mental model of the system (going back to the "why" question that motivated the system in the first place), so that required maintenance will be less likely to result in model inconsistency. A more comprehensible, internally consistent, system is less risky to maintain than a less comprehensible, or internally inconsistent system would be. Unit testing encourages design elegance.

4 comments:

CopeAugust 21, 2014 at 12:22 PM
Thanks for your thoughts, Aidan. If you want to take a design perspective on this, I think you have centuries or millennia of design experience working against you. Tests cannot prove success, only failure; test-oriented thinking by a coder can avoid only those pitfalls that the coder conceived during design. The reason this is important is that goodness is much more than the absence of badness.

As I write in Chapter 2 (soon to be published) the software engineering data bear this out. Capers Jones notes that the efficiency of unit testing is as low as 10% and, in any case, is the least efficient way we know to remove defects from code. I think coders celebrate unit testing because it is within their sphere of control. On other other hand, so are code walkthroughs and their participation in design inspections, which are many fold more effective.

In terms of documentation of code, we know many ways that are more effective than tests. Tests are a very low-context form of communication: they in fact communicate less information than the code, in a form inaccessible to most stakeholders, in a way that covers a minuscule set of the concerns about code behaviour and, furthermore, in a way that covers a minuscule set of the possible paths through the code. From the perspective of information theory they are one of the most inefficient ways that one can devise to convey information about the workings of an algorithm. And the working of an algorithm is a tiny fraction of the concerns one must master to use a given API. I think it will be a long time before we displace natural language as the dominant form of code documentation, and I think there are good reasons far beyond mere inertia that help it hold its dominant place as the way we communicate code's functionality.

In the end I think it's a combination of aversion for teamwork and social activities, and a favour for individual, introverted action, combined with a feeling of autonomous control over one's fate, that lead to the broad acceptance of unit testing. In such a bubble it's easy to become unaccountable to business goals and to ignore the forces that contribute to code maintainability and quality.
ReplyDelete
Replies
CopeDecember 13, 2014 at 12:20 PM
And thanks for a gracious retort. I see we still have some disagreement, but I appreciate you for keeping the pot boiling on the stove. The rest, we'll have to take up over beers some day.
ReplyDelete
Replies

Add comment