Wednesday, December 10, 2014

Bootstrapping Rust

There are two stories I particularly like about the problem of bootstrapping self-referential tools. This one:

Another way to show that LISP was neater than Turing machines was to write a universal LISP function and show that it is briefer and more comprehensible than the description of a universal Turing machine. This was the LISP function eval[e,a], which computes the value of a LISP expression e - the second argument a being a list of assignments of values to variables. (a is needed to make the recursion work). Writing eval required inventing a notation representing LISP functions as LISP data, and such a notation was devised for the purposes of the paper with no thought that it would be used to express LISP programs in practice. Logical completeness required that the notation used to express functions used as functional arguments be extended to provide for recursive functions, and the LABEL notation was invented by Nathaniel Rochester for that purpose. D.M.R. Park pointed out that LABEL was logically unnecessary since the result could be achieved using only LAMBDA - by a construction analogous to Church's Y-operator, albeit in a more complicated way.

S.R. Russell noticed that eval could serve as an interpreter for LISP, promptly hand coded it, and we now had a programming language with an interpreter.

And this one:

Then, one day, a student who had been left to sweep up after a particularly unsuccessful party found himself reasoning in this way: If, he thought to himself, such a machine is a virtual impossibility, it must have finite improbability. So all I have to do in order to make one is to work out how exactly improbable it is, feed that figure into the finite improbability generator, give it a fresh cup of really hot tea... and turn it on!

Rust is, I think, the programming language I've been waiting for: it takes a large proportion of the ideas I like from functional languages, applies them in a systems programming language context with 0-overhead abstractions, can run without a supporting runtime, and makes "ownership" and "lifetime" into first-class concepts. In my work trying to build highly complex and robust embedded systems, these are exactly the features I've been waiting for, and I'm thrilled to see a modern programming language take this application space seriously.

Building the language on a second-class platform, though, was intensely frustrating. The frustration came from several places, including that my Unix systems-administration skills have gotten (*ahem*) rusty, but I'd say the bulk of my difficulty came from a single factor: circular dependencies in the tooling.

  • Building the Rust compiler requires a working Rust compiler.
  • Building the Cargo package-manager requires a working Cargo package-manager.

These problems turned out to be ultimately surmountable: The rustc compiler had been previously built for a different version of FreeBSD, but once I re-compiled the required libraries on my version, I managed to get the pre-built compiler running. For the Cargo package manager, I ended up writing a hacky version of Cargo in Ruby, and using that to bootstrap the "true" build. I'm glossing over it here, but this turned out to be a lot of effort to create: Rust is evolving rapidly, and it was difficult to track new versions of Cargo (and its dependencies) that depended on new versions of the Rust compiler, while the Rust compiler itself sometimes did not work on my platform.

This was, necessarily, my first exposure to the platform in general, and it was, unfortunately, not a positive experience for me. Circular dependencies in developer tools are completely understandable, but there should be a way to break the cycle. Making bootstrap a first-order concern does not seem that difficult to me, in these cases, and would greatly enhance the portability of the tools (maybe even getting Rust to work on NetBSD, on GNU/Hurd, or on other neglected platforms):

  • The dependency of the compiler on itself can be broken by distributing the bootstrap compiler as LLVM IR code. Then use LLVM's IR assembler, and other tools, to re-link the stage0 Rust compiler on the target platform. The stage0 compiler could then be used to build the stage1 compiler, and the rest of the build could proceed as it does today. (Rust issue #19706)
  • The dependency of the package manager on itself can be broken by adding a package manager target to, say, tar up the source code of the package manager and its dependencies, along with a shell script of build commands necessary to link all that source code together to create the package manager executable. (Cargo issue #1026)

I am not suggesting that we should avoid circular dependencies in developer tools: Eating our own dogfood is an extremely important way in which our tools improve over time, and tools like this should be capable of self-hosting. But the more difficult it is to disentangle the circular dependency that results, the more difficult it becomes to bootstrap our tools, which will ultimately mean less people adopting our tools in new environments. Most of us aren't building Infinite Improbability Drives. Bootstrapping should be a first-order concern.

Monday, November 17, 2014


Taking a cue from Julia Evans, linking this here to remind myself to re-read it occasionally:

John Allspaw's notes On Being a Senior Engineer

Sunday, November 9, 2014

Think Big, Act Small

There are, I think, two big dangers in designing software (which probably apply to other types of creative activity, as well):

  • No up-front design.
  • Requiring complete design to begin implementation.

I call these the two "big" dangers because there are strongly seductive aspects to each, and because each approach has been followed in ways that have driven projects into the ground. I think these approaches are followed because many project planners like to frame the plan in terms of "how much design is necessary before implementation starts?". When you ask the question this way, "no up-front design" and "no implementation without a coherent design" are the simplest possible answers, and both of these answers is reasonable in a way:

  • When avoiding up-front design, you can get straight to the implementation, allowing you to show results early, ultimately allowing you to get earlier feedback from your target audience about whether you're building the correct product.
  • When avoiding implementation before the design is fleshed out, you can be more confident that parts of the implementation will fit together as intended, that fewer parts of the system will require significant rework when requirements change late in the process, and it is much easier to involve more people in the implementation, since you should be able to rely more on the formal design documentation to allow work to coordinate.

(I could also make a list of the cons, but ultimately the cons of each approach are visible in the pros of the other. For example, by avoiding up-front design, it becomes much harder to scale the implementation effort to a larger group than, say, 10 or 15 people: The cost of producing formal specifications is high, but (relatively) fixed, while the cost of informal communications starts low, but increases with the square of the number of individuals in the group.)

As I write today, the dominant public thinking in the software developer community is broadly aligned against Big Design Up Front, and towards incremental, or emergent design. I generally share this bias: I consider software design to be the art of learning what the constructed system should look like, in enough detail that the system becomes computer operational (Design Is Learning), and I think learning is better facilitated when we have earlier opportunity for feedback from our design decisions. Further, if we make mistakes in our early design, it's much less expensive to fix those mistakes directly after making them than it is after they become ingrained in our resulting system. It's extremely important to get feedback about the suitability of our design decisions quickly after making them.

But the importance of early feedback does not reduce the importance of early big-picture thinking about the design. I work mostly in real-time embedded systems programming, and in this domain, you cannot ignore the structure of the execution path, or even of the data accessed, between event stimulus and time-critical response. Several operations would have been easier to implement had I ignored these real-time concerns, and the problems that would have resulted would have been invisible in early development (when the system wasn't as stressed, and real-time constraints were looser). On the other hand, we would not have been able to make that system work in that form: large amounts of code would have likely needed rewrite to meet higher load and tighter timing constraints as our system got closer to market. The extra effort put into being able to control the timing of specific parts of our execution path was critical to our system's ability to adapt to tightening real-time requirements.

Which all serves to introduce the core four words of this little essay: "Think Big, Act Small." In other words, consider the whole context of the design you are working on ("think big"), while making sure to frequently test that you are moving towards your goal ("act small"). So, if I'm working on a part of the system, I don't think only of that part, but also of all the other parts with which it will interact, of its current use and of its possible future uses, and of the constraints that this part of the system must operate under. (That is, I try to understand the part's whole context as I design and build it.) On the other hand, if I'm trying to design some large-scale feature, I think of how it breaks down into pieces, I try to think about which of these pieces I'm likely to need the soonest, what sorts of requirements changes are likely to occur, what parts of the break-down those changes are likely to affect the most, and, of the big design, how much work do we actually need to do to meet our immediate needs. (That is, I try to break down the big design into the smallest steps I can that will a) demonstrate progress towards the immediate goal, and b) be consistent with likely future changes.) By the time I actually begin to write the software code, I have probably thought about an order of magnitude larger portion of the system than what I will write to complete my immediate task.

This is hard work, and it can feel like waste to spend time designing software that, often, will never be written. Patience and nerves get worn out, trying to hold a large part of the system in my head at once before the design of a feature or subsystem finally gels and implementation can start. On the other hand, I've found the designs I've created in this way have tended to be much more stable, and maintenance on individual modules tends not to disturb other parts of the system (unless the maintenance task naturally touches that other part of the system as well). In other words, I feel these designs are very well factored. It takes a lot of effort to get there, but echoing an idea made famous by Eisenhower ("plans are worthless, but planning is everything"), in the end I would rather spend up-front time thinking about how to write code that will never need to be written, than spend time at the back-end thinking about how to re-write subsystems that will never meet their design objectives.

Think big: what is the whole context of the effort you are thinking of undertaking? Act small: how can you find out if the path you are on is correct, as early as possible? Know your objectives early, test your decisions early, and adapt to difficulties early, to achieve your goals.

(PS: for more on this theme, see here.)

Wednesday, August 20, 2014

Why, What, How?

I recently encountered an argument (Why Most Unit Testing is Waste) from Jim Coplien arguing forcefully against unit testing as it is generally practiced in the industry today. I generally like his thinking, but find I cannot agree with this thesis. I disagree with the thesis, because I think of unit tests as filling a different role than Mr. Coplien (and, I think, most others in our profession) think it fills. In order to say what I mean, I'll start by taking a step back.

At root, I think the design process for a piece of software consists of answering three questions:

  1. Why? What problem am I trying to solve, and why does this problem need a solution?
  2. What would the results be, if I had a solution to this problem?
  3. How could the solution work?

There is a natural order in which these questions are asked during design, and it is the same order that they are listed above: you should understand "why" you need a solution before you decide "what" your solution will do, and you should understand "what" your solution will do before you can decide "how" it will do it.

These questions will also often be asked hierarchically: the "why" answer characterizing why you create a product might result in a "how" answer characterizing the domain objects with which your product is concerned. But the answer to "how do I organize the concepts in my domain" is actually another "why" question: "why are these the correct concepts to model?". And this "why" question will lead to another "how" answer characterizing the roles and responsibilities of each domain object. And so on down, where "how" answers at one level of abstraction become "why" questions at the more concrete level, until one reaches implementation code, below which it is unnecessary to descend.

It's also notable that there is only one question in this list whose answer will be guaranteed to be visible in a design artifact, and will be guaranteed to be consistent with the execution model of the system: the question, "how does this work?", is ultimately answered in code. Neither of the other questions will necessarily be answered in a design artifact, and even if they are answered in a design artifact, it is likely that this artifact will become inconsistent with the design, over time, unless there is some force working against this. And as design artifacts grow stale, they become less useful. In the end (and again, in the absence of some force pulling in the other direction), the only documentation guaranteed to be useful in understanding a design is the code itself.

This is unfortunate. Because design (including implementation!) is a learning process, our understanding of why we make certain decisions can and will change significantly during design, almost guaranteeing significant drift between early design documentation and the system as built. If, in mitigating this problem, one relies primarily on the code for design documentation, then it takes significant mental work to work out "what" the module from "how" it does it, and still more work to go backwards from "what" the module does to "why" it does it - that is, in relying primarily on the code for design documentation, you are two degrees removed from understanding the design motivation.

Consider, instead, code with a useful test suite. While the "how does this work?" question is answered by the system code itself, the "what does this do?" question is answered by the test code. With a good test suite, you will see the set of "what" answers that the designer thought were relevant in building the set of code under test. You do not need to work backwards from "how", and you have removed a degree of uncertainty in trying to understand the "why" behind a set of code. And, if the test suite is run with sufficient frequency, then the documentation for these "what" answers (that is, the test code itself) is much less likely to drift away from the execution model of the system. Having these two views into the execution model of the system — two orthogonal views — should help maintainers more rapidly develop a deeper understanding of the system under maintenance.

Furthermore, on the designer's part, the discipline of maintaining documentation for not just "how" does a process work, but also "what does the process do" (in other words, having to maintain both the code and the tests) encourages more rigorous consideration of the system model than would otherwise be the case: if I can't figure out a reasonable way to represent "what does this do" in code (that is, if I can't write reasonable test code), then I take it as a hint that I should reconsider my modular breakdown. Remember Dijkstra's admonition:

As a slow-witted human being I have a very small head and I had better learn to live with it and to respect my limitations and give them full credit, rather than try to ignore them, for the latter vain effort will be punished by failure.

Because it takes more cognitive effort to maintain both the tests and the code than to maintain the code alone, I must consider simpler entities if I'm to keep both the code and the tests in my head at once. When this constraint is applied throughout a system, it encourages complexity reduction in every unit of the system — including the complexity of inter-unit interactions. Since complexity is being reduced while maintaining system function, it must be the accidental complexity of the system being taken out, so that a higher proportion of the remaining system complexity is essential to the problem domain. By previous argument, this implies that the unit-testing discipline encourages increased solution elegance.

This makes unit-testing discipline an example of what business-types call "synergy" - a win/win scenario. On the design side, following the discipline encourages a design composed of simpler, more orthogonal units. On the maintenance side, the existence of tests provide an orthogonal view of the design, making the design of individual units more comprehensible. This makes it easier for maintainers to develop a mental model of the system (going back to the "why" question that motivated the system in the first place), so that required maintenance will be less likely to result in model inconsistency. A more comprehensible, internally consistent, system is less risky to maintain than a less comprehensible, or internally inconsistent system would be. Unit testing encourages design elegance.

Sunday, July 27, 2014

Sources of Accidental Complexity

Recalling my preferred definition of solution elegance:

TotalComplexity = EssentialComplexity + AccidentalComplexity
Elegance = EssentialComplexity / TotalComplexity

So we want to reduce accidental complexity. Which requires identifying accidental complexity. This is subjective - the perception of complexity will vary by observer (I wrote about my experience here), and the same observer will perceive the same code as differently complex at different times. It will often be the case that two types of accidental complexity will be in opposition to each other: reducing accidental complexity of type A results in accidental complexity of type B, and reducing complexity of type B results in complexity of type A. So this will be a balancing act, and choosing the right balance means knowing your audience.

So far, so abstract. This will hopefully become clearer through example. If we accept that perceived complexity is proportional to the effort that must be spent to work with a system under study, then we can look at different types of effort we might spend, and then say when that type of effort can be considered "accidental" or "essential" to the system. This may become an ongoing series, so I'll start with two such types of effort that I'll call "lookup effort" and "interpretation effort".

Lookup effort

"Lookup effort" is the effort spent referring to external documents to understand the meaning of a piece of code. I use the term "document" broadly in this context: an external document may refer to traditional published documentation, or it may refer to source code located elsewhere in the system. The important point is that it is external: having to maintain the context of the code under study in your head while you look something up feels like an extra effort, which means it makes the system feel more complex.

This type of effort can feel "essential" to understanding the system as a whole. It will feel "essential" when the external reference neatly matches a domain boundary: in this case, the fact that the lookup is external reinforces that the domain includes a separation of concerns. The fact of having to traverse to an external reference can teach you something about the domain.

On the other hand, this type of effort will feel "accidental" in just about every other scenario:

  • when reading code, having to look up a library function, or language feature, that isn't well understood by the reader (as opposed to inlining the called construct, using mechanisms that the reader already understands);
  • when debugging code, building a highly detailed model of runtime behavior (which often requires moving rapidly through several layers of abstraction, to come to complete understanding);
  • when writing code, determining the calling conventions (function name, argument order and interpretation) of a function to be called.

In fact, I'd argue that the major force preventing this type of effort from becoming overwhelming is the fact that it disappears as our system vocabulary improves: as our language vocabulary improves, we don't need to refer to the dictionary as frequently. The time spent looking concepts up disappears when the concept is already understood. For example, if I am reasonably familiar with C, and read the code:

strncpy(CustomerName, Arg, MaxNameLength);

I do not need to refer to any reference documentation to understand that we are copying the string at Arg to the memory at CustomerName: I've already internalized what strncpy means and how it works, so that reading this line of code does not cause me to break my flow. On the other hand, at the time I wrote this question on stackoverflow, I was not prepared to understand mapM (enumFromTo 0). Since I understand strncpy very well already, it has no associated lookup effort. However, at the time I wrote that question, mapM (enumFromTo 0) had an extremely high lookup complexity, as it relied on familiarity with concepts I did not yet understand, and vocabulary I had not yet developed.

Interpretation effort

"Interpretation effort" is the effort that must be spent to develop an understanding of a linear block of code. There are several metrics that have been developed that can give a sense of the scale of interpretation effort (some of my favorites are McCabe's cyclomatic complexity, and variable live time (I couldn't easily find an on-line reference, but the metric is described in McConnell's Code Complete), but it will usually be true that a longer linear block of code will be more effort to interpret than a shorter block.

Linear blocks of code will ideally be dense with domain information, so that interpretation effort should feel essential to understanding the sub-domain. Since code blocks are the basic unit in which we read and write code, and since the problem domain should be dominant in our thinking as we read and write code, linear code blocks will naturally have a high density of information about the domain. Except when they don't. Which will happen by accident. I'm trying (failing?) to be cute, but to be more straightforward about it, what I mean is that the default state of a linear code block is to consist of problem essence. It is accident that pulls us out of this state.

But software is accident-prone. Among the sources of accidental complexity in linear code blocks are:

  • Repeated blocks of structurally identical code, AKA cut-and-paste code. This code must be re-interpreted each time it is encountered. If it were (properly) abstracted into a named external block, it would only need to be read and interpreted once, and given a name by which the common function can become part of the system vocabulary.
  • Inconsistent style. When software is maintained by multiple authors, it is unlikely that the authors will naturally have the same preferences regarding indentation style, symbol naming, or other aspects of style. To the extent that the written code does not look like the product of a single mind, there will be greater interpretation effort, as the reader must try to see the code through each maintainer's mind as she tries to understand the code in question.
  • Multiple levels of abstraction visible at once. See this.
  • Boilerplate code.
  • And much more!


"Interpretation effort" forms a natural pair with "lookup effort": decreasing lookup effort (by inlining code) will naturally come at the expense of increasing interpretation effort, while decreasing interpretation effort (by relying on external code for part of the function's behavior) will tend to increase lookup effort. There are guidelines that will generally be useful in picking the right balance in your design. (Aiming for high cohesion in code is intended to reduce interpretation effort, aiming for low coupling is intended to reduce lookup effort, and aiming for high fan-in and low fan-out is intended to help minimize the required system vocabulary.) In general, I would bias towards having greater "lookup effort" than "interpretation effort" in a design, as lookup effort can be eliminated by improving system vocabulary while interpretation effort will always be present. This advice will apply in most situations, but will not necessarily apply in all. Internalizing not just the rules, but also the rationale for the rules, will make it possible for you to make the right decisions for your audiences.

Sunday, June 1, 2014

When Writing Code, Know Your Audience

It's well understood that software code is a document written primarily for two different types of audience:

  1. The platforms that will execute the code, and
  2. The developers who will maintain the code.

To support these audiences, code is written simultaneously for execution and for maintainability. Execution is always the primary concern, as execution is why we have software. Maintainers will therefore want to develop a coherent model of how the code under maintenance will execute, what its assumptions are about its environment, and what promises the code makes to its clients. Well-written code makes developing this model as easy as possible (Elegance Is Teaching). Which means, to write code well, one has to know something about the audience that will read it. And this will depend on the type of project you are working on.

Throwaway Projects

The easiest type of code to write is of the throwaway type. That is, code written to satisfy some short-term need, then disposed of. Examples include shell-scripts that you write on the command line, and experimental code written to test or develop your understanding of a platform. This is the easiest type of code to write, for two reasons:

  1. You are not concerned about anyone else reading this code.
  2. The entire model of the code's behavior lives in your working memory. (If it does not, then you might consider that this isn't throwaway code.)

As such, for this type of code (and, in my opinion, only for this type of code), you do not need to worry about making the code reflect your intentions as you are writing it.

Closed-Source Personal Projects

Then there is the project which is longer-lived, and will likely require maintenance across its life-time, but for which the only audience that matters is you, yourself. Other things being equal, this ought to be the second easiest type of code to write: it is much easier to know the audience, to know what their skills are, and what ideas they already know or can easily pick up, when you are your own audience.

Still, it's a fair bet that you'll forget parts of how it works, and if you ever need to maintain something you wrote yourself, there will certainly be times when you need to remind yourself of the design. You will want to have either explicit design documentation, or, preferably, to have your design be self-documenting.

Project Contributor

Then there is the case where you are making a contribution to a code base controlled by someone else. In this case, it's really up to the project maintainers to provide guidance on the mechanisms you should use to make your code maintainable. But practices like having good names for your symbols and creating well-factored code will be appreciated in any project.

Project Maintainer

Finally, you have the case where you are an active maintainer, in some capacity, of a project. In this case, your decisions will be a key factor in determining your intended audience, so your audience should be considered as you make decisions.

It's probably sufficiently clear that considering your audience will affect the nature of your project's documentation. But audience considerations can also have significant effect on your project's technical decisions, as well. For example, considering your audience may guide you towards more or less advanced techniques when making architectural decisions: Doug Simon (at, around the 40 minute mark) noted that basing the Maxine JVM on continuations made it less accessible to some of his desired audience than a more mainstream stack-oriented architecture would be. On the other hand, Bryan O'Sullivan ( argued that using Haskell allowed his start-up to attract better developers. One wanted to expand the audience, by reducing reliance on less well-understood techniques, while the other wanted to reduce the developer audience, by increasing reliance on a less well-understood language. In either case, the nature of the potential audience informed a technical decision.

If you will be working closely with the same, small group of people on a project for a long time (for example, for a large-scale work project), then you may not need much explicit design documentation: informal communication may suffice to spread design knowledge through the team. If you have a high degree of turnover on your project, then you will probably want to invest more in introductory documentation, and your code base should not require a high level of refinement to productively use.

There are a plethora of ways that understanding the audience for your software should influence its technical design, I only scratch the surface here. As in any other form of communication, understanding your listener is critical to success. Your code can communicate to multiple audiences. To produce beautiful code, you must write with those audiences in mind.

Saturday, March 22, 2014

Elegance Is Teaching

While the design process is a learning process, an elegant design can be considered a teaching tool. In order to justify this argument, I'll have to explain what I mean by elegance.

I use an operational definition of solution elegance. Recalling Fred Brooks, solution complexity can be broken down into two components:

  • There is the essential complexity, which is complexity inherent in the solution you are trying to provide,
  • And there is the accidental complexity, which is not.

In my opinion, a solution can be called elegant when it has a high essential-complexity to total-complexity ratio. As an equation, I'd say:

TotalComplexity = EssentialComplexity + AccidentalComplexity
Elegance = EssentialComplexity / TotalComplexity

With an "Elegance" value of 1 representing a perfectly elegant solution (i.e., a solution in which all the solution complexity is essential, and none is accidental), and a value close to 0 representing a big ball of mud. (A value of 0 is considered impossible, as any solution must have some essential complexity.) This is an equation, but because this is an equation describing a subjective phenomenon (the perception of elegance), it's built with subjective terms (essence, accident, complexity). Still, we're all humans, we're all more similar than we are different, it's likely that we'll share the same basis for deciding what is complex, what is essential, and what is not.

This is my basis: perceived complexity is proportional to the amount of effort one must spend in developing understanding of the system under study. (I won't go further into it here, but note that the word "perceived" is used to point out that the experience of complexity will vary with the observer.) Essential complexity is recognizable as that part of the implementation that teaches you something about all possible solutions, while accidental complexity is what's left.

In other words, a design (including implementation code) is recognizable as elegant when it teaches you about the domain. An elegant design can be used as a teaching tool.

Monday, March 10, 2014

Mastery Against Generality

LtU pointed me to Design Principles Behind Smalltalk. I'll quote the first design principle named, because it illustrates what I think is a mistake in reasoning common to those of a mathematical bent:

Personal Mastery: If a system is to serve the creative spirit, it must be entirely comprehensible to a single individual.

The point here is that the human potential manifests itself in individuals. To realize this potential, we must provide a medium that can be mastered by a single individual. Any barrier that exists between the user and some part of the system will eventually be a barrier to creative expression. Any part of the system that cannot be changed or that is not sufficiently general is a likely source of impediment. If one part of the system works differently from all the rest, that part will require additional effort to control. Such an added burden may detract from the final result and will inhibit future endeavors in that area. We can thus infer a general principle of design:

Good Design: A system should be built with a minimum set of unchangeable parts; those parts should be as general as possible; and all parts of the system should be held in a uniform framework.

I strongly sympathize with the point of view outlined here. If one can master simple, general principles, then that reduces the burden for understanding some set of more specific ideas, and can potentially greatly increase the number and scope of the ideas one can understand and use at any given time -- it can improve one's intellect.

That said, it is considerably more difficult to impart understanding of general ideas than of specific ones. If this isn't immediately obvious to you, consider the order in which you learned some mathematical concepts. Take the following problems:

  1. If joey has three apples, and gives two away, how many does he now have?
  2. Solve for x: x = 3 - 2.
  3. Prove that the addition operation under the set of integers modulo some constant forms a group.
  4. Give an example of a non-Abelian group.
  5. What the hell is a left-adjoint functor?

I think you can be expected to gain mastery of each of these problems in the same order in which the problems are listed. Each problem is more abstract than the previous, and each successive problem is, in a sense, simpler and more general than the previous. But each successive problem is also, to my mind, more difficult than the previous: We needed to understand the more specific ideas before we could be expected to generalize. The mechanism for "good design" quoted above can probably be considered to be in some tension with the stated goal of maximizing "personal mastery" of the system.

The implications of this tension are, I think, important. In particular, under the assumption that a more abstract understanding of a problem domain can make the problem more tractable, it's usually in any given author's interest to move "up" the "abstraction ladder", in order to better solve a problem herself. To the extent that this means she happens upon a good solution faster than others, this is all to the good. But to the extent that this means she happens upon much different solutions than others would, then however much more "elegant" her own solution is will be weighed against the cost in comprehension for others she works with.

Tuesday, March 4, 2014

Design Is Learning

Given that, as software designers, we don't know exactly what we're doing, we must be learning what we're doing as we go. By implication, design can be (and, in my opinion, should be) approached as a learning opportunity: the design process should support learning as much as possible about the form of the system under development, as rapidly as possible. An effective design process will include an effective learning environment. (For the record, I also subscribe to Jack Reeves's idea that code is design. As such, I think this idea also applies to software construction.)

Learning is facilitated when there is rapid and easy visibility into the link from action to consequence. This is partly (primarily?) why methodologies like test-driven development, continuous integration, and continuous delivery focus on decreasing the turn-around time from making a change to the system, to seeing the results of that change: by seeing the results of a change while the thoughts that lead to the change are still in working memory, it becomes much easier to see why the change did or did not have the desired effect. Sometimes, an automated test will give the fastest form of feedback from idea to consequence... More often, the fastest form of feedback comes from thinking through the consequences (thinking "if I change this loop's end-condition, it should resolve my off-by-one error" will usually be faster than making the change and re-running the test), with the automated tests acting as a check on the model of the software you maintain in your mind as you work.

Big Design Up Front will often fail because it creates a large lead-time between action (the "big design") and consequence (the built system). On the other hand, no design up front can also break down in cases where systemic consequences of early design decisions do not become visible until late in the construction process, which can give rise to issues that cannot be resolved cheaply or easily. In either case, the fundamental issue is that a decision about the correct sequence of software construction activities has cut off opportunities for determining the most natural way to build the system under development. And we get less maintainable systems as a result.

If, instead, we focus on design (in a broad sense, including requirements elicitation and coding and maintenance) as an opportunity to learn what the shape of our created system should be, then we are encouraged to ask the most significant questions first. (What do our customers want? What are the fundamental concepts in our problem space?) We are encouraged to test our ideas as soon as we can. (Following TDD to test the implementation. Building the Minimum Viable Product to test the market for our ideas.) We are encouraged to reflect on the outcome of having our ideas tested. (Refactoring towards greater insight [Evans, 2004]. Pivoting to serve a different market.) In other words, we are encouraged to train our focus where it will provide the greatest value.

In my opinion, there is no single correct answer to the question of "how much up-front design should we do?". Rather, ask "how can we learn the most about what our system should look like, as quickly as possible?". The answer to that question will be measurable in some quantity of design and implementation work, and that is where you should spend your focus. Software is operational knowledge. To produce novel software, we must gain knowledge. We must learn new things.

Tuesday, February 25, 2014

If you know what you're doing, you're doing the wrong thing.

I read something to this effect, but I can't find the source: "In software engineering, the more you know what it is that you're doing, the more likely it is that you're doing the wrong thing." I plan to come back to this theme, so I'd like to expand on it. The way I see it, this argument is about a couple of (related) things:

  • Encouraging individual growth, and
  • Avoiding obsolescence.

I'd like to end this essay on a high note, so let me start with the more negative reading:

Avoiding Obsolescence

I agree with Marc Andreessen: Software is eating the world. (Aside: when virtual reality becomes commonplace, we will be able to say "software has eaten the world".) This is happening because of a feedback loop. If we start with the idea that software is an operational representation of knowledge, then:

  1. As more information becomes software accessible, more knowledge that acts on that information can be operationalized as software.
  2. Sometimes, knowledge will result in the production of new information.
  3. When this is the knowledge being operationalized as software, this makes the new information produced accessible to software.

By implication, if you have high explicit knowledge about how to do your work, then you know that this knowledge is now, or soon will be, possible to operationalize as software. At which point, you won't be needed to do it any more. Worse, if you don't have high explicit knowledge about how to do your work (likely because the knowledge is tacit), that does not mean that no one else does. And if someone else knows how to operationalize what you do as software, you still won't be needed to do it any more.

This seems to be a negative reading, but it doesn't have to be. The tasks about which we tend to have the highest explicit knowledge will be those tasks that are closest to drudgery. Automating that part of our work frees us to focus on the more interesting aspects of our work. It can dramatically increase our potential as individuals. Which brings us to this point:

Encouraging personal growth

If you know how to automate what you're doing, then automate it, and work on the part you don't know how to automate. If someone else knows how to automate what you're doing, try to use what they've done, and work on some other aspect. Orient your career away from rote work. If you want to do knowledge work, then the best way to be confident that what you're doing can't be automated is to work at some frontier of knowledge: your job should involve the production of new knowledge, since this new knowledge will not have been operationalized as software yet. In other words, the best way to avoid obsolescence in knowledge work involves continuous growth.

Software is increasing your potential. But it cannot fulfill your potential for you. Meeting your potential is hard. Growth requires that you push against boundaries that you don't yet understand. It will involve testing your theories, and re-evaluating them when evidence does not agree with their predictions. At the frontiers of knowledge, your initial ideas will usually fall short. If none of your ideas fail, at least to some extent, then you probably know what you're doing, and you more than likely aren't particularly close to a knowledge frontier. To be at a frontier of knowledge will mean that you won't always know what you're doing, and you will have individual failures. But when the right lessons are learned from failure, it can lead to new insight, and it is from insight that a knowledge worker provides the greatest value.

Fear of failure is ultimately much worse to a knowledge worker than failure is: you can learn from failure. Fear of failure will ultimately lead to intellectual stagnation, and then obsolescence. "If you know what you're doing, you're doing the wrong thing" is an encouraging statement: it helps you accept that you don't completely understand what you're doing, yet. If you're figuring it out, though, you're growing closer to your potential.

Tuesday, February 18, 2014

The Paradox of Architecture

The definition of software architecture I use most often is Chris Verhoen's, which can be paraphrased as "that which is expensive to change." Those aspects of the system that are most expensive to change are also the ones that we most want to get right. In fact, I would argue that a software architect's value to a project is manifest whenever she avoids an expensive wrong decision that would have been made in her absence. It then follows that the role of a software architect is to avoid expensive mistakes. (Note, though, that inaction will also generally be a mistake.)

You might notice that this is a negative characterization of an architect's role: an architect's role is defined by what she does not do (make expensive mistakes), rather than by what she does. I'll come back to this later, after making another observation about the above definition of "architecture".

Defining architecture as "that which is expensive to change" provides an operational basis for determining if an aspect of the system is architectural. That is, given an aspect of the system, you can ask "how expensive would it be to change this?" to determine if that aspect is architectural: if it's expensive to change, then it's architectural; if it isn't, then it's not.

And this brings us to the title of this little essay, something I'll call the Paradox of Architecture. That is:

A large part of an architect's job is to avoid the introduction of architecture to a system.

I know this sounds completely backwards, and yet I believe it's true - the architect is responsible for preventing aspects of the design from becoming architectural. Consider the canonical example of a system designed without architecture, Foote and Yoder's Big Ball of Mud:

A BIG BALL OF MUD is haphazardly structured, sprawling, sloppy, duct-tape and bailing wire, spaghetti code jungle. We’ve all seen them. These systems show unmistakable signs of unregulated growth, and repeated, expedient repair. Information is shared promiscuously among distant elements of the system, often to the point where nearly all the important information becomes global or duplicated. The overall structure of the system may never have been well defined. If it was, it may have eroded beyond recognition.

A defining characteristic of these systems is that they are extremely difficult to maintain. Every aspect of the system is difficult (read: "expensive") to change. As such, every aspect of the system is architectural. Nothing is a simple "implementation detail", because no detail can be examined in isolation from the whole system. The whole system must be understood for every change. A Big Ball of Mud is nothing but architecture, and it is un-maintainable for that very reason.

On the other hand, consider the traditional Unix architecture. Doug McIlroy summarized the underlying philosophy beautifully as:

Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface.


Parts of the Unix architecture (like pipes, and untyped storage ("everything is a file")) are oriented around serving this philosophy. Nothing that did not serve this philosophy was included, leaving Unix with a limited set of primitive operations. The expressive power of this limited set is what makes us consider Unix to be a well-architected system. The rest of a Unix system (which comprises its bulk) is much easier to change, because the architectural components are small and isolated. Most of Unix's value is not in its architecture, which is what makes its architecture so good.