This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise software. It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license. I am indebted to Jay Dvivedi and his team at Shinsei Bank for supporting this research. All errors are my own.
One of Jay’s design rules to which he attaches great importance is physical separation of software modules (i.e., Contexts) and physical motion of information between them. According to this rule, software modules should be installed on physically separated computers.
Yesterday, I had the opportunity to discuss Shinsei’s architecture with Peter Hart, an expert on artificial intelligence and the founder and chairman of Ricoh Innovations, Inc. Peter was very intrigued by Jay’s use of design rules to impose physical constraints on software structure. I’d like to acknowledge Peter’s contribution to my thinking by introducing his perspective on possible implications of such physical constraints. Then, I’ll describe my follow-up conversation with Jay on the topic, and conclude with some of my own reflections. Of course, while I wish to give all due credit to Peter and Jay for their ideas, responsibility for any errors rests entirely with me.
Peter approached the issue from a project management perspective. Why, he asked me, are software development projects so much more difficult to manage compared to other large-scale engineering projects, such as building a bridge or a factory? The most plausible explanation he has found, he told me, is that software has many more degrees of freedom. In contrast to mechanical, chemical, civil, or industrial engineering, where the physical world imposes numerous and often highly restrictive constraints on the design process, there are hardly any physical constraints on the design of software. The many degrees of freedom multiply complexity at every level of the system, and this combinatorial explosion of design parameters makes software design an enormously complex and extraordinarily difficult problem.
Thus, Peter suggested that artificial imposition of physical constraints similar to those found in other engineering domains could help bring complexity under control. These constraints might be designed to mimic constraints encountered when performing analogous physical tasks in the real world. There is a tradeoff, since these constraints close off large swathes of the design space; however, if the goal of the designer is to optimize maintainability or reliability while satisficing with respect to computational complexity, then perhaps the benefit of a smaller design space might outweigh possible performance losses.
After my conversation with Peter, I asked Jay why he places so much importance on physical separation and physical movement.
To begin with, he said, it is difficult to create and enforce boundaries within a single computer. Even if the boundaries are established in principle, developers with “superman syndrome” will work around them in order to “improve” the system, and these boundary violations will be difficult to detect.
Work is made easier by keeping related information together and manipulating it in isolation. Jay uses the analogy of a clean workbench stocked with only the necessary tools for a single task. Parts for a single assembly are delivered to the workbench, the worker assembles the parts, and the assembly is shipped off to the next workstation. There is never any confusion about which parts go into which assembly, or which tool should be used. Computer hardware and network bandwidth can be tuned to the specific task performed at the workstation.
Achieving this isolation requires physical movement of information into and out of the workstation. Although this could be achieved, in theory, by passing data from one module to another on a single computer, designers will be tempted to violate the module boundaries, reaching out and working on information piled up in a motionless heap (e.g., shared memory or a traditional database) instead of physically moving information into and out of the module’s workspace.
When modules are physically separated, it becomes straightforward to reconfigure modules or insert new ones, because flows of information can be rerouted without modifying the internal structures of the modules. Similarly, processes can be replicated easily by sending the output of a workstation to multiple locations.
Finally, physical separation of modules increases system-level robustness by ensuring that there is no single point of failure, and by creating opportunities to intervene and correct problems. Inside a single computer, processes are difficult to pause or examine while operating, but physical separation creates an interface where processes can be held or analyzed.
The idea of contriving physical constraints for software systems seems counterintuitive. After all, computer systems provide a way to manipulate symbols largely independent of physical constraints associated with adding machines, books, or stone tablets. The theory of computation rests on abstract, mathematical models of symbol manipulation in which physical constraints play no part. What benefit could result from voluntarily limiting the design space?
Part of the answer is merely that a smaller design space takes less time to search. Perhaps, to echo Peter’s comment, software development projects are difficult to manage because developers get lost in massive search spaces. Since many design decisions are tightly interdependent, the design space will generally be very rugged (i.e., a small change in a parameter may cause a dramatic change in performance), implying that a seemingly promising path may suddenly turn out to be disastrous1. If physical constrains can herd developers into relatively flatter parts of the design space landscape, intermediate results may provide more meaningful signals and development may become more predictable. Of course, the fewer the interdependencies, the flatter (generally speaking) the landscape, so physical separation may provide a way to fence off the more treacherous areas.
Another part of the answer may have to do with the multiplicity of performance criteria. As Peter mentioned, designers must choose where to optimize and where to satisfice. The problem is that performance criteria are not all equally obvious. Some, such as implementation cost or computational complexity, become evident relatively early in the development process. Others, such as modularity, reliability, maintainability, and evolvability, may remain obscure even after deployment, perhaps for many years.
Developers, software vendors, and most customers will tend to be relatively more concerned about those criteria that directly and immediately affect their quarterly results, annual performance reviews, and quality of life. Thus, software projects will tend to veer into those areas of the design space with obvious short-term benefits and obscure long-term costs. In many cases, especially in large and complex systems, these design tradeoffs will not be visible to senior managers. Therefore, easily verifiable physical constraints may be a valuable project management technology if they guarantee satisfactory performance on criteria likely to be sacrificed by opportunistic participants.
Finally, it is interesting to note that Simon, in The Sciences of the Artificial, emphasizes the physicality of computation in his discussion of physical symbol systems:
Symbol systems are called “physical” to remind the reader that they exist as real-world devices, fabricated of glass and metal (computers) or flesh and blood (brains). In the past we have been more accustomed to thinking of the symbol systems of mathematics and logic as abstract and disembodied, leaving out of account the paper and pencil and human minds that were required actually to bring them to life. Computers have transported symbol systems from the platonic heaven of ideas to the empirical world of actual processes carried out by machines or brains, or by the two of them working together. (22-23)
Indeed, Simon spent much of his career exploring the implications of physical constraints on human computation for social systems. Perhaps it would be no surprise, then, if the design of physical constraints on electronic computer systems (or the hybrid human-computer systems known as modern organizations) turns out to have profound implications for their behavioral properties.
1 When performance depends on the values of a set of parameters, the search for a set of parameter values that yields high performance can be modeled as an attempt to find peaks in a landscape the dimensions of which are defined by the parameters of interest. In general, the more interdependent the parameters, the more rugged the landscape (rugged landscapes being characterized by large numbers of peaks and troughs in close proximity to each other). For details, see the literature on NK models such as Levinthal (1997) or Rivkin (2000).