This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise architectures. It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license. I am indebted to Jay Dvivedi and his team at Shinsei Bank for sharing with me the ideas developed here. All errors are my own.
When I introduced the idea of information assembly lines, I noted Jay’s emphasis on separating work-in-progress from completed work as a distinguishing characteristic of the architecture:
Just like an assembly line in a factory, work-in-progress exists in temporary storage along the line and then leaves the line when completed.
This sounds straightforward enough, but it turns out to have some profound implications for the way we frame information in the system. In order to clearly separate work-in-progress from finished goods, we need to shift our conceptualization of information. Instead of seeing an undifferentiated store of variables that might change at any time (as in a database), we must distinguish between malleable information on the assembly line and a trail of permanent, finished information goods. We might imagine the final step of the assembly line to be a kiln that virtually fires the information goods, preventing them from ever changing again. To underscore the point: finished goods are finished, they will never be worked on or modified again, except perhaps if they become damaged and require some repair to restore them to their proper state.
Separation of work-in-progress from finished goods allows us to divide the enterprise software architecture into two separate sub-problems: managing work-in-progress, and managing completed goods. Managing work-in-progress is challenging, because we must ensure that all the products are assembled correctly, on time, and without accidental error or sabotage. Fortunately, however, on a properly designed assembly line, the volume of work-in-progress will be small relative the volume of completed goods.
Managing completed goods is much simpler, though the volume may be extremely large. Since completed goods cannot be modified, they can be stored in a write-once, read-many data store. It’s much easier to maintain the integrity and security of such a data store, since edit operations need not even exist. Scaling is easy–when a data store fills, just add another–since no modification of existing records implies no interdependence between new and existing records (there can be only one-way dependence, from existing to new). Also, access times are likely to be much less important for completed goods than for work-in-progress.
The idea sounds attractive in principle, but how can this design cope with an ever-changing world? A simple example shows how shifting our perspective on information makes this architecture possible. Many companies have systems that keep track of customers’ postal addresses. Of course, these addresses change when customers move. Typically, addresses are stored in a database, where they can be modified as necessary. There is no separation between work-in-progress and completed goods.
Information assembly lines solve the problem differently. A customer’s postal address is not a variable, but rather our record of where the customer resides for a period of time. Registering a customer address is a manufacturing process involving a number of steps: obtaining the raw information, parsing it into fields, perhaps converting values to canonical forms, performing validity checks, etc. Once the address has been assembled and verified, it is time-stamped, packaged (perhaps together with a checksum), and shipped off to an appropriate finished goods repository. When the customer moves, the address in the system is not changed; we simply manufacture a new address.
The finished goods repository contains all the address records manufactured for the customer, and each record includes the date that the address became active. When retrieving the customer’s address, we simply record the address that became active most recently. If an address is manufactured incorrectly, we manufacture a corrected address. Thus instead of maintaining a malleable model of the customer’s location, we manufacture a sequence of permanent records that capture the history of the customer’s location.
In this way, seeing information from a different perspective makes it possible to subdivide the enterprise software problem into two loosely-coupled and significantly simpler subproblems. And cleverly partitioning complex problems is the first step to rendering them tractable.
Mr. David, Address change is one type of service request similar to many other ones like, signature change, statement issue etc. what is not clear from the above is what is difference between what you described above and other financial institutions are following, as the one which i work for follows the same principles that you have described for address change.
This is an extremely critical concept, that would simplify the design of robust systems. Sometimes the absence of clarity on what is WIP and what is finished goods, may result in a situation where you may try to “rework” on finished goods instead of “writing it off”, passing appropriate entries in the books and remanufacturing the correct version. Evolving clear guidelines on what is WIP and what is finished goods in a given environment will ensure that there is no confusion.
Mr. Vishnu, I agree this is important concept. that is why most organisations differentiate between what has been completed and what is in progress of getting completed. this is also the basic principle that is thought when some one attends a good school.
What i have seen in the past 30 years of my career is that this is implemented by atleast 4 banks I have worked for because it has profound implications on, control, performance, cost etc. but this is also a double edged sword and if not implemented properly can lead to addition of cost or just futile effort in implementation.
There are a large number of books written about this and there is implementations to see, so i am just curious that what is so new or greate writing about this.
Vishnu, thanks for your comment. Guidelines may help, but I think there’s a more direct approach: do not allow workstations to hold more than one transaction at a time (the transaction it is currently working on), and force datastores to be WORM (write once, read many). This should ensure a clean separation between WIP and finished goods, since it will not be possible to mix the two. In general, I think design rules will be more effective when the architecture makes it almost impossible to break them.
Takahashi, all ideas have antecedents, so establishing the degree of novelty is generally non-trivial. I’m under the impression that WORM datastores are often used for logs and other records. That said, the idea separating work-in-progress from finished product isn’t one that I encountered during my computer science education at Stanford, Harvard, and MIT (you’d be hard pressed, I think, to argue that these aren’t good schools). Also, this idea is just part of the architectural theory that I’m developing; at this point, I don’t believe that I fully grasp the implications of separating work-in-progress from finished product, but I think it may turn out to be quite important. If there are any books that describe similar ideas, it would be very helpful if you could provide references–I’m always looking for other perspectives on these topics. Thanks!