This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise architectures. It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license. I am indebted to Jay Dvivedi and his team at Shinsei Bank for sharing with me the ideas developed here. All errors are my own.
The systems in a large bank handle huge volumes of transactions collectively worth many millions or billions of dollars, so errors or sabotage can cause serious damage in short order. For this reason, Jay often compares running bank systems to flying jumbo jets. To make matters worse, the jets must be maintained and modified in flight. Jay’s approach, which he demonstrated to great effect when he successfully migrated Shinsei from mainframes to inexpensive servers, focuses on building up new systems one component at a time in parallel with existing systems and achieving functional parity. Once parity has been reached, the old systems can be shut down.
Parity is relatively straightforward in principle. As new components—assembly line segments—become operational, they are integrated into the line in parallel with the existing line segments being superseded. Parity is reached when line behavior does not change regardless of whether work flows along the existing line or flows from the existing line onto the new segment and then back onto the existing line.
In practice, achieving parity quickly on a high volume line turns out to be challenging. Jay’s agile development methodology means that many new line segments are being developed, deployed and iterated simultaneously, with new versions being deployed on cycles of only a few days or weeks. At the same time, thousands or millions of transactions are being processed on the line. Achieving parity requires addressing many sources of output disparities, and these disparities may occur only for transactions with certain characteristics. In this context, it becomes extremely difficult to keep track of where disparities are appearing, what factors are driving the disparities, and which disparities are in the process of being resolved.
To handle this challenge, Jay emphasizes the importance of automated reconciliation tools. In reconciliation, transaction streams are compared and disparities are detected. Then, analysts find patterns in these disparities, classify them into categories that share a common cause, and characterize the disparity categories for resolution by the development team. At the same time, the analysts create rules in the reconciliation engine that can detect and correct for disparities of each type.
Using these rules, the reconciliation engine distinguishes between the full universe of disparities, the known disparities that have already been characterized and are in the process of being resolved, and residual disparities that have not yet been analyzed. By filtering out known disparities, the reconciliation engine helps analysts focus on disparities that require attention and facilitates the detection and deciphering of patterns that might otherwise be overlooked or misidentified, especially in cases when multiple disparities overlap.
To the extent that reconciliation tools reduce the time and effort required to achieve parity, they facilitate rapid and safe deployment of new systems.