Tag Archives: information factory

The event processing perspective

This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise software.  It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license.  I am indebted to Jay Dvivedi and his team at Shinsei Bank for supporting this research.  All errors are my own.

When I adopted the information factory metaphor, I found that Caltech professor K. Mani Chandy had been there first.  Chandy and his colleagues have written a number of papers and a book1 about system architectures, and I’d like to introduce his perspective here and draw some connections to the theory that I’m developing.  Chandy and his colleagues propose a typology of system architectures based on the way that subsystems interact.  They distinguish three modes of interaction2:

  • Time-driven or schedule: “Groups of components interact at scheduled times.”
  • Request-driven or pull: “A component requests information from other components, which then reply to the requests.”
  • Event-driven or push: “A component sends information to other components when it discovers state changes relevant to its listeners.”

According to Chandy, time-driven and request-driven architectures are relatively common, while event-driven architectures are relatively new to the enterprise computing landscape.  In their book, Chandy and Schulte give two reasons for the rising interest in event-driven systems: first, “an explosion of event ‘streams’ flowing over corporate networks” facilitates programmatic access to events, and, second, “Companies today are operating at a faster pace, so early notification of emerging business threats and opportunities is increasingly important” (xii).

On a practical level, event-driven systems encounter several difficulties.  First, what constitutes an event?  Chandy et al. posit that “An event is a significant change in the state of the universe. A significant state change is one for which an optimal response by the system is to take an action.”  This definition makes events sound objective, but as students of Simon we know that an organization (or any other complex system operating in the real world) can never determine an optimal response to any situation.  Significant changes are therefore in the eye of the beholder: an event is a change in the state of the universe in response to which a system believes that action should be taken.

This leads to a second difficulty.  If our systems are to be decentralized, how can we ensure consistent interpretation of changes in the universe?  One component’s event might be another’s noise.  This, I think, is why Chandy et al. note the importance of shared models:

An agent that is responsible for initiating a response takes action based on its estimates of the state of the universe. Its estimates are based on a model of the universe which, in turn, is based on models that it shares with agents that provide it with information.

If there were objective criteria for defining events, shared models would not be necessary: all (properly functioning) system components would agree on which changes in the state of the universe represent events (to which the system should respond).  Since no such objective criteria exist, components must possess models sufficiently similar to yield complementary behavior.  As enterprise systems become increasingly complex, I suspect that the problem of inconsistent models will pose ever-larger challenges, since this is essentially the old organizational problem of differentiation and integration playing out in a new domain3.

So are Shinsei’s information assembly lines event processing systems?  I think that they may be, but they differ in an important way from the systems described by Chandy and Schulte.  Many of Shinsei’s systems are event driven, in the sense that events (e.g., customer requests for banking services) trigger the production of an information product (e.g., a credit decision or a funds transfer).  In the language of the factory metaphor, Shinsei manufactures custom products on demand rather than standard products in advance of demand.

Shinsei does not, however, seem to have a role for abstract events-as-messages that distribute information to listeners.  Instead, events propagate along the assembly line rather than through meta-level communication channels.  The manufacturing metaphor may help clarify the distinction.  Consider the management of parts inventory on an assembly line.

The event-driven architecture described by Chandy and Schulte would use a sensor to detect when the supply of parts at a workstation declines below a certain level.  The sensor would respond by generating an “almost out of parts” event, which would be sent to the factory’s order management department, which would respond to the event by placing an order with the appropriate supplier.  Two distinct interaction channels can be distinguished: a physical channel composed of trucks and parcels, and a communication channel composed of sensors and computer networks.

By contrast, Shinsei’s approach resembles the Japanese kanban system.  The physical channel for movement of parts is designed to trigger the replenishment of inventory, so no “out-of-band” communication is necessary.  Events are detected and handled implicitly through the design of the process.  In this approach, there is only one interaction channel.

This distinction, between systems that internalize event handling and those that handle events explicitly using dedicated communication networks external to the underlying process, may be a useful dimension for classifying enterprise system architectures.

The relative effectiveness of the two approaches may be an empirical question, and it could depend on the goals of the system.  For straightforward business processes, explicitly modeling events and creating a communication network to detect and process them seems likely to add unnecessary complexity, which could render systems more difficult to modify and ultimately decrease agility.  However, if businesses want to run data fusion algorithms over event streams (Chandy and Schulte use the term “complex event processing”), then it may be necessary to have explicit event representations and dedicated event processing systems.  Hybrids of the two models may also be possible.

1 An abridged edition is available free of charge from Progress Software.  All page number references are to the abridged edition.

2 Definitions from Chandy et al., “Towards a Theory of Events“, 2007.

3 When organizations develop differentiated subunits, the decision-makers in these subunits tend to focus on achieving subunit goals.  This reduces complexity, but often results in suboptimization.  I can see no reason why enterprise systems should not suffer from similar problems.

Workstations

This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise architectures.  It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license.  I am indebted to Jay Dvivedi and his team at Shinsei Bank for sharing with me the ideas developed here.  All errors are my own.

This is the first of what I intend to be a series of short posts focusing on a few important aspects of the information factory perspective that I’m starting to develop.  In the previous iteration of this work, I defined Contexts as elementary subsystems where tasks are performed.  In this iteration, in keeping with the information assembly line metaphor, I’ve decided to replace Contexts with workstations.  The basic idea doesn’t change: a workstation is an elementary subsystem where a worker, in a role, performs a task.  I’d like to add a few nuances, however.

First, at least for the time being, I’m going to rule out nesting of workstations.  Workstations can be daisy-chained, but not nested.  A hierarchical structure similar to nesting can be achieved by grouping workstations into modular sequences, but these groupings remain nothing more or less than sequences of workstations.  Conceptually, workstations divide the system into two hierarchical levels: the organization level (concerned with the configuration of workstation sequences) and the task level (concerned with the performance of tasks within specific workstations).  This conceptual divide resembles, I think, the structure of service-oriented architectures, in which the system level (integration of services) is conceptually distinct from the service level (design and implementation of specific services).

The purpose of the workstation is simply to provide a highly structured and controlled environment for performing tasks, thereby decoupling the management of task sequences (organization level) from the execution details of specific tasks (task level).  Workstations are thus somewhat analogous to web servers: they can “serve” any kind of task without knowing anything about the nature of its content.  Each workstation is provisioned with only those tools (programs, data, and personnel) required to perform the task to which it is dedicated.  The communication protocol for a workstation is a pallet interface, by which the workstation receives work-in-progress and then ships it out to the next workstation.  Pallets may also carry tools and workers to the workstation in order to provision it.

An implementation of the workstation construct requires an interface for pallets to enter and leave the workstation, hooks for loading and unloading tools and workers delivered to the station on pallets, and perhaps some very basic security features (more sophisticated security tools can be carried to the workstation on pallets and installed as needed).

Information assembly lines

This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise architectures.  It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license.  I am indebted to Jay Dvivedi and his team at Shinsei Bank for sharing with me the ideas developed here.  All errors are my own.

In my previous post, I explained my (admittedly somewhat arbitrary) transition from version zero to version one of my architectural theory for enterprise software.  The design metaphor for version one of the theory is the high-volume manufacturing facility where assembly lines churn out large quantities of physical products.  Design metaphors from version zero of the theory (the zoo, the house, the city, and the railway) will probably appear at some point, but I’m not yet exactly sure how they fit.

Jay often describes business processes at Shinsei as computer-orchestrated information assembly lines.  These lines are composed of a series of virtual workstations (locations along the line where work is performed), and transactions move along the line from one workstation to the next on virtual pallets.  At each workstation, humans or robots (software agents) perform simple, repetitive tasks.  This description suggests that the salient features of the information factory1 include linear organization, workstations, pallets, and finely-grained division of labor.

How does this architecture differ from traditional approaches?  Here are a few tentative observations.

  • No central database. All information associated with a transaction is carried along the line on a pallet.  Information on a pallet is the only input and the only output for each workstation, and the workstation has no state information except for log records that capture the work performed.  In essence, there is a small database for each transaction that is carried along the line on a pallet.  In keeping with the house metaphor, information on the pallet is stored hierarchically.  (More thoughts about databases here.)
  • Separation of work-in-progress and completed work. Just like an assembly line in a factory, work-in-progress exists in temporary storage along the line and then leaves the line when completed.

In order to make the system robust, Jay adheres to the following design rules.

  • Information travels in its context. Since workstations have no state, the only ways to ensure that appropriate actions are taken at each workstation are to either (a) have separate lines for transactions requiring different handling or (b) have each pallet carry all context required to determine the appropriate actions to take at each workstation.  The first approach is not robust, because errors will occur if pallets are misrouted or lines are reconfigured incorrectly, and these errors may be difficult to detect.  Thus, all pallets carry information embedded in sufficient context to figure out what actions should be taken (and not taken).
  • All workstations are reversible. In order to repair problems easily, pallets can be backed up when problems are detected and re-processed.  This requires that all workstations log enough information to undo any actions that they perform; that is, they must be able to reproduce their input given their output.  These logs are the only state information maintained by the workstations.
  • Physical separation. In order to constrain interdependencies between workstations and facilitate verification, monitoring, isolation, and interposition of other workstations, workstations are physically separated from each other.  More on this idea here.

The following diagram depicts the structure of an information assembly line.  The line performs six tasks, labeled a through f.  The red arrows indicate logical interdependencies.  The output of a workstation is fully determined by the output of the preceding workstation, so the dependency structure resembles that of a Markov chain.  Information about a transaction in progress travels along the line, and completed transactions are archived for audit or analysis in a database at the end of the line.  Line behavior can be monitored by testing the output of one or more workstations.

info-assembly-line

Information assembly line

By contrast, here is a representation of a system designed according to the traditional centralized database architecture.  The system has modules that operate on the database to perform the same six tasks.  Although the logical interdependency structure is the same in theory, the shared database means that every module depends on every other module: if one module accidentally overwrites the database, the behavior of every other module will be affected.  Moreover, all transactions are interdependent through the database as well.  It’s difficult to verify that the system is functioning properly, since database operations by all six modules are interleaved.

Traditional system architecture with centralized database

Traditional system architecture with centralized database

Clearly, the information assembly line architecture requires more infrastructure than the traditional database approach: at a minimum, we need tools for constructing pallets and moving them between workstations, as well as a framework for building and provisioning workstations.  In addition, we also need to engineer the flow of information so that the output can be computed using a linear sequence of stateless workstations.  There are at least two reasons why this extra effort may be justified.  At this stage, these are just vague hypotheses; in future posts, I’ll try to sharpen them and provide theoretical support in the form of more careful and precise analysis.

First, the linear structure facilitates error detection and recovery.  Since each workstation performs a simple task on a single transaction and has no internal state, detecting an error is much simpler than in the traditional architecture.  The sparse interdependency matrix limits the propagation of errors, and reversibility facilitates recovery.  For critical operations, it is relatively easy to prevent errors by using parallel tracks and checking that the output matches (more on reliable systems from unreliable components here).

Second, the architecture facilitates modification and reconfiguration.  In the traditional architecture, modifying a component requires determining which other components depend on it and how, analyzing the likely effects of the proposed modification, and integrating the new component into the system.  If the number of components is large, this may be extremely difficult.  By contrast, in the information assembly line, the interdependency matrix is relatively sparse, even if we include all downstream dependencies.  Perhaps more importantly, the modified component can easily be tested in parallel with the original component (see the figure below).  Thus, the change cost for the system should be much lower.

info-assembly-line-parallel-operation

Parallel operation in an information assembly line

1A search for the term “information factories” reveals that others have been thinking along similar lines.  In their paper “Enterprise Computing Systems as Information Factories” (2006), Chandy, Tian and Zimmerman propose a similar perspective.  Although they focus on decision-making about IT investments, their concept of “stream applications” has some commonalities with the assembly-line-style organization proposed here.