Tag Archives: design rules

Blame Susan Swart for the WikiLeaks fiasco

Recent days have seen much commotion about the WikiLeaks affair, in which about 250,000 confidential State Department cables have been leaked and publicly released.  Most of the coverage has focused on the content of the cables or on WikiLeads, the organization that turned the cables over to the media.  A few commentators have questioned the adequacy of information security at the State Department.  No one, as far as I am aware, has put the blame where I believe it belongs, on Susan Swart, Chief Information Officer at the U.S. Department of State.  This is surprising, because the State Department profile of Swart does not mince words about her responsibilities.

Susan H. Swart, a member of the Senior Foreign Service with the rank of Minister Counselor, was appointed as the Chief Information Officer for the Department of State in February 2008. As CIO, she is responsible for the Department’s information resources and technology initiatives and provides core information, knowledge management, and technology (IT) services to the Department of State and its 260 overseas missions. She is directly responsible for the Information Resource Management (IRM) Bureau’s budget of $310 million, and oversees State’s total IT/ knowledge management budget of approximately one billion dollars. [italics mine]

Swart was appointed in February 2008, which, given that the leaked cables are said to extend through February of 2010, is at least one year before the leak.  She cannot fob off responsibility on her predecessors.

So why hasn’t the media put the blame where it belongs?  The answer, I suspect, is a common misconception that computer systems are inherently vulnerable and, consequently, these kinds of breaches are inevitable.  According to this line of thinking, the only recourse would be to close off access to information, hindering the functioning of the department.  Swart cannot be held responsible for flaws inherent in the technology; ergo, we must look elsewhere for the guilty party.  Although this logic is rarely stated explicitly, traces can be found in media coverage of the leaks.  For example:

In a memo circulated Monday by its Office of Management and Budget, the White House said it was ordering a review of safeguards that could shut down some users’ access to classified information.

That would further limit diplomatic communications that have been restricted in response to earlier disclosures by WikiLeaks. The Defense Department has already limited the number of computer systems that can handle classified material and made it harder to save material to removable media, such as flash drives, on classified computers.

Bryan Whitman, a Defense Department spokesman, said Monday that it was inevitable that steps like that would “compromise … efforts to give diplomatic, military, law enforcement and intelligence specialists quicker and easier access to greater amounts of data.” (“U.S. can’t let WikiLeaks limit candor, diplomats say”)

And from no less an authority than the former CIO for the Director of National Intelligence:

Dale Meyerrose, former chief information officer for the U.S. intelligence community, said Monday that it will never be possible to completely stop such breaches.

“This is a personnel security issue, more than it is a technical issue,” said Meyerrose, now a vice president at Harris Corp. “How can you prevent a pilot from flying the airplane into the ground? You can’t. Anybody you give access to can become a disgruntled employee or an ideologue that goes bad.” (“U.S. looks for way to prosecute over leaks”)

To be blunt, this is nonsense.  That anyone, employee or otherwise, can easily gain access to and abscond with over 250,000 confidential documents, apprehended only when turned in by a confidant, is evidence of an extremely serious technical issue.  Anyone holding such views should not be in a position with responsibility for information resources.

The alternative view, which I believe to be correct for reasons that I’ll describe below, is that systems can be engineered for security in ways that maintain usability and access, while rendering breaches like this one effectively impossible.  If so, then Swart, by failing to ensure that the systems were so engineered, is responsible for the failure and should be held accountable.

There is a simple reason why we can be confident that the engineering problem is soluble.  No individual can possibly need to access the full content of 250,000 cables in a short period of time, because, even scanning them at a rate of one document per two seconds, more than a week would be required to review them all (assuming grueling 16 hour days).  Furthermore, since the cables cover a wide variety of topics, it’s unlikely that many employees need access to large numbers of cables covering wide ranges of topics and dates.  To solve the problem, then, we just need to engineer a system that makes it relatively easy to access in ways that conform to common use cases (e.g., small numbers of cables, cables close to a particular date, or cables related to a particular topic), and progressively more difficult to access larger numbers of cables.

So how would we engineer such a system for security?  Let’s consider three relatively simple design rules which could probably have prevented the WikiLeaks debacle.  Systems conforming to these design rules could, I’m reasonably certain, have been implemented by Swart within the year before the leaks occurred, especially since they could be implemented in ways that would be almost entirely transparent to end users.  With regard to all of these techniques, I acknowledge a profound intellectual debt to Jay Dvivedi, the brilliant maverick former CIO of Shinsei Bank.

First, don’t aggregate information.  If you put all your cables from around the world in a single giant repository, you’ve created a single point of failure, which inevitably becomes a giant vulnerability.  There is no need to store all these cables together.  Exactly how to separate the cables is an engineering problem that should be informed by knowledge about usage patterns, but it seems reasonable enough that systems might be divided by classification, geographic area (at the country, subcontinent, or continent level) or by age (less than one month, two to six months, seven to twelve months, etc.).  When information needs to be aggregated–e.g., a search on the entire collection of cables for a particular term or assembling cables for all dates for a particular country-the aggregation should take place temporarily in systems explicitly engineered for the purpose.

Second, create and manage differentiated access controls tuned to the sensitivity of the information being accessed.  This becomes easy when the first design rule is followed, because access controls can be developed separately for different classes of systems.  Access privileges should be granted for specific systems, each of which hold only subsets of the entire cable collection.  Many users may need direct access to only recent cables or cables for certain countries or geographic areas.

Carefully engineered access controls should be present on the systems that aggregate data across multiple systems as well.  The broader the aggregation and the larger the volume of data, the more approvals should be required.  In particular, extracting the entire database should be possible only using a specific, highly secure system designed to access all the subsidiary systems, and approval should be required at the highest level of the organization.

All this need not impede the work of intelligence analysts in the field: a search across all cables might return document excerpts and provide full text for several documents-perhaps only the least sensitive-without additional authorization.  Authorization from a supervisor or competent authority would be required to obtain full text of large numbers of documents, perhaps more than a hundred.

Third, track access to all confidential material and limit access for users that exhibit suspicious activity patterns.  That confidential material can be viewed without leaving behind any record of the activity is an inexcusable system design flaw.  It should be possible to see when any user accessed any confidential document.  To ensure the completeness and integrity of these access records, Jay recommends maintaining redundant records from three different perspectives:

  • Document perspective: who accessed the document, and through which gateway?  Here, I use the term gateway to refer to an access channel and its physical and logical location, e.g., a document viewing application running on a specific desktop computer in a particular room or building.
  • Gateway perspective: which documents were accessed through the gateway, and by whom?
  • User perspective: what documents has this user accessed, and through which gateways?

Following the first design rule, these records should be generated and stored by separate systems.  Other systems should continuously reconcile the records to detect errors or evidence of tampering.  It would be very difficult for a user to conceal unauthorized access, since at least three systems would have to be compromised.

Monitoring systems should use these records to look for suspicious activity, such as rapid successions of searches that hit broad swathes of the database or attempts to extract documents from one system after another.  In such cases, it should suffice to limit access until the behavior can be reviewed by a competent authority.  In addition to precluding breaches, the knowledge that all accesses are logged and analyzed will discourage improper use of the system.

The second and third design rules–granular access controls and monitoring user activity–are already commonly implemented by online services and financial firms.  The first design rule has not been widely adopted, but Jay has demonstrated its effectiveness at Shinsei Bank, and my understanding is that the rule resembles in principle to the service-oriented architectures employed at Amazon and Facebook.

All of which is to say that we should not let Swart off easy.  The State Department’s systems were clearly not designed for security, which is obviously inexcusable for an organization responsible for the nation’s diplomacy.

Contexts as elementary subsystems

This post is part of my collaborative research with Shinsei Bank on highly-evolvable enterprise architectures.  It is licensed under the Creative Commons Attribution-ShareAlike 3.0 license.  I am indebted to Jay Dvivedi and his team at Shinsei Bank for sharing with me the ideas developed here.  All errors are my own.

Contexts are the elementary building blocks in Jay’s system architecture.  I’ll define Contexts precisely below, but let me begin with a passage from The Sciences of the Artificial that provides a frame for the discussion.

By a hierarchic system, or hierarchy, I mean a system that is composed of interrelated subsystems, each of the latter being in turn hierarchic in structure until we reach some lowest level of elementary subsystem.  In most systems in nature it is somewhat arbitrary as to where we leave off the partitioning and what subsystems we take as elementary.  Physics makes much use of the concept of “elementary particle,” although particles have a disconcerting tendency not to remain elementary very long.  Only a couple of generations ago the atoms themselves were elementary particles; today to the nuclear physicist they are complex systems.  For certain purposes of astronomy whole stars, or even galaxies, can be regarded as elementary subsystems.  In one kind of biological research a cell may be treated as an elementary subsystem; in another, a protein molecule; in still another, an amino acid residue.

Just why a scientist has a right to treat as elementary a subsystem that is in fact exceedingly complex is one of the questions we shall take up.  For the moment we shall accept the fact that scientists do this all the time and that, if they are careful scientists, they usually get away with it. (Simon, 1996, 184-5)

For Jay, the Context is the elementary subsystem.  Like an atom, the Context is in fact a complex system; however, designed properly, the internal structure of the Context is invisible beyond its boundary.  Thus, system architects can treat the Context as an elementary particle that behaves according to relatively simple rules.

What is a Context?

A Context is a logical space designed to facilitate the performance of a small, well-defined set of actions by people acting in a small, well-defined set of roles.  Metaphorically, Contexts are rooms in a house: each room is designed to accommodate certain actions such as cooking, bathing, sleeping, or dining. Contexts exist to provide environments for action.  Although Contexts bear some resemblance to functions or objects in software programs, they behave according to substantially different design rules (see below).

Defining the Context as the elemental subsystem enables us, by extension, to define the elemental operation: a person, in a role, enters a Context, performs an action, and leaves the Context.  All system behavior can be decomposed into these elemental operations, I’ll label them Interacts for convenience, where a person in a role enters, interacts with, and leaves a Context.  The tasks performed by individual Interacts are very simple, but Interacts can be daisy-chained together to yield sophisticated behavior.

Design rules for Contexts

Creating Contexts that can be treated as elementary subsystems requires adhering to a set of design rules.  Below, I describe some of the design rules that have surfaced in my conversations with Jay.  These rules may not all be strictly necessary, and they are probably not sufficient; refining these design rules will likely be an essential part of developing a highly-evolvable enterprise software architecture based on Jay’s development methodology.

  1. Don’t misuse the context. Only allow those actions to occur in a Context that it was designed to handle; do not cook in the toilet or live in the warehouse, even if it is possible to do so.  Similarly, maintain the integrity of roles: allow a person to perform only those actions appropriate to his or her role.  The repairman should not cook; guests should not open desk drawers in the study.
  2. Physically separate contexts. Locate Contexts on different machines.  Never share a databases among multiple contexts.
  3. Only Interacts connect a Context to the rest of the system. Data enter and leave a context only through Interacts, carried in or out by a person in a role.
  4. There is no central database. Every Context maintains its own database or databases as necessary.
  5. Each Context permits only a limited set of simple, closely related actions. Contexts should be like a European or Japanese house where the toilet, bath, and washbasin are in separate rooms, rather than like a US house where all three are merged into a single room.  If a Context must handle multiple modes of operation or multiple patterns of action, it should be decomposed into multiple Contexts.
  6. Avoid building new Contexts. If a required behavior does not appear to fit in any existing Contexts, decompose it further and look for sub-behaviors that fit existing Contexts. Build new Contexts only after thorough decomposition and careful consideration.
  7. Only bring those items–those data–into the Context that are required to perform the task at hand.
  8. Control entry to the Context. Ensure that only appropriate people, in appropriate roles, with appropriate baggage (data) and appropriate intentions can enter.
  9. Log every Interact from the perspective of the person and the Context. The person logs that he or she performed the action in the Context, while the Context logs that the action was performed in the Context by the person.  This creates mutually verifying dualism.

Why bother?

The purpose of establishing the Context as an elementary subsystem is to simplify the task of system design and modification.  As Simon points out, “The fact that many complex systems have a nearly decomposable [i.e., modular], hierarchic structure is a major facilitating factor enabling us to understand, describe, and even “see” such systems and their parts.” (1996, 207) Establishing the Context as an elementary subsystem in enterprise software is a technique for rendering enterprise software visible, analyzable, and comprehensible.

Bounding and restricting the Context vastly simplifies the work of implementors, enabling them to focus on handling a small family of simple, essentially similar actions.  The Context can be specialized to these actions, thereby reducing errors and  increasing efficiency.

Contexts hide the complexity associated with data and problem representations, databases, programming languages, and development methodologies, enabling system architects to focus on higher-level problems.  In discussions with Jay, he almost never mentions hardware, software, or network technologies, since he can generally solve design problems without considering the internal structures of his Contexts and Interacts.

Since myriad organizational processes are assembled from a relatively small library of simple actions combined in different ways, systems that support these processes exhibit similar redundancy.  Thus, Contexts designed to handle very simple actions can be reused widely, decreasing the cost and time required to develop new systems.

Finally, it is possible that Contexts, by explicitly associating people and roles with all actions, may help clarify accountability as organizational action develops into an increasingly complex mixture of human and computer decision-making.

Concluding thoughts

In essence, Contexts and Interacts are artificial constructs intended to allow high-level system design problems to be solved independently of low-level implementation problems.  The extent to which the constructs achieve this goal depends on the effectiveness of the design rules governing the constructs’ behavior.  Positing Contexts and Interacts as the elementary subsystems in Jay’s development methodology establishes a theoretical structure for further inquiry, but neither guarantee their fitness for this purpose nor implies the impossibility of other, perhaps more effective elementary subsystem constructs.

On several occasions, I’ve been asked how this approach differs from service-oriented architectures.  I’ll explore this question in a subsequent post.