When I was revising my book Architecture and Patterns for IT, I started to delve into the then-emerging topic of DevOps. One of the models I put forward in my investigations was a naive "Change vs. Stability" diagram, something along the lines of this:
The trouble with this diagram is that it is not detailed enough, and as a mental model it leads to antipatterns. If Change is the opposite of Stability, then we should change less, right? But Change is also required; there are no truly static systems, and so the result is to change infrequently, with change forced into larger and larger "batches." (More on why this is in the followup post.)
Another problem with this diagram is that it is evocative and imprecise - "Change" and "Stability" are very high level concepts, and so the diagram becomes more intuitive/emotional rather than something measurable and useful to improvement.
The mental model that one hears around DevOps (and related Lean philosophies) is to change often, so that
- you are responsive to your context,
- change backlogs are minimized,
- no change is too large, and
- you get good at changing.
It is contrary to the antipattern thinking that results in massively disastrous releases. But to understand why this works a more detailed model is needed, one more easily quantified:
(click for larger)
This was the first draft attempt - and like any architecture it needs some definition if we are going to have a sensible conversation.
First, the abstract concept "Change" becomes just a subject area, a logical grouping, but not a primary model construct. That larger oval could go away and the semantics would be unchanged.
Change frequency could be simply the tickets in your ITSM or kanban tool divided by a time period, e.g. an average of 50 changes/month. A formal, ticketed process is essential at scale because it becomes a platform for collaboration and capability maturation.
Measuring Change size is difficult; many metrics such as Lines of Code and Function Points have been tried and found wanting, but nevertheless there are larger and smaller changes. Recently, Scrum methodologists have proposed using T-shirt sizes or numbers from a Fibonacci sequence to avoid the error of false precision in estimation. Because of this I do think that "Change Size" can be operationalized and measured.
Change capability was hard. I have struggled for years with the concept of capabiity and how to distinguish it from a simple function. Driving in the car yesterday, it finally hit me how one could measure capability: it would be some combination of
- the organization's (e.g. Change Management) experience as measured by completed cycles of its primary process (e.g. Completed Changes),
- the outcome level (quality) of those process cycles (e.g. 98% Change success rate)
- the resource base needed for and active in those cycles, penalized for turnover (e.g. Currently employed change participants/All change participants)
[has anyone else ventured down this road of quantifying Capability?]
Change risk would be value at risk (e.g. revenue dependent on the system) multiplied by probability of event (e.g. unsuccessful change).
Finally, Stability here is equated to service uptime. This led to some debate, which I will discuss in the next installment.
I then took a first cut at the relationships between them. At this point they are conjectures, but I think reflecting DevOps philosophy.
Change Frequency has a positive impact on Change Capability. The more Changes, the better you are at handling Change. Note that "size" of change is not relevant to this relationship - it is a separate variable.
Change Frequency has a negative relationship to Change Size. The more often you change, the smaller the changes will be, everything else being equal.
Change Capability has a negative relationship to Change Risk. That is, if Change Capability improves, Change Risk decreases, and vice versa.
Change Size increases Change Risk, and vice versa.
Finally, Change Risk has a negative effect on Stability.
I considered the relationship between Change Size and Change Capability and decided it was fraught. One might make the argument that large changes also increase Change Capability, as they arguably embed more valuable outcomes. But large changes are subject to what I call the Korey Stringer effect.
"If it doesn't kill me it makes me stronger. Oops, I died."*
Intuitively, I don't think that high-stakes, high-drama changes requiring marathon working hours actually *do* increase Change Capability. Heroes emerge, unsustainably; experienced people leave; business sponsors lose confidence; and more. So I put that line in with a (?) for further debate, and to show that I had thought about it.
So, this brings us up to when I tweeted the picture. I got a number of interesting responses and I am going to write a followup blog post addressing them. (So Rob, Majid, stay tuned.)
*Apologize if the imagery gives offense, but I can't think of a better case.