Cloudflare recently announced two additional capabilities for their “serverless” Workers: support for WebAssembly as an alternative to JavaScript, and a key-value store called Workers KV. WebAssembly will allow Workers to be written in compiled languages such as C, C++, Rust and Go. Workers KV provides an eventually consistent state storage mechanism hosted across Cloudflare’s global network of over 150 data centres.

Continue reading the full story at InfoQ.

This is one of those posts that started life on an email thread. It comes from a discussion on the topic of multi cloud governance for large enterprises.

Why cloud?

The answer is not ‘cloud is cheaper’, because it just isn’t. We know from Amazon’s financials that it’s gushing money because cloud is a high margin and very profitable business. Any halfway competent large enterprise can run its own data centres cheaper, and the mainframe and midrange legacy isn’t going anywhere so neither are those data centres.

The answer is ‘cloud is faster’. Cloud enables quicker delivery of value in response to changing user needs. ‘Digital’ is built out of continuous delivery (CD) pipelines, and CD pipelines need cloud at the end of them, because you can’t do continuous delivery without API enabled on demand infrastructure (which is lots of words for ‘a cloud’).

What kind of cloud?

Having focused our attention on the right category – applications that deliver outcomes more quickly by constructing CD pipelines – we finally come to the vexed question of what type of services do they want to consume? Is it IaaS, CaaS, FaaS (or some combination)?

Safety First

The crucial outcome is safety[1]. The app developers need to be able to consume services without crossing lines with respect to security and compliance.

Safety costs money (but far less than unsafe practices) so we end up at a cost/benefit trade off. For every class of services that consumed how much does it cost to make them safe, and what benefit is accrued (in speed to market and revenue achieved from that)? This doesn’t just apply to which cloud will be used (AWS vs Azure vs Google); it also applies at a finer grained level to which services are consumed within those clouds.

Control vs Productivity

Most large enterprises are looking down the wrong end of the telescope. They’re thinking about control before productivity rather than productivity before control; and that pattern is borne from the traditional enterprise mindset of standardisation to reduce complexity and cost.

Many enterprises feel like they have these options:

  1. Make the complexity appear to go away by putting a unifying broker layer between the cloud and its consumer.
  2. Accept that there will be multiple data centres and cloud providers and use some sort of common technology (e.g. containers) to enable common approaches to application development and deployment.
  3. Some combination of (1) and (2) where elements of the infrastructure and operations are treated in a common way (e.g. billing, operational data, software delivery).

None of them are good, because all of them presuppose that we make an activity safe before we even know what the activity is. It’s a great way of spending time and treasure on something that never delivers any practical value[2].

Getting to the cost/benefit trade off

This means actually knowing the cost, and actually being able to figure out the benefit. A managed cloud service with safety built in (like you get from a services company rather than the raw cloud underneath it) will show a sticker price for cost – so we’re half way there. The other half naturally extends into the enterprise’s own investment appraisal processes (and there’s room for services companies to be facilitators for the extra work needed there).

In the end it doesn’t matter whether they use one cloud or five. Whether they use containers, functions or VMs. It only matters that engineers are safe, and the overhead of achieving that safety is part of the broader business case for the functionality.

The final issue is the classic ‘buy the restaurant to eat the first meal’ problem that so often comes up with large enterprises, where any given app can’t make the business case stand up on its own. That takes us to a portfolio management exercise where the investment in safety for a category of apps with common needs gets justified by the expected benefits of all those apps (and the apps that might follow them). So there’s a need for some degree of aggregation. What happens next is essentially paving the cow paths – early success becomes the path of least resistance for things that follow (so that they don’t have to make fresh cases for their own new safety demands).

What this brings us to is the ability to say to application developers ‘you can have anything you want provided that you can afford the safety’, ‘you may club together to pay for safety’. That leads to an initial cohort of apps that make something happen within the confines of a given set of safety systems. It’s easy to follow them, because the safety investment has already been paid down. Apps can choose not to follow, but that choice comes with the consequence that they need to stand up their own safety and pay the premium.

They key point is ‘they come and we build it’ rather than ‘build it and they shall come’.


[1] I’ve written about safety (first) before.
[2] For more on why brokers are considered an anitpattern take a look at Barclays’ Infrastructure CTO Keiran Broadfoot in his 2017 re:Invent presentation.

Our expectations for user experience are shaped by the huge consumer platforms such as Google, Apple, Facebook and Amazon, and also the devices that we access them with such as iPhones, iPads, Android phones and tablets, and the PCs or Macs we use at home.

When we use these consumer devices and services, there’s no help desk for issues that come up, but in general there’s also no need for a help desk because the devices “just work.” That’s the type of user experience organisations strive for on their journey to digital.

head to the full piece to continue reading


My hardware and software setup for my Raspberry Pi sous vide setup have remained the same for over 5 years, but a failed remote controlled socket forced me to update almost everything.


The Maplin remote control socket would turn on, and briefly supply power to the slow cooker, but then it would appear to trip. This wasn’t the first time, as the original socket failed after a few years, but this time there’s no chance of getting a replacement from Maplin as they’ve gone out of business.


After a bit of hunting around I tried ordering a Lloytron A1210WH socket set, as it looked identical to the Maplin one (and hence was likely to have come from the same original equipment manufacturer), but the package I received had the newer model A1211WH.


I could have returned them in the hope of getting some of the older ones from elsewhere, but I decided to bite the bullet and just make things work with the new ones.

Blind Alley

I had a go at using 433Utils, which was able to read different codes from the new remote with RFSniffer, but sending those codes using codesend just didn’t work (and didn’t even show the same code being played back on RFsniffer).


How I Automated My Home Fan with Raspberry Pi 3, RF Transmitter and HomeBridge had a similar issue with 433Utils, and used pilight to read codes from his remote then send them out from a Pi controlled transmitter.

Here began a bit of a struggle to get things working. My ancient Raspbian didn’t have dependencies needed by pilight, so I burned a new SD card with Raspbian Stretch Lite (and then enabled WiFi and SSH for headless access).

My initial attempts to use pilight-debug crashed on the rocks of missing config:

pilight no gpio-platform configured

In retrospect the error message was pretty meaningful, but Google didn’t help much with solutions, and all of the (pre version 8) example configs I’d seen didn’t have the crucial line for:

gpio-platform: "raspberrypi1b1"

That raspberry version maps to the options for the WiringX platform that sits within pilight.

With the config.json sorted for my setup (GPIO 0 to transmitter, and GPIO 7 to receiver [as a temporary replacement for the 1Wire temperature sensor]) I was able to capture button presses from the remote control. It quickly became apparent that there was no consistency between captures, and I’m guessing the timing circuits just aren’t that accurate. But the patterns of long(er) pulses to short(er) pulses were consistent, so I extracted codes that were a mixture of 1200us, 600us and a 7000us stop bit (gist with caps, my simplification, and generated commands).


With timings in hand I was able to turn the new sockets on and off with pilight-send commands in shell scripts for on and off. It was then just a question of updating my control script to invoke those rather than the previous strogonanoff scripts (having migrated my entire sousvide directory from old SD to new via a jump box with a bit of tar and scp).

In Plain Sight


“The future is already here — it’s just not very evenly distributed.” – William Gibson

This post is about a set of powerful management techniques that have each been around for over a decade, but that still haven’t yet diffused into everyday use, and that hence still appear novel to the uninitiated.

Wardley Maps

Simon Wardley developed his mapping technique whilst he was CEO at Fontango.

A Wardley map is essentially a value stream map, anchored on user need, and projected onto a X axis of evolution (from genesis to commodity) and a Y axis of visibility.

The primary purpose of a Wardley map is to provide situational awareness, but they have a number of secondary effects that shouldn’t be ignored:

  • Maps provide a communication medium within a group that has a pre determined set of rules and conventions that help eliminate ambiguity[1].
  • Activities evolve over time, so map users can determine which activities in their value chain will evolve anyway due to the actions of third parties, and which activities they choose to evolve themselves (by investment of time/effort/money).
  • Clusters of activities can be used to decide what should be done organically within an organisation, and what can be outsourced to others.

Working Backwards

Amazon’s CTO Werner Vogels wrote publicly about their technique of working backwards in 2006, and the origin stories of services like EC2 suggest that it was well entrenched in the Amazon culture before then[2].

The technique involves starting with a press release (and FAQ) in order to focus attention on the outcome that the organisation is trying to achieve. So rather than the announcement being written at the end to describe what has been built, it’s written at the start to describe what will be built, thus ensuring that everybody involved in the building work understands what they’re trying to accomplish.

A neat side effect of the technique is that achieveability gets built in. People don’t tend to write press releases for fantastical things they have no idea how to make happen.

Site Reliability Engineering (SRE)

class SRE implements DevOps

SRE emerged from Google as an opinionated approach to DevOps, eventually as a book. Arguably SRE is all about Ops, complementing Dev as practiced by SoftWare Engineers (SWE); but the formalisation of error budgets and Service Level Objectives (SLOs) provides a very clean interface between Dev and Ops to create an overall DevOps approach.

SRE isn’t the only way of getting software into production and making sure it continues to meet expectations, but for organisations starting from scratch it’s a well thought through and thoroughly documented approach that’s known to work (and with a pre fabricated market of practitioners); so the alternative of making up an alternative seems fraught with danger. It’s no accident that Google’s using SRE at the heart of its Customer Reliability Engineering (CRE) approach where it crossed the traditional cloud service provider shared responsibility line to work more closely with its customers[3].

Pulling it all together

These techniques don’t exist in isolation. Whilst each is powerful on its own they can be used in combination to greatly improve organisation performance. Daniel Pink’s Drive talks about Autonomy, Mastery and Purpose in terms of the individual, but at an organisation level[4] they might fit like this:

  • Autonomy – Wardley maps provide a way to focus on the evolution of a specific activity, and with that determined the team can be left to figure out their way to achieving that.
  • Mastery – SRE gives us a canned way to get software into a production environment , making it clear which better skills are needed and must be brought to bear.
  • Purpose – the outcome orientation that comes from working backwards provides clarity of purpose, so nobody is in doubt about what they’re trying to accomplish.


[1] I commonly find that when I introduce Wardley mapping to senior execs their initial take is ‘but that’s obvious’, because they internally use something like the mapping technique as part of their thought process. I then ask them ‘do you think your entire team shares your views of what’s obvious?’.
[2] Arguably a common factor for many of these approaches is that they become public at a point where the companies they emerged from have determined that there’s nothing to lose by talking about them. In part that’s down to inevitable leakage as staff move on and take ways of working with them, and in part it’s because it does take so long for these techniques to find widespread use amongst potential competitors.
[3] A central argument here is that achieving ‘4 nines’ availability on a cloud platform is only possible when the cloud service provider and customer have a shared operations model, and sharing operations means having a mutually agreed upon mechanism for how operations should be done.
[4] An organisation might be an Amazon ‘2 pizza’ team, or an entire company.

Silent PC



I’ve been very happy with the silence of my passively cooled NUC for the past 4 years, but it was starting to perform poorly. So when I came across a good looking recipe for a silent PC with higher performance I put one together for myself.


I’ve been running my NUC in an Akasa silent case since shortly after I got it, and it’s been sweet, until it wasn’t. Silence is golden, but having a PC that’s constantly on the ragged edge of thermal limiting for the CPU and/or SSD[1] became pretty painful[2]. When I came across this Completely Silent Computer post a few weeks back I knew it was exactly what I wanted[3].


I pretty much followed Tim’s build, with a few exceptions:

  • I went for the black DB4 case
  • In line with his follow up Does Pinnacle Ridge change anything? I went with the Ryzen 5 2600 CPU
  • The MSI GeForce GTX 1050 Ti Aero ITX OC 4GB was available, so I went with that
  • A Samsung 970 SSD (rather than the 960)

Unfortunately I wasn’t able to get everything I needed from one place, so I ended up placing three orders:

  1. QuietPC for all the Streacom stuff (DB4 case, CPU and GPU cooling kits and PSU)
  2. Overclockers for the motherboard, SSD and CPU
  3. Scan for the RAM and GPU

By some miracle everything showed up the following day (with the Scan and Overclockers boxes coming in the same DPD van). The whole lot came to £1551.34 inc delivery, which is a bit better than the AUD3000 total mentioned in the original post. I didn’t exhaustively shop on price, so it’s possible I could have squeezed things a little more.

Testing, testing

There aren’t that many components, and they only work as a whole, so I put it all together and (of course) it didn’t work first time. The machine would power on, but there was no output from the graphics card.

There was nothing to go on for diagnostics other than that the power button and LED were apparently working.

So I had to pretty much start over, with everything laid out on a bench, and I found that the CPU wasn’t seated properly. I guess having such a complex heat pipe system attached puts a fair bit of mechanical force onto things that can dislodge what seemed like a sound fit.

As I was checking things out I also noticed that the SSD was perilously close to a raised screw hole on the motherboard holder, which I chose to drill out – better safe than sorry.

Putting it back together I retested after each stage in the construction (each side of the case in terms of heat transfer arrangements), and everything went OK through to completion.

It’s fast

Geekbench is showing a single core score of 4329 and a multi core score of 20639.

That’s way ahead of my NUC which managed 2420 and 4815 respectively. It even beats my son’s i5-6500 based gaming rig that clocked 3208 and 8045.

Cool, but not super cool

As I type the system is pretty much idle, but I’m seeing a CPU temperature in the range of 57-67C, which is nothing like the figures Tim got when measuring Passively-cooled CPU Thermals. The GPU is telling me it’s at 48C. There are a few factors that come into play here:

  • It’s baking in the UK at the moment, so my ambient temperature is 28C rather than 20C.
  • One of the Streacom heat pads was either missing or got lost during my build, so I’m waiting on another to arrive. Thus the thermal efficiency of the CPU cooling isn’t presently all it could be.

I’d also note that I went with the LH6 CPU cooling kit despite having no plans to overclock as I’d like to keep everything as cool as possible.

The case temperature is around 40C, so hot to the touch, but not burning hot. In the winter I might appreciate the warmth it radiates, but right now I’d rather have it off my desk.

Cable management

The DB4 case design has everything emerging from the bottom, which might look amazing for photos when it’s not plugged into anything, but is far from ideal for actual usage. I’ve bundled the cables together and tied them off to the stand, but this is not a machine that makes it easy to pop things into. There are a couple of USB ports on one corner (which I’ve arranged as front right), but using them is a fiddle.

I’m pleased to have USB ports on my keyboard and a little hub sat on my monitor pedestal.


After using a silent PC for over 4 years there’s no way I’d go back to the whine of fan noise, so I was pleased to find an approach that kept things quiet whilst offering better performance. The subjective user experience is amazing (this is easily the fastest PC I’ve ever used), so my fingers are crossed that it stays that way.


[1] There’s not much talk about thermal throttling of SSDs, but it is a thing, and it can badly hurt user experience when your writes get queued up. I do worry that my new M2 drive is sat baking at the bottom of the new rig, and if I find myself taking it apart again I might stick a thermal pad in place so that it can at least conduct directly onto the motherboard tray.
[2] I suspect that over the years the Thermal Interface Material (TIM) in the CPU degraded, leading to the whole rig running hotter, leading to a spiral of poor performance. When it was new it ran quick enough, and (relatively) cool, but it seems over time things got worse.
[3] I considered another NUC, and the Hades Canyon looks like it would have met my needs, but Akasa don’t yet do a silent case for it.

#2 of jobs that should exist but don’t in most IT departments (#1 was The Application Portfolio Manager).

What’s a constraint?

From Wikipedia:

The theory of constraints (TOC)[1] is an overall management philosophy introduced by Eliyahu M. Goldratt in his 1984 book titled The Goal

It’s the idea that in a manufacturing process there will be a constraint (or bottleneck) and that:

  • there’s no point in doing any optimisation work before the constraint, because that will just make work in progress stack up even quicker
  • there’s no point in doing any optimisation work after the constraint, because the work in progress is still stuck upstream

TOC drives us towards a singular purpose – identify the present constraint and fix it.

Of course this becomes a game of ‘Whac-A-Mole‘, just as soon as one constraint is dealt with another lies waiting to be discovered. But it’s an excellent way of ensuring that time, money and other resources are focused in the right place, and the starting point for continuous improvement that takes advantage of incremental gains.

The constraint unblocker

Is an individual who’s empowered to work across an organisation identifying its constraints and leading the efforts to fix them.

James Hamilton

One of my industry heroes is Amazon’s constraint unblocker – James Hamilton[2]. He has:

  • Reinvented data centre cooling (and many other aspects of data centre design)
  • Reinvented servers
  • Reinvented storage
  • Reinvented networks
  • Modified power switching equipment

Take a look at his AWS Innovation at Scale presentation for some depth, or the Wired article Why Amazon Hired a Car Mechanic to Run Its Cloud Empire.

The consequence of that list above shouldn’t be underestimated. Where Hamilton (and his like at Google, Facebook etc.) have led, the entire industry has followed.

The impact of that list shouldn’t also imply that there’s no point in doing this elsewhere. This approach isn’t just the preserve of hyperscale operators. All IT shops have their constraints, and so all IT shops should have a leader who’s focused on unblocking them.

TOC and DevOps

There’s a close relationship between TOC and DevOps. The Goal inspired The Phoenix Project and the ‘3 DevOps Ways’ of Flow, Feedback and Continuous Learning by Experimentation are all about dealing with constraints.

That isn’t however to say that organisations doing DevOps have everything covered. The 3 ways make sure that constraints are addressed in the context of a single continuous delivery pipeline for a single product, but as soon as there’s more than one product there’s most likely a global constraint that can’t be dealt with at a local level.

Amazon may be doing DevOps up and down the organisation, and they very effectively organise themselves into ‘2 pizza‘ teams ‘working backwards‘ building micro services to power their ever expanding service portfolio. But they still need James and his team working top down to get the big roadblocks out of their way as they spend $Billions scaling their infrastructure.

Data (science) required

Notionally this stuff was easy with manufacturing. Look down on the factory floor and you can see the workstation where the work in progress is stacking up. Pop down there and figure out how to fix it.

Of course the reality was much messier than that, which is why Goldratt quickly found himself revising The Goal, and a whole consulting industry sprang up around TOC. But with software we have to acknowledge from the outset that we’re not going to see work in progress physically piling up; and beyond DevOps it’s entirely possible that the constraint may have little to do with ‘work in progress’.

Thus in IT we need data to find our constraints, and we usually need that same data (or more) to inform the model-hypothesise-experiment process that determines what to do about a constraint. In my own work (that we now brand Bionix) that’s why we start with the data science team and their analytics.

Why bother?

My personal observations of TOC in action over the past few years have generally found a 20% improvement in efficiency/effectiveness on the first iteration. That’s not moving the decimal disruption, but that’s a realistic first approximation of what’s achievable in a six week cycle.

Of course because this is Whac-A-Mole you never get the same pay off again. The next iteration might be 15%, then 12%, then 9% and quickly off into the weeds. But stack those gains on top of each other and you’re quickly into completely different territory.


As we can see from Amazon even the best organisations have constraints, and they can benefit from having a leader focused on identifying and fixing them. That way they can achieve continuous improvement and the fruits of incremental gains across the organisation, and not just in a product silo.


[1] I find it somewhat frustrating that ‘theory’ is used here as it makes the approach seem ‘academic’ and thus easily dismissed by those claiming that they only care about practical outcomes.
[2] James starred in my ScotCloud keynote last year “Our problems are easy“.

#1 of jobs that should exist but don’t in most IT departments

What should we do about all the legacy stuff?

This was a question that came up at the closing panel of the Agile Enterprise Rome conference I was at in May. The context was ‘we’ve spent a couple of days hearing about this great stuff with microservices and containers and serverless, but what should we do about our legacy?’.

I’ve heard this question, or some variant of it, many times over my career.

My answer in Rome was something like this:

The very reason that legacy exists is that it satisfies a business need at a price point that’s better than migrating to something new.

There are some important implications to that statement:

  1. You’ve actually figured out what the migration costs are
  2. Those costs are regularly re-evaluated to take account of industry changes

Those things imply a portfolio management approach where each application has a value and a cost to trade out of a given position, and where the portfolio is reappraised on a regular basis. This isn’t something I see being done in a particularly structured way in (m)any organisations[1].

Step functions and gravity wells

A big part of the problem here turns out to be non linearities in the (license) cost for many legacy systems.

How much do you need to reduce your mainframe MIPS to cut your mainframe spend by 50%?

It turns out that the answer to that isn’t anything like 50%, or even 75% or 90%. In most cases it’s essentially impossible to cut mainframe spend by reducing usage unless the mainframe is completely eliminated. The same is roughly true for many classes of legacy software driving an ‘all or nothing’ approach.

This picture is further complicated by bundling within Enterprise License Agreements (ELAs), where account managers will hold firm on well established revenue (their cash cows) but happily throw in some of their shinier new stuff[2]. There’s also the issue of ‘where software goes to die’ vendors that aquire and hoard legacy assets giving them multiple points of leverage when it comes to ELA (re)negotiation time – they’re good at playing the portfolio management game.

5 Rs

There are multiple options for what happens to an application when it’s moved off a legacy system. Gartner suggests the 5 Rs[3] in its ‘Five Ways to Migrate Applications to the Cloud‘:

  1. Rehost
  2. Refactor
  3. Revise
  4. Rebuild
  5. Replace

Broadly this has approximately nothing to do with ‘the Cloud’. Each path implies a different cost/value trade off that needs to be assessed.

For most applications it will be simple to eliminate most of the Rs as viable potential courses of action, leaving one or two to be properly considered and priced.

Who’s your head of application portfolio management?

Becomes the pertinent question. If this isn’t somebody’s job, then it’s probably nobodies’ job, and it won’t be getting done. If organisations aren’t active about this portfolio management approach then inertia will take charge of their direction.


Applications are an investment, and like any other investment they should be managed. A portfolio approach, and tools to evaluate trade offs and options naturally follows; and of course the process has to be iterative, because the world keeps changing. If organisations aren’t active about this, then their direction gets determined by inertia.


[1] I’ve seen IT Portfolio Management tools like Alfabet (now owned by Software AG) implemented in some organisations, but even then I’ve seen little evidence of the tools being used in a rigorous way (or having much impact on overall IT strategy).
[2] Aka the ‘drug dealer model
[3] With thanks to Johan Minnaar for bringing my attention to the model and my colleague Jim Miller for highlighting its ubiquity.


If you can persuade people that their side is going to win without their vote, then perhaps just enough of them won’t bother to show up that you can steal the win.


The two countries that I spend most of my time in (the UK and US) continue to recoil from the effects of narrowly won campaigns that didn’t turn out how the pundits predicted. Social media is credited (by which I mean blamed) for much of this. But the narrative that I’m seeing seems incomplete, and hence doesn’t ring true – no wonder there’s so much cognitive dissonance around this issue.

Activating voters

The role of social media in bringing people into a campaign first came to light during Obama’s run in 2008. Widespread use of social media itself was pretty new then, but the ability for politicians to connect with voters without intermediaries was and remains hugely powerful. I have no doubts that Trump connected better with his base as a consequence of his positive use of social media, and I also think Leave were more savvy than Remain in the Brexit referendum[1].

I use the term ‘positive’ here without any value judgement of a particular side or campaign, but rather for the ability of a politician to connect with their voters in a direct and authentic way that activates them to vote in their favour.

Depressing voters

Michael Moore used the term ‘depressed voter’ in his 5 reasons Trump is going to win:

… it will be what’s called a “depressed vote” – meaning the voter doesn’t bring five people to vote with her. He doesn’t volunteer 10 hours in the month leading up to the election. She never talks in an excited voice when asked why she’s voting for…

This becomes the negative side of influencing the electorate:

  • You’re going to win anyway – so treat yourself to that lie in
  • They’re all as bad as each other – what’s the point in voting

It doesn’t need to appeal to anything besides apathy and indifference, and it’s negative because it stops a voter from voting. Whatever their intention might have been, it doesn’t show up at the ballot box.


As we continue to pick over the outcome of these votes there’s a ton of analysis about who voted which way, and why, and how they might have been influenced by social media campaigns. And then things start getting murky over how those campaigns were orchestrated and financed.

But things get even murkier if we look at who didn’t vote, and why, and how they might have been influenced by social media campaigns. And how those campaigns were orchestrated and financed.

But wait… there’s more

The role of polls and pollsters, and the interplay with social media is only just starting to be examined. The simple lesson here seems to be that the only poll that matters is the actual vote, and anything else might well be part of a disinformation campaign or an elaborate con.

Update 5 Jul 2018 – A couple of days after I posted this Cory Doctorow published Zuck’s Empire of Oily Rags on the same topic. He doesn’t focus on the negative aspects I note above, but the general narrative is (in my opinion) spot on. The line that I expect will be quoted most is:

Cambridge Analytica didn’t convince decent people to become racists; they convinced racists to become voters.

What may also happen here is that they convinced decent people to be apathetic about voting.


[1] This observation extends to just about everything to do with modernity. Remain ran a campaign that wouldn’t have been out of place in the 19th century, and were completely outplayed by Dominic Cummings and his understanding of stochastic processes (branching histories) and OODA loops.

I first used this analogy at an Open Cloud Forum event in Zurich a couple of months back, and I just used it again in a panel discussion at DevSecOps Days London. I’ve been meaning to incorporate it into a DevOps presentation, but until then…


The ‘traditional’ Enterprise IT approach to stability is a game of Jenga – don’t touch anything in case the tower falls over. Each change feels like it brings us closer to calamity; and eventually it does all fall down and you have to pick up the pieces, put them back in place, and start over.

Riding a Bike

The agile/DevOps approach to stability is to keep moving forward, like riding a bike – if you have enough velocity, you’re stable.