The DXC Blogs – Four Pillars of Modern Infrastructure

18May17

Originally posted internally 6 Jan 2016:

I’d meant to post this before the Christmas break as a guide to things to tinker with over the break, but then I hit the point where pretty much everybody seemed to already be on leave, and it was clearly too late….

So Happy New Year, if you’re not already on top of these things then here’s a taste of what’s coming in 2016.

1) Infrastructure as code

‘Infrastructure as code’ means that infrastructure people now need to worry about source code in the same way that application developers have.

The good news is that distributed source control and collaboration is now pretty much a solved problem[1]. History might well remember Linus Torvalds more as the creator of the Git source control system than just the ‘just a hobby’ Linux kernel project it was built to support.

GitHub

Git is built around the idea of ‘local’ and ‘remote’ repositories, where local is generally a development environment. GitHub provides the remote part where source can be pulled from to avoid starting from scratch, and pushed to once changes are made.

There are alternatives to GitHub out there such as BitBucket and GitLab (there are even alternatives to Git such as Mercurial and Perforce), but we went with GitHub at DXC Technology in order to have a consistent feature set and user experience with the most popular open source projects (that are hosted on GitHub). DXC Technology has an organisation on public GitHub where we can collaborate with customers and partners, and also an enterprise GitHub (for internal ‘inner source’ code)

Fork and Pull

Like the source control systems that came before it Git supports the idea of creating code branches for developing a given feature (or fixing a given bug), with merging back into the ‘trunk’ once a branch has served its purpose. Where things get a little different, and much more powerful, is the concept of ‘fork and pull’.

A fork is a copy of a given project so that somebody can make changes without having to ask permission of the original creator. In the past forks have often been considered to be a bad thing as they lead to potentially exploding complexity as a project splinters away from its original point. This led to a point of view that open source projects that forked had governance issues that might be detrimental to the project (and its users).

Pull requests are where somebody who’s made a fork asks for their change to be incorporated back into the main project. If a fork is ‘beg forgiveness’ then a pull is ‘ask permission’. Using fork and pull together means that a project can very quickly explore the space around it, and bring back what’s of value into the main body; and it’s a process that now lies at the heart of nearly every successful open source project.

Some enterprises have struggled with fork and pull, as it doesn’t fit into traditional governance processes – it is after all a governance process (or the beginnings of one) in its own right. The good news for infrastructure people is that infrastructure as code is generally a new enough thing that there’s no need for demolition work before erecting the new building.

Learn Git/GitHub

On Codecademy

Pro Git Book (it’s worth paying particular attention to the chapter on branching and merging, a topic that’s also well covered at learn Git branching)

2) Configuration Management

The whole point of infrastructure as code is to reach a desired configuration for the infrastructure elements in an automated, programmatic, and repeatable fashion to support infrastructure at scale. That’s normally achieved using some sort of configuration management tool, with the code being what drives it. It’s worth noting that the use of such tools move administration away from people logging into individual machines and carrying out manual tasks.

There are essentially two approaches to configuration management:

  • Imperative – ‘do this, then do that, then do this other thing’. This is what scripts have been doing since the dawn of operating systems, and arguably the mantra of ‘every good systems administrator will replace themselves with a script’ means that we’ve always at least had intent to have infrastructure as code. The problem with imperative systems is that they can be very brittle. All it takes is one unexpected change and the script doesn’t do what’s expected, and potentially everything breaks.
  • Declarative – ‘achieve this outcome’. Modern configuration management tools at least aspire to being declarative, which hopefully makes them less fragile than older script based systems[2].

Ansible

There are lots of popular configuration management tools in the marketplace such as Puppet, Chef, Salt and even the original CFEngine. At DXC Technology we’ve chosen Ansible (and Ansible Tower) largely because of its SSH based agentless operating model, which eliminates the need to install stuff onto things just to begin the configuration process, and also allows it to be used with infrastructure such as networking equipment where it might be impossible to install an agent for a config management tool.

Learn Ansible

Ansible Get Started

Ansible: Up and Running

3) Containers

The use of virtual machines (VMs) to provide resource isolation and workload management has been popular now for well over a decade, and has found its way into the most conservative late adopters. Containers achieve resource isolation and workload management within an operating system rather than by using an additional hypervisor. This has multiple advantages, including quick startup time (fractions of a second) and low memory overhead (due to sharing a common kernel and libraries). The containers approach has been popular in various niches ranging from small virtual private server (VPS) hosting companies to how Google manages the million plus servers spread across its global data centres. Containers are now escaping from those niches, primarily because of the Docker project.

Docker

Docker is a set of management tools for containers that brings the mantra of ‘build, ship and run’

  • Build – a container from a ‘bill of materials’ (known as a Dockerfile) that describe what goes inside the container in a simple script.
  • Ship – a container (or just the Dockerfile that describes it) anywhere that you can copy a file
  • Run – on any environment that can support containers, ranging from a laptop to a cloud data centre.

The ‘run’ part is in many ways pretty much ubiquitous already – any Linux machine or virtual machine with a relatively new kernel can run containers, and soon the same will apply to Windows. This makes ‘build’ and ‘ship’ the more interesting part of the story, and DockerHub provides a central (GitHub like) place where things can be built and shipped from. To complement the public DockerHub DXC Technology will be deploying its own Docker Trusted Registry (which will be equivalent to our Enterprise GitHub).

Learn Docker

I already wrote about how Docker gives you installation superpowers, and put out the plea to Install Docker.

Get Started with Docker

4) Cloud Services

There are all kinds of definitions of ‘cloud’ out there[3], and our industry has suffered from a great deal of ‘cloudwashing’.

The modern pillar is the collection of services that expose an application programmer interfaces (APIs) that can be used to manage the life cycle and configuration of resources (on demand). The API allows the human interface, whether that’s a command line interface (CLI) or web interface or some other bizarre tool to be supplanted; though of course once an API is in place it’s easy to build a CLI, web interface (or even something bizarre) in front of it. It’s possible to automate without APIs (by screenscraping older interfaces), but it’s much easier to automate once APIs are in place.

Infrastructure driven by APIs, better known as Infrastructure as a Service (IaaS) was misunderstood in its early days as a cost play (despite never being *that* cheap). It’s now pretty well understood that the main point is time to market (for whatever it is that depends on the infrastructure) rather than cost per se; though many organisations have the numbers to show that they can get cost savings too.

AWS

Amazon Web Services (AWS) is the elephant in the cloud room. It’s 15x larger than its nearest competitor and 5x larger than the rest of the providers put together. Modern management techniques such as ‘two pizza’ have allowed Amazon to release new features at an exponential rate, which has been combined with huge ($Bn/qtr) infrastructure investment to force every other provider out of business or into an expensive game of catch up[4].

Learn AWS

DXC Technology is part of the Amazon Partner Network (APN), so there’s a structure programme for training and certification. Better still there are cash bonuses on offer for getting professional certifications

Where do I start to get Amazon Web Services (AWS) Training (DXC Technology staff only)

Wrapping Up

These four pillars are tied together by the use of code and the APIs that they drive. I’ll write more on how these come together with techniques like continuous integration (CI) and continuous deployment (CD), and how this fits in to the concept of ‘DevOps’.

Notes

[1] The one remaining caveat here is that GitHub has (re)introduced centralised management to a distributed system, and with it a single point of failure. GitHub (and by extension its users) have been victim to numerous distributed denial of service (DDOS) attacks over the past few years (often attributed to Chinese action against projects that allow circumvention of the ‘great firewall’). This is one reason for having a separate enterprise GitHub installation (as we have at CSC). There are some other answers – check out distributed hash tables (DHT) and interplanetary file system (IPFS) for a glimpse of how things will once again become distributed.
[2] It’s worth noting that declarative systems should easily deal with idempotence, which means that if a change has already been made then it shouldn’t be duplicated.
[3] Though NIST did everybody a favour by publishing a set of standard definitions.
[4] As we see Rackspace and CenturyLink fall by the wayside it leaves Microsoft, Google and EMC pretty much alone in terms of having the resources to compete. Google still outspends Amazon on infrastructure, which is entirely fungible between Google services and Google Cloud Services – so it’s able to do interesting things with scale and pricing. Microsoft has done a great job of upselling its existing customers into its cloud offerings, so for many it’s the cloud they already bought and paid for. It should be noted that both Apple and Facebook operate hyperscale infrastructures along similar lines to the IaaS providers, they just haven’t chosen to go into that business (yet).

Retrospective

This post very much became my manifesto for my time in Global Infrastructure Services. I presented talks based on this at TechCom 2016, and it also formed the basis of the ‘infrastructure as code boot camp’ workshops in the global delivery centres (which has now become a set of online Katacoda scenarios).

We ended up not going with Ansible Tower, for reasons I’m not going to get into here and now – that’s a story for another day.

The incentive for doing AWS Certified Solution Architect training has now been withdrawn, but it helped propel us to having the second largest community of certified people.