How long to provision a VM?

11Dec19

This question came from a colleague, and it’s one of those questions that I’m surprised and saddened that we still have to ask.

My history with this stuff

Some 15 years ago I was a beta tester for Leslie Muller‘s ‘Virtual Developer Environment’ (VDE)[1], which was a web site that let me request a virtual machine (VM). I’d fill in a simple form, and about 8 minutes later I’d get an email with connection details for my VM.

There’s a technical aspect to this

It took 8 minutes to provision a VM because that’s how long it took to copy the bytes from the golden image on a file share to the VM server.

When shadow volume copy technology came along on modern storage systems that copy time went away, and it was possible to get VMs as quickly as they could boot (which is a few seconds for a reasonable machine image on reasonably fast storage).

So technically the time to provision a VM is somewhere between a few seconds and a few minutes; and that’s been true for the whole time we’ve had VM provisioning systems.

And there’s an organisational aspect to this

Provisioning a VM takes resources – CPU, RAM, storage, and resources cost money, and organisations like to control how they spend money.

If an organisation makes an up front decision to spend money on VMs for a certain purpose that can be bound into the provisioning process then there’s no good reason why VMs can’t be provisioned in seconds to minutes per the technical constraints above.

This is why public cloud is fast, because the decision to spend money has already been made. That decision might be implicit, but it nevertheless has been taken. Public clouds have also done some cool stuff behind the scenes to make provisioning fast (e.g. by having a pool of Windows servers pre provisioned waiting for users[2]).

If on the other hand an organisation decides to embed the spending decision as an approval workflow that presents as part of the request process for a VM then it can take a while (days to weeks) because it’s no longer a straight through automated process, but instead something waiting on (multiple) human interactions[3].

There can be further technical hurdles

The CPU, RAM and storage needed for a VM can be contained reasonably elegantly into the virtual server infrastructure (pretty much irrespective of how that’s put together). But the observant will have noticed that network is missing from that list, and VMs need things like IP addresses and DNS names.

Of course that stuff can be completely automated with things like DHCP, but many organisations have rules against that sort of thing (because some time in the mid 90s some nugget managed to exhaust a DHCP table by setting the lease duration too long or similar foolishness).

If IP addresses come from Fred’s IP address spreadsheet, then the provisioning process will bottleneck there. If server names come Tina’s tombolla, then the provisioning process will bottleneck there too.

IP Address Management (IPAM) has emerged as a product category that fits the dark arts of legacy enterprise practices into tool suites, that can then be more or less integrated into automated provisioning, so there are ways of holding on to the policy debt of past approaches whilst still working with more modern needs.

Quotas and leases

It’s worth reflecting how we dealt with the spending approval in the early days of VDE.

Each department bought an allocation of capacity (usually in units of servers), and from that capacity they could hand out quotas to users. The main constraint was (and generally remains to this day[4]) RAM, so the quotas were RAM quotas. Therefore if I had a 4GB quota (an 8th of the 32GB we could pack into an HP DL380) I could choose between 1 4GB VM or 4 1GB VMs or any other combination.

The leasing aspect was there to ensure that capacity was kept in active use. VMs would come with a 1 month lease, which had to be renewed by the requester to show that they still needed the VM. After 3 months that was it – the VM got deprovisioned. This ensured that we didn’t have stuff in a ‘development’ environment that was being treated as essential, and had the side benefit of ensuring that nothing remained unpatched for too long. It also forced people to automate their config management.

The quotas idea still lives on in some implementations, whilst leases never really caught on and got overtaken by the cost transparency of Infrastructure as a Service (IaaS) pricing models, whether that’s on demand, reserved instances, or spot pricing.

Conclusion

VM provisioning can be really fast. Not fraction of a second container fast[5], but fast enough where boot time is still a consideration.

VM provisioning can also be really slow, if the organisation wants it to be slow by throwing spend control and other barriers in the way of speed. That’s a choice (conscious or otherwise), and sometimes that choice needs to be properly surfaced and understood; particularly if one part of the organisation cares about speed whilst other parts have other cares.

Notes

[1] VDE became the Virtual Machine Provisioning System (VMPS), which became DynamicOps, which was acquired by VMware to become vRealize Automation (vRA). Other VM automation systems are available.
[2] See The #AWS EC2 Windows Secret Sauce for a detailed example.
[3] I once came across a bank (not one I worked at) that had built a 24h delay into its otherwise automated provisioning system so that the infrastructure services people could check that people had ‘asked for the right thing’. They felt that the delay provided a ‘cooling off’ period that prevented ill considered short term requests. To me this seemed like an act of open hostility by Ops towards their Dev colleagues.
[4] Workloads tend to get broken down into general purpose and compute intensive. The latter usually went onto dedicated High Performance Compute (HPC) environments, which came with their own provisioning and workload management (the grid). So VMs tended to be general purpose workloads that exhausted RAM before CPU.
[5] There are now systems that essentially merge container tech and VM tech that do break the sub second barrier.

 



No Responses Yet to “How long to provision a VM?”

  1. Leave a Comment

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.