Silent PC

16Jul18

TL;DR

I’ve been very happy with the silence of my passively cooled NUC for the past 4 years, but it was starting to perform poorly. So when I came across a good looking recipe for a silent PC with higher performance I put one together for myself.

Background

I’ve been running my NUC in an Akasa silent case since shortly after I got it, and it’s been sweet, until it wasn’t. Silence is golden, but having a PC that’s constantly on the ragged edge of thermal limiting for the CPU and/or SSD[1] became pretty painful[2]. When I came across this Completely Silent Computer post a few weeks back I knew it was exactly what I wanted[3].

Parts

I pretty much followed Tim’s build, with a few exceptions:

  • I went for the black DB4 case
  • In line with his follow up Does Pinnacle Ridge change anything? I went with the Ryzen 5 2600 CPU
  • The MSI GeForce GTX 1050 Ti Aero ITX OC 4GB was available, so I went with that
  • A Samsung 970 SSD (rather than the 960)

Unfortunately I wasn’t able to get everything I needed from one place, so I ended up placing three orders:

  1. QuietPC for all the Streacom stuff (DB4 case, CPU and GPU cooling kits and PSU)
  2. Overclockers for the motherboard, SSD and CPU
  3. Scan for the RAM and GPU

By some miracle everything showed up the following day (with the Scan and Overclockers boxes coming in the same DPD van). The whole lot came to £1551.34 inc delivery, which is a bit better than the AUD3000 total mentioned in the original post. I didn’t exhaustively shop on price, so it’s possible I could have squeezed things a little more.

Testing, testing

There aren’t that many components, and they only work as a whole, so I put it all together and (of course) it didn’t work first time. The machine would power on, but there was no output from the graphics card.

There was nothing to go on for diagnostics other than that the power button and LED were apparently working.

So I had to pretty much start over, with everything laid out on a bench, and I found that the CPU wasn’t seated properly. I guess having such a complex heat pipe system attached puts a fair bit of mechanical force onto things that can dislodge what seemed like a sound fit.

As I was checking things out I also noticed that the SSD was perilously close to a raised screw hole on the motherboard holder, which I chose to drill out – better safe than sorry.

Putting it back together I retested after each stage in the construction (each side of the case in terms of heat transfer arrangements), and everything went OK through to completion.

It’s fast

Geekbench is showing a single core score of 4329 and a multi core score of 20639.

That’s way ahead of my NUC which managed 2420 and 4815 respectively. It even beats my son’s i5-6500 based gaming rig that clocked 3208 and 8045.

Cool, but not super cool

As I type the system is pretty much idle, but I’m seeing a CPU temperature in the range of 57-67C, which is nothing like the figures Tim got when measuring Passively-cooled CPU Thermals. The GPU is telling me it’s at 48C. There are a few factors that come into play here:

  • It’s baking in the UK at the moment, so my ambient temperature is 28C rather than 20C.
  • One of the Streacom heat pads was either missing or got lost during my build, so I’m waiting on another to arrive. Thus the thermal efficiency of the CPU cooling isn’t presently all it could be.

I’d also note that I went with the LH6 CPU cooling kit despite having no plans to overclock as I’d like to keep everything as cool as possible.

The case temperature is around 40C, so hot to the touch, but not burning hot. In the winter I might appreciate the warmth it radiates, but right now I’d rather have it off my desk.

Cable management

The DB4 case design has everything emerging from the bottom, which might look amazing for photos when it’s not plugged into anything, but is far from ideal for actual usage. I’ve bundled the cables together and tied them off to the stand, but this is not a machine that makes it easy to pop things into. There are a couple of USB ports on one corner (which I’ve arranged as front right), but using them is a fiddle.

I’m pleased to have USB ports on my keyboard and a little hub sat on my monitor pedestal.

Conclusion

After using a silent PC for over 4 years there’s no way I’d go back to the whine of fan noise, so I was pleased to find an approach that kept things quiet whilst offering better performance. The subjective user experience is amazing (this is easily the fastest PC I’ve ever used), so my fingers are crossed that it stays that way.

Notes

[1] There’s not much talk about thermal throttling of SSDs, but it is a thing, and it can badly hurt user experience when your writes get queued up. I do worry that my new M2 drive is sat baking at the bottom of the new rig, and if I find myself taking it apart again I might stick a thermal pad in place so that it can at least conduct directly onto the motherboard tray.
[2] I suspect that over the years the Thermal Interface Material (TIM) in the CPU degraded, leading to the whole rig running hotter, leading to a spiral of poor performance. When it was new it ran quick enough, and (relatively) cool, but it seems over time things got worse.
[3] I considered another NUC, and the Hades Canyon looks like it would have met my needs, but Akasa don’t yet do a silent case for it.


#2 of jobs that should exist but don’t in most IT departments (#1 was The Application Portfolio Manager).

What’s a constraint?

From Wikipedia:

The theory of constraints (TOC)[1] is an overall management philosophy introduced by Eliyahu M. Goldratt in his 1984 book titled The Goal

It’s the idea that in a manufacturing process there will be a constraint (or bottleneck) and that:

  • there’s no point in doing any optimisation work before the constraint, because that will just make work in progress stack up even quicker
  • there’s no point in doing any optimisation work after the constraint, because the work in progress is still stuck upstream

TOC drives us towards a singular purpose – identify the present constraint and fix it.

Of course this becomes a game of ‘Whac-A-Mole‘, just as soon as one constraint is dealt with another lies waiting to be discovered. But it’s an excellent way of ensuring that time, money and other resources are focused in the right place, and the starting point for continuous improvement that takes advantage of incremental gains.

The constraint unblocker

Is an individual who’s empowered to work across an organisation identifying its constraints and leading the efforts to fix them.

James Hamilton

One of my industry heroes is Amazon’s constraint unblocker – James Hamilton[2]. He has:

  • Reinvented data centre cooling (and many other aspects of data centre design)
  • Reinvented servers
  • Reinvented storage
  • Reinvented networks
  • Modified power switching equipment

Take a look at his AWS Innovation at Scale presentation for some depth, or the Wired article Why Amazon Hired a Car Mechanic to Run Its Cloud Empire.

The consequence of that list above shouldn’t be underestimated. Where Hamilton (and his like at Google, Facebook etc.) have led, the entire industry has followed.

The impact of that list shouldn’t also imply that there’s no point in doing this elsewhere. This approach isn’t just the preserve of hyperscale operators. All IT shops have their constraints, and so all IT shops should have a leader who’s focused on unblocking them.

TOC and DevOps

There’s a close relationship between TOC and DevOps. The Goal inspired The Phoenix Project and the ‘3 DevOps Ways’ of Flow, Feedback and Continuous Learning by Experimentation are all about dealing with constraints.

That isn’t however to say that organisations doing DevOps have everything covered. The 3 ways make sure that constraints are addressed in the context of a single continuous delivery pipeline for a single product, but as soon as there’s more than one product there’s most likely a global constraint that can’t be dealt with at a local level.

Amazon may be doing DevOps up and down the organisation, and they very effectively organise themselves into ‘2 pizza‘ teams ‘working backwards‘ building micro services to power their ever expanding service portfolio. But they still need James and his team working top down to get the big roadblocks out of their way as they spend $Billions scaling their infrastructure.

Data (science) required

Notionally this stuff was easy with manufacturing. Look down on the factory floor and you can see the workstation where the work in progress is stacking up. Pop down there and figure out how to fix it.

Of course the reality was much messier than that, which is why Goldratt quickly found himself revising The Goal, and a whole consulting industry sprang up around TOC. But with software we have to acknowledge from the outset that we’re not going to see work in progress physically piling up; and beyond DevOps it’s entirely possible that the constraint may have little to do with ‘work in progress’.

Thus in IT we need data to find our constraints, and we usually need that same data (or more) to inform the model-hypothesise-experiment process that determines what to do about a constraint. In my own work (that we now brand Bionix) that’s why we start with the data science team and their analytics.

Why bother?

My personal observations of TOC in action over the past few years have generally found a 20% improvement in efficiency/effectiveness on the first iteration. That’s not moving the decimal disruption, but that’s a realistic first approximation of what’s achievable in a six week cycle.

Of course because this is Whac-A-Mole you never get the same pay off again. The next iteration might be 15%, then 12%, then 9% and quickly off into the weeds. But stack those gains on top of each other and you’re quickly into completely different territory.

Conclusion

As we can see from Amazon even the best organisations have constraints, and they can benefit from having a leader focused on identifying and fixing them. That way they can achieve continuous improvement and the fruits of incremental gains across the organisation, and not just in a product silo.

Notes

[1] I find it somewhat frustrating that ‘theory’ is used here as it makes the approach seem ‘academic’ and thus easily dismissed by those claiming that they only care about practical outcomes.
[2] James starred in my ScotCloud keynote last year “Our problems are easy“.


#1 of jobs that should exist but don’t in most IT departments

What should we do about all the legacy stuff?

This was a question that came up at the closing panel of the Agile Enterprise Rome conference I was at in May. The context was ‘we’ve spent a couple of days hearing about this great stuff with microservices and containers and serverless, but what should we do about our legacy?’.

I’ve heard this question, or some variant of it, many times over my career.

My answer in Rome was something like this:

The very reason that legacy exists is that it satisfies a business need at a price point that’s better than migrating to something new.

There are some important implications to that statement:

  1. You’ve actually figured out what the migration costs are
  2. Those costs are regularly re-evaluated to take account of industry changes

Those things imply a portfolio management approach where each application has a value and a cost to trade out of a given position, and where the portfolio is reappraised on a regular basis. This isn’t something I see being done in a particularly structured way in (m)any organisations[1].

Step functions and gravity wells

A big part of the problem here turns out to be non linearities in the (license) cost for many legacy systems.

How much do you need to reduce your mainframe MIPS to cut your mainframe spend by 50%?

It turns out that the answer to that isn’t anything like 50%, or even 75% or 90%. In most cases it’s essentially impossible to cut mainframe spend by reducing usage unless the mainframe is completely eliminated. The same is roughly true for many classes of legacy software driving an ‘all or nothing’ approach.

This picture is further complicated by bundling within Enterprise License Agreements (ELAs), where account managers will hold firm on well established revenue (their cash cows) but happily throw in some of their shinier new stuff[2]. There’s also the issue of ‘where software goes to die’ vendors that aquire and hoard legacy assets giving them multiple points of leverage when it comes to ELA (re)negotiation time – they’re good at playing the portfolio management game.

5 Rs

There are multiple options for what happens to an application when it’s moved off a legacy system. Gartner suggests the 5 Rs[3] in its ‘Five Ways to Migrate Applications to the Cloud‘:

  1. Rehost
  2. Refactor
  3. Revise
  4. Rebuild
  5. Replace

Broadly this has approximately nothing to do with ‘the Cloud’. Each path implies a different cost/value trade off that needs to be assessed.

For most applications it will be simple to eliminate most of the Rs as viable potential courses of action, leaving one or two to be properly considered and priced.

Who’s your head of application portfolio management?

Becomes the pertinent question. If this isn’t somebody’s job, then it’s probably nobodies’ job, and it won’t be getting done. If organisations aren’t active about this portfolio management approach then inertia will take charge of their direction.

Conclusion

Applications are an investment, and like any other investment they should be managed. A portfolio approach, and tools to evaluate trade offs and options naturally follows; and of course the process has to be iterative, because the world keeps changing. If organisations aren’t active about this, then their direction gets determined by inertia.

Notes

[1] I’ve seen IT Portfolio Management tools like Alfabet (now owned by Software AG) implemented in some organisations, but even then I’ve seen little evidence of the tools being used in a rigorous way (or having much impact on overall IT strategy).
[2] Aka the ‘drug dealer model
[3] With thanks to Johan Minnaar for bringing my attention to the model and my colleague Jim Miller for highlighting its ubiquity.


TL;DR

If you can persuade people that their side is going to win without their vote, then perhaps just enough of them won’t bother to show up that you can steal the win.

Background

The two countries that I spend most of my time in (the UK and US) continue to recoil from the effects of narrowly won campaigns that didn’t turn out how the pundits predicted. Social media is credited (by which I mean blamed) for much of this. But the narrative that I’m seeing seems incomplete, and hence doesn’t ring true – no wonder there’s so much cognitive dissonance around this issue.

Activating voters

The role of social media in bringing people into a campaign first came to light during Obama’s run in 2008. Widespread use of social media itself was pretty new then, but the ability for politicians to connect with voters without intermediaries was and remains hugely powerful. I have no doubts that Trump connected better with his base as a consequence of his positive use of social media, and I also think Leave were more savvy than Remain in the Brexit referendum[1].

I use the term ‘positive’ here without any value judgement of a particular side or campaign, but rather for the ability of a politician to connect with their voters in a direct and authentic way that activates them to vote in their favour.

Depressing voters

Michael Moore used the term ‘depressed voter’ in his 5 reasons Trump is going to win:

… it will be what’s called a “depressed vote” – meaning the voter doesn’t bring five people to vote with her. He doesn’t volunteer 10 hours in the month leading up to the election. She never talks in an excited voice when asked why she’s voting for…

This becomes the negative side of influencing the electorate:

  • You’re going to win anyway – so treat yourself to that lie in
  • They’re all as bad as each other – what’s the point in voting

It doesn’t need to appeal to anything besides apathy and indifference, and it’s negative because it stops a voter from voting. Whatever their intention might have been, it doesn’t show up at the ballot box.

Conclusion

As we continue to pick over the outcome of these votes there’s a ton of analysis about who voted which way, and why, and how they might have been influenced by social media campaigns. And then things start getting murky over how those campaigns were orchestrated and financed.

But things get even murkier if we look at who didn’t vote, and why, and how they might have been influenced by social media campaigns. And how those campaigns were orchestrated and financed.

But wait… there’s more

The role of polls and pollsters, and the interplay with social media is only just starting to be examined. The simple lesson here seems to be that the only poll that matters is the actual vote, and anything else might well be part of a disinformation campaign or an elaborate con.

Update 5 Jul 2018 – A couple of days after I posted this Cory Doctorow published Zuck’s Empire of Oily Rags on the same topic. He doesn’t focus on the negative aspects I note above, but the general narrative is (in my opinion) spot on. The line that I expect will be quoted most is:

Cambridge Analytica didn’t convince decent people to become racists; they convinced racists to become voters.

What may also happen here is that they convinced decent people to be apathetic about voting.

Note

[1] This observation extends to just about everything to do with modernity. Remain ran a campaign that wouldn’t have been out of place in the 19th century, and were completely outplayed by Dominic Cummings and his understanding of stochastic processes (branching histories) and OODA loops.


I first used this analogy at an Open Cloud Forum event in Zurich a couple of months back, and I just used it again in a panel discussion at DevSecOps Days London. I’ve been meaning to incorporate it into a DevOps presentation, but until then…

Jenga

The ‘traditional’ Enterprise IT approach to stability is a game of Jenga – don’t touch anything in case the tower falls over. Each change feels like it brings us closer to calamity; and eventually it does all fall down and you have to pick up the pieces, put them back in place, and start over.

Riding a Bike

The agile/DevOps approach to stability is to keep moving forward, like riding a bike – if you have enough velocity, you’re stable.


Laser Printers

16Jun18

My family prints a lot[1] – about 1200 pages/year, which is why I made the decision almost a decade ago to switch from inkjet to laser. Inkjets weren’t just costing me a fortune in ink; they were also costing me a fortune in printers because they kept clogging up and failing in various ways. I worked my way through a variety of Epsons and Canons before giving up on the genre[2].

Black and White

My first buy back at the end of 2008 was an HP LaserJet 2420DN (the Duplex, Networked version) that was made around 2006 and that I picked up on eBay for £75. It was barely run in with a page count of just under 150,000, which is just 2 months usage at its advertised duty cycle. The toner that came with it had a little life left, but I lucked into a brand new HP toner on eBay for £6.26 that I’ve been using ever since – 7683 pages printed so far, with a forecast of over 2000 still to come. Over the years it’s needed some new rollers (£10.40) and a new fuser sleeve (£3.54), but it’s otherwise been a trouble free workhorse.

The ratio of simplex:duplex has worked out at around 2:3, leading to an average page cost (inc paper and the printer itself amortised over usage so far) of 1.66p/page.

Colour

For a while I hung on to an inkjet just for colour printing, but inkjets hate infrequent use, and so reliability and print quality worsened. When a deal came along in 2010 for Dell’s 1320CN with extra toners for £133.90 I grabbed it.

Colour printing is a less frugal endeavour altogether, but at least the 1320CN is a popular model with a plentiful supply of cheap(er) generic toners. Sadly it only does one sided printing, which has come out at 5.5p/page over the 4000 or so pages printed so far.

If I was starting over

I’d probably go for a Color LaserJet with Duplex and Network so that I could get everything from one unit rather than running two printers. Something like the 3600dn[3] seems to fit the bill as it uses decent capacity toners.

Update 18 Jun 2018

I spent a bit more time modelling costs over the weekend. As things stand the cost per page breaks down to:

  • B&W – 75% Hardware, 4% Toner, 21% Paper – I’m obviously benefiting from ridiculously cheap toner here, but the ‘right’ printer is one with cheap consumables, and there seems to be no better way of getting that than using older laser printers that are (or have been) popular. A quick look at eBay shows that I could easily get another 6000 page toner for about £10.
  • Colour – 61% Hardware, 28% Toner, 11% Paper – once again cheap toner makes a huge difference. When I first bought a replacement toner multi-pack (CMYK) in 2012 it was £18.94, but I’ve since got them as cheaply as £9.99.

If I project usage out a bit further (2x,3x,4x) I quickly get below 1p/page for B&W and 4p/page for Colour as the hardware is amortised and the costs become dominated by toner (more so for colour) and paper (more so for B&W).

I also analysed toner usage… I seem to be getting about 10,000 pages from an HP toner rated at 6,000, which is great (though not uncommon from what I’ve seen in forums). On the other hand I’m getting more like 666 pages for Dell toners rated at 2,000, which is pretty miserable (but probably a reflection of the fact that the colour printer gets used a fair bit for photos, which obviously use tons more toner than a normal page of text with a few words in colour).

Update 9 Jul 2018

Chris Neale pointed me to this Twitter thread from Paul Balze about inkjets:

Notes

[1] Hardly surprising given that my wife is a school teacher and both of my kids are still at school; though I think the bulk of the printing in the household comes from my wife.
[2] I’ve never owned an HP DeskJet myself due to the cost of consumables. Something I’d note from family members running these things is that they last pretty well, but ultimately fall victim to drivers not being available for newer versions of Windows, which has never been an issue for workhorse HP LaserJets.
[3] Or the newer 3800dn or CP3505.


I’m starting to see companies abandon Pivotal Cloud Foundry (PCF) in favour of Kubernetes distributions such as Red Hat’s OpenShift; and it’s almost certainly just a matter of time before we see traffic in the opposite direction.

My suspicion is that this is nothing to do with the technology itself[1], but rather that early implementations have failed to turn out as hoped, and people are blaming the platform rather than their inability to change the culture[2]. So they wheel in an alternative platform (and some fresh faces) and have another go.

We’ve seen this movie before with mobile development[3]. The native developers switched to cross platform frameworks just as the cross platform framework folk switched to native. It wasn’t that one approach was better or worse; as ever with these things there are trade offs that need to be balanced. It was just that v1 sucked, because the organisation that had built v1 hadn’t completed its cultural transformation; so the people making v2 wanted to change things up a bit.

Notes

[1] I could (and may) write an entirely separate post on the pros and cons of PCF and K8s, but the most important point is that they’re both platforms inspired by Google’s Borg that people can run outside of Google (or even on Google Cloud). Meanwhile this post ‘Comparing Kubernetes to Pivotal Cloud Foundry—A Developer’s Perspective‘ by Oded Shopen covers most of the key points.
[2] I’ll use ‘the way we do things around here’ as my definition for culture
[3] and NOSQL


The Spectre and Meltdown bugs have been billed as a ‘failure of imagination’, where the hardware designers simply didn’t conceive of the possibility that a performance optimisation might lead to a security vulnerability.

I personally find this a little hard to swallow. The very first time I came across side-channel attacks the first thing I though of was CPU caches. I just naively assumed that the folk at Intel etc. were smart enough to have figured out the potential problems and already designed in the countermeasures.

Regardless of whether Spectre and Meltdown genuinely were caused by failure of imagination (and I have my doubts about ARM here given that the CSDB instruction was already in the silicon of their licensees) it’s a class of problem we collectively need to think harder about. There seem to be a few valid approaches here:

  1. Adopting a more adversarial mindset – think about how an attacker might try to exploit a new feature or performance optimisation – the ‘red team‘ approach.
  2. ‘Chicken bits'[1] to allow features/optimisations to be disabled if they’re discovered to be vulnerable.
  3. Use of artificial intelligence (AI) to imagine harder/differently. When Google’s Deepmind team created AlphaGo it played Go like a human but a bit better; when they created AlphaGo Zero it came up with entirely different plays. I’d therefore expect that similar approaches could be applied to security validation.

Note

[1] Hat tip to Moritz Lipp for this term from the Q&A section of his QCon London presentation ‘How Performance Optimisations Shatter Security Boundaries


500

01Jun18

I was just about to put virtual pen to paper when I noticed that it would be my 500th post.

Not all of those posts have made it to being published, a few linger in my drafts.

It’s been a little over a decade since my Hello world!, and I’m still tremendously grateful to JP for giving me the push to start blogging, and you all for coming along and reading.


I spent the last couple of days at the Agile Enterprise conference in Rome organised by New York Java Special Interest Group (NYJavaSIG) founder Frank Greco. It was a much more intimate event that I’m generally used to, with only thirty-something attendees.

The best part was the ‘ask me anything’ panel of all the speakers at the end, and the best question boiled down to:

I’ve spent the last couple of days learning about microservices, serverless, Kubernetes, service meshes and blockchain – how do I possibly pull all this stuff together into a coherent solution?

My answer was to repeat the advice from Accelerate and suggest that this issue at hand was one of strategy, best approached by first achieving situational awareness using Wardley Mapping. Looking back I should have also referenced Amazon’s practice of ‘working backwards‘ where they start with a press release and FAQ as a way to crystallise what it is they’re planning to do that customers would actually care about.

Frank closed the panel by asking for thoughts on what might be most important over the next 2-3 years, which meant that the last words were gifted to me. Obviously everybody was there to learn about technology, and how new trends are emerging, but my suggestion was to follow the ‘3rd DevOps way’ of ‘continuous learning by experimentation’ by going and trying stuff out. The Istio Katacoda covers a lot of ground for a 10 minute beginner level course – so that was my suggested place to start.

The slides from my presentation ‘Ops and Security in a PaaS and Serverless world’ are on SlideShare (though they probably make little sense without the narrative). It’s the first time that I’ve done such a long presentation (70m), which had the advantage of allowing time for storytelling, though I fear that it asks a lot of an audience to pay attention for so long (especially when most are listening to a live translation).

This was my second trip to Rome, and for the second time I barely saw the place (not helped by some pretty atrocious weather that didn’t really encourage exploration). It’s a city I’d like to see more of, and I was also impressed by the airport (FCO), which rivals Zurich for being clean and efficient.