Is the server layer just a commodity?
Or can it ever be?
Workshop It’s been a while since Nicholas Carr wrote his polemic ‘Does IT matter? which documented how IT was commoditising, turning into a utility with little to differentiate itself – a theme which he continued in the book The Big Switch.
He was clearly demonstrating an economist’s grasp of technology – falling onto the trap of assuming that it should all just work, and if it doesn’t yet, it will only be a matter of time before it does. Such a stance is admirable but it misses two central points, that the IT operating in most organisations doesn’t, and may never ‘just work’ – and that the offerings of service providers such as Salesforce.com and Google are hardly sufficient to satisfy the demands of even the smallest organisations.
Having said that, Carr still had a point – that technology is not an end in itself, or to put it another way, “It’s what you do with it that counts.” Server rooms across the globe really should ‘just work’, because if they don’t, the applications they support won’t work either. This gives us another perspective on our server capabilities. Not only do we need to think about the servers themselves, but also, what’s driving the requirements on the applications? And perhaps more important still, how do we ensure that the needs of both are correctly aligned?
This raises a number organisational and political challenges. The people who look after servers are different to the people who look after databases, applications, packages, collaboration tools and the like. We know from previous workshops that the many groups involved do not always get along.
But also, the forces on each are very different From the server administrator perspective the diversity of applications is a challenge to anybody trying to keep everything running: unexpected upgrades and new deployments foisted upon administration staff with little warning; conflicts in terms of shared libraries and system configurations; poorly defined architectures with little thought to what might go wrong; or insufficient budget allocated to backup, failover and the like. The list goes on.
From the applications perspective, the server environment is by its nature restrictive. In the ideal world, all applications would exist in their own logical silos, protected from the limitations and vagaries of other platforms. Perhaps virtualization will be the answer to this, at some point in the future when it earns its mainstream stripes.
But right now the two opposing forces remain. The hardware and platform layer tends towards Carr-like commoditisation, as the economic forces prefer locked down to opened up, while the application and service delivery layer needs to remain open to whatever new offerings developers and application software vendors throw at it.
Wheels on the bus fall off and off
Get this balance right and, while things may operate sub-optimally, they will nonetheless operate. Get it wrong and service levels can very quickly start to suffer. Many will be familiar with environments where the wheels have well and truly fallen off the pram, where users are dissatisfied, where blame is being parcelled out like the end of rationing, where the underlying causes are so ingrained into every aspect of the technical and political environment that it takes very bold management to resolve the issues. Thankfully, we know from research that such situations are the exception rather than the norm.
The general consensus (feel free to disagree) is that the ideal server layer is one which can adapt to the needs of its applications, as quickly and efficiently as possible. Various terms have been used to describe such a capability – through the years, we have heard about on-demand IT, adaptive infrastructure, dynamic IT and so on.
There’s absolutely nothing wrong with this principle, but for many organisations it remains just that – a principle. Few have the luxury of ripping everything up and starting again, and even those of you who are conducting large-scale consolidation exercises know that it’s only a matter of time before all that new-and-improved hardware will once again appear old and obsolete.
Meanwhile, we keep plugging away, making do, keeping the lights on and doing the best job we can. We’d be interested to hear about your own experiences of course – perhaps you have actually found the magic bullet, or maybe you’ve consolidated your IT environment in the past and are now watching the paint start to peel.
If you’ve found what you believe is a good middle ground, what’s the secret of your success – is it down to good communications with the applications teams, having the right responsibilities in place? Or is it just keeping ahead of the game and being very good at what you do, ‘looking after number one’ as it were, keeping focused and paying attention to what’s coming?
If you have any horror stories of train wreck IT, please feel free to get them off your chests. But most of all, please do let us know of your experiences about the part the server layer can play, and indeed how efforts should be prioritised, such that IT in general really can ‘just work’. In the future, perhaps IT will stop mattering only once we care about it enough to make it so.
COMMENTS
Requirements
Yes bios updates will fail, servers can catch fire, backups could be destroyed and managers will make stupid decisions, have an organizational process that deals with such situations!
But first, "the server layer" is quite vague IMHO. Does it mean hardware? Or hardware + OS? Or hardware +OS + Apps (i.e. a webserver)? It's hard to draw a line these days were commodity stops and the specific application begins.
The trouble however is not the hardware or the OS or even webservers and software frameworks. Problems start much earlier, namely at specifying the required functionality of an IT system/solution by a business.
Contrary to popular believe, in some cases requirements can't be captured by a static list of demands. They are continuously changing. Customer requirements captured at any given time are therefore seldom correct and have an expiration date. This is at the heart of the problem and why IT never "just works", since the measurement of IT's success is a moving target.
Software products are never complete, since requirements and expectations keep changing. Unless your products and/or customers needs (and therefore IT requirements) never change, the solution seems to be (at least to me at the moment) rapid, iterative development of functionality. The company/department that has an organizational process that can deal with these changes has a better chance of satisfying requirements.
I can imagine that decoupling the functionality (i.e. software) from the hardware is a relieve. So maybe virtualization is not that bad a marketing fad. Iff it makes sure software can change without having to deal with hardware restrictions and hardware can change without a need to change the software. The two opposing forces just might establish an equilibrium.
In the mean time, I think it's wise to try and apply the same decoupling principle. Have play yards where application developers can play on their own development servers and when it's time for a release, plan a meeting with the two opposing forces and design the production environment together. But it doesn't hurt to keep in mind what the capabilities are of your "server layer" (and maintenance thereof) when playing in the play yard ;)
The fallacy of RAID to PHB's
RAID is dangerous when PHB's think their data is safe when it is in a RAID array.
Yep, I have come upon a case where one (not in IT) company CFO beleived that they had made their data safe because it was in a Raid-0 array.
Redundant Array of Cheap Servers
While the concept of a redundant array of cheap servers (RACS) might be fine for the bean counters, supporting them would be a complete nightmare. Having purchased cheap 1U servers, I found that providing you did upgrade or modify them they did actually work. However, when I needed to upgrade the Bios in one the server was left unusable; so the Bios had to be downgraded to the original and the applications moved to a new server. Replacing every server in a cluster could be expensive, unless you plan to buy new servers every three years
