Don't get mad, get even
My crappy component inferno: our first reader audio blog
World of Reg If you're mad as hell and you're not going to take this any more, if you're blissfully happy and can't wait to tell the world, or if you're just tired of listening to product marketing managers who don't know what it's like to get your hands dirty, now's your chance. Don't rant in the pub: share it with millions of Reg readers instead.
We're turning over a small bit of our site to you. We can help you to make your own audio blog. This is Your World of Reg.
In episode one our sysadmin blogger Trevor Pott tell us how he's sick of his servers breaking down. He's blaming manufacturers who pump up profits by cheaping out on components, expecting us to pick up the pieces.
Want to tell your story? We're listening. Send your idea to WorldOfReg@theregister.co.uk. All you need is a point and 12 pictures. If it's good, we'll give you the chance to share Your World of Reg. ®
Last time I looked, Canada was part of America, North America that is, and quite a big part of it ;-)
It's all about the price
Once upon a time, back in the 1970s, you used to be able to purchase components that were certified to MIL-SPEC. Identical in every respect to their "commercial-grade" siblings, the difference in these components were that they would operate in environments that were 25-100% out of tolerance to their base specifications. You paid more for these components, but you could sleep better at night knowing that these pieces weren't going tits-up on you.
30 years later, and we've gone from having MIL-SPEC as a choice to the Military requiring COTS components in all BOMs. Instead of maintaining lines of parts (chips, wires, fans, etc.) actually designed for long service, we've gone the other way. Parts that are meant for use in kid's toys, where destruction is sure to occur before a component failure, are now used in nearly all commercial (and MIL-SPEC) systems. Because they are cheap: you can buy two or three for the price of just one long-life part, and hot-swap 'em when they surely fail.
Yes, there are costly alternatives out there - high-availability systems that are 3-5 times as expensive as what is "required" for the job. And that's what you're paying for - usually: over-engineering to insure that the system doesn't fail when a component falls over.
I'm not saying this is "right". What I am saying is that, while there is a market for higher-quality systems and components in the SMB market, that same market has "voted" with its wallet and chosen the inconvenience of having intermittent outages due to nickel fans failing to paying the 20-25% premium to not have to worry about it.
You see the same thing with support contracts: companies that have a mission-critical system that has to be available 24/7 (because the company's revenue is being collected by this system), yet only a 8-5, M-F service contract to support it. Same story: when it DOES fail, you're lucky if it isn't in the middle of the night (when your Asian or European operations are in full swing), or worse yet on a Friday evening, when the entire system will be down for nearly 3 days. Invariably someone gets fired, the system is fixed, and the "risk managers" say "Well, it happened once, so this will never happen again" and they go on the same as before.
Of course, if the actual outage cost is less than the differential in the support contract, well, then you're better off taking the hit. And THAT IS the real bottom line here: someone in the hierarchy of the company has made a decision, based on data that they have been presented, that a loss situation is acceptable in the larger scheme of things. Maybe they're correct: maybe it is acceptable to lose the use of a system, or a terminal, or a store for a period of time, so long as everyone is willing to sign-off on the costs and understands them.
So that's my two pence worth. I don't like it, but that's the real world.
This is what happens when you don't have a sysadmin on site. I think I had been to that site a year ago, and it was iffy...but not that bad. I spent something like six hours rewiring that mess. Unfortunately, the room provided for me to put two rack sin is barely larger than the two racks themselves. When I try to walk one of the sales staff through adding or removing a server (usually due to ten cent fan issues) they are far more concerned with getting the server back on the rack and online than with cleanliness. Heck, even some sysadmins...