Sun makes midframe servers more resilient
Sun Microsystems Inc has taken some of the high-end resiliency features it created for its "Starfire" Enterprise 10000 servers and improved upon in its "StarCat" Sun Fire 15000 and "StarKitty" Sun Fire 12000 servers and rolled them into its Sun Fire midrange line, which it often refers to as its midframe line because they offer so-called mainframe features in midrange machines,Timothy Prickett Morgan writes
With the original midframe announcements back in March 2001, the dynamic domain partitioning technologies that originated in the Starfire machines were tweaked and rolled into the Sun Fire 3800 (eight-way), Sun Fire 4800 (12-way) and Sun Fire 6800 (24-way) servers. With the recent midframe announcements, Sun is adding two features, enhanced autorecovery and proactive self diagnostics, to these machines in an effort to differentiate these midframe servers from the value-class V series machines such as the V480 four-way and V880 eight-way machines, and the soon-to-be-announced V1280 12-way machine. Sun also hopes that these features will help the Sun Fire midframe line command a higher premium and sell better against competitive commercial Unix offerings from IBM Corp and Hewlett Packard Co.
The enhanced autorecovery technology, which is available as a firmware upgrade for the service processors at the heart of the Sun Fire 3800, 4800, and 6800 servers, debuted in the Starfire servers. In short, this new feature allows a machine with redundant service processors, which are in charge of monitoring the performance of the Sun Fire servers' components and providing the management interface to features such as dynamic domains, to automatically failover one service processor to another.
While the midframe machines ship with a single service processor as default, customers who want to increase the availability of their machines (in the event that a service processor fails) could buy a second service processor. But switching from one service processor to the other was largely a manual procedure. Enhanced autorecovery, which is a default feature in the Sun Fire 12000 and 15000 servers as well as the older Starfires, allows the failover of the service processor to happen automatically.
This feature is shipping in all new Sun Fire 3800, 4800, and 6800 servers as of last week, and is available for download for customers with existing machines. To use it, customers obviously have to buy a second service processor, which Sun says costs $12,000. Chris Kruell, group manager of Sun's Computer Systems Group, said that a lot of Sun Fire customers have these redundant service processors because they are using the machines to run mission-critical applications that cannot be offline.
The Starfire machines, said Kruell, had a rudimentary version of proactive self diagnosis, and Sun has improved on it with the Sun Fire 12000s and 15000s. This capability is being rolled down into the Sun Fire 3800, 4800, and 6800 machines now, too. This feature, which is also enabled by a firmware upgrade for the service processor in these machines, captures the history of key system components, logging performance, temperature, soft memory errors, and so forth so they can be used to fix crashes when they happen and maybe see that something is crashing before it does.
The data gathered by proactive self diagnosis firmware can be used by Sun technicians to identify components that might fail and replace them. Proactive self diagnosis does not, by the way, mean predictive failure analysis, which is a more sophisticated technology that would allow a server to take action to prevent a crash. Nonetheless, the dynamic memory, I/O, and CPU capabilities of the Sun Fire line coupled with this analysis means that companies or technicians working for them can get a machine up and running in a short time frame.
Both of these new features are supported on machines running either Solaris 8 or Solaris 9, and this is the case because these features function below the level of the operating system.
Sponsored: Benefits from the lessons learned in HPC