Bright Computing revs up cluster manager
Fast provisioning, new Linuxes, CUDA, vSMP, and Python scripts
Bright Computing, which started from scratch several years ago to create a new, integrated cluster management tool, continues to build out the capabilities of the Bright Cluster Manager with the 5.2 release of the software, announced today at the International Super Computing 2011 conference in Hamburg, Germany.
Matthijs van Leeuwen, CEO at Bright Computing, says that the company continues to have an advantage over rivals in the open source and proprietary cluster management area in that the company is not taking system management and monitoring tools like Nagios and Ganglia and trying to extend them out to cover clusters instead of systems, but rather created a brand new cluster manager, which launched in 2009 after several years of development, that can hook into other workload managers and grid schedulers to babysit them and make them behave.
Interestingly, Bright has no interest in creating such tools itself. "There is such a wide choice of workload managers already," van Leeuwen tells El Reg. "Writing a workload manager is a lot of work. We want to focus on cluster management and then expand into other areas, such as clouds."
The company's strategy is to borg as many different workload managers, schedulers, and other bits of software that HPC shops want to use on their clusters and be the most inclusive and most integrated cluster manager out there. Bright Cluster Manager 5.2 wraps around more tools than its predecessors.
The release now supports Nvidia's CUDA 4.0 parallel programming environment and can extract metrics out of its Tesla GPUs and show what GPUs are doing out there on the cluster through the GUI console for the cluster manager.
BCM 5.2 also supports the SLURM, which is short for Simple Linux Utility for Research Management, an increasingly popular open source workload manager for Linux clusters that was created by the US Department of Energy supercomputing labs for their own use.
"Now that Grid Engine's future is uncertain, SLURM is becoming a default for people who want a free workload manager," says van Leeuwen, referring to the workload manager that Sun acquired several years ago before itself getting eaten by Oracle.
In addition to supporting SLURM in terms of integration with BCM 5.2, Bright will also provide tech support for SLURM if customers want that. (You can't exactly ring up Lawrence Livermore National Lab and ask for a bug fix.) The updated cluster manager from Bright also adds support for sometime rival Platform Computing's Load Sharing Facility (LSF) workload manager. BCM already supported PBS Professional, Torque/Moab, Torque/Maui, and Grid Engine as workload managers.
The 5.2 release from Bright also now has Python interfaces into its SOAP API set, allowing for Python script kiddies to get at all the features in the cluster manager programmatically. BCM is itself written in C++ and its initial SOAP API stack provided C++ interfaces; over time, the company has added support for Perl and PHP interfaces in the SOAP stack.
The new Bright Cluster Manager Web portal is written in PHP, and the PHP APIs allow BCM users to easily customize that portal. This portal, which is intended for users who want to see how their jobs are doing as they run on the cluster, keeps users from mucking about with the more powerful GUI that cluster administrators use to manage the clusters. Up until now, admins and users had access to all the same features, and that was, er, not too bright.