User Tools

Site Tools




Like Platform/ROCKS (see link), Scali/Manage is a software suite of tools to manage clusters. It appears very, very versatile. Lots of stuff you can do but what attracted my interests in my brief perusals were:

  • heterogeous clusters (as in, manage the other clsuters on campus …)
  • “golden” image capture and deployment (you can also “roll-back” to previous versions!)
  • simultaneously deploys RPM installations (so you can perform entire disk image updates with the “images” or incrementally with RPM packages)
  • parallel ssh & file copy support
  • Change Management … this is a biggie, for example, if you were to add a node: all nodes would need updating, this becomes automatic with change management, it'll auto detect what needs updating on other nodes
  • Fault Handling and Root Cause Analysis … also a biggie, know when something breaks before it happens
  • Scali/MAnage also handles other servers, server farms, grids and blade racks (so for example, rintintin's image could have been captured and deployed elsewhere, or rolled back after upgrading if unsuccessful)
  • java/eclipse based gui and web based client
  • it also supports PBS Pro, and MPI libraries and MPI/HA … that is high availability for HA (reasoning goes like … if jobs run 30 days and a single node fails and MPI is not HA then the entire job is aborted. So HA provides a pathway for atttempting to finish job while hardware underneath gets replaced).
  • And lots more.


cluster/15.txt · Last modified: 2006/12/20 14:19 (external edit)