User Tools

Site Tools


cluster:11

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

cluster:11 [2007/02/20 09:50] (current)
Line 1: Line 1:
 +\\
 +**[[cluster:​0|Home]]**
  
 +==== What is ROCKS or Platform/​ROCKS?​ ====
 +(no idea what the acronym, if any, stands for)
 +
 +^[[http://​www.rocksclusters.org/​wordpress/​|ROCKS]]^[[http://​www.platform.com/​Products/​Platform.OCS/​|Platform Rocks]]^
 +|:-o ROCKS is an open-source software stack that enables the consistent delivery of scale-out application clusters |:-O Platform Open Cluster Stack (OCS) is a pre-integrated,​ vendor certified, software stack that enables the consistent delivery of scale-out application clusters using ROCKS|
 +|[[http://​www.rocksclusters.org/​rocks-documentation/​4.2.1/​index.html|User Guide]] provides a good overview of what ROCKS does|Platform,​ the company, offers 24x7 support for its "rocks implementation"​. ​ In addition, certain vendors are certified, meaning the hardware has been tested. ​ [[hhttp://​www.dell.com/​|Dell]] is certified, [[http://​www.sun.com|Sun]] is not|
 +|Support is via [[http://​www.rocksclusters.org/​wordpress/?​page_id=6|Email Discussion List]] | Annual Cluster Care subscription: ​   * 24x7 support ​   * Regular maintenance ​   * Periodic upgrades ​   * Patches ​   * Access to resources|
 +| the [[https://​wiki.rocksclusters.org/​|Rocks Wiki]]|US $150 per node, per year|
 +|[[http://​www.rocksclusters.org/​rocksapalooza/​2006/​tutorial-session1.pdf|Introduction to Clusters]], 200 slides, good intro| |
 +|[[http://​www.rocksclusters.org/​rocksapalooza/​2006/​tutorial-session2.pdf|Introduction to Rocks]], 200 slides, often too detailed| |
 +
 +A cluster is typically comprosed of a "front end" node, perhaps accompagnied with an "​io"​ node.  Then there may be numerous "light weight"​ nodes and "heavy weight"​ nodes. ​ The difference between the light&​heavy would be CPU speed of the chips, how many cores/cpu, and total memory footprint. ​ All nodes are densely packed in a rack and connected via switches (like gigabit ethernet). ​ Special hardware, like [[http://​www.voltaire.com/​|Infiniband]] switches provide for high performance,​ low latency connectivity. ​ Here is a typical cluster layout [[http://​www.rocksclusters.org/​rocks-documentation/​4.2.1/​images/​cluster.png|Image]]. ​ So what does ROCKS do?
 +
 +  * First the "front end" node is configured using ROCKS. ​ During this period, several "​Rolls"​ are provided; the Kernel Roll, the Base Roll, the HPC Roll and the Server Pack and Webserver Rolls. ​ Also the ROCKS operating system Rolls are provided. ​ This can be substituted with any of the following but __must include all__ your operating system cdroms: [[http://​www.centos.org/​|CentOS]],​ Redhat Enterprise Linux AS4, and [[https://​www.scientificlinux.org/​|Scientific Linux]]
 +
 +  * Next steps involve configuring the front end node (including making the front end aware of any switches on the cluster). ​ All this information is stored in MySQL databases alogn with the work node information collected later. ​ This allows the RCOKS software to generated kickstart file, /etc/hosts files etc for each node.  Once that is done, you insert the Kernel Roll into the cdrom of the first work node.  This node, with the default name of computer-0-0,​ will via DHCP, contact the front end node which registers the node.  The work node will then request a kickstart file and an operating system will be installed on the work node. These steps are repeated for all nodes.
 +
 +  * On the front end node, a suite of configuration files are under the control of a program called "​411";​ files like /​etc/​passwd. ​ So once a user has been added on the front end node via //​useradd//,​ 411 then progagates these changes to the work nodes on a schedule (or can be forced to do so on demand). Default ROCKS setup does not allow users to log into work nodes but the UID/GID and accounts must be  made available to the back end nodes.
 +
 +  * front end node:/​export/​apps is the filesystem area that is shared underneath each work node.  It is globally available as work node:/​share/​apps. ​ This is the area where files not under 411 control, or operating system RPM packages, are located. ​ That typically is the area to install shared applications,​ custom scripts, global datasets, etc ... although it's preferable to install applications inside the operating system area via RPM packages.
 +
 +\\
 +**That'​s the basics of it**
 +
 +\\
 +ROCKS basically manages a "​distribution"​ of one or more operating systems. ​ So, for example: ​
 +
 +  * //​shoot-node compute-0-0//​ ... this command instructs a work node to reinstall the operating system from scratch wiping out all local data (which should take about 10 mins)
 +  * //​cluster-fork "unix command"//​ ... is a utility that takes any unix command and executes it on each node, or a subset derived with --sql="​sql command"​ from the database, and collects all the results on the front end node.
 +
 +
 +===== ROCKS also provides: =====
 +
 +  * monitoring your cluster. ​ all information describing the nodes is save in mysql databases accessible and configurable via an apache server. [[http://​www.rocksclusters.org/​rocks-documentation/​4.2.1/​monitoring-database.html|link]]
 +  * viewing status graphs of how work nodes are performing using Ganglia. [[http://​www.rocksclusters.org/​rocks-documentation/​4.2.1/​monitoring-ganglia.html|link]] [[http://​ganglia.sourceforge.net|source link]]
 +  * obtaining "​cluster top" output to view in detail individual process information across work nodes. [[http://​www.rocksclusters.org/​rocks-documentation/​4.2.1/​x1112.html|link]]
 +
 +\\
 +Next: Job Scheduling & Launching ... the "​PBS"​ Roll
 +
 +Next: Message Passing ... ??? (perhaps in the "​HPC"​ Roll)
 +
 +There is also a Roll for [[http://​www.rocksclusters.org/​rocksapalooza/​2006/​lab-sge.pdf|SGE]],​ the Sun Grid Engine
 +
 +
 +
 +
 +====== ​ ======
 +\\
 +prepared for the UUG meeting of [[cluster:​12|12/​13/​2006]]
 +\\
 +\\
 +**[[cluster:​0|Home]]**
cluster/11.txt ยท Last modified: 2007/02/20 09:50 (external edit)