User Tools

Site Tools


cluster:92

This is an old revision of the document!



Home

a “bottom's up” page of tasks performed while inching towards deployment.

Update (07/02/2011)

HP Towards Deployment

  • buy new intel compilers and cmkl?
  • job 100,000 mile stone (and 1,000,000 on swallowtail)
  • wait for monthly snapshots to start rotating before disabling 5 TB tivoli
  • 07/02/2011 Done, no changes for matlab, stata users.
  • 15/01/2011 Done: setup scripts to log usage, gather stats
  • 14/01/2011 Done: benchmark some applications, test mpi flavors
  • 10/01/2011 Done: run some test jobs via scheduler, some benchmarks
  • 07/01/2010 install Lava (no SGE installed, so lets do this first)
    • lava is up on greentail and n1
    • need to propagate golden image, done
  • 05/01/2011 rsync over rest of home directories over break (started 20dec10)
    • without the '–delete' and '–delete-exclude' flags
    • added '–stats' flag
    • added -the -k and -l flags for rsync when using /home/? where ? is a through z
    • 07/01/2011 almost there, /home to /snapshots is working
    • need to figure out mounts for user access at /snapshots/repository, done
    • set up weekply pulling from /oldhome to /home, in progress
  • 12/18/2010 set up rsnapshot
    • exercising hourly, daily, weekly, monthly
  • 12/15/2010 copy /share/apps across, see below
    • will have to rsync petaltail:/share/apps/src into /share/backups/petaltail (to pull into greentail:/share/apps/src)
  • 12/15/20101 connect filer3's home directories on /mnt
    • rsync -vac -k /oldhome/username /home (does the trick, note no trailing slashes)
    • i will rsync everything over on date X and redo on date X+Y, overwrite
    • anticipate a deadline for moving off netapp filer3 permanently (dell cluster reconfig dependent)
    • make sure understand the lack of TSM filesystem backups
  • 12/15/2010 CMU license expired, turns out to be a demo license.
    • chasing around hp support for a resolution.
  • 12/15/2010 n21: missing dimms or bad dimms, open up the blade enclosure. Fixed.
  • 12/09/10 CMU backup
    • single file daily backup cmu.conf to /usr/local/backups/cmu
  • 12/09/10 Backup databases (hpsmdb for SIM using postgresql, mysql not installed)
    • not much success, not really needed, can do a rediscover
  • 12/09/10 Clean up /root and archive the training session materials.
  • 12/08/10 Linpack burn in. Stressing the hardware. Done. Results can be found here: Linpack
  • 12/05/10 Document training session materials. Done, on itsdoku wiki.
  • 12/03/10 Full backup of local hard disk (/ and /boot). Done.
  • 12/03/10 David Holton arrives for training. Part of “resource for a week”. Replaced a disk that was already flagged as “predictive failure”. Biggest change in cluster configuration is how the MSA60 volumes were set up. We destroyed all that and rearranged three volumes across the scsi cables (that is vertical versus horizontal) and applied LVM on top.
  • 11/25/10 Decommissioned two BSS racks (cluster sharptail) which will be donated Wellesley University. That freed up six L6-30 circuits. With James, three of them were pulled to the area were the Flexible Storage array and Equallogic array are racked. The other three are dedicated to the HP cluster. Pulled one Enterprise UPS L6-30 near the HP cluster and turned it on. Three PDUs are lit up, but differ from documentation. Change the cabling a bit so that both head node power supplies, one side of the MSA60 power supplies, and all switches plus KVM are on the enterprise UPS. This shall not be interruptable, yea.
  • 11/15/10 Cluster arrives. A bit overdue (first ETA was 10/01/10).


Home

cluster/92.1297191634.txt.gz · Last modified: 2011/02/08 14:00 (external edit)