User Tools

Site Tools


cluster:97

This is an old revision of the document!


Table of Contents


Home

IOZone

So after testing the memory performance of our clusters using Linpack, View Results, what about the file system access performance? There are many variables at play in this area, so a higher-level view is appropriate rather than a too detailed view.

In order to have comparative numbers, I choose the package IOZone which seemed to be used for this type of activities. IOZone performs many different tests including read, re-read, write, re-write, read-and-write, random mix, backwards reads and a few others. The whole mix then might be an appropriate comparative standard. As details spin out, we could focus on those that most reflecct our environment best; probably random mix.

Setup

IOZone was compiled for x86 64 bit Linux and staged in a tarball. That tarball would be copied to the disk housing the file system in question, unpacked, and with the vanilla out of the box “rule set” invoked with 'time ./iozone -a -g 12G > output.out'. Then the results were saved and graphed. The reason for 12GB as the file size limit to test at the upper bounds was set because cluster greeentail memory footprint across the board is that. I did not raise the file size limit above the memory footprint to avoid introducing another variable. You can read all about it External Link

As some of the tests IOZone performs put quite the load on the host (observed a single invocation to generate a load of 6), I ran IOZone with the LSF/Lava scheduler flag '-x' meaning exclusive use so no other programs would interfere.

Results

So lets start with cluster petaltail/swallowtail.

  • local.dell.out: real 369m1.476s (or slightly over 6 hours and 15 mins)

The compute nodes have a single 80GB 7.2K RPM disk containing a /localscratch linux file system. IOZone took 6+ hours to finish doing all the tests. So: local disk, one spindle, 4 year old hardware, no raid. Used one of the ehw queue nodes. So how does the fast disks on queue ehwfd compare?

  • fastlocal.dell.out: real 64m36.519s (or slighly over an hour)

The computes nodes in the ehwfd queue have directly attached to them, via iSCSI, a disk array. Each host has dedicated access to 230 GB provided by seven 36GB 15K RPM disks presented as /localscratch. So: local disks, 7 spindles, 4 year old hardware, raid 0. All seven disks working together at high speeds. This probably is the best IOZone performance we'll attain.

  • san.dell.out: real 518m11.531s (or slightly over 8 hours and 30 mins) Our Netapp filer (filer3) provides 5 TB of home directory space, which is the same volume as /sanscratch, served up via a NFS mount. So now we have added a network component, IOZone will perform tests against a network mounted file system. The volume containing /sanscratch is composed of 24 1TB disks at 7.2K RPM speeds. The aggregate holding this volume, also holds other volumes. So: network NFS volume, 24 spindles, raid 50 (i believe). No surprise, it is slow. About 1/3rd slower than the single local disk, that is another surprise. Then lets look at cluster greentail. * local.hp.out: real 208m7.579s (or almost 3 hours and 30 mins) Like the in the petaltail cluster, cluster greentail's compute nodes sport a single 160 GB disk spinning at 7.2K RPM. As above /localscratch is a linux file system. So: local disk, one spindle, new hardware, no raid. Performance is double that of the petaltail nodes, must have to be related to disk caching. * san.hp.out: real 163m25.761s(or almost 2 hours and 45 mins) The head node on cluster greentail has a direct attached smart disk array connected via iSCSI. A logical volume of 24 1TB disks, spinning at 7.2K RPM, holds a volume of 5TB presented to compute nodes as an NFS mount /sanscratch. To add another variable, the NFS mount is done using an infiniband switch, all previous examples used gigabit ethernet switches. IPoIB as it is referred to, and operates at roughly 3x gigE, depends on a lot of things. So: network NFS volume over infiniband, 24 spindles, raid 6. Surprisingly, it betters the single spindle - local disk example above by roughly 20%. * home.hp.out: *real 179m46.708s (or 3 hours)

On cluster greentail, a separate logical volume presents /home. This volume is comprised of 12 1TB disks at 7.2K RPM speeds. Same as above in terms of NFS mount across infiniband. Note that the disk involved for /home are different than those for /sanscratch. As expected it falls slightly short of the sancratch volume performance but not by much. However, as users exercise the /home volume this may become a larger gap.

Graphs

IOZone generates lots of interesting graphs, whose interpretations elude me somewhat still. But it is obvious in some graphs were anomalies exists, iow at sudden thresholds the performance starts to nose dive.


Home

cluster/97.1297804110.txt.gz · Last modified: 2011/02/15 16:08 by hmeij