User Tools

Site Tools


cluster:91

This is an old revision of the document!


Table of Contents


Back

Linpack

Grabbed the Linpack source and compiled against /opt/openmpi/1.4.2 … using the Make.Linux_PII_CBLAS makefile. Had to grab the atlas libraries from another host. We changed $HOME and pointed to libmpi.so ($MPdir and $MPlib) and repointed $LAdir. Then it compiled fine.

Runs

So based on what we did with the Dell burn in, follow this HPLinpack Runs link, some calculations:

  • N calculation: 32 nodes, 12 gb each is 384 gb total which yields 48 gb double precision (8 byte) elements … 48 gb is 48*1024*1024*1024 = 51,539,607,552 … take the square root of that and round 227,032 … 80% of that is 181,600
  • NB: start with 64, then 128, try 192 …
  • PxQ: perfect square of 16×16=256, the number of cores we have.

Next create the machines files that list the hostname for each core.

for i in `seq 1 32`
do
for j in `seq 1 8`
do
echo n${i}-ib0 >> machines
done
done

Note that we're running via the hostname-ib0 port, that is the infiniband port. Probably does not matter but that way we'll stay off the provisioning switch and should see the Voltaire switch light up.

Simple script for invocation.

#!/bin/bash

export PATH=/opt/openmpi/1.4.2/bin:$PATH

export LD_LIBRARY_PATH=/opt/openmpi/1.42./lib:/home/hptest/test/lib64/atlas_GenuineIntel_x86_64

mpirun -n 256 --hostfile machines ./xhpl > hpl.log 2>&1 &

Results

And about the best results we found was with

  • N = 191,600
  • NB of 128
  • PxQ = 16 x 16
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
WR00R2C4      181600   128    16    16            2642.49          1.511e+03
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.0241238 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0097160 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0016592 ...... PASSED
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
WR00R2R2      181600   128    16    16            2649.93          1.507e+03
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.0246131 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0099131 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0016929 ...... PASSED
============================================================================
T/V                N    NB     P     Q               Time             Gflops
----------------------------------------------------------------------------
WR00R2R4      181600   128    16    16            2644.63          1.510e+03
----------------------------------------------------------------------------
||Ax-b||_oo / ( eps * ||A||_1  * N        ) =        0.0231181 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_1  * ||x||_1  ) =        0.0093110 ...... PASSED
||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) =        0.0015901 ...... PASSED
============================================================================

Image

Looks like so.


Back

cluster/91.1291836096.txt.gz · Last modified: 2010/12/08 14:21 by hmeij