This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
cluster:91 [2010/12/08 19:08] hmeij |
cluster:91 [2011/01/07 20:49] hmeij |
||
---|---|---|---|
Line 6: | Line 6: | ||
Grabbed the Linpack source and compiled against / | Grabbed the Linpack source and compiled against / | ||
- | ===== Runs ===== | + | More about [[http:// |
- | So based on what we did with the Dell burn in, follow this [[cluster: | + | ===== HP ===== |
+ | |||
+ | So based on what we did with the Dell burn in, follow this [[cluster: | ||
* N calculation: | * N calculation: | ||
Line 14: | Line 16: | ||
* PxQ: perfect square of 16x16=256, the number of cores we have. | * PxQ: perfect square of 16x16=256, the number of cores we have. | ||
+ | Next create the machines files that list the hostname for each core. | ||
+ | |||
+ | < | ||
+ | for i in `seq 1 32` | ||
+ | do | ||
+ | for j in `seq 1 8` | ||
+ | do | ||
+ | echo n${i}-ib0 >> machines | ||
+ | done | ||
+ | done | ||
+ | </ | ||
+ | |||
+ | Note that we're running via the hostname-ib0 port, that is the infiniband port. Probably does not matter but that way we'll stay off the provisioning switch and should see the Voltaire switch light up. | ||
+ | |||
+ | Simple script for invocation. | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | export PATH=/ | ||
+ | |||
+ | export LD_LIBRARY_PATH=/ | ||
+ | |||
+ | mpirun -n 256 --hostfile machines ./xhpl > hpl.log 2>&1 & | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Results ===== | ||
+ | |||
+ | And about the best results (1.5 teraflops) we found, was with | ||
+ | |||
+ | * N = 191,600 | ||
+ | * NB of 128 | ||
+ | * PxQ = 16 x 16 | ||
+ | |||
+ | < | ||
+ | |||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2C4 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0016592 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2R2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0016929 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2R4 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0015901 ...... PASSED | ||
+ | ============================================================================ | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ===== Image ===== | ||
+ | |||
+ | Looks like so. | ||
+ | |||
+ | {{: | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: | ||
+ | |||
+ | ===== Hmm ===== | ||
+ | |||
+ | And that revealed a host with 10 gb memory instead of 12gb. | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@greentail Linux_PII_CBLAS]# | ||
+ | n10: MemTotal: | ||
+ | n26: MemTotal: | ||
+ | n13: MemTotal: | ||
+ | n3: MemTotal: | ||
+ | n2: MemTotal: | ||
+ | n9: MemTotal: | ||
+ | n23: MemTotal: | ||
+ | n30: MemTotal: | ||
+ | n28: MemTotal: | ||
+ | n1: MemTotal: | ||
+ | n31: MemTotal: | ||
+ | n20: MemTotal: | ||
+ | n27: MemTotal: | ||
+ | n25: MemTotal: | ||
+ | n15: MemTotal: | ||
+ | n16: MemTotal: | ||
+ | n18: MemTotal: | ||
+ | n29: MemTotal: | ||
+ | n6: MemTotal: | ||
+ | n7: MemTotal: | ||
+ | n5: MemTotal: | ||
+ | n24: MemTotal: | ||
+ | n32: MemTotal: | ||
+ | n19: MemTotal: | ||
+ | n12: MemTotal: | ||
+ | n22: MemTotal: | ||
+ | n8: MemTotal: | ||
+ | n11: MemTotal: | ||
+ | n4: MemTotal: | ||
+ | n14: MemTotal: | ||
+ | n17: MemTotal: | ||
+ | n21: MemTotal: | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ===== Dell ===== | ||
+ | |||
+ | Since the cluster will be shut down December 28th we have an opportunity to run Linpack on the Dell cluster. | ||
+ | |||
+ | * ETHERNET | ||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 10x16=160, the number of cores we have | ||
+ | |||
+ | < | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0020883 ...... PASSED | ||
+ | ============================================================================ | ||
+ | </ | ||
+ | |||
+ | |||
+ | * INFINIBAND | ||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 10x16=160, the number of cores we have | ||
+ | |||
+ | < | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0024480 ...... PASSED | ||
+ | ============================================================================ | ||
+ | </ | ||
+ | |||
+ | So a total of 2.455e+02 + 3.264e+0 or about 572 Gflops, 0.5 teraflops. | ||
+ | |||
+ | |||
+ | ===== BSS ===== | ||
+ | |||
+ | |||
+ | Since the cluster will be shut down December 28th we have an opportunity to run Linpack on the sharptail cluster. | ||
+ | |||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 9x10=92, close to the number of cores we have. | ||
+ | |||
+ | |||
+ | Hmm, unable to make this work across all the nodes at the same time. Not sure why. My estimates are that with 92 cores and 1,126 gb of memory this cluster should be able to do 500-700 Gflops. | ||
+ | |||
+ | |||
+ | \\ | ||
+ | **[[cluster: | ||
+ |