This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Last revision Both sides next revision | ||
cluster:91 [2010/12/08 19:08] hmeij |
cluster:91 [2010/12/31 23:10] hmeij |
||
---|---|---|---|
Line 6: | Line 6: | ||
Grabbed the Linpack source and compiled against / | Grabbed the Linpack source and compiled against / | ||
- | ===== Runs ===== | + | More about [[http:// |
- | So based on what we did with the Dell burn in, follow this [[cluster: | + | ===== HP ===== |
+ | |||
+ | So based on what we did with the Dell burn in, follow this [[cluster: | ||
* N calculation: | * N calculation: | ||
Line 14: | Line 16: | ||
* PxQ: perfect square of 16x16=256, the number of cores we have. | * PxQ: perfect square of 16x16=256, the number of cores we have. | ||
+ | Next create the machines files that list the hostname for each core. | ||
+ | |||
+ | < | ||
+ | for i in `seq 1 32` | ||
+ | do | ||
+ | for j in `seq 1 8` | ||
+ | do | ||
+ | echo n${i}-ib0 >> machines | ||
+ | done | ||
+ | done | ||
+ | </ | ||
+ | |||
+ | Note that we're running via the hostname-ib0 port, that is the infiniband port. Probably does not matter but that way we'll stay off the provisioning switch and should see the Voltaire switch light up. | ||
+ | |||
+ | Simple script for invocation. | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | export PATH=/ | ||
+ | |||
+ | export LD_LIBRARY_PATH=/ | ||
+ | |||
+ | mpirun -n 256 --hostfile machines ./xhpl > hpl.log 2>&1 & | ||
+ | |||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Results ===== | ||
+ | |||
+ | And about the best results (1.5 teraflops) we found, was with | ||
+ | |||
+ | * N = 191,600 | ||
+ | * NB of 128 | ||
+ | * PxQ = 16 x 16 | ||
+ | |||
+ | < | ||
+ | |||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2C4 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0016592 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2R2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0016929 ...... PASSED | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00R2R4 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0015901 ...... PASSED | ||
+ | ============================================================================ | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ===== Image ===== | ||
+ | |||
+ | Looks like so. | ||
+ | |||
+ | {{: | ||
\\ | \\ | ||
**[[cluster: | **[[cluster: | ||
+ | |||
+ | ===== Hmm ===== | ||
+ | |||
+ | And that revealed a host with 10 gb memory instead of 12gb. | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@greentail Linux_PII_CBLAS]# | ||
+ | n10: MemTotal: | ||
+ | n26: MemTotal: | ||
+ | n13: MemTotal: | ||
+ | n3: MemTotal: | ||
+ | n2: MemTotal: | ||
+ | n9: MemTotal: | ||
+ | n23: MemTotal: | ||
+ | n30: MemTotal: | ||
+ | n28: MemTotal: | ||
+ | n1: MemTotal: | ||
+ | n31: MemTotal: | ||
+ | n20: MemTotal: | ||
+ | n27: MemTotal: | ||
+ | n25: MemTotal: | ||
+ | n15: MemTotal: | ||
+ | n16: MemTotal: | ||
+ | n18: MemTotal: | ||
+ | n29: MemTotal: | ||
+ | n6: MemTotal: | ||
+ | n7: MemTotal: | ||
+ | n5: MemTotal: | ||
+ | n24: MemTotal: | ||
+ | n32: MemTotal: | ||
+ | n19: MemTotal: | ||
+ | n12: MemTotal: | ||
+ | n22: MemTotal: | ||
+ | n8: MemTotal: | ||
+ | n11: MemTotal: | ||
+ | n4: MemTotal: | ||
+ | n14: MemTotal: | ||
+ | n17: MemTotal: | ||
+ | n21: MemTotal: | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | ===== Dell ===== | ||
+ | |||
+ | Since the cluster will be shut down December 28th we have an opportunity to run Linpack on the Dell cluster. | ||
+ | |||
+ | * ETHERNET | ||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 10x16=160, the number of cores we have | ||
+ | |||
+ | < | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0020883 ...... PASSED | ||
+ | ============================================================================ | ||
+ | </ | ||
+ | |||
+ | |||
+ | * INFINIBAND | ||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 10x16=160, the number of cores we have | ||
+ | |||
+ | < | ||
+ | ============================================================================ | ||
+ | T/V N NB | ||
+ | ---------------------------------------------------------------------------- | ||
+ | WR00L2L2 | ||
+ | ---------------------------------------------------------------------------- | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_1 | ||
+ | ||Ax-b||_oo / ( eps * ||A||_oo * ||x||_oo ) = 0.0024480 ...... PASSED | ||
+ | ============================================================================ | ||
+ | </ | ||
+ | |||
+ | So a total of 2.455e+02 + 3.264e+0 or about 572 Gflops, 0.5 teraflops. | ||
+ | |||
+ | |||
+ | ===== BSS ===== | ||
+ | |||
+ | |||
+ | Since the cluster will be shut down December 28th we have an opportunity to run Linpack on the sharptail cluster. | ||
+ | |||
+ | * N calculation: | ||
+ | * NB: start with 64, then 128, try 192 ... | ||
+ | * PxQ: perfect square of 9x10=92, close to the number of cores we have. | ||
+ | |||
+ | |||
+ | Hmm, unable to make this work. My estimates are that with 92 cores and 1,126 gb of memory this cluster should be able to do 800 Gflops. | ||
+ | |||
+ | |||
+ | \\ | ||
+ | **[[cluster: | ||
+ |