Warning: Undefined array key "DOKU_PREFS" in /usr/share/dokuwiki/inc/common.php on line 2082
cluster:119 [DokuWiki]

User Tools

Site Tools


cluster:119

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Last revision Both sides next revision
cluster:119 [2013/09/26 08:59]
hmeij
cluster:119 [2017/10/18 09:10]
hmeij07
Line 1: Line 1:
 \\ \\
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
 +
 +==== Submitting GPU Jobs ====
  
 Please plenty of time between multiple GPU job submissions.  Like minutes. Please plenty of time between multiple GPU job submissions.  Like minutes.
Line 7: Line 9:
  --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2013/09/25 08:33//  --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2013/09/25 08:33//
  
-====  Articles ====+**Articles**
  
-  * [[http://www.pgroup.com/lit/articles/insider/v5n2a1.htm]] Tesla vs. Xeon Phi vs. Radeon +  * [[http://www.pgroup.com/lit/articles/insider/v5n2a1.htm]] Tesla vs. Xeon Phi vs. RadeonA Compiler Writer's Perspective  
- A Compiler Writer's Perspective +  * [[http://www.pgroup.com/lit/articles/insider/v5n2a5.htm]] Calling CUDA Fortran kernels from MATLAB 
  
-==== Submitting GPU Jobs ==== 
  
  
Line 51: Line 52:
 </code> </code>
  
-With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available [[http://ambermd.org/gpus/]] (I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).+With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available <del>[[http://ambermd.org/gpus/]]</del> [[http://ambermd.org/gpus12/#Running]](I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).
  
 <code> <code>
Line 65: Line 66:
 3       Tesla K20m      21 C            0 % 3       Tesla K20m      21 C            0 %
 ==================================================== ====================================================
 +
 +[hmeij@sharptail sharptail]$ ssh n33 gpu-free
 +1,3,0
 +
 +
  
 </code> </code>
Line 143: Line 149:
 /cm/shared/apps/cuda50/toolkit/5.0.35/open64/lib:/cm/shared/apps/cuda50/toolkit/5.0.35/extras/CUPTI/lib:\ /cm/shared/apps/cuda50/toolkit/5.0.35/open64/lib:/cm/shared/apps/cuda50/toolkit/5.0.35/extras/CUPTI/lib:\
 /cm/shared/apps/mvapich2/gcc/64/1.6/lib /cm/shared/apps/mvapich2/gcc/64/1.6/lib
-# signal that this is charmrun/namd job+
  
 ## leave sufficient time between job submissions (30-60 secs) ## leave sufficient time between job submissions (30-60 secs)
Line 181: Line 187:
  
 # NAMD  # NAMD 
-from greentail we need to recreate module env+signal that this is charmrun/namd job
 export CHARMRUN=1 export CHARMRUN=1
 # stage the data # stage the data
Line 439: Line 445:
  
 exit $? exit $?
 +
 +
 +</code>
 +
 +
 +===== elim code =====
 +
 +<code>
 +
 +#!/usr/bin/perl
 +
 +while (1) {
 +
 +        $gpu = 0;
 +        $log = '';
 +        if (-e "/usr/local/bin/gpu-info" ) {
 +                $tmp = `/usr/local/bin/gpu-info | egrep "Tesla K20"`;
 +                @tmp = split(/\n/,$tmp);
 +                foreach $i (0..$#tmp) {
 +                        ($a,$b,$c,$d,$e,$f,$g) = split(/\s+/,$tmp[$i]);
 +                        if ( $f == 0 ) { $gpu = $gpu + 1; }
 +                        #print "$a $f $gpu\n";
 +                        $log .= "$f,";
 +                }
 +        }
 +        # nr_of_args name1 value1 
 +        $string = "1 gpu $gpu";
 +
 +        $h = `hostname`; chop($h);
 +        $d = `date +%m/%d/%y_%H:%M:%S`; chop($d);
 +        foreach $i ('n33','n34','n35','n36','n37') {
 +                if ( "$h" eq "$i" ) {
 +                        `echo "$d,$log" >> /share/apps/logs/$h.gpu.log`;
 +                }
 +        }
 +
 +        # you need the \n to flush -hmeij
 +        # you also need the space before the line feed -hmeij
 +        print "$string \n"; 
 +        # or use
 +        #syswrite(OUT,$string,1);
 +
 +        # smaller than specified in lsf.shared
 +        sleep 10;
 +
 +}
  
  
cluster/119.txt · Last modified: 2021/06/17 15:32 by hmeij07