User Tools

Site Tools


cluster:119

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:119 [2013/10/09 12:41]
hmeij
cluster:119 [2021/06/17 19:32] (current)
hmeij07
Line 6: Line 6:
 Please plenty of time between multiple GPU job submissions.  Like minutes. Please plenty of time between multiple GPU job submissions.  Like minutes.
  
-Jobs need to be submitted to the scheduler on host sharptail itself for now and will be dispatched to nodes n33-n37 in queue mwgpu. They can also be submitted from host greentail but remember any output will be in shraptail's /home+Jobs need to be submitted to the scheduler via cottontail to queues mwgpu, amber128, exx96. 
- --- //[[hmeij@wesleyan.edu|Meij, Henk]] 2013/09/25 08:33//+ 
 +This page is old, the gpu resource ''gpu4'' should be used, a more recent page can be found [[cluster:173|K20 Redo Usage]]. Although there might some useful information on this page explaining gpu jobs
 + --- //[[hmeij@wesleyan.edu|Henk]] 2021/06/17 15:29//
  
 **Articles** **Articles**
Line 52: Line 54:
 </code> </code>
  
-With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available [[http://ambermd.org/gpus/]] (I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).+With ''gpu-info'' we can view our running job.  ''gpu-info'' and ''gpu-free'' are available <del>[[http://ambermd.org/gpus/]]</del> [[http://ambermd.org/gpus12/#Running]](I had to hard code my GPU string information as they came in at 02,03,82&83, you can use deviceQuery to find them).
  
 <code> <code>
Line 66: Line 68:
 3       Tesla K20m      21 C            0 % 3       Tesla K20m      21 C            0 %
 ==================================================== ====================================================
 +
 +[hmeij@sharptail sharptail]$ ssh n33 gpu-free
 +1,3,0
 +
 +
  
 </code> </code>
Line 440: Line 447:
  
 exit $? exit $?
 +
 +
 +</code>
 +
 +
 +===== elim code =====
 +
 +<code>
 +
 +#!/usr/bin/perl
 +
 +while (1) {
 +
 +        $gpu = 0;
 +        $log = '';
 +        if (-e "/usr/local/bin/gpu-info" ) {
 +                $tmp = `/usr/local/bin/gpu-info | egrep "Tesla K20"`;
 +                @tmp = split(/\n/,$tmp);
 +                foreach $i (0..$#tmp) {
 +                        ($a,$b,$c,$d,$e,$f,$g) = split(/\s+/,$tmp[$i]);
 +                        if ( $f == 0 ) { $gpu = $gpu + 1; }
 +                        #print "$a $f $gpu\n";
 +                        $log .= "$f,";
 +                }
 +        }
 +        # nr_of_args name1 value1 
 +        $string = "1 gpu $gpu";
 +
 +        $h = `hostname`; chop($h);
 +        $d = `date +%m/%d/%y_%H:%M:%S`; chop($d);
 +        foreach $i ('n33','n34','n35','n36','n37') {
 +                if ( "$h" eq "$i" ) {
 +                        `echo "$d,$log" >> /share/apps/logs/$h.gpu.log`;
 +                }
 +        }
 +
 +        # you need the \n to flush -hmeij
 +        # you also need the space before the line feed -hmeij
 +        print "$string \n"; 
 +        # or use
 +        #syswrite(OUT,$string,1);
 +
 +        # smaller than specified in lsf.shared
 +        sleep 10;
 +
 +}
  
  
cluster/119.1381322498.txt.gz · Last modified: 2013/10/09 12:41 by hmeij