Gaussian never fixed the connectivity with Linda so it can not be run across multiple nodes. — Meij, Henk 2010/12/09 10:45
(I wrote this up for a user so am sharing it here until we get clarification from Gaussian.com)
Hi Anthony,
I observed your job below on sharptail. This must be running with the standard g09 executable. Gaussian is program that forks itself on the same host for as many threads you define, in your case 16. You’ll notice below that the scheduler allocates 16 jobslots on 8 nodes. You’ll also notice that Gaussian forks itself 16 times but on the same host, all the others are idle. This will seriously slow down your job.
The solution to this is that you must use Gaussian compiled with Linda. The latter provides the communication between the nodes. In that instance Gaussian will fork itself only twice on each host. Now in order to do this you must specify the target hosts in your Gaussian job. So here are some tips. In your submit script…
# Point to target queue with predefined hosts
#BSUB -q bss12g16
# This must be 16 always
#BSUB –n 16
# Use the correct g09
export g09root=“/share/apps/gaussian/g09root_amd64_linda”
# linda stuff
export GAUSS_LFLAGS=“-nodefile $LSB_HOSTS -opt Tsnet.Node.lindarsharg: ssh”
Then in your Gaussian .com file …
%mem=12gb
%nprocshared=2
%lindaworkers=bss011,bss012,bss013,bss014,bss015,bss016,bss017,bss018
And that should be it. However, big problem. We’re trying to resolve this with Gaussian support but have not received any feedback. It appears the Linda compilation of Gaussian will throw an error, something like:
‘execfile error; could not locate file’
But you could try it and see if you receive a different response.
4722 adavis0 RUN bss24 sharptail bss110 Fry_Lab Nov 17 11:09 bss110 bss070 bss070 bss080 bss080 bss091 bss091 bss102 bss102 bss072 bss072 bss092 bss092 bss123 bss123
[root@sharptail tmp]# pdsh uptime | egrep '110|070|080|091|102|072|092|123' bss070: 10:47:53 up 69 days, 20:10, 0 users, load average: 0.00, 0.00, 0.00 bss072: 10:47:40 up 69 days, 19:59, 0 users, load average: 0.00, 0.00, 0.00 bss091: 10:48:04 up 61 days, 20:37, 0 users, load average: 0.00, 0.00, 0.00 bss092: 10:47:58 up 61 days, 20:28, 0 users, load average: 0.00, 0.00, 0.00 bss080: 10:47:57 up 69 days, 1:07, 0 users, load average: 0.00, 0.00, 0.00 bss102: 10:47:53 up 61 days, 19:25, 0 users, load average: 0.00, 0.00, 0.00 bss123: 10:50:49 up 61 days, 39 min, 0 users, load average: 0.00, 0.00, 0.00 bss110: 10:49:08 up 61 days, 1:12, 0 users, load average: 16.51, 16.35, 16.25