When installing https://mtazzari.github.io/galario/ via miniconda3 the the web site states “Due to technical limitations, the conda package does not support GPUs at the moment. If you want to use a GPU, you have to build galario by hand.”
A compilation by hand yields two standalone libraries and presumably GPU functionality. There is an example of an invocation using -llibgalario_cuda
.
But we have python code so my thought was to pass these two standalone libraries via environment variable to the compiler.
https://nuitka.net/ source was downloaded and used directly.
cd /usr/local/src/tmp/ tar zxvf /zfshomes/hmeij/amhughes/Nuitka-0.6.16.1.tar.gz cd Nuitka-0.6.16.1/ ln -s /zfshomes/hmeij/amhughes/galario_SS-6.py # make sure pip installe cpython in the python of choice # miniconda3's python 3.8 (source the code initialization block) # be sure to get a recent gcc compiler export PATH=/share/apps/CENTOS6/gcc/9.2.0/bin:$PATH export LD_LIBRARY_PATH=/share/apps/CENTOS6/gcc/9.2.0/lib64:$LD_LIBRARY_PATH export LD_LIBRARY_PATH=/share/apps/CENTOS6/gcc/9.2.0/lib:$LD_LIBRARY_PATH # then generate a centos 7 binary python bin/nuitka --follow-imports galario_SS-7.py
That all worked and a sample job scheduler is listed below. It is claimed that the C binary will run 300% faster than the python code. We'll see what the users say.
Now to try to pass the standalone libraries I tried this …
# cuda 9.2 export CUDAHOME=/usr/local/n37-cuda-9.2 export PATH=/usr/local/n37-cuda-9.2/bin:$PATH export LD_LIBRARY_PATH=/usr/local/n37-cuda-9.2/lib64:$LD_LIBRARY_PATH which nvcc # fftw3 export FFTW_HOME=/share/apps/CENTOS7/fftw/3.3.8 export PATH=/share/apps/CENTOS7/fftw/3.3.8/bin:$PATH export LD_LIBRARY_PATH=$FFTW_HOME/lib:$LD_LIBRARY_PATH # for gpu we have a c++ galario libraries export LD_LIBRARY_PATH=/share/apps/CENTOS7/galario/1.2.2/usr/local/lib:$LD_LIBRARY_PATH # attempt at linking those libraries CPATH=/share/apps/CENTOS7/galario/1.2.2/usr/local/include \ LIBRARY_PATH=/share/apps/CENTOS7/galario/1.2.2/usr/local/lib \ python bin/nuitka --follow-imports galario_SS-7gpu.py
It compiled but when invoked it does not find any cuda routines. That may be because I can not find any cuda python code in the git repository either.
#!/bin/bash rm -f err-mpi-7c out-mpi-7c # need conda initialization code block in ~/.bashrc, or next lines # queues mwgpu, amber128 and exx96 #BSUB -q exx96 #BSUB -o out-mpi-7c #BSUB -e err-mpi-7c #BSUB -J test #BSUB -R "span[hosts=1]" #BSUB -n 8 # miniconda3 source /share/apps/CENTOS7/amber/miniconda3/etc/profile.d/conda.sh export PATH=/share/apps/CENTOS7/amber/miniconda3/bin:$PATH export LD_LIBRARY_PATH=/share/apps/CENTOS7/amber/miniconda3/lib:$LD_LIBRARY_PATH which mpirun python conda # wrapper handles mpirun invocation openmpi-mpirun \ ./galario_SS-7.bin 'Sz_46.txt' \ 1.28e-3 36.7 0.339 -1.0 1.0 10.9 61.7 0.46 -0.47 5 12 \ --outdir "output" ~