This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
cluster:211 [2022/02/28 14:47] hmeij07 created |
cluster:211 [2022/03/01 16:50] (current) hmeij07 [DMTCP CRAC] |
||
---|---|---|---|
Line 9: | Line 9: | ||
* http:// | * http:// | ||
- | CRAC consists of the plugin on top of DMTCP. | + | CRAC consists of the plugin on top of DMTCP.\\ |
+ | This software runs in the original directory | ||
Compilation needs '' | Compilation needs '' | ||
+ | < | ||
+ | # env on node n79 CRAC-early-developmennt-master.zip | ||
+ | # download | ||
+ | wget https:// | ||
+ | # unzip | ||
+ | unzip ../ | ||
+ | mv CRAC-early-development-master / | ||
+ | # gcc | ||
+ | | ||
+ | | ||
+ | / | ||
+ | / | ||
+ | $LD_LIBRARY_PATH | ||
+ | |||
+ | |||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | |||
+ | |||
+ | # make in place | ||
+ | cd / | ||
+ | ./configure | ||
+ | make # no errors | ||
+ | $ ls bin | ||
+ | dmtcp_command | ||
+ | dmtcp_coordinator | ||
+ | |||
+ | make check # all failed, msg: checkpoint error ??? | ||
+ | make check2 # /bin/sh: -c: line 11: syntax error near unexpected token `&' | ||
+ | make check3 # /bin/sh: -c: line 11: syntax error near unexpected token `&' | ||
+ | |||
+ | cd contrib/ | ||
+ | # edit Makefile set to gcc/g++ in PATH | ||
+ | make # no errors, but missing lib | ||
+ | |||
+ | $ ls | ||
+ | libdmtcp_split-cuda.so | ||
+ | kernel-loader.exe | ||
+ | libcuda_wrappers.so | ||
+ | |||
+ | # -lcuda -lcusparse -lcusolver -lcublas | ||
+ | # my 10.2 toolkit does not have cublas v11 | ||
+ | # so linking against lowest version in hpc_sdk | ||
+ | |||
+ | # seems to have worked | ||
+ | $ ldd kernel-loader.exe | ||
+ | libcublas.so.11 => not found | ||
+ | # now | ||
+ | libcublas.so.11 => / | ||
+ | |||
+ | |||
+ | </ | ||
+ | |||
+ | Next gobble together a gpu program like lammps/ | ||
+ | |||
+ | < | ||
+ | |||
+ | # oh well, nice try | ||
+ | cudaGetDeviceCount failed CUDA driver version is insufficient for CUDA runtime version | ||
+ | |||
+ | # manual with amber20 example | ||
+ | |||
+ | dmtcp_launch --new-coordinator \ | ||
+ | --coord-port 0 --port-file / | ||
+ | --ckptdir / | ||
+ | -O -o mdout.$LSB_JOBID -inf mdinfo.1K10 -x mdcrd.1K10 -r restrt.1K10 -ref inpcrd | ||
+ | |||
+ | |||
+ | </ | ||
\\ | \\ |