This shows you the differences between two versions of the page.
Next revision | Previous revision Next revision Both sides next revision | ||
cluster:124 [2013/10/31 17:31] hmeij created |
cluster:124 [2013/10/31 18:21] hmeij |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | BLCR | + | \\ |
+ | **[[cluster: | ||
- | are modules loaded (done via / | + | ==== BLCR ==== |
+ | So we need a day of down time to switch file server functionality from greentail to sharptail. It would be nice if everybody did not loose any computational progress. | ||
+ | |||
+ | I've decided to support one checkpoint/ | ||
+ | |||
+ | BLCR consists of two kernel modules, some user-level libraries, and several command-line executables. No kernel patching is required. Modules are loading upon boot via / | ||
+ | |||
+ | * [[https:// | ||
+ | * [[https:// | ||
+ | |||
+ | First lets test on a node to grasp the concept. | ||
+ | |||
+ | < | ||
+ | |||
+ | # are modules loaded | ||
[hmeij@n33 blcr]$ lsmod | grep blcr | [hmeij@n33 blcr]$ lsmod | grep blcr | ||
blcr 115529 | blcr 115529 | ||
blcr_imports | blcr_imports | ||
- | set env | ||
- | | ||
- | | ||
+ | # set env | ||
+ | export PATH=/ | ||
+ | export LD_LIBRARY_PATH=/ | ||
+ | # is it all working | ||
[hmeij@n33 blcr]$ cr_checkpoint --help | [hmeij@n33 blcr]$ cr_checkpoint --help | ||
Usage: cr_checkpoint [options] ID | Usage: cr_checkpoint [options] ID | ||
Line 23: | Line 39: | ||
... | ... | ||
+ | # and here is our application and output (one extra character per second) | ||
+ | [hmeij@n33 blcr]$ ./ | ||
+ | * | ||
+ | ** | ||
+ | *** | ||
+ | **** | ||
+ | ***** | ||
+ | ****** | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | So now lets run this under BLCR and observe what happens. | ||
+ | |||
+ | < | ||
+ | |||
+ | # start application | ||
[hmeij@n33 blcr]$ cr_run ./ | [hmeij@n33 blcr]$ cr_run ./ | ||
[1] 12789 | [1] 12789 | ||
+ | # observe PID | ||
[hmeij@n33 blcr]$ ps | [hmeij@n33 blcr]$ ps | ||
PID TTY TIME CMD | PID TTY TIME CMD | ||
Line 32: | Line 65: | ||
28257 pts/ | 28257 pts/ | ||
+ | # wait, then checkpoint and terminate process | ||
[hmeij@n33 blcr]$ sleep 30 | [hmeij@n33 blcr]$ sleep 30 | ||
[hmeij@n33 blcr]$ cr_checkpoint --term 12789 | [hmeij@n33 blcr]$ cr_checkpoint --term 12789 | ||
[1]+ Terminated | [1]+ Terminated | ||
+ | # save the output | ||
[hmeij@n33 blcr]$ mv context context.save | [hmeij@n33 blcr]$ mv context context.save | ||
+ | </ | ||
+ | |||
+ | Ok. Next we use '' | ||
+ | |||
+ | < | ||
+ | |||
+ | # restart in background | ||
[hmeij@n33 blcr]$ cr_restart ./ | [hmeij@n33 blcr]$ cr_restart ./ | ||
[1] 13579 | [1] 13579 | ||
+ | |||
+ | # wait and terminate the restart | ||
[hmeij@n33 blcr]$ sleep 30 | [hmeij@n33 blcr]$ sleep 30 | ||
[hmeij@n33 blcr]$ kill %1 | [hmeij@n33 blcr]$ kill %1 | ||
[1]+ Terminated | [1]+ Terminated | ||
+ | </ | ||
+ | |||
+ | So what we're interested in is the boundary between first termination and subsequent restart. | ||
+ | |||
+ | < | ||
[hmeij@n33 blcr]$ tail context.save | [hmeij@n33 blcr]$ tail context.save | ||
Line 68: | Line 117: | ||
************************************************************ | ************************************************************ | ||
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | # pretty nifty! |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | # but be forewarned that there are binary characters lurking at this boundary |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | # you can strip them out with '' |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | # it looks like this |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@*************************************************** |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | </ |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@ | + | |
- | ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@*************************************************** | + | |
Line 133: | Line 169: | ||
lrwxrwxrwx 1 hmeij its 40 Oct 31 11:06 1383231850.62322.shell -> | lrwxrwxrwx 1 hmeij its 40 Oct 31 11:06 1383231850.62322.shell -> | ||
/ | / | ||
+ | |||
+ | |||
+ | \\ | ||
+ | **[[cluster: | ||