This shows you the differences between two versions of the page.
cluster:51 [2007/09/28 14:57] |
cluster:51 [2007/09/28 14:57] (current) |
||
---|---|---|---|
Line 1: | Line 1: | ||
+ | \\ | ||
+ | **[[cluster: | ||
+ | This is for experimental purposes only. \\ | ||
+ | Proof of concept type of a thing. \\ | ||
+ | --- // | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ====== The Story Of NAT ====== | ||
+ | |||
+ | The cluster is served file systems from our **[[cluster: | ||
+ | |||
+ | So what happens when you have another file system that you would like to make available on the back end compute nodes? | ||
+ | |||
+ | Note that: | ||
+ | |||
+ | * i'm not endorsing this approach at the current time until we test it further | ||
+ | * any " | ||
+ | * any " | ||
+ | * i had no idea how this worked until Scott Knauert put it together | ||
+ | |||
+ | We start by grabbing a surplus computer and install linux on it\\ | ||
+ | We add two NIC cards (in our case capable of 100e not gigE)\\ | ||
+ | We run a CAT6 cable from a router port to the cluster (this is gigE)\\ | ||
+ | And we named this new host **'' | ||
+ | |||
+ | < | ||
+ | [root@NAT: | ||
+ | Linux NAT 2.6.18-5-686 #1 SMP Fri Jun 1 00:47:00 UTC 2007 i686 GNU/Linux | ||
+ | </ | ||
+ | |||
+ | |||
+ | ===== Interfaces ===== | ||
+ | |||
+ | The NAT box will have two interface. | ||
+ | |||
+ | * eth1: 129.133.1.225 | ||
+ | * eth2: 10.3.1.10 | ||
+ | |||
+ | This is defined in (Debian) ''/ | ||
+ | |||
+ | < | ||
+ | # This file describes the network interfaces available on your system | ||
+ | # and how to activate them. For more information, | ||
+ | |||
+ | # The loopback network interface | ||
+ | auto lo | ||
+ | iface lo inet loopback | ||
+ | |||
+ | # Wesleyan | ||
+ | auto eth1 | ||
+ | iface eth1 inet static | ||
+ | address 129.133.1.225 | ||
+ | netmask 255.255.255.0 | ||
+ | gateway 129.133.1.1 | ||
+ | |||
+ | # Cluster | ||
+ | auto eth2 | ||
+ | iface eth2 inet static | ||
+ | address 10.3.1.10 | ||
+ | netmask 255.255.255.0 | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ===== ipTables ===== | ||
+ | |||
+ | Since we are opening up the backend of the cluster' | ||
+ | |||
+ | But the whole intent of the NAT host is to provide a bridge between separate networks. | ||
+ | |||
+ | * file ''/ | ||
+ | |||
+ | < | ||
+ | #!/bin/bash | ||
+ | |||
+ | #EXTERNAL is the interface to the outside network. | ||
+ | EXTERNAL=" | ||
+ | #INTERNAL is the interface to the local network. | ||
+ | INTERNAL=" | ||
+ | |||
+ | / | ||
+ | / | ||
+ | / | ||
+ | iptables --flush | ||
+ | iptables --table nat --flush | ||
+ | iptables --delete-chain | ||
+ | iptables --table nat --delete-chain | ||
+ | |||
+ | # added source and destination -hmeij | ||
+ | iptables --table nat --source 10.3.1.0/24 --destination 129.133.90.207 \ | ||
+ | | ||
+ | iptables --source 129.133.90.207 --destination 10.3.1.0/24 \ | ||
+ | | ||
+ | |||
+ | echo " | ||
+ | </ | ||
+ | |||
+ | We can now test the setup by contacting the remote host and attempt to mount the remote file system: | ||
+ | |||
+ | < | ||
+ | [root@NAT: | ||
+ | PING vishnu.phys.wesleyan.edu (129.133.90.207) 56(84) bytes of data. | ||
+ | 64 bytes from vishnu.phys.wesleyan.edu (129.133.90.207): | ||
+ | 64 bytes from vishnu.phys.wesleyan.edu (129.133.90.207): | ||
+ | 64 bytes from vishnu.phys.wesleyan.edu (129.133.90.207): | ||
+ | |||
+ | --- vishnu.phys.wesleyan.edu ping statistics --- | ||
+ | 3 packets transmitted, | ||
+ | rtt min/ | ||
+ | |||
+ | [root@NAT: | ||
+ | [root@NAT: | ||
+ | Filesystem | ||
+ | vishnu.phys.wesleyan.edu:/ | ||
+ | 4.6T 1.5T 3.2T 31% /mnt | ||
+ | [root@NAT: | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ===== Routes ===== | ||
+ | |||
+ | On the compute nodes we now need to change the routing of the packets. | ||
+ | |||
+ | < | ||
+ | # add for nat box on administrative network | ||
+ | route add -net 192.168.1.0 netmask 255.255.255.0 gw 192.168.1.254 dev eth0 | ||
+ | # change default route set by platform/ | ||
+ | route add -net default netmask 0.0.0.0 gw 10.3.1.10 | ||
+ | route del -net default netmask 0.0.0.0 gw 10.3.1.254 dev eth1 | ||
+ | </ | ||
+ | |||
+ | and now our routing tables on the compute node looks like this: | ||
+ | |||
+ | < | ||
+ | [root@compute-1-1 ~]# route | ||
+ | Kernel IP routing table | ||
+ | Destination | ||
+ | 255.255.255.255 * | ||
+ | 192.168.1.0 | ||
+ | 192.168.1.0 | ||
+ | 10.3.1.0 | ||
+ | 169.254.0.0 | ||
+ | 224.0.0.0 | ||
+ | default | ||
+ | </ | ||
+ | |||
+ | We should now be able to '' | ||
+ | |||
+ | |||
+ | ===== AutoFS ===== | ||
+ | |||
+ | The whole point of the NAT box is to make the remote home directories available to certain users. | ||
+ | |||
+ | < | ||
+ | |||
+ | [root@swallowtail ~]# egrep ' | ||
+ | |||
+ | hmeij localhost:/ | ||
+ | sknauert vishnu.phys.wesleyan.edu:/ | ||
+ | |||
+ | [root@swallowtail ~]# make -C /var/411 | ||
+ | [root@swallowtail ~]# / | ||
+ | |||
+ | </ | ||
+ | |||
+ | Once autofs is reatrted on both the head node and the compute node compute-1-1, | ||
+ | < | ||
+ | [root@compute-1-1 ~]# cd ~sknauert | ||
+ | [root@compute-1-1 sknauert]# df -h . | ||
+ | Filesystem | ||
+ | vishnu.phys.wesleyan.edu:/ | ||
+ | 4.6T 1.5T 3.2T 31% / | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ===== Tests ===== | ||
+ | |||
+ | So lets write some files on compute node compute-1-1 in the remotely mounted home directory. | ||
+ | |||
+ | < | ||
+ | #for i in 1024 10240 102400 1024000; do echo $i; time dd if=/ | ||
+ | # ls -lh | ||
+ | -rw-r--r-- | ||
+ | -rw-r--r-- | ||
+ | -rw-r--r-- | ||
+ | -rw-r--r-- | ||
+ | </ | ||
+ | |||
+ | ^ Where ^ 1024 ^ 10240 ^ 102400 ^ 1024000 ^ | ||
+ | |vishnu.phys:/ | ||
+ | |/ | ||
+ | |/ | ||
+ | |/ | ||
+ | |||
+ | These time recordings will wildly vary depending on competing resources ofcourse. | ||
+ | |||
+ | The connection in our test setup is limited by the 100e NIC cards in the NAT box. Also the remote host has a 100e link to VLAN 90. We should move these to gigE. | ||
+ | |||
+ | |||
+ | ===== Errors ===== | ||
+ | |||
+ | '' | ||
+ | |||
+ | < | ||
+ | eth1: Transmit error, Tx status register 82. | ||
+ | Probably a duplex mismatch. See Documentation/ | ||
+ | Flags; bus-master1, | ||
+ | Transmit list 00000000 vs. c7c30b60. | ||
+ | 0: @c7c30200 length 800005ea status 000105ea | ||
+ | ... | ||
+ | </ | ||
+ | |||
+ | \\ | ||
+ | **[[cluster: |