User Tools

Site Tools


cluster:34


Back

General

After you have logged in and read the User Guides & Manuals, you should be able to get some work done.

If you have large compilations to perform please use one of the login nodes. You will also speed up your compilations if you use the localscratch area. To stage data and programs you are welcome to do so in your home directory.

Programs and software can be installed in either the operating system (via the package manager RPM) or staged in /share/apps/.

Data can similarly reside in your home directory or be staged.

NetApp FAS 3050c

Our filesystems will be provided by our new NetApp FAS 3050c device.
⇒ Fabric Attached Storage device.
Read About It

5 Terabytes of filesystem will be made available. Please read the paragraph below, it is important to understand the implications. The entire filesystem is shared in chunks by all users. There are currently no quotas, other than what's implied below.

/sanscratch is a LUN of size 1 TB, thin provisioned. There are other LUNs, each of size 1 TB and thin provisioned, that holds the home directories. At the current time there are 9 such LUNs.

That implies 10 TB of filesystem has been allocated but this a fairy tale. thin provisioning means the storage device will allocate disk space as needed to each LUN up till the 1 TB limit. All home directory LUNs are contained in a single volume of size 4 TB. If the volume fills up, all LUNS are moved offline.

There are several reasons for splitting the users across multiple LUNs.

  • If a LUN is moved offline (for resizing for example), only those users on that LUN are affected.
  • If the filesystem goes corrupt, a file system consistency checks needs to be performed. This can imply quite the downtime when performed on very large file systems. Again, tis will now only affect the users on the LUN in question.
  • Multiple LUNs allows us to group users together, so for example, we have a LUN for students performing class work and a LUN for postdocs/grads/visitors.

<hi yellow>A df command will display the available disk space to users, but be forewarned that it is the linux client reporting the disk usage but the actual disk blocks reside on the storage device. The space available is dependent on the disk usage of all LUNs together. It is thus possible to misinterpret this number. The only assurance you have is that if the other LUNs have not used their share, you could use the space available in your assigned LUN.</hi>

Sorry, but that's life with fiber channel and LUNs. I'm working on an idea to post on the web somewhere periodic information of disk space usage by LUN.

/tmp

Do not use it please. Pretty please.

Consider this filesystem as dedicated to the operating system processes.

/localscratch

Dell was so kind as to ship each of our compute nodes with dual disks on board (except for the heavy weight nodes). Since Platform/OCS only uses the first disk, we brought the second disks up as /localscratch. These are single spindle, slow disks with about 70 GB available.

The use of these disks can be for multiple purposes. NFS file locking will work. Also, if your programs produce frequent output that needs to be written to logs, write them in this area instead of your home directory. Then at the end of your job, copy the file in a single step to your home directory.

On the heavy weight nodes /localscratch is provided by the MD1000 disk arrays. We have two MD1000, each with a split backplane. 7 Disks spinning at 15K RPM are striped and carry Raid level 0. This provides about 230 GB of /localscratch for each of the heavy weight nodes. These filesystems are local to each node, use /sanscratch if you need to share temporary data amongst the heavy weight nodes (or you home directory).

/sanscratch

This file system is shared amongst all compute nodes. It's permissions are like /tmp meaning users could delete and overwrite other users data, or their own data, if running multiple copies of one program on a node.

Unique subdirectories for each job submitted will automatically be created for you by the scheduler. <hi yellow>Please use those directories.</hi> Those directories will also automatically be removed when your job finishes. More information can be found in the Queues page, under the topic 'PRE_EXEC and POST_EXEC'.

Because this filesystem is NFS mounted, file locking may not be reliable.


Back

cluster/34.txt · Last modified: 2007/05/16 15:58 (external edit)