This is an old revision of the document!
This is the template that the primary share owner will receive in order to get the process of migrating off flexstorage to the rstore platform (with custom edits in bottom section). It is not a storage policy draft but information could be gleaned from it.
Your Flexstorage storage share will be migrated to our new platform dubbed Rstore. Details are below. Once you have tested the new environment we need to coordinate a final update from Flexstorage to Rstore before you formally switch over.
Our current platform to provide disk storage capacity for users and groups is called Flexstorage. In this model users and groups purchase storage as needed for backup, replication and/or snapshosts. Our next platform will be called Rstore (stands for remote or research storage) and in this model disk space is allocated to users and groups upon request. Both platforms are made up of lots of slow spinning disks and should not be relied on for performance. A storage advisory group will evaluated the requests, possibly twice a year.
You will use your Wesleyan Active Directory (AD) credentials, that is, the username/password combination used to access all other Wesleyan services. There will be two service addresses: rstore0.wesleyan.edu and rstore2.wesleyan.edu. These addresses are each backed with a pair of integrated storage and server modules (rstoresrv[0&1].wesleyan.edu for rstore0 and rstoresrv[2&3].wesleyan.edu for rstore2). The service addresses will point to the primary member of each pair. During a fail over event the secondary will become the primary and handle the traffic for that service address. The failed primary, after being fixed, will then resurface as the secondary member.
There are several ways to access your contents; either by logging in or mounting the shares. More information is provided below with specifics for your share.
Each primary/secondary pair contains a /home directory that is kept in sync. This is solely for SSH access. Users /home quotas are very small (10 MB) and should only be used for scripts for example. Each primary/secondary's disk array is carved up in into four data filesystems (/data/1, /data/2, /data3 and /data/4). Shares will be distributed across the four 26 TB data filesystems allowing all shares room to grow.
Replication is duplicating data of primary member to be in sync with data on secondary member. In the event of a fail over, the contents are available on short notice. If a file is corrupt or missing on primary, it will be so on secondary following a replication event. The secondary members within each pair nightly replicate any new data from the primary via a pulling action. The frequency of “pulling” is described as nightly but depending on volume of new data/changes can span more than 24 hours. This happens for all four data filesystems within each pair.
Only the shares located on the third data filesystem (/data/3), in addition to replication, will perform snapshots. This happens on both primary and secondary, locally, from for example /data/3/share_name/ to /data/4/snapshots/share_name/, and also happens nightly. Snapshots are point-in-time comparisons of the share contents. Snapshots will be kept for: 6 daily, 4 weekly, 2 monthly. Thus if a file is deleted or corrupt it can be restored. This only happens for shares on /data/3 while we can sustain such disk usage.
Note: Because of replication, large reorganizations of content areas (250+ gb or 10,000+ files) causes a lot of deletions first, then recopying. Please describe in detail what needs to be moved and we can perform those actions on both primary and secondary members avoiding this.
Note: For very large filesystems whose contents does not change, the replication can take a long time and is typically unnecessary after first copy event. Share owners can control what gets replicated, or not, by staging a file in the top level share folder named rsync.incl or rsync.excl. These files contain lines of absolute paths to the folders to be either excluded or included during replication. For example: /data/1/share_name/projects/2005
(do not use weird characters or spaces). An include file only replicates those folders and an exclude file replicates the whole share but skips the listed folders.
Note: During an initial seeding of share contents, or adding large volumes of contents, it is possible to skip replication avoiding any race conditions. If a file named rsync.skiprepl is found in top level share folder replication (and snapshots) will not be performed.
Your content is protected using these methods:
[TEMPLATE]
[TEMPLATE]
Note: UID/GID name space. Every user has a uid/gid identification which determines permissions to files and directories. For this to work properly, they should be the same everywhere. Hence we're adopting AD assigned uid/gid combinations. You can find out your primary uid/gid with the unix command id username
on servers rintintin.wesleyan.edu and chloe.wesleyan.edu. The shares will have proper share owner uids but we're forcing gids to the share's group designation (which will be made in AD for you). Access methods slightly differ in this regard and it is important to understand them. If using a Samba mount to access, uids/gids are set properly.
/home/username
cd /location/to/share
) to your share locationls -l
) etcs15
)newgrp SHARE_GROUP_NAME
\\service_address.wesleyan.edy\SHARE_NAME
smb://service_address.wesleyan.edu/SHARE_NAME
mount -t nfs rstoresrv[0|1].wesleyan.edu/data/[1-4]/SHARE_NAME /mnt/foo
//rstoresrv0.wesleyan.edu/SHARE_NAME /mnt/foo cifs username=WESLEYAN/username,password=password,domain=wesleyan