User Tools

Site Tools


cluster:135

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:135 [2014/12/05 15:21]
hmeij [RSTORE FAQ]
cluster:135 [2019/04/11 09:10] (current)
hmeij07
Line 2: Line 2:
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
-This is the template that the primary share owner will receive in order to get the process of migrating off flexstorage to the rstore platform (with custom edits in bottom section). It is not a storage policy draft but information could be gleaned from it.+ 
 +==== RSTORE Update ==== 
 + 
 +The rstore0/2 access points will go into read only mode early 2019. These access points will be replace by a similar but new platform rstore4/6.  Each share owner will be contacted and content will be copied if needed (we have two copies of everything on the old platform so hopefully most of it can remain there).  The new platform is 2x220 TB fully backed with an rsync replication engine. 
 + 
 + --- //[[hmeij@wesleyan.edu|Henk]] 2019/04/11 09:07//
  
 ==== RSTORE FAQ ==== ==== RSTORE FAQ ====
  
 Your Flexstorage storage share will be migrated to our new platform dubbed Rstore. Your Flexstorage storage share will be migrated to our new platform dubbed Rstore.
-Details are below.  Once you have tested the new environment we need to coordinate a final update from Flexstorage to Rstore before you formally switch over.+Details below, not all may apply to your share. 
 +Once you have tested the new environment we need to coordinate a final update from Flexstorage to Rstore before you formally switch over.
  
 === What is it? === === What is it? ===
  
-Our current platform to provide disk storage capacity for users and groups is called Flexstorage.  In this model users and groups purchase storage as needed for backup, replication and/or snapshosts. Our next platform will be called Rstore (stands for remote or research storage) and in this model disk space is allocated to users and groups upon request.  Both platforms are made up of lots of slow spinning disks and should not be relied on for performance. A storage advisory group will evaluated the requests, possibly twice a year.+Our current platform to provide disk storage capacity for users and groups is called Flexstorage (2x28TB).  In this model users and groups purchase storage as needed for backup, replication and/or snapshosts. Our next platform will be called Rstore (stands for remote or research storage, 2x112TB with full backup) and in this model disk space is allocated to users and groups upon request.  Both platforms are made up of lots of slow spinning disks and should not be relied upon for performance. A storage advisory group will evaluated the requests, possibly twice a year.
  
 === How do I access it? === === How do I access it? ===
Line 17: Line 23:
 You will use your Wesleyan Active Directory (AD) credentials, that is, the username/password combination used to access all other Wesleyan services.  There will be two service addresses: rstore0.wesleyan.edu and rstore2.wesleyan.edu.  These addresses are each backed with a pair of integrated storage and server modules (rstoresrv[0&1].wesleyan.edu for rstore0 and rstoresrv[2&3].wesleyan.edu for rstore2). The service addresses will point to the primary member of each pair.  During a fail over event the secondary will become the primary and handle the traffic for that service address. The failed primary, after being fixed, will then resurface as the secondary member. You will use your Wesleyan Active Directory (AD) credentials, that is, the username/password combination used to access all other Wesleyan services.  There will be two service addresses: rstore0.wesleyan.edu and rstore2.wesleyan.edu.  These addresses are each backed with a pair of integrated storage and server modules (rstoresrv[0&1].wesleyan.edu for rstore0 and rstoresrv[2&3].wesleyan.edu for rstore2). The service addresses will point to the primary member of each pair.  During a fail over event the secondary will become the primary and handle the traffic for that service address. The failed primary, after being fixed, will then resurface as the secondary member.
  
-There are several ways to access your contents either by logging in or mounting the shares. More information is provided below with specifics for your share.+There are several ways to access your contentseither by logging in or mounting the shares.  
 +More information is provided below.
  
-=== How is is configured? ===+=== How is it configured? ===
  
-Each primary/secondary pair contains a /home directory that is kept in sync.  This is solely for SSH access. Users /home quotas are very small (10 MB) and should only be used for scripts for example.  Each primary/secondary's disk array is carved up in into four data filesystems (/data/1, /data/2, /data3 and /data/4).  Shares will be distributed across the four 26 TB data filesystems allowing all shares room for growth.+Each primary/secondary pair contains a /home directory that is kept in sync.  This is solely for SSH access. Users /home quotas are very small (10 MB) and should only be used for scripts for example.  Each primary/secondary's disk array is carved up in into four data filesystems (/data/1, /data/2, /data/and /data/4).  Shares will be distributed across the four 26 TB data filesystems allowing all shares room to grow.
  
-Replication is duplicating data on primary member to be in sync with data on secondary member.  In the event of a fail over, the contents are available on short notice. If a file is corrupt or missing on primary, it will be so on secondary following replication event. The secondary members within each pair nightly replicate any new data from the primary via a pulling action. The frequency of "pulling" is described as nightly but depending on volume of delta changes can span more than 24 hours. This happens for all four data filesystems within each pair.+Replication is duplicating data of primary member to be in sync with data on secondary member.  In the event of a fail over, the contents are available on short notice. If a file is corrupt or missing on primary, it will be so on secondary following next replication event. The secondary members within each pair nightly replicate any new data from the primary via a pulling action. The frequency of "pulling" is described as nightly but depending on volume of new data/changes can span more than 24 hours. This happens for all four data filesystems within each pair.
  
-Only the shares located on the third data filesystem (/data/3), in addition to replication, will perform snapshots. This happens on both primary and secondary, locally, from for example /data/3/share_name/ to /data/4/snapshots/share_name/, and also happens nightly.  Snapshots are point-in-time comparisons of the share contents. Snapshots will be kept for: 6 daily, 4 weekly, 2 monthly. Thus if a file is deleted or corrupt it can be restored.  This only happens for shares on /data/3 while we can sustain such disk usage requirements.+Only the shares located on the third data filesystem (/data/3), in addition to replication, will perform snapshots. This happens on both primary and secondary, locally, from for example /data/3/share_name/ to /data/4/snapshots/share_name/, and also happens nightly.  Snapshots are point-in-time comparisons of the share contents. Snapshots will be kept for: 6 daily, 4 weekly, 2 monthly. Thus if a file is deleted or corrupt it can be restored.  This only happens for shares on /data/3 while we can sustain such disk usage.
  
-Note: Because of replication, large reorganizations of content areas (250+ gb or 10,000+ files) causes a lot of deletions first, then recopying.  Please describe in detail what needs to be moved where and we can perform those actions on both primary and secondary members avoiding this+Note: Because of replication, large reorganizations of content areas (250+ GB or 10,000+ files) causes a lot of deletions first, then recopying.  Please describe in detail what needs to be moved and we can perform those actions for you on both primary and secondary members avoiding the traffic
  
-Note: For very large filesystems whose contents does not change, the replication can take a long time and is typically unnecessary after first copy event.  Share owners can control what gets replicated, or not, by staging a file in the top level share folder named rsync.incl or rsync.excl.  These files contain lines of absolute paths to the folders to be either excluded or included during replication. For example: /data/1/share_name/projects/2005 (do not use weird characters or spaces). An include file only replicates those folders and an exclude file replicates the whole share but skips the listed folders. +Note: For very large filesystems whose contents does not change, the replication actions can take a long time and is typically unnecessary after first copy event.  Share owners can control what gets replicated, or not, by staging a file in the top level share folder named rsync.incl or rsync.excl.  These files contain lines of absolute paths to the folders to be either excluded or included during replication. For example: ''/data/1/share_name/projects/2005'' (do not use weird characters or spaces). An include file only replicates those folders (preferred method) while an exclude file replicates the whole share but skips the listed folders (less efficient).  
 + 
 +Note: During an initial seeding of share contents, or adding large volumes of new contents, it is possible to skip replication avoiding any race conditions.  If a file named rsync.skiprepl is found in top level share folder, replication will not be performed.SimilarlyA file rsync.skipsnap prevents snapshots from happening.
  
 === Is my content safe? === === Is my content safe? ===
Line 37: Line 46:
   * When the content was transferred from Flexstorage to Rstore the -c option of rsync was used meaning perform a checksum check guaranteeing that both files have the same unique finger print.   * When the content was transferred from Flexstorage to Rstore the -c option of rsync was used meaning perform a checksum check guaranteeing that both files have the same unique finger print.
   * Once copied to Rstore disk arrays the content is protected with redundant array of independent disk technology, in our case RAID 60. This technique keeps multiple copies of data chunks so that failed disks can be repaired.  RAID 60 can withstand two simultaneous disk failures.   * Once copied to Rstore disk arrays the content is protected with redundant array of independent disk technology, in our case RAID 60. This technique keeps multiple copies of data chunks so that failed disks can be repaired.  RAID 60 can withstand two simultaneous disk failures.
-  * RAID cards managing the arrays continually scrub and test the disks for sign of probable near future failure events thus reducing actual failures.+  * RAID cards managing the arrays continually scrub and test the disks for signs of probable near future failure events thus reducing actual failures.
   * In the event of a primary catastrophic failure of a member in a pair, the content is available on the secondary member in a state defined by last replication event.   * In the event of a primary catastrophic failure of a member in a pair, the content is available on the secondary member in a state defined by last replication event.
   * In an accidental deletion of content in shares in /data/3 (ONLY), there will be snapshots available for restoration (daily, weekly, monthly).   * In an accidental deletion of content in shares in /data/3 (ONLY), there will be snapshots available for restoration (daily, weekly, monthly).
-  * Both primary/secondary members reside in the same data center but not in the same rack location wise.+  * Both primary/secondary members reside in the same data center but not in the same rack location wise (racks to still be purchased).
   * Both primary/secondary members are dual source powered (utility power and enterprise UPS)   * Both primary/secondary members are dual source powered (utility power and enterprise UPS)
-  * Upon deployment of your share, checksums will be calculated for each file on both primary and secondary members.  These "hashdeep" signatures (file size, two different checksums (MD5, SHA-256), and absolute path to file) will be stored in a text file.  That text file can be used in an audit to find out if anything about these files has changed and a report can be provided. If this gets automated will need to be assessed in the future.+  * Upon deployment of your share, checksums will be calculated for each file on both primary and secondary members.  These "hashdeep" signatures (file size, two different checksums (MD5, SHA-256), and absolute path to file) will be stored in a text file.  That text file can be used in an audit to find out if anything about these files has changed and a report can be provided. (If this gets automated will need to be assessed in the future.)
  
 === How do I test? === === How do I test? ===
  
-  * Your share's service address: rstore0.wesleyan.edu+[TEMPLATE] 
 + 
 +Hello, 
 + 
 +You're Rstore share has been created. Although most folks will connect to their shares via Samba, SSH/SFTP access is also allowed. Please consult the details at https://dokuwiki.wesleyan.edu/doku.php?id=cluster:135 because the "paths" change between both access modes. 
 + 
 +In general you can access your share in the following manner on your desktop: 
 + 
 +For windows, map a network drive 
 +<code> 
 +\\service_address\share_name 
 +</code> 
 + 
 +For Macs, in a terminal mount the share 
 +<code> 
 +smb://service_adress/share_name 
 +</code> 
 +  * Your share's service address: rstore[0|2].wesleyan.edu
   * Your share's location: /data/[1-4]   * Your share's location: /data/[1-4]
-  * Your share's name: SHARE_NAME+  * Your share's name: SHARE_NAME (owner SHARE_OWNER)
   * Your share's group name: SHARE_GROUP_NAME   * Your share's group name: SHARE_GROUP_NAME
   * Your share's default permissions are:   * Your share's default permissions are:
-    * share owner (SHARE_OWNER) rwx, share group (SHARE_GROUP) rwx, others none+    * share owner rwx, share group rwx, others none
     * share members own their own folders and files they create     * share members own their own folders and files they create
     * share members have group access to folders and files created     * share members have group access to folders and files created
       * This default policy can be changed to a certain extent       * This default policy can be changed to a certain extent
 +  * Your share's quota: X G (used Y G) 
 +  * Please let us know if you wish to receive nightly replication/snapshot reports.
 +
 +The Rstore Team
 +
 +[TEMPLATE]
  
-Note: UID/GID name space. Every user has a uid/gid identification which determines permissions to files and directories.  For this to work, they should be the same everywhere.  Hence we're adopting AD assigned uid/gid combinations.  You can find out your primary uid/gid with the unix command ''id username'' on servers rintintin.wesleyan.edu and chloe.wesleyan.edu. The shares will be owned proper uids but we're forcing gids to the share's group designation (which will be made in AD for you). Access methods slightly differ in this regard.+Note: UID/GID name space. Every user has a uid/gid identification number which determines permissions to files and directories.  For this to work properly, they should be the same everywhere.  Hence we're adopting AD assigned uid/gid combinations.  You can find out your primary uid/gid with the unix command ''id username'' on servers rintintin.wesleyan.edu and chloe.wesleyan.edu. The shares will have proper share owner uids but we're forcing gids to the share's group designation (which will be made in AD for you). Access methods slightly differ in this regard and it is important to understand them.  If you are using a Samba mount to access, uids/gids are set properly. Samba access is the easiest way to keep everything proper.
  
   * SSH access   * SSH access
-    * open any ssh client and connect to service address, you end up in /home/username +    * open any ssh client and connect to service address, you end up in ''/home/username'' 
-    * then change directories (cd /location/to/share) to your share location+    * then change directories (''cd /location/to/share'') to your share location
     * view contents (''ls -l'') etc     * view contents (''ls -l'') etc
-      * These are not computational servers but running maintenance scripts is allowed+      * These are not computational servers but running maintenance scripts etc is allowed
       * If you will be running programs that will take a long time please notify us       * If you will be running programs that will take a long time please notify us
-  * Your uid/gid comes from AD but your gid may not be the share group gid (for example it can s15) +  * Your uid/gid comes from AD but your gid may not be the share group gid (for example it can ''s15''
-    * to force the corect gid type ''newgrp SHARE_GROUP_NAME''+    * to force the correct gid type ''newgrp SHARE_GROUP_NAME''
  
   * Samba access (Preferred)   * Samba access (Preferred)
    * Windows users, map a network drive and use your AD credentials    * Windows users, map a network drive and use your AD credentials
-     '' \\service_address.wesleyan.edy\SHARE_NAME ''+     <code>\\service_address.wesleyan.edu\SHARE_NAME</code>
    * Mac users, mount a samba share    * Mac users, mount a samba share
-     '' smb:\/\/service_address.wesleyan.edu/SHARE_NAME '' +     <code>smb://service_address.wesleyan.edu/SHARE_NAME</code> 
-   The uid/gid will be forced by Samba to be correct.+   In both cases, the uid/gid will be forced by Samba to be correct.
  
-  * NFS (least preferred+  * NFS (only for HPCC environment
-    * This has the most potential to cuase havoc with uid/gid settings +    * On the high performance compute cluster head nodes (the "tails")  
-    * It requires the client (mounting clientto have the porper uid/gid or be AD enabled. +      * Shares are mounted at /home/SHARE_NAME 
-    * Must refer to the current primary hostname. +      * Like SSH access you may have change your group gid 
-    * Hangs during a fail over event. +   
-      * Via command line  +  * CIFS (mount the Samba share on remote clients
-      * '' mount -t nfs rstoresrv[0|1].wesleyan.edu/data/[1-4]/SHARE_NAME  /mnt/foo '' +    This can be done but remote client is responsible for proper uig/gid settings 
-      * Or via /etc/fstab using CIFS (mount the Samba share) +    * SFTP/SCP is probably easier. 
-      '' //rstoresrv0.wesleyan.edu/SHARE_NAME /mnt/foo  cifs username=WESLEYAN/username,password=password,domain=wesleyan ''+    * <code>//rstoresrv0.wesleyan.edu/SHARE_NAME /mnt/foo  cifs username=WESLEYAN/username,password=password,domain=wesleyan </code>
  
 \\ \\
cluster/135.1417810913.txt.gz · Last modified: 2014/12/05 15:21 (external edit)