User Tools

Site Tools


cluster:189

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revision Both sides next revision
cluster:189 [2020/01/11 17:53]
hmeij07 [Funding Policy]
cluster:189 [2020/01/11 21:12]
hmeij07
Line 3: Line 3:
  
 ===== Structure and History of HPCC ===== ===== Structure and History of HPCC =====
 +
 +As promised at the CLAC HPC Mindshare event at Swarthmore College Jan 2020. Here is the Funding and Priority Policies with some context around it.
  
 ==== History ==== ==== History ====
  
-In 2006, 4 Wesleyan qfaculty members approached ITS with a proposal to centrally manage a whigh performance computing center (HPCC) seeding the effort with an NSF grant (about $190K). ITS offered 0.5 FTE for a dedicated "hpcadmin". An Advisory Group was formed by these faculty plus hpcadmin (5 members, not necessarily our current "power users"). Another NSF grant reward was added in 2010 (about $105K). An alumni donation followed in 2016 (about $10K).  In 2018 the first instance of "faculty startup monies" was contribute to the HPCC (about $92K, see "Priority Policy" below. In 2019, a TrueNAS/ZFS appliance was purchased (about $40K) followed in 2020 by a GPU expansion project (about $96K). The latter two were self-funded expenditures, see "Funding Policy" below. To view the NSF grants visit [[cluster:169|Acknowledgement]]+In 2006, 4 Wesleyan faculty members approached ITS with a proposal to centrally manage a high performance computing center (HPCC) seeding the effort with an NSF grant (about $190K). ITS offered 0.5 FTE for a dedicated "hpcadmin". An Advisory Group was formed by these faculty plus hpcadmin (5 members, not necessarily our current "power users"). Another NSF grant reward was added in 2010 (about $105K). An alumni donation followed in 2016 (about $10K).  In 2018 the first instance of "faculty startup monies" was contributed to the HPCC (about $92K, see "Priority Policy" below). In 2019, a TrueNAS/ZFS appliance was purchased (about $40K, [[cluster:186|Home Dir Server]]) followed in 2020 by a GPU expansion project (about $96K, [[cluster:181|2019 GPU Models]]). The latter two were self-funded expenditures, see "Funding Policy" below. To view the NSF grants visit [[cluster:169|Acknowledgement]] 
 + 
 +The Advisory Group meets with the user base yearly during the reading week of the Spring semester (early May) before everybody scatters for the summer. At this meeting, the hpcadmin reviews the past year, previews the coming year, and the user base are contributing feedback on progress and problems.
  
 ==== Structure ==== ==== Structure ====
  
-The Wesleyan HPCC is part of the **Scientific Computing and Informatics Center** ([[https://www.wesleyan.edu/scic/| SCIC ]]).  The SCIC project leader is appointed by the Director of the **Quantitative Analysis Center** [[https://www.wesleyan.edu/qac/| QAC ]]. The Director of the QAC reports to the Associate Provost. The hpcadmin has a direct report with ITS Deputy Director and an indirect report with QAC Director.+The Wesleyan HPCC is part of the **Scientific Computing and Informatics Center** ([[https://www.wesleyan.edu/scic/| SCIC ]]).  The SCIC project leader is appointed by the Director of the **Quantitative Analysis Center** [[https://www.wesleyan.edu/qac/| QAC ]]. The Director of the QAC reports directly to the Associate Provost. The hpcadmin has a direct report with the ITS Deputy Director and an indirect report with the QAC Director.
  
-The QAC has an [[https://www.wesleyan.edu/qac/apprenticeship/index.html|Apprenticeship]] Program in which students are trained in Linux and several program languagues of their choice and other options. From this pool of students some become the QAC and SCIC helpdesk and tutors.+The QAC has an [[https://www.wesleyan.edu/qac/apprenticeship/index.html|Apprenticeship]] Program in which students are trained in Linux and several program languages of their choice and other options (like SQL or GIS). From this pool of students the hope is some become the QAC and SCIC help desk and tutors.
  
 ==== Funding Policy ==== ==== Funding Policy ====
Line 33: Line 37:
   * >3125 = 0.000048   * >3125 = 0.000048
 A cpu usage of 3,125,000 hours/year would cost $ 2,400.00 \\ A cpu usage of 3,125,000 hours/year would cost $ 2,400.00 \\
 +A gpu hour of usage is 3x the cpu hourly rate.\\
  
 +We currently have about 1,450 physical cpu cores, 60 gpus, 520 gb of gpu memory and 8,560 gb cpu memory provided by about 120 compute nodes and login nodes. Scratch spaces are provide local to compute nodes (2-5 tb) or over the network via NFS (55 tb). Home directories are under quota (10 tb) but these will disappear in the future with the TrueNAS/ZFS appliance (190 tb, 475 tb effective assuming a compression rate of 2.5x). a guide can be found here [[cluster:126|Brief Guide to HPCC]] and the software is located here [[cluster:73|Software Page]]
  
-priority access policy 
-user base stats, annual meeting, spring reading week 
-2019 queue usage stats link 
-adv group details, administrative 
-hpcc stats cpu cores, gpus, mem, hdd (rough) link to guide 
-latest deployment: nvidia gpu cloud on premise (docker containers) link 
-Script preempts nodes every 2 hours.  
  
 +==== Priority Policy ====
  
- +This policy was put in place about 3 years ago to deal with the issues surrounding new monies infusions from for example new faculty "startup monies", new grant monies, or donations to the HPCC. 
-===== Priority Access ===== +
- +
-This page will describe the Priority Access Policy in place at the current time (Jan 2020) for the HPCC. This policy was put in place about 3 years ago to deal with the issues surrounding new monies infusions from for example new faculty "startup monies", new grant monies, or donations to the HPCC. +
  
 There are few Principles in this Priority Access Policy There are few Principles in this Priority Access Policy
Line 54: Line 51:
   - Priority access is granted for 3 years starting at the date of deployment (user access).   - Priority access is granted for 3 years starting at the date of deployment (user access).
   - Only applies to newly purchased resources which should be under warranty in the priority period.   - Only applies to newly purchased resources which should be under warranty in the priority period.
- +  -  
-The main objective is to build an HPCC for all users with no (permanent) special treatment of subgroup.+**The main objective is to build an HPCC community resource for all users with no (permanent) special treatment of any subgroup.**
  
 The first principle implies that all users have access to the new resources immidiately when deployed. Root privilege is for hpcadmin only, sudo privilge may be used if/when necessary to achieve some purpose. The hpcadmin will maintain the new resource(s) while configuration(s) of new resource(s) will be done by consent of all parties involved. Final approval by the Advisory Group initiates deployment activities.  The first principle implies that all users have access to the new resources immidiately when deployed. Root privilege is for hpcadmin only, sudo privilge may be used if/when necessary to achieve some purpose. The hpcadmin will maintain the new resource(s) while configuration(s) of new resource(s) will be done by consent of all parties involved. Final approval by the Advisory Group initiates deployment activities. 
cluster/189.txt · Last modified: 2024/02/12 16:47 by hmeij07