User Tools

Site Tools



Usage Survey (circa early Nov 2006)

Brief synopsis of emerging themes
  • some commercial software will have to be bought outside of the grant: Matlab, Linda and Portland compilers
  • most current code is “coarse grain” parallel (meaning split a big problem into tiny pieces) rather than “fine grain” parallel (meaning the code itself is actual parallel implying internode communication)
  • work node requirements in general top out at 32 nodes per job, each node with 1 gb ram per cpu (2 gb / node)
  • some work nodes need a larger memory footprint: perhaps up to 4 nodes with dual quad-core cpus (16-32 gb ram per node)
  • 64-bit compilations will be required with 32-bit backward compatibility for the selected operating system
  • few jobs will be data intentive, most will be cpu intensive
  • similarly, few jobs will be integer intensive, most will be floating point intensive
Some configuration scenarios
  • one head node for ssh access (also runs the scheduler)
  • one I/O node, mounts via fiber home and scratch filesystems from SAN (no user access),
  • work nodes NFS mount filesystems from I/O node
  • up to 60 work nodes, lightweight, little on-board disk dspace
  • * 16 of 60 work nodes connected via infiniband switch for low latency
  • * all other work nodes connected via gigabit ethernet
  • * 4 of those other work nodes with large memory footprint
  • redhat enterprise linux AS 4 (2.6 kernel for large memory access), perhaps CentOS ?

QUESTION: what about the speed and size of the scratch filesystem? NFS ok? global/parallel/clustered filesystem? current thinking is to have one massive scratch space and another for home directories served from SAN underneath all nodes. basically we fiber mount under one work node, the I/O node, and NFS mount from there to all others.

Generic responses of similar users.

What applications do you plan to run on the Cluster? Is it home grown or third party. Please list some typical application names.Accelrys InsightIIMatlab $$$ Matlab and homegrown applications. AMBER, NAMD C code. Also Gromacs, CHARMM, AMBER“Gaussian”
For each application, is the application written in C, C++, or f90? Any particular compiler or packages you would need to use? mex,mcc C, C++ or Fortran $$$ C Fortran + Linda $$$
Is your code parallel? no yes10%yes
Does the application require high interprocessor communication (i.e. Does your code benefit from using low latency interconnect or are these independent calculations that need to run on many processors)? yes, low latency requiredlow latency, yes4 8-core nodes + SMP
Does the application use MPI ? yesyesno (Linda)
If MPI, how many nodes can the code can take advantage of, and with what interconnects (IB, myrinet or GE), and what is the memory requirement per core? typical 16 to 32 nodes (scales to 100's), myrinet, 1 GB per node<200MB4 nodes with 2 quad-core processors and 16 - 32 GB ram per node
Does the Application use OPENMP, if yes how many CPU's does the application scale to? Does it use large amounts of shared memory? How large? no up to 32 cpus (6 nodes), up to 128 gb ram
Is the application data intensive or CPU intensive? bothsome data, some cpucpucpucpu, some arrays > 100 gb
Is the application 64-bit? yesno32 or 6432 or 6464
Is the application integer or FP intensive? some, FP yesFPFPFP
RAM needed per processor? Amount of storage space required? 1GB (storage or ram?) 512 MB/cpu, 2 GB storage/runUsually <200 MB, some 4 GB (ram?)min 2 gb/core (?), scratch files 1,000 GB
Which Scheduling software does user prefer - LSF, PBS or SGE? anylsf or pbs, no pref


cluster/3.txt · Last modified: 2017/02/14 14:07 by hmeij07