User Tools

Site Tools


cluster:121

This is an old revision of the document!



Back

Hadoop Summary

Our production Hadoop Cluster is based on Cloudera's CD3U6 repository. Here are some details:

  • namenode (that is login node): whitetail.wesleyan.edu
  • resources: access to 600 GB of memory and 1.75 TB of Hadoop's Distributed File System (HDFS)
    • could be doubled in near future if needed
  • HDFS is not backed up!
  • You must request a writable work area /userdata/username
  • Be sure to down load your results to /home/username
  • Data to be shared (dictionaries, anagrams, etc) can be posted in /shareddata
    • request such items to be posted there
  • Basic tools (request other tools to be installed)
    • shell scripting
    • python
    • perl (Hadoop::Streaming)
    • R+RHadoop (rmr2, rhdfs, rhbase)
    • Hbase (noSQL database)
    • MySQL
      • request a database to be set up

Other useful pages

cluster/121.1379339093.txt.gz · Last modified: 2013/09/16 09:44 by hmeij