User Tools

Site Tools


cluster:121

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:121 [2013/09/11 19:43]
hmeij [Hadoop Summary]
cluster:121 [2013/09/16 15:09] (current)
hmeij
Line 4: Line 4:
 ==== Hadoop Summary ==== ==== Hadoop Summary ====
  
-Our production Hadoop Cluster is based on [[http://www.cloudera.com/content/cloudera/en/home.html|Cloudera]]'s CD3U6 repository.  Here are some details:+Our production Hadoop Cluster is based on [[http://www.cloudera.com/content/cloudera/en/home.html|Cloudera]]'s CD3U6 repository.  Here are some details :
  
-  * namenode (that is login node) is whitetail.wesleyan.edu (our first mammal) +  * namenode (that is login node)whitetail.wesleyan.edu 
-    * whitetail also runs the Hadoop scheduler and Health Monitor+    * whitetail also runs the Hadoop Scheduler and Health Monitor 
 +      *  [[http://whitetail.wesleyan.edu:50070|Health Status]] 
 +      *  [[http://whitetail.wesleyan.edu:50030|Job Tracker]]
     * ssh to it directly or from any of our other tails     * ssh to it directly or from any of our other tails
   * resources: access to 600 GB of memory and 1.75 TB of Hadoop's Distributed File System (HDFS)   * resources: access to 600 GB of memory and 1.75 TB of Hadoop's Distributed File System (HDFS)
     * could be doubled in near future if needed     * could be doubled in near future if needed
-  * Some useful links 
-    * Health Status [[http://whitetail.wesleyan.edu:50070]] 
-    * Job Tracker [[http://whitetail.wesleyan.edu:50030]] 
  
 +  * HDFS is not backed up!
 +    * You must request a  writable work area /userdata/username
 +    * Be sure to down load your results to /home/username (that is the regular filesystem)
  
 +  * Data to be shared (dictionaries, anagrams, etc) can be posted in /shareddata
 +    * request such items to be posted there
  
 +  * Basic tools (request other tools to be installed)
 +    * shell scripting
 +    * python
 +    * perl (Hadoop::Streaming)
 +    * java (both Oracle in /usr/java and openJDK)
 +    * R+RHadoop (rmr2, rhdfs, rhbase)
 +    * Hbase (noSQL database)
 +      * [[http://whitetail.wesleyan.edu:60010|Master & Zookeepers]]
 +      * [[http://whitetail.wesleyan.edu:9095|Thrift server]]
 +    * MySQL 
 +      * request a database to be set up for you (limited space)
 +
 +  * Note: the permissions are bit weird in HDFS but I think it is sorted out.
 +    * If this turns into a problem we'll let everybody run as user hdfs ...
 +  * Note: some http links will not work because they point to the private network
 +    * If you wish to view them launch firefox from whitetail ...
  
 Other useful pages Other useful pages
Line 23: Line 43:
   * [[cluster:115|Use Hadoop Cluster]]   * [[cluster:115|Use Hadoop Cluster]]
  
 +\\ 
 +**[[cluster:0|Back]]**
cluster/121.1378928606.txt.gz · Last modified: 2013/09/11 19:43 by hmeij