User Tools

Site Tools


cluster:121

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:121 [2013/09/11 14:15]
hmeij
cluster:121 [2013/09/16 15:09] (current)
hmeij
Line 4: Line 4:
 ==== Hadoop Summary ==== ==== Hadoop Summary ====
  
-Our production Hadoop CLuster is based on [[http://www.cloudera.com/content/cloudera/en/home.html|Cloudera]]'s CD3U6 repository.  Here are some details:+Our production Hadoop Cluster is based on [[http://www.cloudera.com/content/cloudera/en/home.html|Cloudera]]'s CD3U6 repository.  Here are some details :
  
-  * namenode (that is login node) is whitetail.wesleyan.edu (our first mammal) +  * namenode (that is login node)whitetail.wesleyan.edu 
-    * whitetail also runs the Hadoop scheduler and Monitor+    * whitetail also runs the Hadoop Scheduler and Health Monitor 
 +      *  [[http://whitetail.wesleyan.edu:50070|Health Status]] 
 +      *  [[http://whitetail.wesleyan.edu:50030|Job Tracker]]
     * ssh to it directly or from any of our other tails     * ssh to it directly or from any of our other tails
   * resources: access to 600 GB of memory and 1.75 TB of Hadoop's Distributed File System (HDFS)   * resources: access to 600 GB of memory and 1.75 TB of Hadoop's Distributed File System (HDFS)
-  Some useful links +    could be doubled in near future if needed
-    * Health Status [[http://whitetail.wesleyan.edu:50070]] +
-    * Job Tracker [[http://whitetail.wesleyan.edu:50030]]+
  
 +  * HDFS is not backed up!
 +    * You must request a  writable work area /userdata/username
 +    * Be sure to down load your results to /home/username (that is the regular filesystem)
  
 +  * Data to be shared (dictionaries, anagrams, etc) can be posted in /shareddata
 +    * request such items to be posted there
  
 +  * Basic tools (request other tools to be installed)
 +    * shell scripting
 +    * python
 +    * perl (Hadoop::Streaming)
 +    * java (both Oracle in /usr/java and openJDK)
 +    * R+RHadoop (rmr2, rhdfs, rhbase)
 +    * Hbase (noSQL database)
 +      * [[http://whitetail.wesleyan.edu:60010|Master & Zookeepers]]
 +      * [[http://whitetail.wesleyan.edu:9095|Thrift server]]
 +    * MySQL 
 +      * request a database to be set up for you (limited space)
 +
 +  * Note: the permissions are bit weird in HDFS but I think it is sorted out.
 +    * If this turns into a problem we'll let everybody run as user hdfs ...
 +  * Note: some http links will not work because they point to the private network
 +    * If you wish to view them launch firefox from whitetail ...
  
 Other useful pages Other useful pages
Line 22: Line 43:
   * [[cluster:115|Use Hadoop Cluster]]   * [[cluster:115|Use Hadoop Cluster]]
  
 +\\ 
 +**[[cluster:0|Back]]**
cluster/121.1378908930.txt.gz · Last modified: 2013/09/11 14:15 by hmeij