User Tools

Site Tools


cluster:115

Warning: Undefined array key -1 in /usr/share/dokuwiki/inc/html.php on line 1458

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
cluster:115 [2013/05/28 09:54]
hmeij [Rhadoop]
cluster:115 [2013/09/10 15:04] (current)
hmeij [Rhadoop]
Line 2: Line 2:
 **[[cluster:0|Back]]** **[[cluster:0|Back]]**
  
-===== Use Hadoop (test) Cluster =====+===== Use Hadoop Cluster =====
  
 [[cluster:114|Build Hadoop Cluster]] [[cluster:114|Build Hadoop Cluster]]
Line 275: Line 275:
 R CMD INSTALL rmr-2.2.0.tar.gz R CMD INSTALL rmr-2.2.0.tar.gz
 R CMD INSTALL rhdfs_1.0.5.tar.gz R CMD INSTALL rhdfs_1.0.5.tar.gz
 +</code>
 +
 +Verify
 +
 +<code>
 +Type 'q()' to quit R.
 +
 +> library(rmr2)
 +Loading required package: Rcpp
 +Loading required package: RJSONIO
 +Loading required package: digest
 +Loading required package: functional
 +Loading required package: stringr
 +Loading required package: plyr
 +Loading required package: reshape2
 +> library(rhdfs)
 +Loading required package: rJava
 +
 +HADOOP_CMD=/usr/bin/hadoop
 +
 +Be sure to run hdfs.init()
 +> sessionInfo()
 +R version 3.0.0 (2013-04-03)
 +Platform: x86_64-redhat-linux-gnu (64-bit)
 +
 +locale:
 + [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C
 + [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8
 + [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8
 + [7] LC_PAPER=C                 LC_NAME=C
 + [9] LC_ADDRESS=C               LC_TELEPHONE=C
 +[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
 +
 +attached base packages:
 +[1] stats     graphics  grDevices utils     datasets  methods   base
 +
 +other attached packages:
 + [1] rhdfs_1.0.5    rJava_0.9-4    rmr2_2.2.0     reshape2_1.2.2 plyr_1.8
 + [6] stringr_0.6.2  functional_0.4 digest_0.6.3   RJSONIO_1.0-3  Rcpp_0.10.3
 +
 </code> </code>
  
Line 294: Line 334:
 </code> </code>
  
 +Then Hbase for Rhbase:
  
 +[[http://hbase.apache.org/book/configuration.html]]
  
 +But first Trift, the language interface to the database Hbase:
 +
 +<code>
 +yum install openssl098e
 +</code>
 +
 +Download Trift: [[http://thrift.apache.org/download/]]
 +
 +<code>
 +yum install byacc -y
 +yum install automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel
 +
 +./configure
 +make
 +make install
 +export PKG_CONFIG_PATH=$PKG_CONFIG_PATH:/usr/local/lib/pkgconfig/
 +pkg-config --cflags thrift
 +cp -p /usr/local/lib/libthrift-0.9.0.so /usr/lib/
 +
 +HBASE_ROOT/bin/hbase thrift start &
 +lsof -i:9090 that is server, port 9095 is monitor
 +
 +</code>
 +
 +Configure for distributed environment: [[http://hbase.apache.org/book/standalone_dist.html#standalone]]
 +
 +  * used 3 zookeepers with quorum, see config example online
 +  * start with rolling_restart, the start & stop have a timing issue
 +  * /hbase owened by root:root
 +  * permissions reset on /hdfs, not sure why
 +  * also use /sanscratch/zookeepers
 +  * some more notes below
 +
 +
 +<code>
 +
 +
 +install.packages('rJava')
 +install.packages("int64")
 +install.packages(c("Rcpp", "RJSONIO", "bitops", "digest", "functional", "stringr", "plyr", "reshape2"))
 +
 +wget http://cran.r-project.org/src/contrib/Archive/Rcpp/Rcpp_0.9.8.tar.gz
 +wget -O rmr-2.2.0.tar.gz http://goo.gl/bhCU6
 +wget -O rhdfs_1.0.5.tar.gz https://github.com/RevolutionAnalytics/rhdfs/blob/master/build/rhdfs_1.0.5.tar.gz?raw=true
 +
 +R CMD INSTALL Rcpp_0.9.8.tar.gz
 +R CMD INSTALL rmr-2.2.0.tar.gz
 +R CMD INSTALL rhdfs_1.0.5.tar.gz
 +R CMD INSTALL rhbase_1.2.0.tar.gz
 +
 +yum install openssl098e openssl openssl-devel flex boost ruby ruby-libs ruby-devel php php-libs php-devel \
 +automake libtool flex bison pkgconfig gcc-c++ boost-devel libevent-devel zlib-devel python-devel ruby-devel
 +
 +b2 install --prefix=/usr/local
 +
 +thrift: ./configure --prefix=/usr/local --with-boost=/usr/local; make
 +make install
 +
 +cp -p /usr/local/lib/libthrift-0.9.0.so /usr/lib/
 +cd /usr/lib; ln -s libthrift-0.9.0.so libthrift.so
 +
 +SKIP (nasty replaced with straight copy, could go to nodes)
 +http://www.cpan.org
 +'o conf commit'
 +cpan> install Hadoop::Streaming 
 +
 +whitetail only, unpack hbase, edit conf/hbase-site.xml, add to /etc/rc.local
 +also edit conf/regionservers
 +copy /usr/local/hbase-version-dir to nodes:/usr/local
 +
 +  <property>
 +    <name>hbase.zookeeper.quorum</name>
 +    <value>example1,example2,example3</value>
 +    <description>The directory shared by RegionServers.
 +    </description>
 +  </property>
 +  <property>
 +    <name>hbase.zookeeper.property.dataDir</name>
 +    <value>/export/zookeeper</value>
 +    <description>Property from ZooKeeper's config zoo.cfg.
 +    The directory where the snapshot is stored.
 +    </description>
 +  </property>
 +
 +
 +</code>
  
  
Line 487: Line 615:
  
 ==== Perl Hadoop::Streaming ==== ==== Perl Hadoop::Streaming ====
 +
 +  * All nodes
 +
  
   * [[http://search.cpan.org/~spazm/Hadoop-Streaming-0.122420/lib/Hadoop/Streaming.pm]]   * [[http://search.cpan.org/~spazm/Hadoop-Streaming-0.122420/lib/Hadoop/Streaming.pm]]
cluster/115.1369749269.txt.gz ยท Last modified: 2013/05/28 09:54 by hmeij