Hadoop集群安装&Hbase实验环境搭建

标签: hadoop 集群 hbase | 发表时间:2013-04-09 23:36 | 作者:chessloveyou
出处:http://blog.csdn.net
1.安装ubuntu10.04操作系统
安装并配置telnet
1)安装
#apt-get install xinetd telnetd
2). 安装成功后,系统也会有相应提示:
sudo vi /etc/inetd.conf并加入以下一行
telnet stream tcp nowait telnetd /usr/sbin/tcpd /usr/sbin/in.telnetd
3). sudo vi /etc/xinetd.conf并加入以下内容:
# Simple configuration file for xinetd
#
# Some defaults, and include /etc/xinetd.d/

defaults
{

# Please note that you need a log_type line to be able to use log_on_success
# and log_on_failure. The default is the following :
# log_type = SYSLOG daemon info

instances = 60
log_type = SYSLOG authpriv
log_on_success = HOST PID
log_on_failure = HOST
cps = 25 30
}

includedir /etc/xinetd.d
4). sudo vi /etc/xinetd.d/telnet并加入以下内容:
# default: on
# description: The telnet server serves telnet sessions; it uses \
# unencrypted username/password pairs for authentication.
service telnet
{
disable = no
flags = REUSE
socket_type = stream
wait = no
user = root
server = /usr/sbin/in.telnetd
log_on_failure  = USERID
}
5). 重启机器或重启网络服务sudo /etc/init.d/xinetd restart

安装vim
sudo apt-get remove vim-common && sudo apt-get install vim

2.配置更新源(其实没必要)
配置ubuntu10.04网易更新源
vi /etc/apt/source-list

deb http://mirrors.163.com/ubuntu/ lucid main universe restricted multiverse
deb-src http://mirrors.163.com/ubuntu/ lucid main universe restricted multiverse
deb http://mirrors.163.com/ubuntu/ lucid-security universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ lucid-security universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ lucid-updates universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ lucid-proposed universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ lucid-proposed universe main multiverse restricted
deb http://mirrors.163.com/ubuntu/ lucid-backports universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ lucid-backports universe main multiverse restricted
deb-src http://mirrors.163.com/ubuntu/ lucid-updates universe main multiverse restricted

3.安装Java1.6
TIPS:
完美解决Ubuntu下vi编辑器方向键变字母的问题。
执行命令 sudo apt-get remove vim-common && sudo apt-get install vim

安装步骤:
1)为jdk 文件添加执行权限:    # chmod +x jdk-6u43-linux-x64.bin
2)进行安装 :    # ./jjdk-6u43-linux-x64.bin
3)增加JAVA_HOME环境变量(注意实际JAVA安装环境)
   [root@test src]# vi /etc/profile
   在最后面增加:
   #set java environment
    export JAVA_HOME=/usr/java/jdk1.6.0_43
    export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
    export PATH=$PATH:$JAVA_HOME/bin
    export JAVA_HOME CLASSPATH PATH
   保存退出
4)使得刚刚添加到环境变量生效:
     # source /etc/profile
5) 进入 /usr/bin/目录
        #cd /usr/bin
        #ln -s -f /usr/java/jdk1.6.0_18/jre/bin/java
        #ln -s -f /usr/java/jdk1.6.0_18/bin/javac
6)在命令行输入
root@ubuntu:/usr/bin# java -version
java version "1.6.0_43"
Java(TM) SE Runtime Environment (build 1.6.0_43-b01)
Java HotSpot(TM) 64-Bit Server VM (build 20.14-b01, mixed mode)

hadoop1.04 stable版本,下载地址:http://mirrors.cnnic.cn/apache/hadoop/common/

vi /etc/hosts
IP地址规划
192.168.123.11  master
192.168.123.12  slave1
192.168.123.13  slave2

安装软件 lrzsz(方便使用SecureCRT传输文件)
添加用户:
添加Hadoop专用系统用户hadoop:hduser
$ sudo addgroup hadoop
$ sudo adduser --ingroup hadoop hduser

克隆虚拟机两台!!!需要40-50分钟
4.配置SSH服务
每台机器上都运行:ssh-keygen -t rsa -P ""
配置SSH(因为Hadoop需要通过SSH来管理它的节点)
Master节点:
hduser@master:~$scp .ssh/authorized_keys root@slave2:/home/root
hduser@master:~$scp .ssh/authorized_keys root@slave2:/home/root
Slave1节点:
root@ubuntu:/home/hduser# chown -R hduser:hadoop authorized_keys
hduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keys
hduser@ubuntu:~$ scp authorized_keys hduser@slave2:/home/hduser
Slave2节点:
hduser@ubuntu:~$ cat .ssh/id_rsa.pub >> authorized_keys
hduser@ubuntu:~$ cp authorized_keys .ssh/
hduser@ubuntu:~$ scp authorized_keys hduser@master:/home/hduser/.ssh
hduser@ubuntu:~$ scp authorized_keys hduser@slave1:/home/hduser/.ssh

完成SSH配置。

5.关闭IPV6功能
vi /etc/sysctl.conf
在/etc/sysctl.conf中添加:
# disable ipv6
net.ipv6.conf.all.disable_ipv6  = 1
net.ipv6.conf.default.disable_ipv6  = 1
net.ipv6.conf.lo.disable_ipv6  = 1

然后需要重启机器
查看是否关闭IPV6
$ cat /proc/sys/net/ipv6/conf/all/disable_ipv6
如果为0,则没有关闭,如果为1,则关闭
##或者仅仅不允许Hadoop使用ipv6,在hadoop-env.sh中添加:
exportHADOOP_OPTS =-Djava.net.preferIPv4Stack =true

6.Hadoop安装:
将上传到/home/root的文件修改所属为hduser(或者直接在hduser用户上传)
root@ubuntu:~# chown -R hduser:hadoop *.gz
(使用hduser用户)将Hadoopxx.xx.gz在/usr/local文件夹中解压
~$sudo tar xzf hadoop-1.0.4.tar.gz
出现错误:
hduser is not in the sudoers file.  This incident will be reported.
解决方法:
1)查看sudoers的位置
root@master:~# whereis sudoers
sudoers: /etc/sudoers.d /etc/sudoers /usr/share/man/man5/sudoers.5.gz
2)添加修改权限
root@slave2:~# chmod u+x /etc/sudoers
3)添加hduser的sudo权限
root@slave2:~# vi /etc/sudoers
添加:hduser  ALL=(ALL:ALL) ALL

解压Hadoop:
再进行解压:
hduser@master:/usr/local$ sudo tar xzf hadoop-1.0.4.tar.gz
修改目录名称:
hduser@master:/usr/local$ sudo mv hadoop-1.0.4 hadoop
修改所属:
hduser@master:/usr/local$ sudo chown -R hduser:hadoop hadoop

更新$HOME/.bashrc(每台机器上都需要修改)
# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop

# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/java/jdk1.6.0_43

# Some convenient aliases and functions for running Hadoop-related commands
unalias fs &> /dev/null
alias fs="hadoop fs"
unalias hls &> /dev/null
alias hls="fs -ls"

# If you have LZO compression enabled in your Hadoop cluster and
# compress job outputs with LZOP (not covered in this tutorial):
# Conveniently inspect an LZOP compressed file from the command
# line; run via:
#
# $ lzohead /hdfs/path/to/lzop/compressed/file.lzo
#
# Requires installed 'lzop' command.
#
lzohead () {
    hadoop fs -cat $1 | lzop -dc | head -1000 | less
}

# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

7.Hadoop配置:
1)修改conf/hadoop-env.sh
设置JAVA_HOME
exportJAVA_HOME =/usr/java/jdk1.6.0_43

2)仅修改Master节点:
conf/masters内容修改为:
master

conf/slaves内容修改为:
master
slave1
slave2

3)修改所有节点的conf/*-site.xml
conf/core-site.xml (所有机器)
<property>
  <name>fs.default.name</name>
  <value>hdfs://localhost:54310</value>
  <description>The name of the default file system.  A URI whose
  scheme and authority determine the FileSystem implementation.  The
  uri's scheme determines the config property (fs.SCHEME.impl) naming
  the FileSystem implementation class.  The uri's authority is used to
  determine the host, port, etc. for a filesystem.</description>
</property>

conf/mapred-site.xml (所有机器)
<property>
  <name>mapred.job.tracker</name>
  <value>master:54311</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

conf/hdfs-site.xml (所有机器)
<property>
  <name>dfs.replication</name>
  <value>2</value>
  <description>Default block replication.
  The actual number of replications can be specified when the file is created.
  The default is used if replication is not specified in create time.
  </description>
</property>


8.Format HDFS
1)hduser@master:/usr/local/hadoop$ bin/hadoop namenode -format
13/04/09 06:50:03 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/127.0.1.1
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
13/04/09 06:50:04 INFO util.GSet: VM type       = 64-bit
13/04/09 06:50:04 INFO util.GSet: 2% max memory = 19.33375 MB
13/04/09 06:50:04 INFO util.GSet: capacity      = 2^21 = 2097152 entries
13/04/09 06:50:04 INFO util.GSet: recommended=2097152, actual=2097152
13/04/09 06:50:05 INFO namenode.FSNamesystem: fsOwner=hduser
13/04/09 06:50:05 INFO namenode.FSNamesystem: supergroup=supergroup
13/04/09 06:50:05 INFO namenode.FSNamesystem: isPermissionEnabled=true
13/04/09 06:50:05 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
13/04/09 06:50:05 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
13/04/09 06:50:05 INFO namenode.NameNode: Caching file names occuring more than 10 times 
13/04/09 06:50:07 INFO common.Storage: Image file of size 112 saved in 0 seconds.
13/04/09 06:50:07 INFO common.Storage: Storage directory /tmp/hadoop-hduser/dfs/name has been successfully formatted.
13/04/09 06:50:07 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/127.0.1.1
************************************************************/
HDFS name table存储在NameNode本地的文件系统中,具体在dfs.name.dir中,name table被用于跟踪和分析DataNode的信息

9.测试
1.启动集群
1)启动HDFS守护进程,NameNode进程在master上,DataNode进程在slave节点上
2)然后启动MapReduce进程:JobTracker在master上,TaskTracker进程在slave节点上

HDFS进程:
运行bin/star-dfs.sh(在master节点上运行)
hduser@master:/usr/local/hadoop$ bin/start-dfs.sh 
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-master.out
The authenticity of host 'master (127.0.1.1)' can't be established.
ECDSA key fingerprint is ec:c2:e2:5f:c7:72:de:4f:7a:c0:f1:e7:2b:eb:84:3f.
Are you sure you want to continue connecting (yes/no)? slave1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave1.out
slave2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave2.out

master: Host key verification failed.
The authenticity of host 'master (127.0.1.1)' can't be established.
ECDSA key fingerprint is ec:c2:e2:5f:c7:72:de:4f:7a:c0:f1:e7:2b:eb:84:3f.
Are you sure you want to continue connecting (yes/no)? yes
master: Warning: Permanently added 'master' (ECDSA) to the list of known hosts.
master: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-secondarynamenode-master.out

在master节点上~此时JAVA进程应该像这样~
hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
4697 Jps

slave节点应该像这样:
hduser@slave2:/usr/local/hadoop/conf$ jps
3105 DataNode
3743 Jps

在Master节点上运行:
bin/start-mapred.sh
在slave节点上查看:
hduser@master:/usr/local/hadoop/logs$ cat hadoop-hduser-tasktracker-master.log 
2013-04-09 07:27:15,895 INFO org.apache.hadoop.mapred.TaskTracker: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting TaskTracker
STARTUP_MSG:   host = master/127.0.1.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.4
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1393290; compiled by 'hortonfo' on Wed Oct  3 05:13:58 UTC 2012
************************************************************/
2013-04-09 07:27:17,558 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2013-04-09 07:27:17,681 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2013-04-09 07:27:17,683 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2013-04-09 07:27:17,684 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: TaskTracker metrics system started
2013-04-09 07:27:18,445 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2013-04-09 07:27:18,459 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2013-04-09 07:27:23,961 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2013-04-09 07:27:24,178 INFO org.apache.hadoop.http.HttpServer: Added global filtersafety (class=org.apache.hadoop.http.HttpServer$QuotingInputFilter)
2013-04-09 07:27:24,305 INFO org.apache.hadoop.mapred.TaskLogsTruncater: Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1
2013-04-09 07:27:24,320 INFO org.apache.hadoop.mapred.TaskTracker: Starting tasktracker with owner as hduser
2013-04-09 07:27:24,327 INFO org.apache.hadoop.mapred.TaskTracker: Good mapred local directories are: /tmp/hadoop-hduser/mapred/local
2013-04-09 07:27:24,345 INFO org.apache.hadoop.util.NativeCodeLoader: Loaded the native-hadoop library
2013-04-09 07:27:24,397 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source jvm registered.
2013-04-09 07:27:24,406 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source TaskTrackerMetrics registered.
2013-04-09 07:27:24,492 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcDetailedActivityForPort39355 registered.
2013-04-09 07:27:24,493 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source RpcActivityForPort39355 registered.
2013-04-09 07:27:24,498 INFO org.apache.hadoop.ipc.Server: Starting SocketReader
2013-04-09 07:27:24,509 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2013-04-09 07:27:24,519 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 39355: starting
2013-04-09 07:27:24,519 INFO org.apache.hadoop.ipc.Server: IPC Server handler 1 on 39355: starting
2013-04-09 07:27:24,520 INFO org.apache.hadoop.ipc.Server: IPC Server handler 2 on 39355: starting
2013-04-09 07:27:24,521 INFO org.apache.hadoop.mapred.TaskTracker: TaskTracker up at: localhost/127.0.0.1:39355
2013-04-09 07:27:24,521 INFO org.apache.hadoop.mapred.TaskTracker: Starting tracker tracker_master:localhost/127.0.0.1:39355
2013-04-09 07:27:24,532 INFO org.apache.hadoop.ipc.Server: IPC Server handler 3 on 39355: starting
2013-04-09 07:27:24,533 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 39355: starting
2013-04-09 07:27:24,941 INFO org.apache.hadoop.mapred.TaskTracker: Starting thread: Map-events fetcher for all reduce tasks on tracker_master:localhost/127.0.0.1:39355
2013-04-09 07:27:24,998 INFO org.apache.hadoop.util.ProcessTree: setsid exited with exit code 0
2013-04-09 07:27:25,028 INFO org.apache.hadoop.mapred.TaskTracker:  Using ResourceCalculatorPlugin : org.apache.hadoop.util.LinuxResourceCalculatorPlugin@74c6eff5
2013-04-09 07:27:25,030 WARN org.apache.hadoop.mapred.TaskTracker: TaskTracker's totalMemoryAllottedForTasks is -1. TaskMemoryManager is disabled.
2013-04-09 07:27:25,034 INFO org.apache.hadoop.mapred.IndexCache: IndexCache created with max memory = 10485760
2013-04-09 07:27:25,042 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ShuffleServerMetrics registered.
2013-04-09 07:27:25,045 INFO org.apache.hadoop.http.HttpServer: Port returned by webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the listener on 50060
2013-04-09 07:27:25,046 INFO org.apache.hadoop.http.HttpServer: listener.getLocalPort() returned 50060 webServer.getConnectors()[0].getLocalPort() returned 50060
2013-04-09 07:27:25,046 INFO org.apache.hadoop.http.HttpServer: Jetty bound to port 50060
2013-04-09 07:27:25,046 INFO org.mortbay.log: jetty-6.1.26
2013-04-09 07:27:25,670 INFO org.mortbay.log: Started [email protected]:50060
2013-04-09 07:27:25,670 INFO org.apache.hadoop.mapred.TaskTracker: FILE_CACHE_SIZE for mapOutputServlet set to : 2000

此时在Master节点上应该为:
hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
5130 Jps
4761 JobTracker
4988 TaskTracker
hduser@master:/usr/local/hadoop$ 

在Slave节点应该是:
hduser@slave2:/usr/local/hadoop/conf$ jps
3901 TaskTracker
3105 DataNode
3958 Jps

停止MapReduce进程,在Master节点上运行:
hduser@master:/usr/local/hadoop$ bin/stop-mapred.sh 
stopping jobtracker
slave2: stopping tasktracker
master: stopping tasktracker
slave1: stopping tasktracker
之后Master节点上的Jave进程:
hduser@master:/usr/local/hadoop$ jps
3217 NameNode
3526 SecondaryNameNode
4455 DataNode
5427 Jps
4761 JobTracker
4988 TaskTracker
之后在Slave节点上:
hduser@slave2:/usr/local/hadoop/conf$ jps
3105 DataNode
4140 Jps
3901 TaskTracker

Stopping the HDFS layer(Master)
hduser@master:/usr/local/hadoop$ bin/stop-dfs.sh 
stopping namenode
slave1: stopping datanode
slave2: stopping datanode
master: stopping datanode
localhost: stopping secondarynamenode
hduser@master:/usr/local/hadoop$ jps
5871 Jps

此时的Slave节点:
hduser@slave2:/usr/local/hadoop/conf$ jps
4305 Jps


10.MapReduce测试:
hduser@master:/usr/local/hadoop$ mkdir /tmp/test
hduser@master:/usr/local/hadoop$ cd /tmp/test
hduser@master:/tmp/test$ rz
rz waiting to receive.
开始 zmodem 传输。  按 Ctrl+C 取消。
  100%     615 KB  615 KB/s 00:00:01       0 Errors
  100%     502 KB  502 KB/s 00:00:01       0 Errors
  100%     813 KB  813 KB/s 00:00:01       0 Errors

hduser@master:/tmp/test$ /usr/local/hadoop/bin/start-all.sh 
starting namenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-namenode-master.out
slave2: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave2.out
slave1: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-slave1.out
master: starting datanode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-datanode-master.out
localhost: starting secondarynamenode, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-secondarynamenode-master.out
starting jobtracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-jobtracker-master.out
slave1: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-slave1.out
slave2: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-slave2.out
master: starting tasktracker, logging to /usr/local/hadoop/libexec/../logs/hadoop-hduser-tasktracker-master.out


hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -copyFromLocal /tmp/test /user/hduser/test
hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -ls /user/hduser
Found 1 items
drwxr-xr-x   - hduser supergroup          0 2013-04-09 08:06 /user/hduser/test
hduser@master:~$ /usr/local/hadoop/bin/hadoop dfs -ls /user/hduser/test
Found 3 items
-rw-r--r--   2 hduser supergroup     630010 2013-04-09 08:06 /user/hduser/test/1.epub
-rw-r--r--   2 hduser supergroup     514824 2013-04-09 08:06 /user/hduser/test/2.epub
-rw-r--r--   2 hduser supergroup     832882 2013-04-09 08:06 /user/hduser/test/3.epub


hduser@master:~$ cd /usr/local/hadoop/
hduser@master:/usr/local/hadoop$ bin/hadoop jar hadoop-examples-1.0.4.jar wordcount /user/hduser/test /user/hduser/test-output
13/04/09 08:18:45 INFO input.FileInputFormat: Total input paths to process : 3
13/04/09 08:18:45 INFO util.NativeCodeLoader: Loaded the native-hadoop library
13/04/09 08:18:45 WARN snappy.LoadSnappy: Snappy native library not loaded
13/04/09 08:18:45 INFO mapred.JobClient: Running job: job_201304090758_0001
13/04/09 08:18:46 INFO mapred.JobClient:  map 0% reduce 0%
13/04/09 08:19:10 INFO mapred.JobClient:  map 66% reduce 0%
13/04/09 08:19:19 INFO mapred.JobClient:  map 100% reduce 0%
13/04/09 08:19:31 INFO mapred.JobClient:  map 100% reduce 100%
13/04/09 08:19:36 INFO mapred.JobClient: Job complete: job_201304090758_0001
13/04/09 08:19:37 INFO mapred.JobClient: Counters: 29
13/04/09 08:19:37 INFO mapred.JobClient:   Job Counters 
13/04/09 08:19:37 INFO mapred.JobClient:     Launched reduce tasks=1
13/04/09 08:19:37 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=42658
13/04/09 08:19:37 INFO mapred.JobClient:     Total time spent by all reduces waiting after reserving slots (ms)=0
13/04/09 08:19:37 INFO mapred.JobClient:     Total time spent by all maps waiting after reserving slots (ms)=0
13/04/09 08:19:37 INFO mapred.JobClient:     Launched map tasks=3
13/04/09 08:19:37 INFO mapred.JobClient:     Data-local map tasks=3
13/04/09 08:19:37 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=20867
13/04/09 08:19:37 INFO mapred.JobClient:   File Output Format Counters 
13/04/09 08:19:37 INFO mapred.JobClient:     Bytes Written=3216032
13/04/09 08:19:37 INFO mapred.JobClient:   FileSystemCounters
13/04/09 08:19:37 INFO mapred.JobClient:     FILE_BYTES_READ=3421949
13/04/09 08:19:37 INFO mapred.JobClient:     HDFS_BYTES_READ=1978040
13/04/09 08:19:37 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=6930267
13/04/09 08:19:37 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=3216032
13/04/09 08:19:37 INFO mapred.JobClient:   File Input Format Counters 
13/04/09 08:19:37 INFO mapred.JobClient:     Bytes Read=1977716
13/04/09 08:19:37 INFO mapred.JobClient:   Map-Reduce Framework
13/04/09 08:19:37 INFO mapred.JobClient:     Map output materialized bytes=3421961
13/04/09 08:19:37 INFO mapred.JobClient:     Map input records=14841
13/04/09 08:19:37 INFO mapred.JobClient:     Reduce shuffle bytes=2449921
13/04/09 08:19:37 INFO mapred.JobClient:     Spilled Records=64440
13/04/09 08:19:37 INFO mapred.JobClient:     Map output bytes=3685555
13/04/09 08:19:37 INFO mapred.JobClient:     CPU time spent (ms)=10830
13/04/09 08:19:37 INFO mapred.JobClient:     Total committed heap usage (bytes)=496644096
13/04/09 08:19:37 INFO mapred.JobClient:     Combine input records=36177
13/04/09 08:19:37 INFO mapred.JobClient:     SPLIT_RAW_BYTES=324
13/04/09 08:19:37 INFO mapred.JobClient:     Reduce input records=32220
13/04/09 08:19:37 INFO mapred.JobClient:     Reduce input groups=31501
13/04/09 08:19:37 INFO mapred.JobClient:     Combine output records=32220
13/04/09 08:19:37 INFO mapred.JobClient:     Physical memory (bytes) snapshot=614944768
13/04/09 08:19:37 INFO mapred.JobClient:     Reduce output records=31501
13/04/09 08:19:37 INFO mapred.JobClient:     Virtual memory (bytes) snapshot=3845349376
13/04/09 08:19:37 INFO mapred.JobClient:     Map output records=36177


hduser@master:/usr/local/hadoop$ mkdir /tmp/test-output
hduser@master:/usr/local/hadoop$ bin/hadoop dfs -getmerge /user/hduser/test-output /tmp/test-output
13/04/09 08:22:48 INFO util.NativeCodeLoader: Loaded the native-hadoop library

 $head /tmp/test-output/test-output
查看内容~
因为导入的内容为中文的,所以为乱码,待查看,先到此吧~
HBASE还没安装呢~


作者:chessloveyou 发表于2013-4-9 23:36:56 原文链接
阅读:0 评论:0 查看评论

相关 [hadoop 集群 hbase] 推荐:

[hadoop] 基于Hadoop集群的HBase集群的配置

- - CSDN博客系统运维推荐文章
       a> 已经配置完成的Hadoop集群.        b> 所需要的软件包. 2>  单独安装的ZooKeeper集群,不基于HBase集群管理.        a> 在master01上解压zookeeper-3.4.4.tar.gz.        b> 修改Zookeeper的配置文件.

Hadoop集群安装&Hbase实验环境搭建

- - CSDN博客云计算推荐文章
1.安装ubuntu10.04操作系统. 安装成功后,系统也会有相应提示:. sudo vi /etc/inetd.conf并加入以下一行. sudo vi /etc/xinetd.conf并加入以下内容:. sudo vi /etc/xinetd.d/telnet并加入以下内容:. 重启机器或重启网络服务sudo /etc/init.d/xinetd restart.

分布式集群环境hadoop、hbase、zookeeper搭建(全)

- - CSDN博客云计算推荐文章
集群环境至少需要3个节点(也就是3台服务器设备):1个Master,2个Slave,节点之间局域网连接,可以相互ping通,下面举例说明,配置节点IP分配如下:. 三个节点均使用centos 6.3系统,为了便于维护,集群环境配置项最好使用相同用户名、用户密码、相同hadoop、hbase、zookeeper目录结构.

"Hadoop/MapReduce/HBase"分享总结

- - ITeye博客
此分享是关于hadoop生态系统的简单介绍包括起源到相对应用. Hadoop和HBase.pdf (2.1 MB). 已有 0 人发表留言,猛击->> 这里<<-参与讨论. —软件人才免语言低担保 赴美带薪读研.

Hbase+Hadoop安装部署

- - ITeye博客
VMware安装多个RedHat Linux操作系统,摘抄了不少网上的资料,基本上按照顺序都能安装好. 在217  218  216 分别执行 . 在217  218  216 分别执行 . 4、建hadoop与hbase、zookeeper. 1) hadoop 配置. 加入(不用master做salve).

[原]Hadoop,HBase添加和删除节点

- - long1657的专栏
Hadoop添加和删除节点. (一)添加节点有两种方式,一种是静态添加,关闭hadoop集群,配置相应配置,重启集群(这个就不再重述了). (二)动态添加,在不重启集群的情况下添加节点. 1.设置新datanode与namenode的SSH无密码登陆. 2.在hosts添加主机名称,并且把该文件复制到集群中的其他节点上.

Hadoop集群与Hadoop性能优化

- - 学着站在巨人的肩膀上
本文讲解一下Hadoop集群、Hadoop性能优化、Hadoop机架感知实现、Hadoop配置等,下面是有关这些命令的具体介绍. Hadoop性能优化:Hadoop机架感知实现及配置:分布式的集群通常包含非常多的机器,由于受到机架槽位和交换机网口的限制,通常大型的分布式集群都会跨好几个机架,由多个机架上的机器共同组成一个分布式集群.

[hadoop] 搭建自己的hadoop集群

- - CSDN博客系统运维推荐文章
       a>  五台centos6.2虚拟机,配置主机名、IP地址、yum源、.        b>  准备所需要的软件包. 2> 配置我自己的hadoop 集群.       a>  修改5台机器的hosts文件.       b>  配置master无密码登录slave,在master01和master02上执行以下命令:   .

Hadoop 集群基准测试

- - IT瘾-dev
生产环境中,如何对 Hadoop 集群进行 Benchmark Test. 本文将通过 Hadoop 自带的 Benchmark 测试程序:TestDFSIO 和 TeraSort,简单介绍如何进行 Hadoop 的读写 & 计算性能的压测. 回顾上篇文章: 认识多队列网卡中断绑定. (本文使用 2.6.0 的 hadoop 版本进行测试,基准测试被打包在测试程序 JAR 文件中,通过无参调用 bin/hadoop jar ./share/hadoop/mapreduce/xxx.jar 可以得到其列表 ).

从远端集群拷贝HBase表到本地HBase

- - 开源软件 - ITeye博客
背景描述:想导出 服务器HBase里面的一张表remine_4520及其数据,我能通过java连接HBase库,浏览器能访问master的信息. 方案:版本一样的话直接distcp表目录过来   然后hbck一下就行. HBase0.94.8,Hadoop 1.1.2,集群使用了loz压缩,远端HBase master节点域名为namenode.