0%

hadoop伪分布式搭建

说明

一个master,2个slave,只有一个nameNode.

注意:3台机器的安装目录及配置要求完全一致.可以先配置一台再拷贝到其它机器上.

主备3台机器

# 安装java环境
yum install -y java-1.8.0-openjdk java-1.8.0-openjdk-devel

vim /etc/profile

export JAVA_HOME=/usr/lib/jvm/java-1.8.0-openjdk-1.8.0.242.b08-0.el7_7.x86_64
# hadoop 解压目录
export HADOOP_HOME=/root/hbase/hadoop-2.6.0
export CLASSPATH=.:$JAVA_HOME/jre/lib/rt.jar:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
# 生效
source /etc/profile

下载二进制包

# zookeeper集群安装省略.参考用docker安装
wget http://archive.apache.org/dist/zookeeper/zookeeper-3.4.5/zookeeper-3.4.5.tar.gz
 
wget http://archive.apache.org/dist/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz
 
wget http://archive.apache.org/dist/hbase/1.2.0/hbase-1.2.0-bin.tar.gz

添加hadoop用户

usermod -a -G hadoop hadoop
passwd hadoop

vim /etc/sudoers

root	ALL=(ALL) 	ALL
hadoop	ALL=(ALL) 	ALL

配置免密省略.

创建name,data,tmp目录

mkdir -p dfs/name
mkdir -p dfs/data
mkdir tmp

关键配置

所有配置文件在hadoop-2.6.0/etc/hadoop/下:

<!-- hadoop-env.sh   yarn-env.sh 配置JAVA_HOME -->

<!-- core-site.xml  -->
<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://data1:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/root/hbase/tmp</value>
    </property>
</configuration>

<!-- hdfs-site.xml -->
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>2</value>
    </property>
    <property>
        <name>dfs.namenode.name.dir</name>
        <value>/root/hbase/dfs/name</value>
    </property>
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>/root/hbase/dfs/data</value>
    </property>
</configuration>

<!-- mapred-site.xml -->
<configuration>
   <property>
      <name>mapred.job.tracker</name>
      <value>data1:9001</value>
   </property>
</configuration>

<!-- yarn-site.xml -->

<configuration>
     <property>
         <name>yarn.nodemanager.aux-services</name>
         <value>mapreduce_shuffle</value>
     </property>
     <property>
         <name>yarn.resourcemanager.address</name>
         <value>data1:8032</value>
     </property>
     <property>
         <name>yarn.resourcemanager.scheduler.address</name>
         <value>data1:8030</value>
     </property>
     <property>
         <name>yarn.resourcemanager.resource-tracker.address</name>
         <value>data1:8031</value>
     </property>
     <property>
         <name>yarn.resourcemanager.admin.address</name>
         <value>data1:8033</value>
     </property>
     <property>
         <name>yarn.resourcemanager.webapp.address</name>
         <value>data1:8088</value>
     </property>
</configuration>

<!-- slaves, 删除localhost -->
[root@data1 hadoop]# cat slaves 
data2
data3

启动

# master节点进行格式化
hadoop namenode -format
# 启动
[root@data1 hbase]# start-all.sh 
This script is Deprecated. Instead use start-dfs.sh and start-yarn.sh
Starting namenodes on [data1]
data1: starting namenode, logging to /root/hbase/hadoop-2.6.0/logs/hadoop-root-namenode-data1.out
data3: starting datanode, logging to /root/hbase/hadoop-2.6.0/logs/hadoop-root-datanode-data3.out
data2: starting datanode, logging to /root/hbase/hadoop-2.6.0/logs/hadoop-root-datanode-data2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /root/hbase/hadoop-2.6.0/logs/hadoop-root-secondarynamenode-data1.out
starting yarn daemons
starting resourcemanager, logging to /root/hbase/hadoop-2.6.0/logs/yarn-root-resourcemanager-data1.out
data3: starting nodemanager, logging to /root/hbase/hadoop-2.6.0/logs/yarn-root-nodemanager-data3.out
data2: starting nodemanager, logging to /root/hbase/hadoop-2.6.0/logs/yarn-root-nodemanager-data2.out

# 检查各个节点的状态
[root@data1 hbase]# jps
24048 ResourceManager
24307 Jps
23893 SecondaryNameNode
23711 NameNode

[root@data2 tmp]# jps
12341 DataNode
12442 NodeManager
12570 Jps

[root@data3 tmp]# jps
5187 DataNode
5288 NodeManager
5416 Jps

错误1

启动的时候报了以下错误,主要原因是底层文件version的配置信息clusterID不一样.删除name,data,tmp文件,重新格式化.

Starting secondary namenodes [0.0.0.0]
The authenticity of host '0.0.0.0 (0.0.0.0)' can't be established.
ECDSA key fingerprint is SHA256:/bqliO4L8XYMMr/5wVDufH9IjldwXwLWEol3eAEjuzc.
ECDSA key fingerprint is MD5:92:8e:24:a9:a1:e8:a9:55:8d:20:0f:4e:3d:34:dd:f0.
Are you sure you want to continue connecting (yes/no)? yes

测试

hadoop fs -mkdir -p /test
hadoop fs -ls /test
hadoop fs -put test.txt /test/
hadoop fs -cat /test/test.txt