主要内容
- hadoop HA集群搭建
- 前期准备
- zookeeper
- hadoop
前期准备
软硬件规划
硬件
所有服务器统一 4核CPU 16G内存软件
CentOS 7.9
OpenJDK 11
hadoop 3.3.1目录结构
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24# hadoop软件安装目录
/home/hadoop/
├── hadoop-3.3.1
├── source
└── apache-zookeeper-3.7.0
# 数据目录
/data
└── hadoop
├── dfs
│ ├── data
│ └── name
├── hdfs
├── history
│ ├── done
│ └── done_intermediate
├── tmp
├── var
├── yarn
│ └── nm
└── zk
├── data
├── journaldata
└── logs
查看系统
1 | uname -a |
yum源
使用阿里云的yum源
所有服务器都需操作1
2
3
4
5
6mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.backup
wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
sed -i -e '/mirrors.cloud.aliyuncs.com/d' -e '/mirrors.aliyuncs.com/d' /etc/yum.repos.d/CentOS-Base.repo
yum clean all
yum makecache
JDK
所有服务器都需操作1
2
3
4
5# delete 默认 JDK
yum -y remove java
# install OpenJDK 11
yum install -y java-11-openjdk-devel.x86_64
1 | # 切换 java版本 |
建立目录
所有服务器都需操作1
2
3
4
5
6
7
8
9
10
11
12mkdir -p /data/hadoop/zk/data
mkdir -p /data/hadoop/zk/journaldata
mkdir -p /data/hadoop/zk/logs
mkdir -p /data/hadoop/dfs/data
mkdir -p /data/hadoop/dfs/name
mkdir -p /data/hadoop/history/done
mkdir -p /data/hadoop/history/done_intermediate
mkdir -p /data/hadoop/yarn/nm
mkdir -p /data/hadoop/yarn/staging
mkdir -p /data/hadoop/tmp
mkdir -p /data/hadoop/var
环境变量
按实际情况 所有服务器都需操作vim /etc/profile
1
2
3
4
5
6
7
8
9
10# hadoop env -----------
export ZK_HOME=/home/hadoop/apache-zookeeper-3.7.0
export HADOOP_HOME=/home/hadoop/hadoop-3.3.1
export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.12.0.7-0.el7_9.x86_64/
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
# export HIVE_HOME=/home/hadoop/hive-3.1.2
# export SPARK_HOME=/home/hadoop/spark-3.1.2-bin-hadoop3.2
export CLASSPATH=$JAVA_HOME/lib:$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH
# export PATH=$PATH:$ZK_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin/:$SPARK_HOME/bin:$SPARK_HOME/sbin:$HIVE_HOME/bin
export PATH=$PATH:$ZK_HOME/bin:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin/
立即生效source /etc/profile
设置主机名称
所有服务器都需操作
- hostname
按规划设置hostname, 重启后生效1
hostnamectl set-hostname xxx
- hostname与ip映射
可修改本机/etc/hosts/
或在本地dns服务器上设置1
2
3
4
5
6
7
8
9
10
11
12# hadoop
192.168.5.20 hadoop-master-a
192.168.2.26 hadoop-master-b
192.168.5.21 hadoop-data-1
192.168.5.22 hadoop-data-2
192.168.5.23 hadoop-data-3
# zookeeper
192.168.5.21 zk01
192.168.5.22 zk02
192.168.5.23 zk03
建立hadoop用户
所有服务器都需操作1
2
3
4
5
6
7useradd hadoop
passwd hadoop
chmod u+w /etc/sudoers
vim /etc/sudoers #在root ALL=(ALL)ALL下添加 hadoop ALL=(ALL) ALL
chmod u-w /etc/sudoers
chown -R hadoop:hadoop /data/hadoop/
关闭防火墙
所有服务器都需操作
测试环境简单操作关闭防火墙, 正式环境看情况1
2
3
4
5setenforce 0
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
systemctl stop firewalld
systemctl disable firewalld
免密登录
以下操作都是在hadoop账户进行
在hadoop-master-a
上操作1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16## 1) 生成公私钥
ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
....
## 2)将在/home/hadoop/.ssh目录下生成公钥id_rsa.pub和私钥id_rsa将生成的秘钥,写入authorized_keys上面
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
chmod 0600 ~/.ssh/authorized_keys其他4台机器上运行
1
2
3## 1) 生成公私钥
ssh-keygen -t rsa
ssh-copy-id -i hadoop-master-a #可以看到hadoop-master-a上authorized_keys的变化同步hadoop-master-a的authorized_keys给其他机器
一定在hadoop-master-a用hadoop用户操作1
2
3
4scp /home/hadoop/.ssh/authorized_keys hadoop-master-b:/home/hadoop/.ssh/
scp /home/hadoop/.ssh/authorized_keys hadoop-data-1:/home/hadoop/.ssh/
scp /home/hadoop/.ssh/authorized_keys hadoop-data-2:/home/hadoop/.ssh/
scp /home/hadoop/.ssh/authorized_keys hadoop-data-3:/home/hadoop/.ssh/
zookeeper
下载
按规划,在hadoop-data-1 上 下载 zookeeper ,官网
安装
解压
1
2
3
4
5
6
7
8
9tar xvf apache-zookeeper-3.7.0-bin.tar.gz
mv apache-zookeeper-3.7.0-bin apache-zookeeper-3.7.0
#结果如下 ==>
/home/hadoop
├── hadoop-3.3.1
├── source
├── spark-3.1.2-bin-hadoop3.2
└── apache-zookeeper-3.7.0目录
建立zookeeper的数据、日志目录1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17/data
└── hadoop
├── dfs
│ ├── data
│ └── name
├── hdfs
├── history
│ ├── done
│ └── done_intermediate
├── tmp
├── var
├── yarn
│ └── nm
└── zk
├── data
├── journaldata
└── logs配置
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15$ cd /home/hadoop/apache-zookeeper-3.7.0/conf/
$ cp zoo_sample.cfg zoo.cfg
$ ls
configuration.xsl log4j.properties zoo.cfg zoo_sample.cfg
$ vim zoo.cfg
# 编辑zoo.cfg内容如下
dataDir=/data/hadoop/zk/data/
dataLogDir=/data/hadoop/zk/logs
server.1=zk01:2888:3888
server.2=zk02:2888:3888
server.3=zk03:2888:3888
# the port at which the clients will connect
clientPort=2181
quorumListenOnAllIPs=true分发配置
1
2
3
4
5
6
7scp -r /home/hadoop/apache-zookeeper-3.7.0 hadoop-data-2:/home/hadoop/
scp -r /home/hadoop/apache-zookeeper-3.7.0 hadoop-data-3:/home/hadoop/
#分别在hadoop-data-1, hadoop-data-2, hadoop-data-3上面配置myid文件
[hadoop@hadoop-data-1 hadoop]$ echo 1 > /data/hadoop/zk/data/myid
[hadoop@hadoop-data-2 hadoop]$ echo 2 > /data/hadoop/zk/data/myid
[hadoop@hadoop-data-3 hadoop]$ echo 3 > /data/hadoop/zk/data/myid启动
分别在hadoop-data-1, hadoop-data-2, hadoop-data-3 启动zookeeper1
2
3[hadoop@hadoop-data-1 hadoop]$ /home/hadoop/apache-zookeeper-3.7.0/bin/zkServer.sh start
[hadoop@hadoop-data-2 hadoop]$ /home/hadoop/apache-zookeeper-3.7.0/bin/zkServer.sh start
[hadoop@hadoop-data-3 hadoop]$ /home/hadoop/apache-zookeeper-3.7.0/bin/zkServer.sh start
hadoop
配置
登录主机hadoop-master-a
, 配置文件主要在 /home/hadoop/hadoop-3.3.1/etc/hadoop
目录里
环境变量 hadoop-env.sh
1 | export JAVA_HOME=/usr/lib/jvm/java-11-openjdk-11.0.12.0.7-0.el7_9.x86_64/ |
hdfs-site.xml
1 |
|
core-site.xml
1 |
|
yarn-site.xml
1 |
|
注意 yarn.nodemanager.resource.memory-mb
, yarn.scheduler.minimum-allocation-mb
, yarn.scheduler.maximum-allocation-mb
根据服务器内存调整。
mapred-site.xml
1 | <?xml version="1.0"?> |
workers
在hadoop-master-a节点的workers文件内把localhost删除,加入1
2
3hadoop-data-1
hadoop-data-2
hadoop-data-3
分发
将/home/hadoop/hadoop-3.3.1拷贝到集群其他机器上面1
2
3
4$ scp -r /home/hadoop/hadoop-3.3.1 hadoop-master-b:/home/hadoop/
$ scp -r /home/hadoop/hadoop-3.3.1 hadoop-data-1:/home/hadoop/
$ scp -r /home/hadoop/hadoop-3.3.1 hadoop-data-2:/home/hadoop/
$ scp -r /home/hadoop/hadoop-3.3.1 hadoop-data-3:/home/hadoop/
启动
zookeeper上 格式化hadoop-ha目录
1 | #在hadoop-master-a 上 格式化 |
类似这样成功
1 | #验证:检查zookeeper上是否已经有Hadoop HA目录,在任意一台zk节点上面 |
journalnode
启动namenode日志同步服务journalnode,所有ZooKeeper节点均启动,
journal 会监听 8485 端口, namenode -format
会连接此服务1
2
3
4
5
6
7
8
9
10
11[hadoop@hadoop-data-1 ~]$ $HADOOP_HOME/bin/hdfs --daemon start journalnode
WARNING: /home/hadoop/hadoop-3.3.1/logs does not exist. Creating.
[hadoop@hadoop-data-1 ~]$ jps
27429 Jps
8233 QuorumPeerMain
27372 JournalNode
# 其他zk节点
[hadoop@hadoop-data-2 root]$ $HADOOP_HOME/bin/hdfs --daemon start journalnode
[hadoop@hadoop-data-3 root]$ $HADOOP_HOME/bin/hdfs --daemon start journalnode
启动hadoop
主namenode节点
在主namenode节点hadoop-master-a
格式化NAMENODE,并启动namenode1
2
3
4
5[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/hdfs namenode -format
[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/hdfs --daemon start namenode
[hadoop@hadoop-master-a ~]$ jps
29461 NameNode
29541 Jps备namenode
在备namenode节点同步元数据,并启动namenode 服务,此前一定要先启动主namenode1
[hadoop@hadoop-master-b ~]$ $HADOOP_HOME/bin/hdfs namenode -bootstrapStandby
启动1
2
3
4[hadoop@hadoop-master-b ~]$ $HADOOP_HOME/bin/hdfs --daemon start namenode
[hadoop@hadoop-master-b ~]$ jps
24882 NameNode
24966 Jps
ZKFC
在所有namenode节点上,启动DFSZKFailoverController
主namenode1
2
3
4
5
6[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/hdfs --daemon start zkfc
[hadoop@hadoop-master-a ~]$ jps
1045 Jps
984 DFSZKFailoverController
462 NameNode
备namenode1
2
3
4
5[hadoop@hadoop-master-b ~]$ $HADOOP_HOME/bin/hdfs --daemon start zkfc
[hadoop@hadoop-master-b ~]$ jps
24882 NameNode
25171 DFSZKFailoverController
25212 Jps
datanode服务
在集群任意节点,启动1
2[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/hdfs --workers --daemon start datanode #启动所有的datanode节点
# $HADOOP_HOME/bin/hdfs --daemon start datanode启动单个datanode
在datanode验证1
2
3
4
5[hadoop@hadoop-data-3 etc]$ jps
6050 JournalNode
8579 Jps
15398 QuorumPeerMain
8471 DataNode
DataNode进程已经启动。
yarn
主resourcemanager
1
2
3
4
5
6[hadoop@hadoop-master-b ~]$ $HADOOP_HOME/bin/yarn --daemon start resourcemanager
[hadoop@hadoop-master-b ~]$ jps
25393 ResourceManager
24882 NameNode
25171 DFSZKFailoverController
25637 Jps备resourcemanager
1
2
3
4
5
6[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/yarn --daemon start resourcemanager
[hadoop@hadoop-master-a ~]$ jps
1331 Jps
984 DFSZKFailoverController
1261 ResourceManager
462 NameNodenodemanager
1
2
3
4
5
6
7
8
9
10# 启动NodeManager
[hadoop@hadoop-master-a ~]$ $HADOOP_HOME/bin/yarn --workers --daemon start nodemanager
#在NodeManager节点中
[hadoop@hadoop-data-3 etc]$ jps
6050 JournalNode
8724 NodeManager
15398 QuorumPeerMain
8471 DataNode
8844 Jps
验证
- 主namenode
http://hadoop-master-a:50070/ - 备namenode
http://hadoop-master-b:50070/ - yarn Applications
http://hadoop-master-b:8088/cluster
其他
1 | #1 datanode 报告 |