我使用的安装文件是 hadoop-3.1.3.tar.gz ,以下内容均以此版本进行说明。
1.前置条件
Hadoop 的运行依赖 jdk 我安装的 openjdk11。
[root@tcloud ~]
openjdk version "11" 2018-09-25
OpenJDK Runtime Environment 18.9 (build 11+28)
OpenJDK 64-Bit Server VM 18.9 (build 11+28, mixed mode)
2.配置免密登录
Hadoop 组件之间需要基于 SSH 进行通讯。
2.1 配置映射
配置 ip 地址和主机名映射:很关键通过ifconfig查询本机的ip地址,这个地方没有配置正确的话节点会有问题。
vim /etc/hosts
xxx.xx.x.x tcloud tcloud
2.2 生成公私钥
执行下面命令行生成公匙和私匙:
[root@tcloud ~]
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
/root/.ssh/id_rsa already exists.
Overwrite (y/n)? y
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:BtWqdvRxf90QPhg5p2OOIBwgEGTu4lxAd92icFc5cwE root@tcloud
The key's randomart image is:
+---[RSA 2048]----+
|+*...o. +Eo... |
|+ .o...= =..+ o |
| o o.+...+ B . |
|. . .o.+ . * + |
|.. . +So * o oo|
|+ . o.. o . . +|
| o . . . |
| |
| |
+----[SHA256]-----+
2.3 授权
进入 /root/.ssh/ 目录下,查看生成的公匙和私匙,并将公匙写入到授权文件:
[root@tcloud .ssh]
total 16
-rw------- 1 root root 786 Jul 6 11:57 authorized_keys
-rw-r--r-- 1 root root 0 Jul 5 11:06 config
-rw-r--r-- 1 root root 0 Jul 5 11:06 iddummy.pub
-rw------- 1 root root 1679 Jul 27 17:42 id_rsa
-rw-r--r-- 1 root root 393 Jul 27 17:42 id_rsa.pub
-rw-r--r-- 1 root root 1131 Jul 6 13:31 known_hosts
[root@tcloud .ssh]
[root@tcloud .ssh]
3.HDFS环境搭建
3.1 下载并解压
tar -zxvf hadoop-3.1.3.tar.gz
mv ./hadoop-3.1.3 /usr/local/
3.2 配置环境变量
配置环境变量的方法比较多,这里统一将环境变量放在 /tec/profile.d/my_env.sh 内。
vim /etc/profile.d/my_env.sh
export HADOOP_HOME=/usr/local/hadoop-2.9.2
export PATH=${HADOOP_HOME}/bin:$PATH
source /etc/profile
20210727-17:58
3.3 修改Hadoop配置
进入 ${HADOOP_HOME}/etc/hadoop/ 目录下,修改以下配置:
- hadoop-env.sh
export JAVA_HOME=/usr/local/jdk1.8/jdk1.8.0_251/
- core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://aliyun:8020</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/nndata</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/dndata</value>
</property>
<property>
<name>hadoop.http.staticuser.user</name>
<value>root</value>
</property>
<property>
<name>dfs.permissions.enabled</name>
<value>false</value>
</property>
</configuration>
- hdfs-site.xml
指定副本系数和临时文件存储位置:
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>aliyun:50075</value>
</property>
</configuration>
<property>
<name>dfs.http.address</name>
<value>Master1-IP:50070</value>
</property>
<property>
<name>dfs.https.address</name>
<value>Master1-IP:50470</value>
</property>
<property>
<name>dfs.secondary.http.address</name>
<value>Master2-IP:50090</value>
</property>
<property>
<name>dfs.datanode.address</name>
<value>${local.bind.address}:50010</value>
</property>
<property>
<name>dfs.datanode.ipc.address</name>
<value>${local.bind.address}:50020</value>
</property>
<property>
<name>dfs.datanode.http.address</name>
<value>${local.bind.address}:50075</value>
</property>
<property>
<name>dfs.datanode.https.address</name>
<value>${local.bind.address}:50475</value>
</property>
- slaves
hadoop001
3.4 关闭防火墙
不关闭防火墙可能导致无法访问 Hadoop 的 Web UI 界面:
sudo firewall-cmd --state
sudo systemctl stop firewalld.service
3.5 初始化
第一次启动 Hadoop 时需要进行初始化,进入 ${HADOOP_HOME}/bin/ 目录下,执行以下命令:
[root@hadoop001 bin]
3.6 启动HDFS
进入 ${HADOOP_HOME}/sbin/ 目录下,启动 HDFS:
[root@hadoop001 sbin]
3.7 验证是否启动成功
方式一:执行 jps 查看 NameNode 和 DataNode 服务是否已经启动:
[root@hadoop001 hadoop-2.6.0-cdh5.15.2]
9137 DataNode
9026 NameNode
9390 SecondaryNameNode
方式二:查看 Web UI 界面,端口为 50070
4.YARN环境搭建
4.1 修改配置
进入 ${HADOOP_HOME}/etc/hadoop/ 目录下,修改以下配置:
cp mapred-site.xml.template mapred-site.xm
- mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
- yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
4.2 启动服务
进入 ${HADOOP_HOME}/sbin/ 目录下,启动 YARN:
./start-yarn.sh
4.3 验证是否启动成功
方式一:执行 jps 命令查看 NodeManager 和 ResourceManager 服务是否已经启动:
[root@hadoop001 hadoop-2.6.0-cdh5.15.2]
9137 DataNode
9026 NameNode
12294 NodeManager
12185 ResourceManager
9390 SecondaryNameNode
方式二:查看 Web UI 界面,端口号为 8088
|