Hadoop 3.0.0-alpha2安装(一)
1、集群部署概述1.1 Hadoop简介
研发要做数据挖掘统计,需要Hadoop环境,便开始了本次安装测试,仅仅使用了3台虚拟机做测试工作。 简介……此处省略好多……,可自行查找 ……
从你找到的内容可以总结看到,NameNode和JobTracker负责分派任务,DataNode和TaskTracker负责数据计算和存储。这样集群中可以有一台NameNode+JobTracker,N多台DataNode和TaskTracker。
### 直接从word文档中拷贝到博客编辑后台的,看官注意个别空格等问题!
1.2版本信息
本次测试安装所需软件版本信息如表1-1所示。
表1-1:软件版本信息
名称
版本信息
操作系统
CentOS-6.8-x86_64-bin-DVD1.iso
Java
jdk-8u121-linux-x64.tar.gz
Hadoop
hadoop-3.0.0-alpha2.tar.gz
1.3测试环境说明
本实验环境是在虚拟机中安装测试的,Hadoop集群中包括1个Master,2个Salve,节点之间内网互通,虚拟机主机名和IP地址如表1-2所示。
主机名
模拟外网IP地址(eth1)
备注
master
192.168.24.15
NameNode+JobTracker
slave1
192.168.24.16
DataNode+TaskTracker
slave2
192.168.24.17
DataNode+TaskTracker
### 说明:文档出现的灰色阴影部分内容为文件编辑内容或操作显示内容。
2、操作系统设置
1、安装常用软件
### 由于操作系统是最小化安装,所以安装一些常用的软件包
# yum install gcc gcc-c++ openssh-clients vimmake ntpdate unzip cmake tcpdump openssl openssl-devel lzo lzo-devel zlibzlib-devel snappy snappy-devel lz4 lz4-devel bzip2 bzip2-devel cmake wget
2、修改主机名
# vim /etc/sysconfig/network # 其他两个节点分别是:slave1和slave2
NETWORKING=yes
HOSTNAME=master
3、配置hosts文件
# vim /etc/hosts # master和slave服务器上均添加以下配置内容
10.0.24.15 master
10.0.24.16 slave1
10.0.24.17 slave2
4、创建账号
# useradd hadoop
5、文件句柄设置
# vim/etc/security/limits.conf
*soft nofile 65000
*hard nofile 65535
$ ulimit -n # 查看
6、系统内核参数调优sysctl.conf
net.ipv4.ip_forward = 0
net.ipv4.conf.default.rp_filter = 1
net.ipv4.conf.default.accept_source_route = 0
kernel.sysrq = 0
kernel.core_uses_pid = 1
net.ipv4.tcp_syncookies = 1
kernel.msgmnb = 65536
kernel.msgmax = 65536
kernel.shmmax = 68719476736
kernel.shmall = 4294967296
net.ipv4.tcp_max_tw_buckets = 60000
net.ipv4.tcp_sack = 1
net.ipv4.tcp_window_scaling = 1
net.ipv4.tcp_rmem = 4096 87380 4194304
net.ipv4.tcp_wmem = 4096 16384 4194304
net.core.wmem_default = 8388608
net.core.rmem_default = 8388608
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.core.netdev_max_backlog = 262144
net.core.somaxconn = 262144
net.ipv4.tcp_max_orphans = 3276800
net.ipv4.tcp_max_syn_backlog = 262144
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 1
net.ipv4.tcp_syn_retries = 1
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_mem = 94500000 915000000 927000000
net.ipv4.tcp_fin_timeout = 1
net.ipv4.tcp_keepalive_time = 1200
net.ipv4.tcp_max_syn_backlog = 65536
net.ipv4.tcp_timestamps = 0
net.ipv4.tcp_synack_retries = 2
net.ipv4.tcp_syn_retries = 2
net.ipv4.tcp_tw_recycle = 1
#net.ipv4.tcp_tw_len = 1
net.ipv4.tcp_tw_reuse = 1
#net.ipv4.tcp_fin_timeout = 30
#net.ipv4.tcp_keepalive_time = 120
net.ipv4.ip_local_port_range = 102465535
7、关闭SELINUX
# vim /etc/selinux/config
#SELINUX=enforcing
#SELINUXTYPE=targeted
SELINUX=disabled
# reboot # 重启服务器生效
8、配置ssh
# vim /etc/ssh/sshd_config # 去掉以下内容前“#”注释
HostKey /etc/ssh/ssh_host_rsa_key
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
# /etc/init.d/sshd restart
9、配置master和slave间无密码互相登录
(1)maseter和slave服务器上均生成密钥
# su - hadoop
$ssh-keygen -b 1024 -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key(/root/.ssh/id_rsa):/etc/profile.d/java.sh> /etc/profile.d/hadoop.sh
页:
[1]