您的当前位置:首页正文

快速搭建hadoop,zk,hbase的基础集群

2024-10-17 来源:个人技术集锦

1. ZK集群,Hadoop集群,Hbase集群安装

Linux121 Linux122 Linux123
Hadoop
MySQL
ZK
HBASE

1.1 安装Vmware,安装虚拟机集群

1.1.1 安装 (VMware-workstation-full-15.5.5-16285975)

许可证:

UY758-0RXEQ-M81WP-8ZM7Z-Y3HDA

1.1.2 安装 centos7

123456

1.1.3 配置静态IP

vi /etc/sysconfig/network-scripts/ifcfg-ens33

:wqsystemctl restart networkip addr

ping www.baidu.com
快照
安装jdk
mkdir -p /opt/lagou/software    --软件安装包存放目录mkdir -p /opt/lagou/servers     --软件安装目录
rpm -qa | grep java清理上面显示的包名sudo yum remove java-1.8.0-openjdk上传文件jdk-8u421-linux-x64.tar.gzchmod 755 jdk-8u421-linux-x64.tar.gz解压文件到/opt/lagou/servers目录下tar -zxvf jdk-8u421-linux-x64.tar.gz -C /opt/lagou/serverscd /opt/lagou/serversll
配置环境
vi /etc/profile
export JAVA_HOME=/opt/lagou/servers/jdk1.8.0_421export CLASSPATH=.:${JAVA_HOME}/jre/lib/rt.jar:${JAVA_HOME}/lib/dt.jar:${JAVA_HOME}/lib/tools.jarexport PATH=$PATH:${JAVA_HOME}/bin
source /etc/profilejava -version

1.1.4 安装Xmanager

连接192.168.49.121:22密码:123456

1.1.5 克隆2台机器,并配置

vi /etc/sysconfig/network-scripts/ifcfg-ens33

systemctl restart networkip addrhostnamectlhostnamectl set-hostname linux121
关闭防火墙
systemctl status firewalldsystemctl stop firewalldsystemctl disable firewalld
关闭selinux
vi /etc/selinux/config

三台机器免密登录
vi /etc/hosts

192.168.49.121 linux121192.168.49.122 linux122192.168.49.123 linux123

第一步: ssh-keygen -t rsa 在centos7-1和centos7-2和centos7-3上面都要执行,产生公钥和私钥ssh-keygen -t rsa第二步:在centos7-1 ,centos7-2和centos7-3上执行:ssh-copy-id linux121 将公钥拷贝到centos7-1上面去ssh-copy-id linux122 将公钥拷贝到centos7-2上面去ssh-copy-id linux123 将公钥拷贝到centos7-3上面去ssh-copy-id linux121 ssh-copy-id linux122 ssh-copy-id linux123 第三步:centos7-1执行:scp /root/.ssh/authorized_keys linux121:$PWDscp /root/.ssh/authorized_keys linux122:$PWDscp /root/.ssh/authorized_keys linux123:$PWD
三台机器时钟同步
sudo cp -a /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.baksudo curl -o /etc/yum.repos.d/CentOS-Base.repo http://mirrors.aliyun.com/repo/Centos-7.reposudo yum clean allsudo yum makecachesudo yum install ntpdatentpdate us.pool.ntp.orgcrontab -e*/1 * * * * /usr/sbin/ntpdate us.pool.ntp.org;
快照

1.2 安装ZK,Hadoop,Hbase集群,安装mysql

1.2.1 安装hadoop集群

在/opt目录下创建文件夹
mkdir -p /opt/lagou/software    --软件安装包存放目录mkdir -p /opt/lagou/servers     --软件安装目录
上传hadoop安装文件到/opt/lagou/software
https://archive.apache.org/dist/hadoop/common/hadoop-2.9.2/hadoop-2.9.2.tar.gz

linux121节点
tar -zxvf hadoop-2.9.2.tar.gz -C /opt/lagou/serversll /opt/lagou/servers/hadoop-2.9.2
yum install -y vim添加环境变量vim /etc/profile
##HADOOP_HOMEexport HADOOP_HOME=/opt/lagou/servers/hadoop-2.9.2export PATH=$PATH:$HADOOP_HOME/binexport PATH=$PATH:$HADOOP_HOME/sbin
source /etc/profilehadoop version
HDFS集群配置
cd /opt/lagou/servers/hadoop-2.9.2/etc/hadoop
vim hadoop-env.shexport JAVA_HOME=/opt/lagou/servers/jdk1.8.0_421
vim core-site.xml <!-- 指定HDFS中NameNode的地址 --><property> <name>fs.defaultFS</name> <value>hdfs://linux121:9000</value></property> <!-- 指定Hadoop运行时产生文件的存储目录 --><property> <name>hadoop.tmp.dir</name> <value>/opt/lagou/servers/hadoop-2.9.2/data/tmp</value></property>
 vim slaves linux121linux122linux123
vim mapred-env.shexport JAVA_HOME=/opt/lagou/servers/jdk1.8.0_421
mv mapred-site.xml.template mapred-site.xml vim mapred-site.xml <!-- 指定MR运行在Yarn上 --> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
vi mapred-site.xml在该文件里面增加如下配置。<!-- 历史服务器端地址 --> <property> <name>mapreduce.jobhistory.address</name> <value>linux121:10020</value> </property> <!-- 历史服务器web端地址 --> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>linux121:19888</value> </property>
vim yarn-env.shexport JAVA_HOME=/opt/lagou/servers/jdk1.8.0_421
 vim yarn-site.xml <!-- 指定YARN的ResourceManager的地址 --> <property> <name>yarn.resourcemanager.hostname</name> <value>linux123</value> </property> <!-- Reducer获取数据的方式 --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property>
vi yarn-site.xml在该文件里面增加如下配置。<!-- 日志聚集功能使能 --><property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <!-- 日志保留时间设置7天 --> <property> <name>yarn.log-aggregation.retain-seconds</name>     <value>604800</value> </property> <property>     <name>yarn.log.server.url</name>     <value>http://linux121:19888/jobhistory/logs</value> </property>
chown -R root:root /opt/lagou/servers/hadoop-2.9.2
分发配置
三台都要sudo yum install -y rsync
touch rsync-scriptvim rsync-script#!/bin/bash #1 获取命令输入参数的个数,如果个数为0,直接退出命令paramnum=$# if((paramnum==0)); then echo no params; exit; fi #2 根据传入参数获取文件名称p1=$1 file_name=`basename $p1` echo fname=$file_name #3 获取输入参数的绝对路径pdir=`cd -P $(dirname $p1); pwd` echo pdir=$pdir #4 获取用户名称user=`whoami` #5 循环执行rsync for((host=121; host<124; host++)); do echo ------------------- linux$host -------------- rsync -rvl $pdir/$file_name $user@linux$host:$pdir done
chmod 777 rsync-script./rsync-script /home/root/bin./rsync-script /opt/lagou/servers/hadoop-2.9.2./rsync-script /opt/lagou/servers/jdk1.8.0_421./rsync-script /etc/profile
在namenode,linux121上格式化节点hadoop namenode -formatssh localhost集群群起stop-dfs.shstop-yarn.sh sbin/start-dfs.sh

datanode可能起不来sudo rm -rf /opt/lagou/servers/hadoop-2.9.2/data/tmp/*hadoop namenode -formatsbin/start-dfs.sh
注意:NameNode和ResourceManger不是在同一台机器,不能在NameNode上启动 YARN,应该在ResouceManager所在的机器上启动YARNsbin/start-yarn.shlinux121:sbin/mr-jobhistory-daemon.sh start historyserver
地址:hdfs:http://linux121:50070/dfshealth.html#tab-overview日志:http://linux121:19888/jobhistory

cd /opt/lagou/servers/hadoop-2.9.2

sbin/mr-jobhistory-daemon.sh stop historyserver

stop-yarn.sh

stop-dfs.sh

测试

hdfs dfs -mkdir /wcinputcd /root/touch wc.txtvi wc.txthadoop mapreduce yarn hdfs hadoop mapreduce mapreduce yarn lagou lagou lagou 保存退出: wq!hdfs dfs -put wc.txt /wcinputhadoop jar share/hadoop/mapreduce/hadoop mapreduce-examples-2.9.2.jar wordcount /wcinput /wcoutput

1.2.2 安装zk集群

上传并解压zookeeper-3.4.14.tar.gz tar -zxvf zookeeper-3.4.14.tar.gz -C ../servers/
修改配置⽂文件创建data与log⽬目录
#创建zk存储数据⽬目录mkdir -p /opt/lagou/servers/zookeeper-3.4.14/data #创建zk⽇日志⽂文件⽬目录mkdir -p /opt/lagou/servers/zookeeper-3.4.14/data/logs #修改zk配置⽂文件cd /opt/lagou/servers/zookeeper-3.4.14/conf #⽂文件改名mv zoo_sample.cfg zoo.cfgmkdir -p /opt/lagou/servers/zookeeper-3.4.14/datamkdir -p /opt/lagou/servers/zookeeper-3.4.14/data/logscd /opt/lagou/servers/zookeeper-3.4.14/confmv zoo_sample.cfg zoo.cfg vim zoo.cfg  #更更新datadir dataDir=/opt/lagou/servers/zookeeper-3.4.14/data #增加logdir dataLogDir=/opt/lagou/servers/zookeeper-3.4.14/data/logs #增加集群配置##server.服务器器ID=服务器器IP地址:服务器器之间通信端⼝口:服务器器之间投票选举端⼝口server.1=linux121:2888:3888server.2=linux122:2888:3888server.3=linux123:2888:3888 #打开注释#ZK提供了了⾃自动清理理事务⽇日志和快照⽂文件的功能,这个参数指定了了清理理频率,单位是⼩小时autopurge.purgeInterval=1
 cd /opt/lagou/servers/zookeeper-3.4.14/data echo 1 > myid  安装包分发并修改myid的值cd /opt/lagou/servers/hadoop-2.9.2/etc/hadoop ./rsync-script /opt/lagou/servers/zookeeper-3.4.14  修改myid值 linux122 echo 2 >/opt/lagou/servers/zookeeper-3.4.14/data/myid   修改myid值 linux123 echo 3 >/opt/lagou/servers/zookeeper-3.4.14/data/myid   依次启动三个zk实例例启动命令(三个节点都要执⾏行行)/opt/lagou/servers/zookeeper-3.4.14/bin/zkServer.sh start查看zk启动情况/opt/lagou/servers/zookeeper-3.4.14/bin/zkServer.sh status集群启动停⽌止脚本vim zk.sh #!/bin/sh echo "start zookeeper server..." if(($#==0));then echo "no params"; exit; fi hosts="linux121 linux122 linux123" for host in $hosts do ssh $host "source /etc/profile; /opt/lagou/servers/zookeeper-3.4.14/bin/zkServer.sh $1" donechmod 777 zk.sh./zk.sh start./zk.sh stop./zk.sh status

1.2.3 安装Hbase集群(先启动Hadoop和zk才能启动Hbase)

解压安装包到指定的规划目录 hbase-1.3.1-bin.tar.gztar -zxvf hbase-1.3.1-bin.tar.gz -C /opt/lagou/servers

修改配置文件

把hadoop中的配置core-site.xml 、hdfs-site.xml拷贝到hbase安装目录下的conf文件夹中

ln -s /opt/lagou/servers/hadoop-2.9.2/etc/hadoop/core-site.xml /opt/lagou/servers/hbase-1.3.1/conf/core-site.xml ln -s /opt/lagou/servers/hadoop-2.9.2/etc/hadoop/hdfs-site.xml /opt/lagou/servers/hbase-1.3.1/conf/hdfs-site.xml

修改conf目录下配置文件

cd /opt/lagou/servers/hbase-1.3.1/confvim hbase-env.sh #添加java环境变量export JAVA_HOME=/opt/lagou/servers/jdk1.8.0_421 #指定使用外部的zk集群export HBASE_MANAGES_ZK=FALSE 
vim hbase-site.xml<configuration>          <!-- 指定hbase在HDFS上存储的路径 -->        <property>                <name>hbase.rootdir</name>                <value>hdfs://linux121:9000/hbase</value>        </property>                <!-- 指定hbase是分布式的 -->        <property>                <name>hbase.cluster.distributed</name>                <value>true</value>        </property>                <!-- 指定zk的地址,多个用“,”分割 -->        <property>                <name>hbase.zookeeper.quorum</name>                <value>linux121:2181,linux122:2181,linux123:2181</value>        </property> </configuration>       
vim regionserverslinux121linux122linux123vim backup-masterslinux122
vim /etc/profileexport HBASE_HOME=/opt/lagou/servers/hbase-1.3.1export PATH=$PATH:$HBASE_HOME/bin
分发hbase目录和环境变量到其他节点cd /opt/lagou/servers/hadoop-2.9.2/etc/hadoop./rsync-script /opt/lagou/servers/hbase-1.3.1./rsync-script /etc/profile
让所有节点的hbase环境变量生效在所有节点执行  source /etc/profile
cd /opt/lagou/servers/hbase-1.3.1/binHBase集群的启动和停止前提条件:先启动hadoop和zk集群启动HBase:start-hbase.sh停止HBase:stop-hbase.sh
HBase集群的web管理界面启动好HBase集群之后,可以访问地址:HMaster的主机名:16010linux121:16010

1.2.4 安装mysql

卸载系统自带的mysqlrpm -qa | grep mysqlrpm -e --nodeps mysql-libs-5.1.73-8.el6_8.x86_64
安装mysql-community-release-el6-5.noarch.rpmrpm -ivh mysql-community-release-el6-5.noarch.rpm
安装mysql 服务器yum -y install mysql-community-server启动服务service mysqld start如果出现:serivce: command not found安装serviceyum install initscripts
配置数据库
设置密码/usr/bin/mysqladmin -u root password '123'# 进入mysqlmysql -uroot -p123# 清空 mysql 配置文件内容>/etc/my.cnf修改vi /etc/my.cnf[client]default-character-set=utf8[mysql]default-character-set=utf8[mysqld]character-set-server=utf8重启查看,授权远程连接service mysqld restartmysql -uroot -p123show variables like 'character_set_%';# 给root授权:既可以本地访问, 也可以远程访问grant all privileges on *.* to 'root'@'%' identified by '123' with grantoption;# 刷新权限(可选)flush privileges;
快照

图片不显示无伤大雅,就是示例,步骤都在。

显示全文