mysql hive

qweewq123qwe 发表于 2018-9-27 11:01:40

　　环境：centos-6.5_x86_64
　　一、mysql安装
　　1.查看是系统是否自带安装了mysql
　　# rpm -q mysql
　　package mysql is not installed
　　2.
　　# yum install -y mysql-server mysql mysql-devel
　　# rpm -qimysql-server
　　# service mysqld start
　　Initializing MySQL database:Installing MySQL system tables...
　　OK
　　Filling help tables...
　　OK
　　To start mysqld at boot time you have to copy
　　support-files/mysql.server to the right place for your system
　　PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
　　To do so, start the server, then issue the following commands:
　　/usr/bin/mysqladmin -u root password 'new-password'
　　/usr/bin/mysqladmin -u root -h slave2.hadoop password 'new-password'
　　Alternatively you can run:
　　/usr/bin/mysql_secure_installation
　　which will also give you the option of removing the test
　　databases and anonymous user created by default.This is
　　strongly recommended for production servers.
　　See the manual for more instructions.
　　You can start the MySQL daemon with:
　　cd /usr ; /usr/bin/mysqld_safe &
　　You can test the MySQL daemon with mysql-test-run.pl
　　cd /usr/mysql-test ; perl mysql-test-run.pl
　　Please report any problems with the /usr/bin/mysqlbug script!
　　
　　Starting mysqld:
　　# /usr/bin/mysqladmin -u root password '111111'
　　# /usr/bin/mysqladmin -u root -h slave2.hadoop password '111111'
　　# /usr/bin/mysql_secure_installation
　　---执行以上命令做下安全方面的修改
　　---------------
　　# cat /etc/my.cnf
　　
　　datadir=/var/lib/mysql
　　socket=/var/lib/mysql/mysql.sock
　　user=mysql
　　# Disabling symbolic-links is recommended to prevent assorted security risks
　　symbolic-links=0
　　
　　log-error=/var/log/mysqld.log
　　pid-file=/var/run/mysqld/mysqld.pid
　　----------------
　　# whichmysqld
　　/usr/bin/which: no mysqld in (/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
　　# find / -name mysqld
　　/usr/libexec/mysqld
　　/var/lock/subsys/mysqld
　　/var/run/mysqld
　　/etc/logrotate.d/mysqld
　　/etc/rc.d/init.d/mysqld
　　# /usr/libexec/mysqld --verbose --help | grep -A 1 'Default options'
　　Default options are read from the following files in the given order:
　　/etc/mysql/my.cnf /etc/my.cnf ~/.my.cnf
　　从上图可以看出, 服务器首先会读取/etc/mysql/my.cnf文件，如果发现该文件不存在，再依次尝试从后面的几个路径进行读取
　　# chkconfig --list | grep mysqld
　　mysqld       0:off1:off2:off3:off4:off5:off6:off
　　# chkconfig mysqld on
　　# chkconfig --list | grep mysqld
　　mysqld       0:off1:off2:on3:on4:on5:on6:off
　　# mysql -u root -p
　　Enter password:
　　Welcome to the MySQL monitor.Commands end with ; or \g.

　　Your MySQL connection>　　Server version: 5.1.73 Source distribution
　　Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
　　Oracle is a registered trademark of Oracle Corporation and/or its
　　affiliates. Other names may be trademarks of their respective
　　owners.
　　Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
　　mysql> show tables;
　　ERROR 1046 (3D000): No database selected
　　mysql> show databases;
　　+--------------------+
　　| Database       |
　　+--------------------+
　　| information_schema |
　　| mysql          |
　　+--------------------+
　　2 rows in set (0.01 sec)
　　mysql>
　　--------------
　　二、HIVE
　　1.下载安装及环境配置
　　$ wgethttp://mirror.bit.edu.cn/apache/hive/stable/apache-hive-0.13.0-bin.tar.gz
　　$ tar xvfapache-hive-0.13.0-bin.tar.gz
　　$ vi .bash_profile
　　export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
　　export MAVEN_HOME=/home/hadoop/apache-maven-3.1.1
　　export PATH=/home/hadoop/apache-maven-3.1.1/bin:$PATH
　　export HADOOP_PREFIX=/home/hadoop/hadoop-2.2.0
　　export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
　　export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
　　export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
　　export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
　　export HADOOP_YARN_HOME=${HADOOP_PREFIX}
　　export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
　　export HADOOP_HOME=/home/hadoop/hadoop-2.2.0
　　export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
　　export YARN_HOME=${HADOOP_PREFIX}
　　export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
　　export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
　　export SCALA_HOME=/home/hadoop/scala-2.10.1
　　export PATH=$PATH:$SCALA_HOME/bin
　　export SPARK_HOME=/home/hadoop/spark-0.9.1-bin-hadoop2
　　export FLUME_HOME=/home/hadoop/apache-flume-1.4.0-bin
　　export FLUME_CONF_DIR=$FLUME_HOME/conf
　　export PATH=.:$PATH::$FLUME_HOME/bin
　　export HIVE_HOME=/home/hadoop/apache-hive-0.13.0-bin
　　export PATH=$HIVE_HOME/bin:$PATH
　　$ ..bash_profile
　　$ hdfsdfs-mkdir/tmp
　　$ hdfsdfs-mkdir/usr/hive/warehouse
　　$ hdfsdfs-chmod +w/tmp
　　$ hdfsdfs-chmod +w/usr/hive/warehouse
　　$ cd apache-hive-0.13.0-bin/conf/
　　$ cp hive-default.xml.template hive-site.xml
　　$ cp hive-env.sh.template hive-env.sh
　　$ cp hive-exec-log4j.properties.template hive-exec-log4j.properties
　　$ cp hive-log4j.properties.template hive-log4j.properties
　　$ hive
　　hive> add jar /home/hadoop/apache-hive-0.13.0-bin/lib/hive-contrib-0.13.0.jar;
　　hive> exit;
　　$ vi .bash_profile
　　添加：

　　export>　　$ source.bash_profile
　　$ mysql-u root-p
　　Enter password:
　　mysql> create databasehadoop;
　　mysql> createuser 'hive'@'slave2.hadoop'identified by'111111';
　　Query OK, 0 rows affected (0.07 sec)
　　mysql> GRANT ALL PRIVILEGES ON hadoop.* TO 'hive'@'slave2.hadoop' WITH GRANT OPTION;
　　Query OK, 0 rows affected (0.00 sec)
　　mysql> flush privileges;
　　mysql> exit
　　Bye
　　修改配置文件：
　　$ pwd
　　/home/hadoop/apache-hive-0.13.0-bin/conf
　　$ vi hive-site.xml
　　删除：
　　
　　hive.metastore.local
　　true
　　controls whether to connect to remove metastore server or open a new metastore server in Hive Client JVM
　　
　　---没删除之前运行hive报以下警告：
　　～～～～～～～～～
　　WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
　　上述提示是由HIVE的bug引起的，详情见https://issues.apache.org/jira/browse/HIVE-6159，另外实际上在该0.13中该bug已经修复，但由于官网下载的hive版本是基于hadoop 0.20平台编译的，当前hadoop 2.2环境下需要重新编译hive.
　　～～～～～～～～～～
　　修改：
　　
　　javax.jdo.option.ConnectionURL
　　jdbc:mysql://slave2.hadoop:3306/hive?createDatabaseIfNotExist=true
　　JDBC connect string for a JDBC metastore
　　
　　
　　javax.jdo.option.ConnectionDriverName
　　com.mysql.jdbc.Driver

　　Driver>　　
　　
　　javax.jdo.option.ConnectionUserName
　　hive
　　username to use against metastore database
　　
　　
　　javax.jdo.option.ConnectionPassword
　　hivepasswd
　　password to use against metastore database
　　
　　--
　　添加jdbc的jar包：
　　添加驱动jar包到SQOOP_HOME/lib/下面
　　我现在用的是mysql-server-5.1.73
　　从http://dev.mysql.com/downloads/connector/j/下载mysql-connector-java-5.1.30.tar.gz
　　$ tar xvfmysql-connector-java-5.1.30.tar.gz
　　$ cp~/mysql-connector-java-5.1.30/mysql-connector-java-5.1.30-bin.jar~/apache-hive-0.13.0-bin/lib/
　　$ hive
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
　　14/05/22 11:08:24 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
　　14/05/22 11:08:24 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.Use hive.hmshandler.retry.* instead
　　Logging initialized using configuration in file:/home/hadoop/apache-hive-0.13.0-bin/conf/hive-log4j.properties
　　hive> CREATE TABLE maptile (ipaddress STRING,time STRING,method STRING,request STRING,protocol STRING,status STRING,size STRING,referer STRING,agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) (\"[^ ]*) ([^ ]*) ([^ ]*\") (-|*) (-|*)(?: ([^ \"]*| \".*\") ([^ \"]*|\".*\"))?","output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s %10$s %11$s")STORED AS TEXTFILE;
　　OK
　　Time taken: 0.18 seconds
　　hive> load datainpath '/flume/172.20.105.133_server/20140520_date/FlumeData.1400629779172' overwrite into table maptile;
　　Loading data to table default.maptile
　　rmr: DEPRECATED: Please use 'rm -r' instead.
　　Deleted hdfs://master.hadoop:9000/user/hive/warehouse/maptile
　　Table default.maptile stats:
　　OK
　　Time taken: 1.435 seconds
　　hive> create table result (ip string,num int) partitioned by (dt string);
　　OK
　　Time taken: 0.107 seconds
　　hive> insert overwrite table result partition (dt='2014-5-20') select ipaddress,count(1) as numrequest from maptile group by ipaddress sort by numrequest desc;
　　Total jobs = 2
　　Launching Job 1 out of 2

　　Number of reduce tasks not specified. Estimated from input data>　　In order to change the average load for a reducer (in bytes):
　　set hive.exec.reducers.bytes.per.reducer=
　　In order to limit the maximum number of reducers:
　　set hive.exec.reducers.max=
　　In order to set a constant number of reducers:
　　set mapreduce.job.reduces=
　　Starting Job = job_1400782075030_0001, Tracking URL = http://master.hadoop:8088/proxy/application_1400782075030_0001/
　　Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job-kill job_1400782075030_0001
　　Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
　　2014-05-22 11:59:26,803 Stage-1 map = 0%,reduce = 0%
　　2014-05-22 11:59:42,986 Stage-1 map = 100%,reduce = 0%, Cumulative CPU 3.15 sec
　　2014-05-22 12:00:02,351 Stage-1 map = 100%,reduce = 100%, Cumulative CPU 4.59 sec
　　MapReduce Total cumulative CPU time: 4 seconds 590 msec
　　Ended Job = job_1400782075030_0001
　　Launching Job 2 out of 2

　　Number of reduce tasks not specified. Estimated from input data>　　In order to change the average load for a reducer (in bytes):
　　set hive.exec.reducers.bytes.per.reducer=
　　In order to limit the maximum number of reducers:
　　set hive.exec.reducers.max=
　　In order to set a constant number of reducers:
　　set mapreduce.job.reduces=
　　Starting Job = job_1400782075030_0002, Tracking URL = http://master.hadoop:8088/proxy/application_1400782075030_0002/
　　Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job-kill job_1400782075030_0002
　　Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
　　2014-05-22 12:00:32,149 Stage-2 map = 0%,reduce = 0%
　　2014-05-22 12:00:41,397 Stage-2 map = 100%,reduce = 0%, Cumulative CPU 1.05 sec
　　2014-05-22 12:01:23,745 Stage-2 map = 100%,reduce = 67%, Cumulative CPU 2.77 sec
　　2014-05-22 12:01:27,819 Stage-2 map = 100%,reduce = 100%, Cumulative CPU 3.92 sec
　　MapReduce Total cumulative CPU time: 3 seconds 920 msec
　　Ended Job = job_1400782075030_0002
　　Loading data to table default.result partition (dt=2014-5-20)
　　: Skipping stats aggregation by error org.apache.hadoop.hive.ql.metadata.HiveException: : Stats aggregator of type counter cannot be connected to
　　Partition default.result{dt=2014-5-20} stats:
　　MapReduce Jobs Launched:
　　Job 0: Map: 1Reduce: 1 Cumulative CPU: 4.59 sec HDFS Read: 40138523 HDFS Write: 117 SUCCESS
　　Job 1: Map: 1Reduce: 1 Cumulative CPU: 3.92 sec HDFS Read: 486 HDFS Write: 10 SUCCESS
　　Total MapReduce CPU Time Spent: 8 seconds 510 msec
　　OK
　　Time taken: 206.925 seconds
　　～～～～～～～～～～～～～～～
　　---这里的: Skipping stats aggregation by error org.apache.hadoop.hive.ql.metadata.HiveException: : Stats aggregator of type counter cannot be connected to还不知道怎么个情况，回头再查一下
　　～～～～～～～～～～～～～～～～～～～～～～
　　hive> show tables;
　　OK
　　maptile
　　result
　　Time taken: 0.349 seconds, Fetched: 2 row(s)
　　hive> select * from result;
　　OK
　　NULL1382652014-5-20
　　Time taken: 0.766 seconds, Fetched: 1 row(s)

页: [1]

运维网's Archiver

mysql hive