qweewq123qwe 发表于 2018-9-27 11:01:40

mysql hive

  环境:centos-6.5_x86_64
  一、mysql安装
  1.查看是系统是否自带安装了mysql
  # rpm -q mysql
  package mysql is not installed
  2.
  # yum install -y mysql-server mysql mysql-devel
  # rpm -qimysql-server
  # service mysqld start
  Initializing MySQL database:Installing MySQL system tables...
  OK
  Filling help tables...
  OK
  To start mysqld at boot time you have to copy
  support-files/mysql.server to the right place for your system
  PLEASE REMEMBER TO SET A PASSWORD FOR THE MySQL root USER !
  To do so, start the server, then issue the following commands:
  /usr/bin/mysqladmin -u root password 'new-password'
  /usr/bin/mysqladmin -u root -h slave2.hadoop password 'new-password'
  Alternatively you can run:
  /usr/bin/mysql_secure_installation
  which will also give you the option of removing the test
  databases and anonymous user created by default.This is
  strongly recommended for production servers.
  See the manual for more instructions.
  You can start the MySQL daemon with:
  cd /usr ; /usr/bin/mysqld_safe &
  You can test the MySQL daemon with mysql-test-run.pl
  cd /usr/mysql-test ; perl mysql-test-run.pl
  Please report any problems with the /usr/bin/mysqlbug script!
  
  Starting mysqld:                                          
  # /usr/bin/mysqladmin -u root password '111111'
  # /usr/bin/mysqladmin -u root -h slave2.hadoop password '111111'
  # /usr/bin/mysql_secure_installation
  ---执行以上命令做下安全方面的修改
  ---------------
  # cat /etc/my.cnf
  
  datadir=/var/lib/mysql
  socket=/var/lib/mysql/mysql.sock
  user=mysql
  # Disabling symbolic-links is recommended to prevent assorted security risks
  symbolic-links=0
  
  log-error=/var/log/mysqld.log
  pid-file=/var/run/mysqld/mysqld.pid
  ----------------
  # whichmysqld
  /usr/bin/which: no mysqld in (/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin)
  # find / -name mysqld
  /usr/libexec/mysqld
  /var/lock/subsys/mysqld
  /var/run/mysqld
  /etc/logrotate.d/mysqld
  /etc/rc.d/init.d/mysqld
  # /usr/libexec/mysqld --verbose --help | grep -A 1 'Default options'
  Default options are read from the following files in the given order:
  /etc/mysql/my.cnf /etc/my.cnf ~/.my.cnf
  从上图可以看出, 服务器首先会读取/etc/mysql/my.cnf文件,如果发现该文件不存在,再依次尝试从后面的几个路径 进行读取
  # chkconfig --list | grep mysqld
  mysqld         0:off1:off2:off3:off4:off5:off6:off
  # chkconfig mysqld on
  # chkconfig --list | grep mysqld
  mysqld         0:off1:off2:on3:on4:on5:on6:off
  # mysql -u root -p
  Enter password:
  Welcome to the MySQL monitor.Commands end with ; or \g.

  Your MySQL connection>  Server version: 5.1.73 Source distribution
  Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
  Oracle is a registered trademark of Oracle Corporation and/or its
  affiliates. Other names may be trademarks of their respective
  owners.
  Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
  mysql> show tables;
  ERROR 1046 (3D000): No database selected
  mysql> show databases;
  +--------------------+
  | Database         |
  +--------------------+
  | information_schema |
  | mysql            |
  +--------------------+
  2 rows in set (0.01 sec)
  mysql>
  --------------
  二、HIVE
  1.下载安装及环境配置
  $ wgethttp://mirror.bit.edu.cn/apache/hive/stable/apache-hive-0.13.0-bin.tar.gz
  $ tar xvfapache-hive-0.13.0-bin.tar.gz
  $ vi .bash_profile
  export PATH=$PATH:$JAVA_HOME/bin:$JAVA_HOME/jre/bin
  export MAVEN_HOME=/home/hadoop/apache-maven-3.1.1
  export PATH=/home/hadoop/apache-maven-3.1.1/bin:$PATH
  export HADOOP_PREFIX=/home/hadoop/hadoop-2.2.0
  export PATH=$PATH:$HADOOP_PREFIX/bin:$HADOOP_PREFIX/sbin
  export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
  export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
  export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
  export HADOOP_YARN_HOME=${HADOOP_PREFIX}
  export HADOOP_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
  export HADOOP_HOME=/home/hadoop/hadoop-2.2.0
  export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/nativeexport HADOOP_OPTS="$HADOOP_OPTS -Djava.library.path=$HADOOP_HOME/lib/native"
  export YARN_HOME=${HADOOP_PREFIX}
  export YARN_CONF_DIR=$HADOOP_HOME/etc/hadoop
  export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
  export SCALA_HOME=/home/hadoop/scala-2.10.1
  export PATH=$PATH:$SCALA_HOME/bin
  export SPARK_HOME=/home/hadoop/spark-0.9.1-bin-hadoop2
  export FLUME_HOME=/home/hadoop/apache-flume-1.4.0-bin
  export FLUME_CONF_DIR=$FLUME_HOME/conf
  export PATH=.:$PATH::$FLUME_HOME/bin
  export HIVE_HOME=/home/hadoop/apache-hive-0.13.0-bin
  export PATH=$HIVE_HOME/bin:$PATH
  $ ..bash_profile
  $ hdfsdfs-mkdir/tmp
  $ hdfsdfs-mkdir/usr/hive/warehouse
  $ hdfsdfs-chmod +w/tmp
  $ hdfsdfs-chmod +w/usr/hive/warehouse
  $ cd apache-hive-0.13.0-bin/conf/
  $ cp hive-default.xml.template hive-site.xml
  $ cp hive-env.sh.template hive-env.sh
  $ cp hive-exec-log4j.properties.template hive-exec-log4j.properties
  $ cp hive-log4j.properties.template hive-log4j.properties
  $ hive
  hive> add jar /home/hadoop/apache-hive-0.13.0-bin/lib/hive-contrib-0.13.0.jar;
  hive> exit;
  $ vi .bash_profile
  添加:

  export>  $ source.bash_profile
  $ mysql-u root-p
  Enter password:
  mysql> create databasehadoop;
  mysql> createuser 'hive'@'slave2.hadoop'identified by'111111';
  Query OK, 0 rows affected (0.07 sec)
  mysql> GRANT ALL PRIVILEGES ON hadoop.* TO 'hive'@'slave2.hadoop' WITH GRANT OPTION;
  Query OK, 0 rows affected (0.00 sec)
  mysql> flush privileges;
  mysql> exit
  Bye
  修改配置文件:
  $ pwd
  /home/hadoop/apache-hive-0.13.0-bin/conf
  $ vi hive-site.xml
  删除:
  
  hive.metastore.local
  true
  controls whether to connect to remove metastore server or open a new metastore server in Hive Client JVM
  
  ---没删除之前运行hive报以下警告:
  ~~~~~~~~~
  WARN conf.HiveConf: DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
  上述提示是由HIVE的bug引起的,详情见https://issues.apache.org/jira/browse/HIVE-6159,另外实际上在该0.13中该bug已经修复,但由于官网下载的hive版本是基于hadoop 0.20平台编译的,当前hadoop 2.2环境下需要重新编译hive.
  ~~~~~~~~~~
  修改:
  
  javax.jdo.option.ConnectionURL
  jdbc:mysql://slave2.hadoop:3306/hive?createDatabaseIfNotExist=true
  JDBC connect string for a JDBC metastore
  
  
  javax.jdo.option.ConnectionDriverName
  com.mysql.jdbc.Driver

  Driver>  
  
  javax.jdo.option.ConnectionUserName
  hive
  username to use against metastore database
  
  
  javax.jdo.option.ConnectionPassword
  hivepasswd
  password to use against metastore database
  
  --
  添加jdbc的jar包:
  添加驱动jar包到SQOOP_HOME/lib/下面
  我现在用的是mysql-server-5.1.73
  从http://dev.mysql.com/downloads/connector/j/下载mysql-connector-java-5.1.30.tar.gz
  $ tar xvfmysql-connector-java-5.1.30.tar.gz
  $ cp~/mysql-connector-java-5.1.30/mysql-connector-java-5.1.30-bin.jar~/apache-hive-0.13.0-bin/lib/
  $ hive
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
  14/05/22 11:08:24 INFO Configuration.deprecation: mapred.committer.job.setup.cleanup.needed is deprecated. Instead, use mapreduce.job.committer.setup.cleanup.needed
  14/05/22 11:08:24 WARN conf.HiveConf: DEPRECATED: hive.metastore.ds.retry.* no longer has any effect.Use hive.hmshandler.retry.* instead
  Logging initialized using configuration in file:/home/hadoop/apache-hive-0.13.0-bin/conf/hive-log4j.properties
  hive> CREATE TABLE maptile (ipaddress STRING,time STRING,method STRING,request STRING,protocol STRING,status STRING,size STRING,referer STRING,agent STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "([^ ]*) ([^ ]*) ([^ ]*) (-|\\[[^\\]]*\\]) (\"[^ ]*) ([^ ]*) ([^ ]*\") (-|*) (-|*)(?: ([^ \"]*| \".*\") ([^ \"]*|\".*\"))?","output.format.string" = "%1$s %2$s %3$s %4$s %5$s %6$s %7$s %8$s %9$s %10$s %11$s")STORED AS TEXTFILE;
  OK
  Time taken: 0.18 seconds
  hive> load datainpath '/flume/172.20.105.133_server/20140520_date/FlumeData.1400629779172' overwrite into table maptile;
  Loading data to table default.maptile
  rmr: DEPRECATED: Please use 'rm -r' instead.
  Deleted hdfs://master.hadoop:9000/user/hive/warehouse/maptile
  Table default.maptile stats:
  OK
  Time taken: 1.435 seconds
  hive> create table result (ip string,num int) partitioned by (dt string);
  OK
  Time taken: 0.107 seconds
  hive> insert overwrite table result partition (dt='2014-5-20') select ipaddress,count(1) as numrequest from maptile group by ipaddress sort by numrequest desc;
  Total jobs = 2
  Launching Job 1 out of 2

  Number of reduce tasks not specified. Estimated from input data>  In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
  In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
  In order to set a constant number of reducers:
  set mapreduce.job.reduces=
  Starting Job = job_1400782075030_0001, Tracking URL = http://master.hadoop:8088/proxy/application_1400782075030_0001/
  Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job-kill job_1400782075030_0001
  Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
  2014-05-22 11:59:26,803 Stage-1 map = 0%,reduce = 0%
  2014-05-22 11:59:42,986 Stage-1 map = 100%,reduce = 0%, Cumulative CPU 3.15 sec
  2014-05-22 12:00:02,351 Stage-1 map = 100%,reduce = 100%, Cumulative CPU 4.59 sec
  MapReduce Total cumulative CPU time: 4 seconds 590 msec
  Ended Job = job_1400782075030_0001
  Launching Job 2 out of 2

  Number of reduce tasks not specified. Estimated from input data>  In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=
  In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=
  In order to set a constant number of reducers:
  set mapreduce.job.reduces=
  Starting Job = job_1400782075030_0002, Tracking URL = http://master.hadoop:8088/proxy/application_1400782075030_0002/
  Kill Command = /home/hadoop/hadoop-2.2.0/bin/hadoop job-kill job_1400782075030_0002
  Hadoop job information for Stage-2: number of mappers: 1; number of reducers: 1
  2014-05-22 12:00:32,149 Stage-2 map = 0%,reduce = 0%
  2014-05-22 12:00:41,397 Stage-2 map = 100%,reduce = 0%, Cumulative CPU 1.05 sec
  2014-05-22 12:01:23,745 Stage-2 map = 100%,reduce = 67%, Cumulative CPU 2.77 sec
  2014-05-22 12:01:27,819 Stage-2 map = 100%,reduce = 100%, Cumulative CPU 3.92 sec
  MapReduce Total cumulative CPU time: 3 seconds 920 msec
  Ended Job = job_1400782075030_0002
  Loading data to table default.result partition (dt=2014-5-20)
  : Skipping stats aggregation by error org.apache.hadoop.hive.ql.metadata.HiveException: : Stats aggregator of type counter cannot be connected to
  Partition default.result{dt=2014-5-20} stats:
  MapReduce Jobs Launched:
  Job 0: Map: 1Reduce: 1   Cumulative CPU: 4.59 sec   HDFS Read: 40138523 HDFS Write: 117 SUCCESS
  Job 1: Map: 1Reduce: 1   Cumulative CPU: 3.92 sec   HDFS Read: 486 HDFS Write: 10 SUCCESS
  Total MapReduce CPU Time Spent: 8 seconds 510 msec
  OK
  Time taken: 206.925 seconds
  ~~~~~~~~~~~~~~~
  ---这里的: Skipping stats aggregation by error org.apache.hadoop.hive.ql.metadata.HiveException: : Stats aggregator of type counter cannot be connected to还不知道怎么个情况,回头再查一下
  ~~~~~~~~~~~~~~~~~~~~~~
  hive> show tables;
  OK
  maptile
  result
  Time taken: 0.349 seconds, Fetched: 2 row(s)
  hive> select * from result;
  OK
  NULL1382652014-5-20
  Time taken: 0.766 seconds, Fetched: 1 row(s)

页: [1]
查看完整版本: mysql hive