运行第一个Hive程序

来自CloudWiki
跳转至: 导航搜索

实训目的

利用Hive进行单词计数

实训步骤

编写hive脚本wordcount.hql,并保存至/opt目录下

drop table if exists words;
create table words(textline string);
load data local inpath '/opt/test.txt' into table words;
set hive.cli.print.header=true;
select word,count(*) as wordcount from (select explode(split(textline," ")) as word from words) tmp group by word;

将数据test.txt移动到/opt目录下

执行hive命令:

hive -f wordcount.hql

运行结果:


Logging initialized using configuration in jar:file:/usr/local/apache-hive-1.2.1-bin/lib/hive-common-1.2.1.jar!/hive-log4j.properties
OK
Time taken: 0.945 seconds
OK
Time taken: 1.252 seconds
Loading data to table default.words
Table default.words stats: [numFiles=1, totalSize=48]
OK
Time taken: 1.058 seconds
Query ID = root_20200627101421_cb24470f-8105-4dd2-8fc5-bb8e32c014e1
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1593205940394_0003, Tracking URL = http://master:8088/proxy/application_1593205940394_0003/
Kill Command = /usr/local/hadoop-2.6.5/bin/hadoop job  -kill job_1593205940394_0003
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2020-06-27 10:14:33,980 Stage-1 map = 0%,  reduce = 0%
2020-06-27 10:14:43,523 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.52 sec
2020-06-27 10:14:51,954 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 4.07 sec
MapReduce Total cumulative CPU time: 4 seconds 70 msec
Ended Job = job_1593205940394_0003
MapReduce Jobs Launched:
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 4.07 sec   HDFS Read: 7404 HDFS Write: 52 SUCCESS
Total MapReduce CPU Time Spent: 4 seconds 70 msec
OK
word    wordcount
I       3
MapReduce       1
a       1
am      1
hadoop  1
learn   2
student 1
Time taken: 31.612 seconds, Fetched: 7 row(s)