查看“Hive离线选品推荐”的源代码
←
Hive离线选品推荐
跳转至:
导航
,
搜索
因为以下原因,您没有权限编辑本页:
您所请求的操作仅限于该用户组的用户使用:
用户
您可以查看与复制此页面的源代码。
== Hive离线选品推荐 == 1.我们先展示一条源数据,以逗号分割<br> 字段意思:商品页数,一级分类,二级分类,三级分类,商品图片,商品名称,商品价格,商品价格范围,商品单位,公司名称,公司网址,地址,"几年老店",标签 字段名:productId,productPage,categoryOne,categoryTwo,categoryThr,productImg,productName,productPrice,productPrices,productUnit,companyName,companyUrl,address,year,tags 1,1,Apparel_Textiles & Accessories,Apparel,Apparel Design Services,//s.alicdn.com/@sc01/kf/H50cfdfc8a6c4445ab6f3a926a2542c00t.jpg_300x300.jpg,embroidery digitizing 24h on line,1,$1.00-$2.00,Unit,Shen Zhen Happitoo Textile Co._ Ltd.,//happitoo.en.alibaba.com/company_profile.html#top-nav-bar,China,1,Apparel Design Apparel_Textessories Services; 首先在hive创建表名称为"data" create table data( productId string, productPage int, categoryOne string, categoryTwo string, categoryThr string, productImg string, productName string, productPrice float, productPrices string, productUnit string, companyName string, companyUrl string, address string, year int, tags string) row format delimited fields terminated by ','; [[文件:2020-08-26 090216.png]] 然后我们将本地数据导入到hive中: load data local inpath '/root/sj.txt' into table data; 查看导入的数据: select * from data; 创建一个Grade表,用于等级分类调用: create table Grade() insert into Grade values("low","middlw","high"); 接下来写shell脚本对数据价格进行分类: #!/usr/bin/env bash #统计data表中数据总数 #test1=`hive -S -e "select count(*) from data;"` #echo ${test1} #对数据进行排名从低到高并且生成到新表aaa_data test3=`hive -S -e "create table aaa_data as select *,rank() over(order by productPrice asc) as pm from data;"` echo ${test3} #计算总数中前百分之30和百分之60的数 test4=`hive -S -e "select count(*)*0.3 from aaa_data;"` test5=`hive -S -e "select count(*)*0.6 from aaa_data;"` #将价格分类名称输出到变量 testlow=`hive -S -e "select low from Grade;"` testmiddlw=`hive -S -e "select middlw from Grade;"` testhigh=`hive -S -e "select high from Grade;"` #打印输出 echo ${testlow} echo ${testmiddlw} echo ${testhigh} echo ${test1} echo ${test4} echo ${test5} #test6=`hive -S -e "select a,b,h,case # when pm<=${test4} then ${testlow} # when pm>=${test4} and pm<${test5} then ${testmiddlw} # when pm>=${test5} then ${testhigh} # else null end as Praaa from aaa_info,Grade;"` #echo ${test6} #创建新表data_output,判断前百分之30为low,百分之30到百分之60为middiw,百分之60以上为high<br> output=`hive -S -e "create table data_output as select productId,productPage,categoryOne,categoryTwo,categoryThr,productImg,productName,productPrice,productPrices,productUnit,companyName,companyUrl,address,year,tags,case when pm<=${test4} then ${testlow} when pm>=${test4} and pm<${test5} then ${testmiddlw} when pm>=${test5} then ${testhigh} else null end as Praaa from aaa_data,Grade;"` echo ${output} hive查询结果导入到数据库 add jar /root/software/apache-hive-2.1.1-bin/lib/hive-contrib-2.1.1.jar; add jar /root/mysql-connector-java-5.1.32-bin.jar; CREATE TEMPORARY FUNCTION dboutput AS 'org.apache.hadoop.hive.contrib.genericudf.example.GenericUDFDBOutput'; Mysql数据库创建 create table commodity(productId int(10),productPage int(10),categoryOne varchar(100),categoryTwo varchar(100),categoryThr varchar(100),productImg varchar(100),productName varchar(100),productPrice float,productPrices varchar(100),productUnit varchar(100),companyName varchar(100),companyUrl varchar(100),address varchar(100),year varchar(100),tags varchar(100),pricelevel varchar(100)); 导入到数据库 CREATE TEMPORARY FUNCTION dboutput AS 'org.apache.hadoop.hive.contrib.genericudf.example.GenericUDFDBOutput'; select dboutput('jdbc:mysql://master/mncs','root','root','insert into output(productId,productPage,categoryOne,categoryTwo,categoryThr,productImg,productName,productPrice,productPrices,productUnit,companyName,companyUrl,address,year,tags,pricelevel) values (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)',productId,productPage,categoryOne,categoryTwo,categoryThr,productImg,productName,productPrice,productPrices,productUnit,companyName,companyUrl,address,year,tags,pricelevel) from output; 计算后的格式: productId,productPage,categoryOne,categoryTwo,categoryThr,productImg,productName,productPrice,productPrices,productUnit,companyName,companyUrl,address,year,tags,pricelevel 字段意思:商品页数,一级分类,二级分类,三级分类,商品图片,商品名称,商品价格,商品价格范围,商品单位,公司名称,公司网址,地址,"几年老店",标签,价格分类 数据展示: 1,1,Apparel_Textiles & Accessories,Apparel,Apparel Design Services,//s.alicdn.com/@sc01/kf/H50cfdfc8a6c4445ab6f3a926a2542c00t.jpg_300x300.jpg,embroidery digitizing service 24h on line,1,$1.00-$2.00,Unit,Shen Zhen Happitoo Textile Co._ Ltd.,//happitoo.en.alibaba.com/company_profile.html#top-nav-bar,China,1,Apparel Design Apparel_Textiles Accessories Services,Low 2,1,Apparel_Textiles & Accessories,Apparel,Apparel Design Services,//s.alicdn.com/@sc01/kf/He32d24711ba34240aa7f12fa9150ca32Y.jpg_300x300.jpg,women t shirts graphic custom t shirt printing with 100%cotton and can printed logo,3,$3.90-$4.90,Piece,Shandong Shanke Clothing Co._ Ltd.,//sdshanke.en.alibaba.com/company_profile.html#top-nav- bar,China,1,Apparel Design Apparel_Textiles Accessories Services,Low 3,1,Apparel_Textiles & Accessories,Apparel,Apparel Design Services,//s.alicdn.com/@sc01/kf/UTB8pn_mX22JXKJkSanrq6y3lVXab.jpg_300x300.jpg,waist coat,16,$16.00-$25.00,Piece,RUBY ENTERPRISES,//rubythobes.trustpass.alibaba.com/company_profile.html#top-nav-bar,India,4,Apparel Design Apparel_Textiles Accessories Services,Low [[文件:2020-09-16 163559.png]] [[文件:2020-08-26 090139.png]]
返回至
Hive离线选品推荐
。
导航菜单
个人工具
登录
命名空间
页面
讨论
变种
视图
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
帮助
工具
链入页面
相关更改
特殊页面
页面信息