查看“丝路通:导入商品分类数据”的源代码
←
丝路通:导入商品分类数据
跳转至:
导航
,
搜索
因为以下原因,您没有权限编辑本页:
您所请求的操作仅限于该用户组的用户使用:
用户
您可以查看与复制此页面的源代码。
== 生成CSV文件== ===敦煌网=== 输入:已有的爬虫数据。 <nowiki> import time category_file ='dh_category_data.csv' def read_category_file(): cat_list = "" # 创建类别网址列表 fp = open('dh_sub_category.csv', "rt") # 打开csv文件 count= 0 #类别名 类目级别 父类目级别 s =set()#储存已有的类别 for line in fp: # 文件对象可以直接迭代 count +=1 d = {}; data = line.split(',') d['line_num'] = data[0]; d['class1'] = data[1]; d['class2'] = data[2]; d['url'] = data[3] now_time = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime()) class1 ="NULL"+","+d['class1']+","+d['class1']+","+ \ "NULL"+","+str(1)+","+str(0)+","+now_time+","+"NULL\n" class2 ="NULL"+","+d['class2']+","+d['class2']+","+ \ "NULL"+","+str(2)+","+str(0)+","+now_time+","+"NULL\n" if d['class1'] not in s: #只添加之前没有的目录 cat_list += class1 if d['class2'] not in s: cat_list += class2 s.add(d['class1']);s.add(d['class2']);#将目录名收录进集合中 #print(d['class1'],d['class1'] in s) #print(s) if count%100 ==0: fw = open(category_file,"a",encoding="utf-8") fw.write(cat_list) fw.close() cat_list ="" fp.close() return cat_list if __name__ == '__main__': cat_list =read_category_file() </nowiki> ==将CSV文件上传数据库== ===敦煌网=== MariaDB [mxshop]> load data infile '/opt/dh_category_data.csv' into table goods_goodscategory fields terminated by ',' optionally enclosed by '"' escaped by '"' lines terminated by '\r\n'; <nowiki>Query OK, 415 rows affected, 431 warnings (0.06 sec) Records: 415 Deleted: 0 Skipped: 0 Warnings: 431</nowiki> MariaDB [mxshop]> select * from goods_goodscategory limit 0,10; <nowiki>+----+--------------+------+------+---------------+--------+----------------- ----+--------------------+ | id | name | code | desc | category_type | is_tab | add_time | parent_category_id | +----+--------------+------+------+---------------+--------+----------------- ----+--------------------+ | 1 | 生鲜食品 | sxsp | | 1 | 0 | 2020-06-24 16:34 :11 | NULL | | 2 | 精品肉类 | jprl | | 2 | 0 | 2020-06-24 16:34 :11 | 1 | | 3 | 羊肉 | yr | | 3 | 0 | 2020-06-24 16:34</nowiki> ===阿里巴巴=== MariaDB [mxshop]> load data infile '/opt/ali_category_data.csv' into table goods_goodscategory fields terminated by ',' optionally enclosed by '"' escaped by '"' lines terminated by '\r\n'; <nowiki>Query OK, 2022 rows affected, 997 warnings (0.04 sec) Records: 2022 Deleted: 0 Skipped: 0 Warnings: 997</nowiki> MariaDB [mxshop]> select count(*) from goods_goodscategory; <nowiki>+----------+ | count(*) | +----------+ | 2557 | +----------+ 1 row in set (0.00 sec)</nowiki> ===中国制造网=== MariaDB [mxshop]> load data infile '/opt/china_category_data.csv' into table goods_goodscategory fields terminated by ',' optionally enclosed by '"' escaped by '"' lines terminated by '\r\n';
返回至
丝路通:导入商品分类数据
。
导航菜单
个人工具
登录
命名空间
页面
讨论
变种
视图
阅读
查看源代码
查看历史
更多
搜索
导航
首页
最近更改
随机页面
帮助
工具
链入页面
相关更改
特殊页面
页面信息