“Django REST Elasticsearch:Django集成ES”的版本间的差异
(未显示同一用户的22个中间版本) | |||
第1行: | 第1行: | ||
− | ==Django REST | + | ==Django REST Elasticsearch简介== |
Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。 | Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。 | ||
第6行: | 第6行: | ||
+ | ==要求== | ||
+ | *Django REST Framework 3.5及更高版本 | ||
+ | *elasticsearch-dsl> = 5.0.0,<7.0.0(Elasticsearch 5.x) | ||
+ | ==操作步骤== | ||
+ | ===安装=== | ||
+ | pip3 install django-rest-elasticsearch -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com | ||
+ | ===创建项目=== | ||
+ | [root@localhost ~]# django-admin startproject tutorial | ||
+ | [root@localhost ~]# ls | ||
+ | |||
+ | <nowiki>anaconda-ks.cfg CentOS-Sources.repo jdk1.8.0_211 | ||
+ | apache-maven-3.5.4-bin.tar.gz centos.tar jdk-8u211-linux-x64.tar.gz | ||
+ | big_data CentOS-Vault.repo mongodb-linux-x86_64-rhel62-4.0.0.tgz | ||
+ | CentOS-Base.repo docker-ce.repo Python-3.6.5 | ||
+ | CentOS-CR.repo Dockerfile Python-3.6.5.tgz | ||
+ | CentOS-Debuginfo.repo elasticsearch-5.5.1 tutorial | ||
+ | CentOS-fasttrack.repo elasticsearch-5.5.1.zip | ||
+ | CentOS-Media.repo HelloWorld</nowiki> | ||
+ | |||
+ | ===创建APP=== | ||
+ | [root@localhost ~]# cd tutorial | ||
+ | |||
+ | [root@localhost tutorial]# python3 manage.py startapp blog | ||
+ | |||
+ | ===配置APP=== | ||
+ | 在tutorial/settings.py 中添加APP: | ||
+ | |||
+ | [root@localhost tutorial]# vi settings.py | ||
+ | |||
+ | <nowiki>INSTALLED_APPS = [ | ||
+ | 'django.contrib.admin', | ||
+ | 'django.contrib.auth', | ||
+ | 'django.contrib.contenttypes', | ||
+ | 'django.contrib.sessions', | ||
+ | 'django.contrib.messages', | ||
+ | 'django.contrib.staticfiles', | ||
+ | 'rest_framework', | ||
+ | 'blog', | ||
+ | ] | ||
+ | </nowiki> | ||
+ | |||
+ | ===配置URL=== | ||
+ | 在tutorial/urls.py中配置URL: | ||
+ | |||
+ | <nowiki>from django.contrib import admin | ||
+ | from django.urls import path | ||
+ | from django.conf.urls import include | ||
+ | |||
+ | urlpatterns = [ | ||
+ | path('admin/', admin.site.urls), | ||
+ | path('blog/', include('blog.urls')), | ||
+ | ] | ||
+ | |||
+ | </nowiki> | ||
+ | |||
+ | 编写内部路由文件: | ||
+ | |||
+ | blog/urls.py: | ||
+ | |||
+ | <nowiki>from django.conf.urls import url | ||
+ | from . import views | ||
+ | |||
+ | urlpatterns = [ | ||
+ | |||
+ | ]</nowiki> | ||
+ | |||
+ | ===创建模型=== | ||
+ | blog/models.py: | ||
+ | |||
+ | <nowiki>from django.db import models | ||
+ | from django.utils.translation import ugettext_lazy as _ | ||
+ | from django.contrib.postgres.fields import ArrayField | ||
+ | |||
+ | |||
+ | # Create your models here. | ||
+ | class Blog(models.Model): | ||
+ | title = models.CharField(_('Title'), max_length=1000) #博客题目 | ||
+ | created_at = models.DateTimeField(_('Created at'), auto_now_add=True) #创建日期 | ||
+ | body = models.TextField(_('Body'))#博客正文 | ||
+ | tags = ArrayField(models.CharField(max_length=200), blank=True, null=True) #博客标签 | ||
+ | is_published = models.BooleanField(_('Is published'), default=False) #是否公开 | ||
+ | |||
+ | def __str__(self): | ||
+ | return self.title | ||
+ | </nowiki> | ||
+ | |||
+ | ===在ES中创建映射=== | ||
+ | |||
+ | 创建一个 DocType 来代表我们的博客模型 | ||
+ | |||
+ | blog/search_indexes.py: | ||
+ | |||
+ | <nowiki>from elasticsearch_dsl import DocType,Date, Integer, Keyword, Text,Boolean | ||
+ | from elasticsearch_dsl.connections import connections | ||
+ | # Define a default Elasticsearch client | ||
+ | connections.create_connection(hosts=['localhost']) | ||
+ | |||
+ | class BlogIndex(DocType): | ||
+ | |||
+ | pk = Integer() | ||
+ | title = Text(fields={'raw': Keyword()}) | ||
+ | created_at = Date() | ||
+ | body = Text() | ||
+ | tags = Keyword(multi=True) | ||
+ | is_published = Boolean() | ||
+ | |||
+ | class Meta: | ||
+ | index = 'blog' | ||
+ | |||
+ | # 在Elasticsearch中创建映射 | ||
+ | #我们需要在Elasticsearch中创建映射。这里通过调用init类方法直接创建映射: | ||
+ | if __name__ == "__main__": | ||
+ | BlogIndex.init() | ||
+ | |||
+ | </nowiki> | ||
+ | |||
+ | 在命令行中执行此程序 python3 search_indexes.py | ||
+ | |||
+ | 然后,运行curl命令检验一下, | ||
+ | |||
+ | 看看ES中是否新建了索引blog: | ||
+ | |||
+ | [root@localhost blog]# curl -X GET 'http://localhost:9200/_cat/indices?v' | ||
+ | |||
+ | <nowiki>health status index uuid pri rep docs.count docs.deleted store.size pri.store.size | ||
+ | ... | ||
+ | yellow open blog 4VV7_m7bT-aarL7URl2XzQ 5 1 0 0 810b 810b | ||
+ | ...</nowiki> | ||
+ | |||
+ | ‘在ES创建映射’这一部分的代码,跟普通Django REST framework 项目中的数据库迁移命令( python3 manage.py migrate )很相似,只不过现在面对的“数据库”是我们的Elasticsearch。 | ||
+ | |||
+ | 另:在[http://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#document-life-cycle 文档生命周期文档]中,您可以找到有关如何手动使用文档的完整说明。 | ||
+ | |||
+ | ===网络配置=== | ||
+ | 配置连接ES的实例: | ||
+ | |||
+ | blog/config.py: | ||
+ | |||
+ | <nowiki>from elasticsearch import Elasticsearch, RequestsHttpConnection | ||
+ | |||
+ | es_client = Elasticsearch( | ||
+ | hosts=['10.0.0.30:9200/'], | ||
+ | connection_class=RequestsHttpConnection | ||
+ | ) | ||
+ | </nowiki> | ||
+ | |||
+ | ===编写序列化模块=== | ||
+ | |||
+ | 使用 Serializer 可以将 queryset, model 实例等复杂数据类型(complex types)序列化成原生的 python 数据结构,且将其渲染成 JSON,XML 等其他数据类型。 Serializers 也可以反序列化,可将输入数据验证后,解析成复杂数据类型。 | ||
+ | |||
+ | blog/serializers.py : | ||
+ | |||
+ | <nowiki>from rest_framework_elasticsearch.es_serializer import ElasticModelSerializer | ||
+ | from .models import Blog | ||
+ | from .search_indexes import BlogIndex | ||
+ | |||
+ | class ElasticBlogSerializer(ElasticModelSerializer): | ||
+ | class Meta: | ||
+ | model = Blog | ||
+ | es_model = BlogIndex | ||
+ | fields = ('pk', 'title', 'created_at', 'tags', 'body', 'is_published')</nowiki> | ||
+ | ===添加信号调度程序=== | ||
+ | (作用:添加,更新或删除新数据时使索引可更新) | ||
+ | |||
+ | 我们希望在ElasticSearch中拥有一致的数据,这就是我们在更改模型中的任何内容时需要创建,更新或删除文档的原因。最好的方法是添加一个Django信号调度程序。在添加信号之前,我们上一步已经创建了一个序列化器来创建,更新和删除elasticsearch文档。 | ||
+ | |||
+ | 现在,我们需要创建signals.py文件: | ||
+ | |||
+ | <nowiki>from .config import es_client | ||
+ | from django.db.models.signals import pre_save, post_delete | ||
+ | from django.dispatch import receiver | ||
+ | |||
+ | from .serializers import Blog, ElasticBlogSerializer | ||
+ | |||
+ | |||
+ | @receiver(pre_save, sender=Blog, dispatch_uid="update_record") | ||
+ | def update_es_record(sender, instance, **kwargs): | ||
+ | obj = ElasticBlogSerializer(instance) | ||
+ | obj.save(using=es_client) | ||
+ | |||
+ | |||
+ | @receiver(post_delete, sender=Blog, dispatch_uid="delete_record") | ||
+ | def delete_es_record(sender, instance, *args, **kwargs): | ||
+ | obj = ElasticBlogSerializer(instance) | ||
+ | obj.delete(using=es_client, ignore=404) | ||
+ | |||
+ | </nowiki> | ||
+ | |||
+ | 加载signals设置: blog/ __init__.py中填写: | ||
+ | |||
+ | default_app_config = 'blog.apps.BlogConfig' | ||
+ | |||
+ | 在blog/apps.py中重写ready方法: | ||
+ | |||
+ | <nowiki>from django.apps import AppConfig | ||
+ | |||
+ | class BlogConfig(AppConfig): | ||
+ | name = 'blog' | ||
+ | def ready(self): | ||
+ | import blog.signals # noqa | ||
+ | </nowiki> | ||
+ | |||
+ | 解释: 当INSTALLED_APPS 包含一个应用模块的路径后,Django 将在这个模块中检查一个default_app_config 变量。如果这个变量有定义,它应该是这个应用的AppConfig 子类的路径,调用get_app_config返回的是这个子类化的AppConfig。如果没有default_app_config,Django 将使用AppConfig 基类,调用get_app_config返回的是AppConfig基类。(https://blog.csdn.net/weixin_34268169/article/details/94320415) | ||
+ | |||
+ | 参考文档:https://www.jianshu.com/p/4f1bf0905c69 | ||
+ | |||
+ | ===编写视图=== | ||
+ | 最后,让我们创建一个简单的搜索视图,查找按标签过滤的所有帖子,并按标题中的单词进行搜索: | ||
+ | |||
+ | blog/views.py: | ||
+ | |||
+ | <nowiki>#from config.es_client import es_client | ||
+ | from .config import es_client | ||
+ | |||
+ | from rest_framework_elasticsearch import es_views, es_pagination, es_filters | ||
+ | from .search_indexes import BlogIndex | ||
+ | |||
+ | |||
+ | class BlogView(es_views.ListElasticAPIView): | ||
+ | es_client = es_client | ||
+ | es_model = BlogIndex | ||
+ | es_pagination_class = es_pagination.ElasticLimitOffsetPagination | ||
+ | |||
+ | es_filter_backends = ( | ||
+ | es_filters.ElasticFieldsFilter, | ||
+ | es_filters.ElasticSearchFilter, | ||
+ | es_filters.ElasticOrderingFilter, | ||
+ | ) | ||
+ | es_ordering_fields = ( | ||
+ | "created_at", | ||
+ | ("title.raw", "title") | ||
+ | ) | ||
+ | es_filter_fields = ( | ||
+ | es_filters.ESFieldFilter('tag', 'tags'), | ||
+ | es_filters.ESFieldFilter('is_published', 'is_published') | ||
+ | ) | ||
+ | es_search_fields = ( | ||
+ | 'tags', | ||
+ | 'title', | ||
+ | ) | ||
+ | </nowiki> | ||
+ | |||
+ | ===编写路由=== | ||
+ | |||
+ | blog/urls.py: | ||
+ | |||
+ | <nowiki>from django.conf.urls import url | ||
+ | from . import views | ||
+ | |||
+ | urlpatterns = [ | ||
+ | url(r'^api/list$', views.BlogView.as_view(), name='blog-list'), | ||
+ | ] | ||
+ | </nowiki> | ||
+ | ===模型注册=== | ||
+ | Django自带的后台管理是Django明显特色之一,可以让我们快速便捷管理数据。后台管理可以在各个app的admin.py文件中进行控制。 | ||
+ | |||
+ | blog/admin.py | ||
+ | |||
+ | <nowiki>from django.contrib import admin | ||
+ | |||
+ | # Register your models here. | ||
+ | from django.contrib import admin | ||
+ | |||
+ | from .models import Blog | ||
+ | |||
+ | |||
+ | @admin.register(Blog) | ||
+ | class GalleryCategoryAdmin(admin.ModelAdmin): | ||
+ | list_display = ('title', 'is_published') | ||
+ | </nowiki> | ||
+ | |||
+ | |||
+ | |||
+ | ==测试== | ||
+ | |||
+ | curl http://10.0.0.30:8000/blog/api/list?search=elasticsearch | ||
+ | |||
+ | curl http://10.0.0.30:8000/blog/api/list?tag=opensource | ||
+ | |||
+ | curl http://10.0.0.30:8000/blog/api/list?tag=opensource,aws | ||
+ | |||
+ | |||
+ | 或直接在浏览器上打开: | ||
第17行: | 第300行: | ||
[2] https://www.helplib.com/GitHub/article_154027 | [2] https://www.helplib.com/GitHub/article_154027 | ||
+ | |||
+ | [3]https://github.com/myarik/django-rest-elasticsearch | ||
+ | |||
+ | [4] https://www.jianshu.com/p/7179229eeac1 | ||
+ | |||
+ | [5] https://blog.csdn.net/weixin_34018169/article/details/88469999 | ||
+ | |||
+ | [6] https://www.jianshu.com/p/131ac72557a3 | ||
+ | |||
+ | [7] https://mp.weixin.qq.com/s?__biz=MzA4MjEyNTA5Mw==&mid=2652567887&idx=1&sn=ebe895705a5e8062f982535fed048c9a&chksm=8464d105b3135813590f310fab05791cddad7602f921606a683ffc5a8a5ca0796b38467c74c5&mpshare=1&scene=23&srcid=0621jEcEuhggQoLCPan7mnhp#rd |
2019年8月16日 (五) 08:46的最新版本
目录
Django REST Elasticsearch简介
Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。
让我们看一下使用Django REST Elasticsearch构建一个简单应用程序的快速示例。在这个例子中,我们将构建一个简单的博客系统。
要求
- Django REST Framework 3.5及更高版本
- elasticsearch-dsl> = 5.0.0,<7.0.0(Elasticsearch 5.x)
操作步骤
安装
pip3 install django-rest-elasticsearch -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com
创建项目
[root@localhost ~]# django-admin startproject tutorial
[root@localhost ~]# ls
anaconda-ks.cfg CentOS-Sources.repo jdk1.8.0_211 apache-maven-3.5.4-bin.tar.gz centos.tar jdk-8u211-linux-x64.tar.gz big_data CentOS-Vault.repo mongodb-linux-x86_64-rhel62-4.0.0.tgz CentOS-Base.repo docker-ce.repo Python-3.6.5 CentOS-CR.repo Dockerfile Python-3.6.5.tgz CentOS-Debuginfo.repo elasticsearch-5.5.1 tutorial CentOS-fasttrack.repo elasticsearch-5.5.1.zip CentOS-Media.repo HelloWorld
创建APP
[root@localhost ~]# cd tutorial
[root@localhost tutorial]# python3 manage.py startapp blog
配置APP
在tutorial/settings.py 中添加APP:
[root@localhost tutorial]# vi settings.py
INSTALLED_APPS = [ 'django.contrib.admin', 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.messages', 'django.contrib.staticfiles', 'rest_framework', 'blog', ]
配置URL
在tutorial/urls.py中配置URL:
from django.contrib import admin from django.urls import path from django.conf.urls import include urlpatterns = [ path('admin/', admin.site.urls), path('blog/', include('blog.urls')), ]
编写内部路由文件:
blog/urls.py:
from django.conf.urls import url from . import views urlpatterns = [ ]
创建模型
blog/models.py:
from django.db import models from django.utils.translation import ugettext_lazy as _ from django.contrib.postgres.fields import ArrayField # Create your models here. class Blog(models.Model): title = models.CharField(_('Title'), max_length=1000) #博客题目 created_at = models.DateTimeField(_('Created at'), auto_now_add=True) #创建日期 body = models.TextField(_('Body'))#博客正文 tags = ArrayField(models.CharField(max_length=200), blank=True, null=True) #博客标签 is_published = models.BooleanField(_('Is published'), default=False) #是否公开 def __str__(self): return self.title
在ES中创建映射
创建一个 DocType 来代表我们的博客模型
blog/search_indexes.py:
from elasticsearch_dsl import DocType,Date, Integer, Keyword, Text,Boolean from elasticsearch_dsl.connections import connections # Define a default Elasticsearch client connections.create_connection(hosts=['localhost']) class BlogIndex(DocType): pk = Integer() title = Text(fields={'raw': Keyword()}) created_at = Date() body = Text() tags = Keyword(multi=True) is_published = Boolean() class Meta: index = 'blog' # 在Elasticsearch中创建映射 #我们需要在Elasticsearch中创建映射。这里通过调用init类方法直接创建映射: if __name__ == "__main__": BlogIndex.init()
在命令行中执行此程序 python3 search_indexes.py
然后,运行curl命令检验一下,
看看ES中是否新建了索引blog:
[root@localhost blog]# curl -X GET 'http://localhost:9200/_cat/indices?v'
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size ... yellow open blog 4VV7_m7bT-aarL7URl2XzQ 5 1 0 0 810b 810b ...
‘在ES创建映射’这一部分的代码,跟普通Django REST framework 项目中的数据库迁移命令( python3 manage.py migrate )很相似,只不过现在面对的“数据库”是我们的Elasticsearch。
另:在文档生命周期文档中,您可以找到有关如何手动使用文档的完整说明。
网络配置
配置连接ES的实例:
blog/config.py:
from elasticsearch import Elasticsearch, RequestsHttpConnection es_client = Elasticsearch( hosts=['10.0.0.30:9200/'], connection_class=RequestsHttpConnection )
编写序列化模块
使用 Serializer 可以将 queryset, model 实例等复杂数据类型(complex types)序列化成原生的 python 数据结构,且将其渲染成 JSON,XML 等其他数据类型。 Serializers 也可以反序列化,可将输入数据验证后,解析成复杂数据类型。
blog/serializers.py :
from rest_framework_elasticsearch.es_serializer import ElasticModelSerializer from .models import Blog from .search_indexes import BlogIndex class ElasticBlogSerializer(ElasticModelSerializer): class Meta: model = Blog es_model = BlogIndex fields = ('pk', 'title', 'created_at', 'tags', 'body', 'is_published')
添加信号调度程序
(作用:添加,更新或删除新数据时使索引可更新)
我们希望在ElasticSearch中拥有一致的数据,这就是我们在更改模型中的任何内容时需要创建,更新或删除文档的原因。最好的方法是添加一个Django信号调度程序。在添加信号之前,我们上一步已经创建了一个序列化器来创建,更新和删除elasticsearch文档。
现在,我们需要创建signals.py文件:
from .config import es_client from django.db.models.signals import pre_save, post_delete from django.dispatch import receiver from .serializers import Blog, ElasticBlogSerializer @receiver(pre_save, sender=Blog, dispatch_uid="update_record") def update_es_record(sender, instance, **kwargs): obj = ElasticBlogSerializer(instance) obj.save(using=es_client) @receiver(post_delete, sender=Blog, dispatch_uid="delete_record") def delete_es_record(sender, instance, *args, **kwargs): obj = ElasticBlogSerializer(instance) obj.delete(using=es_client, ignore=404)
加载signals设置: blog/ __init__.py中填写:
default_app_config = 'blog.apps.BlogConfig'
在blog/apps.py中重写ready方法:
from django.apps import AppConfig class BlogConfig(AppConfig): name = 'blog' def ready(self): import blog.signals # noqa
解释: 当INSTALLED_APPS 包含一个应用模块的路径后,Django 将在这个模块中检查一个default_app_config 变量。如果这个变量有定义,它应该是这个应用的AppConfig 子类的路径,调用get_app_config返回的是这个子类化的AppConfig。如果没有default_app_config,Django 将使用AppConfig 基类,调用get_app_config返回的是AppConfig基类。(https://blog.csdn.net/weixin_34268169/article/details/94320415)
参考文档:https://www.jianshu.com/p/4f1bf0905c69
编写视图
最后,让我们创建一个简单的搜索视图,查找按标签过滤的所有帖子,并按标题中的单词进行搜索:
blog/views.py:
#from config.es_client import es_client from .config import es_client from rest_framework_elasticsearch import es_views, es_pagination, es_filters from .search_indexes import BlogIndex class BlogView(es_views.ListElasticAPIView): es_client = es_client es_model = BlogIndex es_pagination_class = es_pagination.ElasticLimitOffsetPagination es_filter_backends = ( es_filters.ElasticFieldsFilter, es_filters.ElasticSearchFilter, es_filters.ElasticOrderingFilter, ) es_ordering_fields = ( "created_at", ("title.raw", "title") ) es_filter_fields = ( es_filters.ESFieldFilter('tag', 'tags'), es_filters.ESFieldFilter('is_published', 'is_published') ) es_search_fields = ( 'tags', 'title', )
编写路由
blog/urls.py:
from django.conf.urls import url from . import views urlpatterns = [ url(r'^api/list$', views.BlogView.as_view(), name='blog-list'), ]
模型注册
Django自带的后台管理是Django明显特色之一,可以让我们快速便捷管理数据。后台管理可以在各个app的admin.py文件中进行控制。
blog/admin.py
from django.contrib import admin # Register your models here. from django.contrib import admin from .models import Blog @admin.register(Blog) class GalleryCategoryAdmin(admin.ModelAdmin): list_display = ('title', 'is_published')
测试
curl http://10.0.0.30:8000/blog/api/list?search=elasticsearch
curl http://10.0.0.30:8000/blog/api/list?tag=opensource
curl http://10.0.0.30:8000/blog/api/list?tag=opensource,aws
或直接在浏览器上打开:
参考文档:
[1] https://www.jianshu.com/p/cd3d60da3128
[2] https://www.helplib.com/GitHub/article_154027
[3]https://github.com/myarik/django-rest-elasticsearch
[4] https://www.jianshu.com/p/7179229eeac1
[5] https://blog.csdn.net/weixin_34018169/article/details/88469999