“Django REST Elasticsearch:Django集成ES”的版本间的差异

来自CloudWiki
跳转至: 导航搜索
 
(未显示同一用户的23个中间版本)
第1行: 第1行:
==Django REST Elasticsearch==
+
==Django REST Elasticsearch简介==
  
 
Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。
 
Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。
第6行: 第6行:
  
  
 +
==要求==
  
 +
*Django REST Framework 3.5及更高版本
 +
*elasticsearch-dsl> = 5.0.0,<7.0.0(Elasticsearch 5.x)
  
 +
==操作步骤==
 +
===安装===
 +
pip3 install django-rest-elasticsearch -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com
  
 +
===创建项目===
 +
[root@localhost ~]# django-admin startproject tutorial
  
 +
[root@localhost ~]# ls
  
 +
<nowiki>anaconda-ks.cfg                CentOS-Sources.repo      jdk1.8.0_211
 +
apache-maven-3.5.4-bin.tar.gz  centos.tar              jdk-8u211-linux-x64.tar.gz
 +
big_data                      CentOS-Vault.repo        mongodb-linux-x86_64-rhel62-4.0.0.tgz
 +
CentOS-Base.repo              docker-ce.repo          Python-3.6.5
 +
CentOS-CR.repo                Dockerfile              Python-3.6.5.tgz
 +
CentOS-Debuginfo.repo          elasticsearch-5.5.1      tutorial
 +
CentOS-fasttrack.repo          elasticsearch-5.5.1.zip
 +
CentOS-Media.repo              HelloWorld</nowiki>
  
作者:社会刘独秀
+
===创建APP===
链接:https://www.jianshu.com/p/cd3d60da3128
+
[root@localhost ~]# cd tutorial
来源:简书
+
 
简书著作权归作者所有,任何形式的转载都请联系作者获得授权并注明出处。
+
[root@localhost tutorial]# python3 manage.py startapp blog
 +
 
 +
===配置APP===
 +
在tutorial/settings.py 中添加APP:
 +
 
 +
[root@localhost tutorial]# vi settings.py
 +
 
 +
<nowiki>INSTALLED_APPS = [
 +
    'django.contrib.admin',
 +
    'django.contrib.auth',
 +
    'django.contrib.contenttypes',
 +
    'django.contrib.sessions',
 +
    'django.contrib.messages',
 +
    'django.contrib.staticfiles',
 +
    'rest_framework',
 +
    'blog',
 +
]
 +
</nowiki>
 +
 
 +
===配置URL===
 +
在tutorial/urls.py中配置URL:
 +
 
 +
<nowiki>from django.contrib import admin
 +
from django.urls import path
 +
from django.conf.urls import include
 +
 
 +
urlpatterns = [
 +
    path('admin/', admin.site.urls),
 +
    path('blog/', include('blog.urls')),
 +
]
 +
 
 +
</nowiki>
 +
 
 +
编写内部路由文件:
 +
 
 +
blog/urls.py:
 +
 
 +
<nowiki>from django.conf.urls import  url
 +
from . import views
 +
 
 +
urlpatterns = [
 +
   
 +
]</nowiki>
 +
 
 +
===创建模型===
 +
blog/models.py:
 +
 
 +
<nowiki>from django.db import models
 +
from django.utils.translation import ugettext_lazy as _
 +
from django.contrib.postgres.fields import ArrayField
 +
 
 +
 
 +
# Create your models here.
 +
class Blog(models.Model):
 +
    title = models.CharField(_('Title'), max_length=1000) #博客题目
 +
    created_at = models.DateTimeField(_('Created at'), auto_now_add=True) #创建日期
 +
    body = models.TextField(_('Body'))#博客正文
 +
    tags = ArrayField(models.CharField(max_length=200), blank=True, null=True) #博客标签
 +
    is_published = models.BooleanField(_('Is published'), default=False) #是否公开
 +
 
 +
    def __str__(self):
 +
        return self.title
 +
</nowiki>
 +
 
 +
===在ES中创建映射===
 +
 
 +
创建一个 DocType 来代表我们的博客模型
 +
 
 +
blog/search_indexes.py:
 +
 
 +
<nowiki>from elasticsearch_dsl import DocType,Date, Integer, Keyword, Text,Boolean
 +
from elasticsearch_dsl.connections import connections
 +
# Define a default Elasticsearch client
 +
connections.create_connection(hosts=['localhost'])
 +
 
 +
class BlogIndex(DocType):
 +
 
 +
    pk = Integer()
 +
    title = Text(fields={'raw': Keyword()})
 +
    created_at = Date()
 +
    body = Text()
 +
    tags = Keyword(multi=True)
 +
    is_published = Boolean()
 +
 
 +
    class Meta:
 +
        index = 'blog'
 +
 
 +
# 在Elasticsearch中创建映射
 +
#我们需要在Elasticsearch中创建映射。这里通过调用init类方法直接创建映射:
 +
if __name__ == "__main__":
 +
    BlogIndex.init()
 +
 
 +
</nowiki>
 +
 
 +
在命令行中执行此程序  python3  search_indexes.py
 +
 
 +
然后,运行curl命令检验一下,
 +
 
 +
看看ES中是否新建了索引blog:
 +
 
 +
[root@localhost blog]# curl -X GET 'http://localhost:9200/_cat/indices?v'
 +
 
 +
<nowiki>health status index      uuid                  pri rep docs.count docs.deleted store.size pri.store.size
 +
...
 +
yellow open  blog      4VV7_m7bT-aarL7URl2XzQ  5  1          0            0      810b          810b
 +
...</nowiki>
 +
 
 +
‘在ES创建映射’这一部分的代码,跟普通Django REST framework 项目中的数据库迁移命令( python3 manage.py migrate )很相似,只不过现在面对的“数据库”是我们的Elasticsearch。
 +
 
 +
另:在[http://elasticsearch-dsl.readthedocs.io/en/latest/persistence.html#document-life-cycle 文档生命周期文档]中,您可以找到有关如何手动使用文档的完整说明。
 +
 
 +
===网络配置===
 +
配置连接ES的实例:
 +
 
 +
blog/config.py:
 +
 
 +
<nowiki>from elasticsearch import Elasticsearch, RequestsHttpConnection
 +
 
 +
es_client = Elasticsearch(
 +
    hosts=['10.0.0.30:9200/'],
 +
    connection_class=RequestsHttpConnection
 +
)
 +
</nowiki>
 +
 
 +
===编写序列化模块===
 +
 
 +
使用 Serializer 可以将 queryset, model 实例等复杂数据类型(complex types)序列化成原生的 python 数据结构,且将其渲染成 JSON,XML 等其他数据类型。 Serializers 也可以反序列化,可将输入数据验证后,解析成复杂数据类型。
 +
 
 +
blog/serializers.py :
 +
 
 +
<nowiki>from rest_framework_elasticsearch.es_serializer import ElasticModelSerializer
 +
from .models import Blog
 +
from .search_indexes import BlogIndex
 +
 
 +
class ElasticBlogSerializer(ElasticModelSerializer):
 +
    class Meta:
 +
        model = Blog
 +
        es_model = BlogIndex
 +
        fields = ('pk', 'title', 'created_at', 'tags', 'body', 'is_published')</nowiki>
 +
===添加信号调度程序===
 +
(作用:添加,更新或删除新数据时使索引可更新)
 +
 
 +
我们希望在ElasticSearch中拥有一致的数据,这就是我们在更改模型中的任何内容时需要创建,更新或删除文档的原因。最好的方法是添加一个Django信号调度程序。在添加信号之前,我们上一步已经创建了一个序列化器来创建,更新和删除elasticsearch文档。
 +
 
 +
现在,我们需要创建signals.py文件:
 +
 
 +
<nowiki>from .config import es_client
 +
from django.db.models.signals import pre_save, post_delete
 +
from django.dispatch import receiver
 +
 
 +
from .serializers import Blog, ElasticBlogSerializer
 +
 
 +
 
 +
@receiver(pre_save, sender=Blog, dispatch_uid="update_record")
 +
def update_es_record(sender, instance, **kwargs):
 +
    obj = ElasticBlogSerializer(instance)
 +
    obj.save(using=es_client)
 +
 
 +
 
 +
@receiver(post_delete, sender=Blog, dispatch_uid="delete_record")
 +
def delete_es_record(sender, instance, *args, **kwargs):
 +
    obj = ElasticBlogSerializer(instance)
 +
    obj.delete(using=es_client, ignore=404)
 +
 
 +
</nowiki>
 +
 
 +
加载signals设置: blog/ __init__.py中填写:
 +
 
 +
default_app_config = 'blog.apps.BlogConfig'
 +
 
 +
在blog/apps.py中重写ready方法:
 +
 
 +
<nowiki>from django.apps import AppConfig
 +
 
 +
class BlogConfig(AppConfig):
 +
    name = 'blog'
 +
    def ready(self):
 +
        import blog.signals  # noqa
 +
</nowiki>
 +
 
 +
解释: 当INSTALLED_APPS 包含一个应用模块的路径后,Django 将在这个模块中检查一个default_app_config 变量。如果这个变量有定义,它应该是这个应用的AppConfig 子类的路径,调用get_app_config返回的是这个子类化的AppConfig。如果没有default_app_config,Django 将使用AppConfig 基类,调用get_app_config返回的是AppConfig基类。(https://blog.csdn.net/weixin_34268169/article/details/94320415)
 +
 
 +
参考文档:https://www.jianshu.com/p/4f1bf0905c69
 +
 
 +
===编写视图===
 +
最后,让我们创建一个简单的搜索视图,查找按标签过滤的所有帖子,并按标题中的单词进行搜索:
 +
 
 +
blog/views.py:
 +
 
 +
<nowiki>#from config.es_client import es_client
 +
from .config import es_client
 +
 
 +
from rest_framework_elasticsearch import es_views, es_pagination, es_filters
 +
from .search_indexes import BlogIndex
 +
 
 +
 
 +
class BlogView(es_views.ListElasticAPIView):
 +
    es_client = es_client
 +
    es_model = BlogIndex
 +
    es_pagination_class = es_pagination.ElasticLimitOffsetPagination
 +
 
 +
    es_filter_backends = (
 +
        es_filters.ElasticFieldsFilter,
 +
        es_filters.ElasticSearchFilter,
 +
        es_filters.ElasticOrderingFilter,
 +
    )
 +
    es_ordering_fields = (
 +
        "created_at",
 +
        ("title.raw", "title")
 +
    )
 +
    es_filter_fields = (
 +
        es_filters.ESFieldFilter('tag', 'tags'),
 +
        es_filters.ESFieldFilter('is_published', 'is_published')
 +
    )
 +
    es_search_fields = (
 +
        'tags',
 +
        'title',
 +
    )
 +
</nowiki>
 +
 
 +
===编写路由===
 +
 
 +
blog/urls.py:
 +
 
 +
<nowiki>from django.conf.urls import  url
 +
from . import views
 +
 
 +
urlpatterns = [
 +
    url(r'^api/list$', views.BlogView.as_view(), name='blog-list'),
 +
]
 +
</nowiki>
 +
===模型注册===
 +
Django自带的后台管理是Django明显特色之一,可以让我们快速便捷管理数据。后台管理可以在各个app的admin.py文件中进行控制。
 +
 
 +
blog/admin.py
 +
 
 +
<nowiki>from django.contrib import admin
 +
 
 +
# Register your models here.
 +
from django.contrib import admin
 +
 
 +
from .models import Blog
 +
 
 +
 
 +
@admin.register(Blog)
 +
class GalleryCategoryAdmin(admin.ModelAdmin):
 +
    list_display = ('title', 'is_published')
 +
</nowiki>
 +
 
 +
 
 +
 
 +
==测试==
 +
 
 +
curl http://10.0.0.30:8000/blog/api/list?search=elasticsearch
 +
 
 +
curl http://10.0.0.30:8000/blog/api/list?tag=opensource
 +
 
 +
curl http://10.0.0.30:8000/blog/api/list?tag=opensource,aws
 +
 
 +
 
 +
或直接在浏览器上打开:
 +
 
 +
 
 +
参考文档:
 +
 
 +
[1] https://www.jianshu.com/p/cd3d60da3128
 +
 
 +
[2] https://www.helplib.com/GitHub/article_154027
 +
 
 +
[3]https://github.com/myarik/django-rest-elasticsearch
 +
 
 +
[4] https://www.jianshu.com/p/7179229eeac1
 +
 
 +
[5] https://blog.csdn.net/weixin_34018169/article/details/88469999
 +
 
 +
[6] https://www.jianshu.com/p/131ac72557a3
 +
 
 +
[7] https://mp.weixin.qq.com/s?__biz=MzA4MjEyNTA5Mw==&mid=2652567887&idx=1&sn=ebe895705a5e8062f982535fed048c9a&chksm=8464d105b3135813590f310fab05791cddad7602f921606a683ffc5a8a5ca0796b38467c74c5&mpshare=1&scene=23&srcid=0621jEcEuhggQoLCPan7mnhp#rd

2019年8月16日 (五) 08:46的最新版本

Django REST Elasticsearch简介

Django REST Elasticsearch提供了集成Django REST Framework和Elasticsearch的简便方法。该库使用Elasticsearch DSL库(elasticsearch-dsl-py)它是官方低级客户端的高级库。

让我们看一下使用Django REST Elasticsearch构建一个简单应用程序的快速示例。在这个例子中,我们将构建一个简单的博客系统。


要求

  • Django REST Framework 3.5及更高版本
  • elasticsearch-dsl> = 5.0.0,<7.0.0(Elasticsearch 5.x)

操作步骤

安装

pip3 install django-rest-elasticsearch -i http://mirrors.aliyun.com/pypi/simple --trusted-host mirrors.aliyun.com

创建项目

[root@localhost ~]# django-admin startproject tutorial

[root@localhost ~]# ls

anaconda-ks.cfg                CentOS-Sources.repo      jdk1.8.0_211
apache-maven-3.5.4-bin.tar.gz  centos.tar               jdk-8u211-linux-x64.tar.gz
big_data                       CentOS-Vault.repo        mongodb-linux-x86_64-rhel62-4.0.0.tgz
CentOS-Base.repo               docker-ce.repo           Python-3.6.5
CentOS-CR.repo                 Dockerfile               Python-3.6.5.tgz
CentOS-Debuginfo.repo          elasticsearch-5.5.1      tutorial
CentOS-fasttrack.repo          elasticsearch-5.5.1.zip
CentOS-Media.repo              HelloWorld

创建APP

[root@localhost ~]# cd tutorial

[root@localhost tutorial]# python3 manage.py startapp blog

配置APP

在tutorial/settings.py 中添加APP:

[root@localhost tutorial]# vi settings.py

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',
    'rest_framework',
    'blog',
]

配置URL

在tutorial/urls.py中配置URL:

from django.contrib import admin
from django.urls import path
from django.conf.urls import include

urlpatterns = [
    path('admin/', admin.site.urls),
    path('blog/', include('blog.urls')),
]


编写内部路由文件:

blog/urls.py:

from django.conf.urls import  url
from . import views

urlpatterns = [
    
]

创建模型

blog/models.py:

from django.db import models
from django.utils.translation import ugettext_lazy as _
from django.contrib.postgres.fields import ArrayField


# Create your models here.
class Blog(models.Model):
    title = models.CharField(_('Title'), max_length=1000) #博客题目
    created_at = models.DateTimeField(_('Created at'), auto_now_add=True) #创建日期
    body = models.TextField(_('Body'))#博客正文
    tags = ArrayField(models.CharField(max_length=200), blank=True, null=True) #博客标签
    is_published = models.BooleanField(_('Is published'), default=False) #是否公开

    def __str__(self):
        return self.title

在ES中创建映射

创建一个 DocType 来代表我们的博客模型

blog/search_indexes.py:

from elasticsearch_dsl import DocType,Date, Integer, Keyword, Text,Boolean
from elasticsearch_dsl.connections import connections
# Define a default Elasticsearch client
connections.create_connection(hosts=['localhost'])

class BlogIndex(DocType):

    pk = Integer()
    title = Text(fields={'raw': Keyword()})
    created_at = Date()
    body = Text()
    tags = Keyword(multi=True)
    is_published = Boolean()

    class Meta:
        index = 'blog'

# 在Elasticsearch中创建映射
#我们需要在Elasticsearch中创建映射。这里通过调用init类方法直接创建映射:
if __name__ == "__main__":
    BlogIndex.init()
  

在命令行中执行此程序 python3 search_indexes.py

然后,运行curl命令检验一下,

看看ES中是否新建了索引blog:

[root@localhost blog]# curl -X GET 'http://localhost:9200/_cat/indices?v'

health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
...
yellow open   blog       4VV7_m7bT-aarL7URl2XzQ   5   1          0            0       810b           810b
...

‘在ES创建映射’这一部分的代码,跟普通Django REST framework 项目中的数据库迁移命令( python3 manage.py migrate )很相似,只不过现在面对的“数据库”是我们的Elasticsearch。

另:在文档生命周期文档中,您可以找到有关如何手动使用文档的完整说明。

网络配置

配置连接ES的实例:

blog/config.py:

from elasticsearch import Elasticsearch, RequestsHttpConnection

es_client = Elasticsearch(
    hosts=['10.0.0.30:9200/'],
    connection_class=RequestsHttpConnection
)

编写序列化模块

使用 Serializer 可以将 queryset, model 实例等复杂数据类型(complex types)序列化成原生的 python 数据结构,且将其渲染成 JSON,XML 等其他数据类型。 Serializers 也可以反序列化,可将输入数据验证后,解析成复杂数据类型。

blog/serializers.py :

from rest_framework_elasticsearch.es_serializer import ElasticModelSerializer
from .models import Blog
from .search_indexes import BlogIndex

class ElasticBlogSerializer(ElasticModelSerializer):
    class Meta:
        model = Blog
        es_model = BlogIndex
        fields = ('pk', 'title', 'created_at', 'tags', 'body', 'is_published')

添加信号调度程序

(作用:添加,更新或删除新数据时使索引可更新)

我们希望在ElasticSearch中拥有一致的数据,这就是我们在更改模型中的任何内容时需要创建,更新或删除文档的原因。最好的方法是添加一个Django信号调度程序。在添加信号之前,我们上一步已经创建了一个序列化器来创建,更新和删除elasticsearch文档。

现在,我们需要创建signals.py文件:

from .config import es_client
from django.db.models.signals import pre_save, post_delete
from django.dispatch import receiver

from .serializers import Blog, ElasticBlogSerializer


@receiver(pre_save, sender=Blog, dispatch_uid="update_record")
def update_es_record(sender, instance, **kwargs):
    obj = ElasticBlogSerializer(instance)
    obj.save(using=es_client)


@receiver(post_delete, sender=Blog, dispatch_uid="delete_record")
def delete_es_record(sender, instance, *args, **kwargs):
    obj = ElasticBlogSerializer(instance)
    obj.delete(using=es_client, ignore=404)


加载signals设置: blog/ __init__.py中填写:

default_app_config = 'blog.apps.BlogConfig'

在blog/apps.py中重写ready方法:

from django.apps import AppConfig

class BlogConfig(AppConfig):
    name = 'blog'
    def ready(self):
        import blog.signals  # noqa

解释: 当INSTALLED_APPS 包含一个应用模块的路径后,Django 将在这个模块中检查一个default_app_config 变量。如果这个变量有定义,它应该是这个应用的AppConfig 子类的路径,调用get_app_config返回的是这个子类化的AppConfig。如果没有default_app_config,Django 将使用AppConfig 基类,调用get_app_config返回的是AppConfig基类。(https://blog.csdn.net/weixin_34268169/article/details/94320415)

参考文档:https://www.jianshu.com/p/4f1bf0905c69

编写视图

最后,让我们创建一个简单的搜索视图,查找按标签过滤的所有帖子,并按标题中的单词进行搜索:

blog/views.py:

#from config.es_client import es_client
from .config import es_client

from rest_framework_elasticsearch import es_views, es_pagination, es_filters
from .search_indexes import BlogIndex


class BlogView(es_views.ListElasticAPIView):
    es_client = es_client
    es_model = BlogIndex
    es_pagination_class = es_pagination.ElasticLimitOffsetPagination

    es_filter_backends = (
        es_filters.ElasticFieldsFilter,
        es_filters.ElasticSearchFilter,
        es_filters.ElasticOrderingFilter,
    )
    es_ordering_fields = (
        "created_at",
        ("title.raw", "title")
    )
    es_filter_fields = (
        es_filters.ESFieldFilter('tag', 'tags'),
        es_filters.ESFieldFilter('is_published', 'is_published')
    )
    es_search_fields = (
        'tags',
        'title',
    )

编写路由

blog/urls.py:

from django.conf.urls import  url
from . import views

urlpatterns = [
    url(r'^api/list$', views.BlogView.as_view(), name='blog-list'),
]

模型注册

Django自带的后台管理是Django明显特色之一,可以让我们快速便捷管理数据。后台管理可以在各个app的admin.py文件中进行控制。

blog/admin.py

from django.contrib import admin

# Register your models here.
from django.contrib import admin

from .models import Blog


@admin.register(Blog)
class GalleryCategoryAdmin(admin.ModelAdmin):
    list_display = ('title', 'is_published')


测试

curl http://10.0.0.30:8000/blog/api/list?search=elasticsearch

curl http://10.0.0.30:8000/blog/api/list?tag=opensource

curl http://10.0.0.30:8000/blog/api/list?tag=opensource,aws


或直接在浏览器上打开:


参考文档:

[1] https://www.jianshu.com/p/cd3d60da3128

[2] https://www.helplib.com/GitHub/article_154027

[3]https://github.com/myarik/django-rest-elasticsearch

[4] https://www.jianshu.com/p/7179229eeac1

[5] https://blog.csdn.net/weixin_34018169/article/details/88469999

[6] https://www.jianshu.com/p/131ac72557a3

[7] https://mp.weixin.qq.com/s?__biz=MzA4MjEyNTA5Mw==&mid=2652567887&idx=1&sn=ebe895705a5e8062f982535fed048c9a&chksm=8464d105b3135813590f310fab05791cddad7602f921606a683ffc5a8a5ca0796b38467c74c5&mpshare=1&scene=23&srcid=0621jEcEuhggQoLCPan7mnhp#rd