mongo-hadoop(https://github.com/mongodb/mongo-hadoop)项目是用来连接Hadoop和Spark与MongoDB的。你可以从它的发布页面(https://github.com/mongodb/mongo-hadoop/releases)进行下载。
You must have at least version 3.0.0 of the MongoDB Java Driver installed in order to use the Hadoop connector.
PySpark实战:安装MongoDB的Java驱动