We recently worked with MongoDB and their developer team for the analysis of their Hadoop based connector Vs our native connector solution. The paper highlights how Stratio’s connector for Apache Spark implements the PrunedFilteredScan API instead of the TableScan API which effectively allows you to avoid scanning the entire collection.

Our connector supports the Spark Catalyst optimizer for both rule-based and cost-based query optimization. To operate against multi-structured data, the connector infers the schema by sampling documents from the MongoDB collection. This process is controlled by the samplingRatio parameter. If the schema is known, the developer can provide it to the connector, avoiding the need for any inference. Once data is stored in MongoDB, Stratio provides an ODBC/JDBC connector for integrating results with any BI tool.

The connector can be downloaded from the community Spark Packages repository. Installation is simple – the connector can be included in a Spark application with a single command. One of the main advantages of implementing the Dataframe API from Spark is that you can integrate different data sources, i.e you could make a join between a MongoDB table and an ElasticSearch collection.

Many thanks to Mat Keep and Sam Weaver from MongoDB, and our team of devs for making the analysis. Download the whitepaper here.

Author

Avatar
Author

4 Comments

  1. Avatar

    Such a very useful article. Very interesting to read this.I would like to thank you for the efforts you had made for writing this awesome article.

  2. Avatar

    Thanks for the thorough explanation.Good summary of a simple but often miss understood topic ? This blog is the fascinating one and it induces me to know more about it
    Thanks for the sharing this blog and keep on sharing these kinds of useful blog.

  3. Avatar

    Hiya, I’m really glad I have found this information. ?
    Today bloggers publish only about gossips and net and this is really annoying. A good web site with interesting content, this is what I need. Thanks for keeping this web site, I’ll be visiting it.

  4. Avatar

    Thanks for the mention and your site looks great!! This is a great explanation you are so thorough….much appreciated!

Write A Comment