Blog

Big Data Spain 2014 summary

by | Jan 22, 2015 | Business, Stratio | 0 comments

Once Data Sources API  has been released, we’ve wanted to take advantage of these new features and, for this reason, we have developed a Spark-MongoDB library. With this new connector we help the growing MongoDB community to simplify the interaction with this datasource via Spark.

This library provides the mechanism for accessing MongoDB collections in a structured way from SparkSQL, accesible from Python and Scala API’s. Since MongoDB is an open-source document database leader among NoSQL databases and is highly used in several projects [http://www.mongodb.com/leading-nosql-database] we find this connection with all the operations permitted by SparkSQL not only useful but necessary.

Our library uses the 2.13.0 MongoDB Java Driver (that supports the newest MongoDB versions). We use the Casbah toolkit in order to better integrate our Scala implementation with MongoDB. Thus, the project becomes cleaner and less verbose while allowing for a simpler and more intuitive way of developing.

SparkSQL is being rapidly developed,  giving support for reading data from other formats (Apache Hive, Parquet, …) and the chance of performing many operations with this data. With our library we extend these possibilities by adding other datasource with which the user could combine existing data in other formats.

We are looking forward for the new Spark 1.3 to keep updating and evolving our library.

About cookies on this site

We use our own and third party cookies to enhance your browsing experience. By using this website you agree to our use of cookies.

Privacy Settings saved!
About cookies on this site

When you visit any web site, it may store or retrieve information on your browser, mostly in the form of cookies. Control your personal Cookie Services here.

These cookies are necessary for the website to function and cannot be switched off in our systems.

In order to use this website we use the following technically required cookies
  • wordpress_test_cookie
  • wordpress_logged_in_
  • wordpress_sec

Decline all Services
Accept all Services
X