In this post we will show how to use the different SQL contexts for data query on Spark. We will begin with Spark SQL and follow up with HiveContext. In addition to this, we will conduct queries on various NoSQL databases and analyze the advantages / disadvantages of using them.
Proud to share the press release announcing Stratio as Huawei’s technological partner and looking forward to working together.
When we first started using Spark, we were twenty people. Twenty Stratians. We took a risk and adopted Spark very early on, but with a lot of teamwork and a lot of mistakes, we managed to create the first pure Spark platform.
Stratio has just added top-k queries support to its Lucene based implementation of the Cassandra’s secondary indexes. This implementation was originally designed to allow embedded full-text and multivariable search in Apache Cassandra.
Once the Data Sources API has been released, we’ve wanted to take advantage of these new features and, for this reason, we have developed a Spark-MongoDB library. With this new connector we help the growing MongoDB community to simplify the interaction with this datasource via Spark.