Spark Streaming is one of the most widely used frameworks for real time processing in the world with Apache Flink, Apache Storm and Kafka Streams. However, when compared to the others, Spark Streaming has more performance problems and its process is through time windows instead of event by event, resulting in delay.
In this post we will show how to use the different SQL contexts for data query on Spark. We will begin with Spark SQL and follow up with HiveContext. In addition to this, we will conduct queries on various NoSQL databases and analyze the advantages / disadvantages of using them.