Implicit parameters and conversions are powerful tools in Scala increasingly used to develop concise, versatile tools such as DSLs, APIs, libraries… When used correctly, they reduce the verbosity of Scala programs thus providing easy to read code.
When surfing the internet, it is quite easy to find sites comparing the most popular Machine learning toolkits. These sites give you a lot of information about the strengths and weaknesses of the libraries, how they work and some examples to compare how easy it is to use these types of tools.
In this post we will show how to use the different SQL contexts for data query on Spark. We will begin with Spark SQL and follow up with HiveContext. In addition to this, we will conduct queries on various NoSQL databases and analyze the advantages / disadvantages of using them.
When working with Big Data, it’s frequent to have the need to aggregate data in real-time, whether it comes from a specific service, such as social networks (Twitter, Facebook…) or even from more diverse sources, like a weather station.
When working with Big Data, sometimes it’s useful to remember that powerful products wouldn’t work properly without the tools that build them.
Thanks to the changes proposed at CASSANDRA-8717, CASSANDRA-7575 and CASSANDRA-6480, Stratio is glad to present its Lucene-based implementation of Cassandra secondary indexes as a plugin that can be attached to the Apache distribution.
This post contains the winning solution for the Stratio challenge 2015 developed by Marco Piva, Leonardo Biagioli, Fabio Fantoni and Andrea De Marco (BitBang).
If you really want to learn and soak up every bit of Scala’s powerful functional features try not to learn them all at once, pick one and try to think of parts of your current code where this feature might fit in.
Security is often a forgotten concern in Big Data environments. However, as these technologies are being embraced by companies with sensitive data (think, for example, about banks or insurance companies), security is a growing requirement.
Stratio has just added top-k queries support to its Lucene based implementation of the Cassandra’s secondary indexes. This implementation was originally designed to allow embedded full-text and multivariable search in Apache Cassandra.