December 2nd 2013 was a great day for the Spark community: the first Spark Summit took place in San Francisco. The event confirmed Spark as one of the BigData tools with the highest adoption rate seen in the last years. The event hosted 450 professionals interested in this technology and showed the roadmap for the upcoming months.
The event was organized in two tracks in which more than 30 talks were given. An excited audience made the event unforgettable.
The talks had an amazingly high technical level and the community presented both use cases and tools developed around this technology.
The event started with the speech given by Matei Zaharia (@matei_zaharia), the creator of Spark and one of the founders of Databricks (http://www.databricks.com). Matei’s talk revolved around the past, present and future of Spark and gave the numbers proving how much this technology has evolved since 2010.
Ion Stoica gave the second talk, Ion is the CEO of Databricks, Conviva and a professor of UC Berkeley. In this session Ion gave a more business-oriented view of the technology, explaining the added value that this technology is able to bring, which is the main reason why it has had so much success in the last few years. Ion found an amazing analogy between the history of mobile devices and the BigData tools that we have today. In the beginning there were only rudimentary devices, later there was a proliferation of devices specialized in a specific use case. In the end, the advent of the smartphone unified all of these tools in a single device. Just as the smartphone did, Spark aims to unify all BigData tools into a single tool.
After that, Michael Franklin spoke about the AMPlab in Berkeley. This lab was born to create tools able to process a massive amount of information. AMPlab is one of the most active Spark committers. Michael listed the main research projects currently active in AMPlab: BlinkDB, a database able to query over a huge amount of data in a very short time thanks to the ability it provides to tune the error level of the answer; MLbase a distributed library of machine learning algorithms built on top of Spark. Furthermore, they have more than a dozen other projects in which they are working on.
After AMPlab, it was the turn of Yahoo to talk about the work they are currently doing on Spark.
Eric Baldeschwieler (ex CEO for Hortonworks) closed the morning talks, giving the most inspiring speech of the whole event.
After lunch, the event was split into two tracks of talks. Among all the speeches, we’d like to highlight the talks given by Patrick Wendell of Databricks, who talked about how to configure and improve a Spark installation and the talk given by Kay Ousterhout, PhD student in Berkeley, who explained Sparrow, an alternative task scheduler for Spark.
In conclusion, the technical level of all of the talks was really, really high. And this gave more value to the whole event. The organisation has been excellent, we’d really like to thank Andy of Databricks and Courtney of Geeksession.
The event was able to gather a lot of really awesome people. Stratio is really aiming to keep in touch with them and to collaborate with them in the near future. A special mention to the guys from Tuplejump, Ooyala and Mesosphere. Thanks to all of them.
We’ve got the feeling that it won’t take too much time to start organizing the next event that will be held when Spark 1.0 is announced. We hope the community will keep growing along this amazing technology.