{"id":664,"date":"2016-02-18T12:52:31","date_gmt":"2016-02-18T12:52:31","guid":{"rendered":"http:\/\/blog.stratio.com\/?p=664"},"modified":"2023-09-20T13:43:45","modified_gmt":"2023-09-20T13:43:45","slug":"how-to-aggregate-data-in-real-time-with-stratio-sparta","status":"publish","type":"post","link":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/","title":{"rendered":"How to aggregate Data in Real-Time with Stratio Sparta"},"content":{"rendered":"<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">When working with Big Data, it&#8217;s frequent to have the need to aggregate data in real-time, whether it comes from a specific service, such as social networks (Twitter, Facebook&#8230;) or even from more diverse sources, like a weather station<\/span><span style=\"font-weight: 400;\">.<\/span><span style=\"font-weight: 400;\">\u00a0A good way to process these large amounts of information is with\u00a0<\/span><strong><a href=\"http:\/\/spark.apache.org\/streaming\/\" target=\"_blank\" rel=\"noopener\">Spark Streaming<\/a><\/strong><span style=\"font-weight: 400;\"><strong>,<\/strong>\u00a0it provides us all the data in real time, but it has one problem: you have to program it yourself.<\/span><!--more--><\/p>\n<h2><b>How to process and aggregate data<\/b><\/h2>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/2016-02-17-1.jpg\"><br \/>\n<img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-690 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/2016-02-17-1.jpg\" alt=\"Aggregate Data in Real-Time\" width=\"730\" height=\"312\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17-1.jpg 730w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17-1-300x128.jpg 300w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">To avoid this, we can use\u00a0<\/span><strong><a href=\"https:\/\/stratio.atlassian.net\/wiki\/display\/SPARKTA0x8\/1.+Introduction\" target=\"_blank\" rel=\"noopener\">Stratio Sparta<\/a><\/strong><span style=\"font-weight: 400;\">\u00a0to process and aggregate data. It helps us by reducing raw information and making it useful after it\u2019s processed. Later we\u2019ll persist the result in a\u00a0<\/span><strong><a href=\"https:\/\/www.mongodb.org\/\" target=\"_blank\" rel=\"noopener\">MongoDB<\/a><\/strong><span style=\"font-weight: 400;\">\u00a0cluster and provide the streaming through WebSockets with\u00a0<\/span><strong><a href=\"https:\/\/nodejs.org\/en\/\" target=\"_blank\" rel=\"noopener\">Node.js<\/a>.<\/strong><\/p>\n<p style=\"text-align: justify;\">Stratio Sparta is a very flexible, simple to use tool that allows us to transform raw information with\u00a0<strong>Morphline,<\/strong>\u00a0through its web interface, and aggregate data in different dimensions and time ranges.<\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/4.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-683 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/4.jpg\" alt=\"Aggregate Data in Real-Time\" width=\"730\" height=\"312\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/4.jpg 730w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/4-300x128.jpg 300w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/><\/a><\/p>\n<h2><b>First Steps with Stratio Sparta<\/b><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">First of all, we must\u00a0<\/span><strong><a href=\"https:\/\/stratio.atlassian.net\/wiki\/display\/SPARKTA0x8\/4.+Get+Started\" target=\"_blank\" rel=\"noopener\">install Stratio Sparta<\/a><\/strong><span style=\"font-weight: 400;\">\u00a0as indicated in Stratio\u2019s documentation. For this example we\u2019ve used the\u00a0<\/span><strong><a href=\"https:\/\/stratio.atlassian.net\/wiki\/display\/MANAGER1x5\/1.+Introduction\" target=\"_blank\" rel=\"noopener\">Stratio Manager<\/a><\/strong><span style=\"font-weight: 400;\">\u00a0and installed our 3 machine clusters with MongoDB and Stratio Sparta. With it we can also monitor our different service resources.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">After we install it, we can execute it and it will open its service in the 9090 doc where we can start to set up the different parameters:<\/span><\/p>\n<ul style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\"><b>Input:\u00a0<\/b><span style=\"font-weight: 400;\">It\u2019s the origin of the data. It can come from Flume, Kafka, RabbitMQ, from a Socket, from a WebSocket or directly from Twitter.<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>Output:\u00a0<\/b><span style=\"font-weight: 400;\">We can persist or show the information in different ways like MongoDB, Elasticsearch, Cassandra, Parquet, Redis, a CSV file or on the screen.<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>Policies:\u00a0<\/b><span style=\"font-weight: 400;\">It\u2019s the most complex part of the set-up. We can do the following:<\/span>\n<ul>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Configure the input.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Configure the outputs.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Make transformations with Morphlines by type of data and date.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Make aggregations of different dimensions and time granularity. We can also make common functions in data groupings, such as\u00a0<\/span><span style=\"font-weight: 400;\"><code>count<\/code><\/span><span style=\"font-weight: 400;\">,\u00a0<\/span><span style=\"font-weight: 400;\"><code>sum<\/code><\/span><span style=\"font-weight: 400;\">,\u00a0<\/span><span style=\"font-weight: 400;\"><code>max<\/code><\/span><span style=\"font-weight: 400;\">,\u00a0<\/span><span style=\"font-weight: 400;\"><code>min\u00a0<\/code><\/span>and others.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">In this example we\u2019ll take the data API from meetup.com, which provides a WebSocket with the information that is being created or being updated in the service. Later, we\u2019ll aggregate it by country and we\u2019ll do an hourly recount.<\/span><\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-669 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_1.png\" alt=\"Aggregate Data in Real-Time\" width=\"730\" height=\"465\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_1.png 730w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_1-300x191.png 300w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/><\/a><\/p>\n<p>Afterwards we\u2019ll add 3 nodes with MongoDB as an output and we\u2019ll establish the name of the collection that it will generate.<\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_2.png\"><img loading=\"lazy\" decoding=\"async\" class=\" aligncenter wp-image-670\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_2.png\" alt=\"Aggregate Data in Real-Time\" width=\"726\" height=\"369\" \/><\/a><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Of all the raw information provided by the WebSocket, we\u2019ll only take 3 of the attributes and give them the proper format. We\u2019ll establish the output of the transformation as\u00a0<\/span><span style=\"font-weight: 400;\"><code>country<\/code><\/span><span style=\"font-weight: 400;\">,\u00a0<\/span><span style=\"font-weight: 400;\"><code>response<\/code><\/span><span style=\"font-weight: 400;\">\u00a0and\u00a0<\/span><span style=\"font-weight: 400;\"><code>modified.<\/code><\/span><span style=\"font-weight: 400;\">The last one to easily return the updates from Node.js.<\/span><\/p>\n<pre class=\"theme:monokai lang:js decode:true\">{\n  \"morphline\": {\n    \"id\": \"morphline1\",\n    \"importCommands\": [\n      \"org.kitesdk.**\"\n    ],\n    \"commands\": [\n      {\n        \"readJson\": {}\n      },\n      {\n        \"extractJsonPaths\": {\n          \"paths\": {\n            \"response\": \"\/response\",\n            \"country\": \"\/group\/group_country\",\n            \"modified\": \"\/mtime\"\n          }\n        }\n      },\n      {\n        \"removeFields\": {\n          \"blacklist\": [\n            \"literal:_attachment_body\"\n          ]\n        }\n      }\n    ]\n  }\n}<\/pre>\n<p style=\"text-align: justify;\">After we establish what the structure of the processed information is going to be like, we\u2019ll set up an aggregation in the following way:<\/p>\n<ul style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\"><b>Time dimension:<\/b><span style=\"font-weight: 400;\">\u00a0name of the attribute, to which we\u2019ll add the beginning of the aggregation date (days, hours\u2026).<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>Granurality:<\/b><span style=\"font-weight: 400;\">\u00a0time rate in which the information will be aggregated. In this case it\u2019s hourly.<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>Dimensions:\u00a0<\/b><span style=\"font-weight: 400;\">We\u2019ll group by\u00a0<\/span><span style=\"font-weight: 400;\"><code>country<\/code>,<\/span><span style=\"font-weight: 400;\">\u00a0though it\u2019s possible to do a\u00a0<\/span><span style=\"font-weight: 400;\"><code>country<\/code>\u00a0<\/span><span style=\"font-weight: 400;\">and<\/span><span style=\"font-weight: 400;\">\u00a0<code>response<\/code>\u00a0<\/span><span style=\"font-weight: 400;\">aggregation.<\/span><\/li>\n<li style=\"font-weight: 400;\"><b>Operators:<\/b><span style=\"font-weight: 400;\">\u00a0We\u2019ll make several operations, the first one is the event recount (<\/span><span style=\"font-weight: 400;\"><code>count<\/code><\/span><span style=\"font-weight: 400;\">), afterwards it will add the last value of the\u00a0<\/span><span style=\"font-weight: 400;\"><code>modified<\/code><\/span><span style=\"font-weight: 400;\">\u00a0entry, which was taken from the \u00a0<\/span><span style=\"font-weight: 400;\"><code>mtime<\/code>\u00a0<\/span><span style=\"font-weight: 400;\">attribute from the WebSocket (with \u00a0<\/span><span style=\"font-weight: 400;\"><code>lastValue<\/code><\/span><span style=\"font-weight: 400;\">).<\/span><\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">When all these actions are done, we only have to establish a way out (in this case MongoDB) and execute the policy.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">With this information the aggregations should start in the following way:<\/span><\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/BUENO.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-699 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/BUENO.jpg\" alt=\"Aggregate Data in Real-Time\" width=\"730\" height=\"312\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/BUENO.jpg 730w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/BUENO-300x128.jpg 300w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400;\">We\u2019ll get the result in real time in our MongoDB cluster, and we can make queries to it from any other technology.<\/span><\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/sparta.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-672 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/sparta.png\" alt=\"sparta\" width=\"478\" height=\"193\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/sparta.png 478w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/sparta-300x121.png 300w\" sizes=\"(max-width: 478px) 100vw, 478px\" \/><\/a><\/p>\n<h2><b>WebSocket Server with Node.js<\/b><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">To set up our MongoDB connected WebSocket server, we need to have previously installed both modules. Later on we\u2019ll be able to read the database registry that will be sent by the WebSocket:<\/span><\/p>\n<pre class=\"theme:monokai lang:default decode:true\">var MongoClient = require('mongodb').MongoClient;\n\nMongoClient.connect('mongodb:\/\/localhost:27017\/sparta', function(err, db) {\n  db\n    .collection('id_country_hour')\n    .find()\n    .toArray(function(err, docs){\n      console.log(docs);\n    });\n});<\/pre>\n<p><span style=\"font-weight: 400;\">When we execute the script with the node, it will show the MongoDB content in a console.<\/span><\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_4.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-673 size-full\" title=\"Aggregate Data in Real-Time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_4.png\" alt=\"Aggregate Data in Real-Time\" width=\"543\" height=\"202\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_4.png 543w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_4-300x112.png 300w\" sizes=\"(max-width: 543px) 100vw, 543px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\">After we check the access to the information, we\u2019ll create the WebSocket server in order to transmit the changes produced by Stratio Sparta in real-time.<\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">To serve all the streaming data we must do the following:<\/span><\/p>\n<ul style=\"text-align: justify;\">\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">In first place, when a new client logs in we must store the connection that\u2019s generated by the WebSocket API to make a broadcast later on. We have to eliminate the connection after, as well.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">It&#8217;s also important to send all the existent registries to date.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">Every so often we send the changes that have taken place on MongoDB using the\u00a0<\/span><span style=\"font-weight: 400;\"><code>modified<\/code>\u00a0<\/span><span style=\"font-weight: 400;\">attribute to find them.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">For the example, we\u2019ll filter the countries to reduce the amount of data that will be sent.<\/span><\/li>\n<li style=\"font-weight: 400;\"><span style=\"font-weight: 400;\">We\u2019ll also filter the attributes that will get sent, since we\u2019ll only be using\u00a0<\/span><span style=\"font-weight: 400;\"><code>country<\/code><\/span><span style=\"font-weight: 400;\">,\u00a0<\/span><span style=\"font-weight: 400;\"><code>hour<\/code><\/span><span style=\"font-weight: 400;\">\u00a0and\u00a0<\/span><span style=\"font-weight: 400;\"><code>count<\/code><\/span><span style=\"font-weight: 400;\">.<\/span><\/li>\n<\/ul>\n<p style=\"text-align: justify;\"><strong>\u00a0<\/strong><span style=\"font-weight: 400;\">The example is brief and you can consult it\u00a0<\/span><a href=\"https:\/\/gist.github.com\/PedroGutierrezStratio\/5955bb97afb89f4c00ce\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<h2><b>WebSocket Client with JavaScript<\/b><\/h2>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">Once we have all the data in our front-end we can show them in a table, chart or console. In this case we\u2019ll show the information in a table with\u00a0<\/span><span style=\"font-weight: 400;\"><code>console.table<\/code><\/span><span style=\"font-weight: 400;\">\u00a0to simplify the visualization.<\/span><\/p>\n<pre class=\"theme:monokai lang:default decode:true\">var ws = new WebSocket('ws:\/\/127.0.0.1:8008\/');\nvar data = {};\n\nws.onmessage = function(response){\n  var responseData = JSON.parse(response.data);\n\n  responseData\n    .map(function(row){\n      row.hour = new Date(row.hour).toISOString().slice(0, 16).replace('T',' ');\n      return row;\n    })\n    .forEach(function(row){\n      if(!data[row.hour]){\n        data[row.hour] = {};\n      }\n      data[row.hour][row.country] = row.count;\n    });\n\n  console.clear();\n  console.log(\"Last update:\", new Date().toLocaleString());\n  console.table(data);\n}<\/pre>\n<p>The result is simple but functional:<\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_5.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-674 size-full\" title=\"Aggregate data in real-time\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/Screenshot_5.png\" alt=\"Aggregate Data in Real-Time\" width=\"700\" height=\"133\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_5.png 700w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/Screenshot_5-300x57.png 300w\" sizes=\"(max-width: 700px) 100vw, 700px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">We can also make a more aesthetic visualization with charts. A quick way to make them, is with\u00a0<\/span><span style=\"font-weight: 400;\"><code>Chart.js<\/code>.\u00a0<\/span><span style=\"font-weight: 400;\">We simply have to format the data so that they can adjust themselves to the\u00a0<code>Chart.js<\/code>\u00a0scheme and we\u2019ll get the following result.<\/span><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">You can find this simple data visualization example\u00a0<\/span><a href=\"https:\/\/gist.github.com\/PedroGutierrezStratio\/98a2c6d5298ec92e64ef\" target=\"_blank\" rel=\"noopener\"><span style=\"font-weight: 400;\">here<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><a href=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/2016-02-17-4.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-693 size-full\" title=\"Data visualization of aggregated data\" src=\"http:\/\/blog.stratio.com\/wp-content\/uploads\/2016\/02\/2016-02-17-4.jpg\" alt=\"Aggregate Data in Real-Time data visualization\" width=\"730\" height=\"312\" srcset=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17-4.jpg 730w, https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17-4-300x128.jpg 300w\" sizes=\"(max-width: 730px) 100vw, 730px\" \/><\/a><\/p>\n<p style=\"text-align: justify;\"><span style=\"font-weight: 400;\">It\u2019s really simple to handle information from JavaScript and there are many data visualization libraries, this makes it very easy to use and visualize data generated by Stratio Sparta data aggregations and make statistical reports, which would be very costly to handle without a tool that processes data in real-time.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>When working with Big Data, it&#8217;s frequent to have the need to aggregate data in real-time, whether it comes from a specific service, such as social networks (Twitter, Facebook&#8230;) or even from more diverse sources, like a weather station.\u00a0<\/p>\n","protected":false},"author":1,"featured_media":691,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[686],"tags":[19],"ppma_author":[795],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v22.9 (Yoast SEO v22.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>How to aggregate Data in Real-Time with Stratio Sparta - Stratio Blog<\/title>\n<meta name=\"description\" content=\"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"How to aggregate Data in Real-Time with Stratio Sparta\" \/>\n<meta property=\"og:description\" content=\"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\" \/>\n<meta property=\"og:site_name\" content=\"Stratio\" \/>\n<meta property=\"article:published_time\" content=\"2016-02-18T12:52:31+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-09-20T13:43:45+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"730\" \/>\n\t<meta property=\"og:image:height\" content=\"312\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Stratio\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@stratiobd\" \/>\n<meta name=\"twitter:site\" content=\"@stratiobd\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Stratio\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\"},\"author\":{\"name\":\"Stratio\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/d0377b199cd052b17e15c9ba44c45ab7\"},\"headline\":\"How to aggregate Data in Real-Time with Stratio Sparta\",\"datePublished\":\"2016-02-18T12:52:31+00:00\",\"dateModified\":\"2023-09-20T13:43:45+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\"},\"wordCount\":967,\"publisher\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg\",\"keywords\":[\"Big Data\"],\"articleSection\":[\"Product\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\",\"url\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\",\"name\":\"How to aggregate Data in Real-Time with Stratio Sparta - Stratio Blog\",\"isPartOf\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg\",\"datePublished\":\"2016-02-18T12:52:31+00:00\",\"dateModified\":\"2023-09-20T13:43:45+00:00\",\"description\":\"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.\",\"breadcrumb\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage\",\"url\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg\",\"contentUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg\",\"width\":730,\"height\":312},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.stratio.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"How to aggregate Data in Real-Time with Stratio Sparta\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#website\",\"url\":\"https:\/\/www.stratio.com\/blog\/\",\"name\":\"Stratio Blog\",\"description\":\"Corporate blog\",\"publisher\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.stratio.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\",\"name\":\"Stratio\",\"url\":\"https:\/\/www.stratio.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png\",\"contentUrl\":\"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png\",\"width\":260,\"height\":55,\"caption\":\"Stratio\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/stratiobd\",\"https:\/\/es.linkedin.com\/company\/stratiobd\",\"https:\/\/www.youtube.com\/c\/StratioBD\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/d0377b199cd052b17e15c9ba44c45ab7\",\"name\":\"Stratio\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/image\/bb38888f58c2bb664646155f78ae6ccc\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/e3387ad00609f34a56d6796400eb8191?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/e3387ad00609f34a56d6796400eb8191?s=96&d=mm&r=g\",\"caption\":\"Stratio\"},\"description\":\"Stratio guides businesses on their journey through complete #DigitalTransformation with #BigData and #AI. Stratio works worldwide for large companies and multinationals in the sectors of banking, insurance, healthcare, telco, retail, energy and media.\"}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"How to aggregate Data in Real-Time with Stratio Sparta - Stratio Blog","description":"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/","og_locale":"en_US","og_type":"article","og_title":"How to aggregate Data in Real-Time with Stratio Sparta","og_description":"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.","og_url":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/","og_site_name":"Stratio","article_published_time":"2016-02-18T12:52:31+00:00","article_modified_time":"2023-09-20T13:43:45+00:00","og_image":[{"width":730,"height":312,"url":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg","type":"image\/jpeg"}],"author":"Stratio","twitter_card":"summary_large_image","twitter_creator":"@stratiobd","twitter_site":"@stratiobd","twitter_misc":{"Written by":"Stratio","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#article","isPartOf":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/"},"author":{"name":"Stratio","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/d0377b199cd052b17e15c9ba44c45ab7"},"headline":"How to aggregate Data in Real-Time with Stratio Sparta","datePublished":"2016-02-18T12:52:31+00:00","dateModified":"2023-09-20T13:43:45+00:00","mainEntityOfPage":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/"},"wordCount":967,"publisher":{"@id":"https:\/\/www.stratio.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage"},"thumbnailUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg","keywords":["Big Data"],"articleSection":["Product"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/","url":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/","name":"How to aggregate Data in Real-Time with Stratio Sparta - Stratio Blog","isPartOf":{"@id":"https:\/\/www.stratio.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage"},"thumbnailUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg","datePublished":"2016-02-18T12:52:31+00:00","dateModified":"2023-09-20T13:43:45+00:00","description":"Spark Streaming is a good way to aggregate Data in real-time, but if you want to avoid programming it yourself, you can use Stratio Sparta.","breadcrumb":{"@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#primaryimage","url":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg","contentUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2016\/02\/2016-02-17.jpg","width":730,"height":312},{"@type":"BreadcrumbList","@id":"https:\/\/www.stratio.com\/blog\/how-to-aggregate-data-in-real-time-with-stratio-sparta\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.stratio.com\/blog\/"},{"@type":"ListItem","position":2,"name":"How to aggregate Data in Real-Time with Stratio Sparta"}]},{"@type":"WebSite","@id":"https:\/\/www.stratio.com\/blog\/#website","url":"https:\/\/www.stratio.com\/blog\/","name":"Stratio Blog","description":"Corporate blog","publisher":{"@id":"https:\/\/www.stratio.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.stratio.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.stratio.com\/blog\/#organization","name":"Stratio","url":"https:\/\/www.stratio.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png","contentUrl":"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png","width":260,"height":55,"caption":"Stratio"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/stratiobd","https:\/\/es.linkedin.com\/company\/stratiobd","https:\/\/www.youtube.com\/c\/StratioBD"]},{"@type":"Person","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/d0377b199cd052b17e15c9ba44c45ab7","name":"Stratio","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/image\/bb38888f58c2bb664646155f78ae6ccc","url":"https:\/\/secure.gravatar.com\/avatar\/e3387ad00609f34a56d6796400eb8191?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/e3387ad00609f34a56d6796400eb8191?s=96&d=mm&r=g","caption":"Stratio"},"description":"Stratio guides businesses on their journey through complete #DigitalTransformation with #BigData and #AI. Stratio works worldwide for large companies and multinationals in the sectors of banking, insurance, healthcare, telco, retail, energy and media."}]}},"authors":[{"term_id":795,"user_id":1,"is_guest":0,"slug":"stratioadmin","display_name":"Stratio","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/e3387ad00609f34a56d6796400eb8191?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/664"}],"collection":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/comments?post=664"}],"version-history":[{"count":9,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/664\/revisions"}],"predecessor-version":[{"id":13932,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/664\/revisions\/13932"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/media\/691"}],"wp:attachment":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/media?parent=664"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/categories?post=664"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/tags?post=664"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=664"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}