{"id":1378,"date":"2014-03-14T04:26:13","date_gmt":"2014-03-14T04:26:13","guid":{"rendered":"http:\/\/stratio.tumblr.com\/post\/79547024299"},"modified":"2023-09-20T13:54:19","modified_gmt":"2023-09-20T13:54:19","slug":"paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1","status":"publish","type":"post","link":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/","title":{"rendered":"Paper of the week: &#8220;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&#8221; [1]"},"content":{"rendered":"<p>This paper has been presented at the <a href=\"http:\/\/eurosys2013.tudos.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Eurosys 2013 conference<\/a>\u00a0and is avaiblable for\u00a0<a href=\"http:\/\/eurosys2013.tudos.org\/wp-content\/uploads\/2013\/papers-alternative\/Agarwal.pdf\" target=\"_blank\" rel=\"noopener noreferrer\">download<\/a>\u00a0at the conference website. The paper presents BlinkDB that, despite its name, is not a database but a query engine on top of\u00a0<a href=\"http:\/\/hive.apache.org\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hive<\/a>\u00a0and\u00a0<a href=\"http:\/\/shark.cs.berkeley.edu\/\" target=\"_blank\" rel=\"noopener noreferrer\">Shark<\/a>, and it is used for running interactive SQL queries on large volumes of data using data samples.<\/p>\n<p><!--more--><\/p>\n<div style=\"text-align: justify;\">BlinkDB is built using two key ideas: an adaptive optimization framework to build and maintain stratified samples, and a dynamic sample selection strategy to select appropiately sized sample based on a query\u2019s accuracy or response time requirements.<\/div>\n<div style=\"text-align: justify;\"><\/div>\n<div style=\"text-align: justify;\">This paper offers an interesting introduction on how to apply statistical inference technics on Big Data and makes clear that there is always a trade-off between accuracy and performance. In that regard, BlinkDB offers information about query accuracy so the user can make decisions. Although it is not clear what the cost of maintaining stratified samples is, the paper provides a good seed for future works in the area.<\/div>\n<div style=\"text-align: justify;\"><\/div>\n<div style=\"text-align: justify;\">[1]\u00a0<em>Agarwal, Sameer, et al. \u201cBlinkDB: queries with bounded errors and bounded response times on very large data.\u201d\u00a0Proceedings of the 8th ACM European Conference on Computer Systems. ACM, 2013.<\/em><\/div>\n","protected":false},"excerpt":{"rendered":"<p>This paper has been presented at the\u00a0Eurosys 2013 conference\u00a0and is avaiblable for\u00a0download\u00a0at the conference website. The paper presents BlinkDB that, despite its name, is not a database but a query engine on top of\u00a0Hive\u00a0and\u00a0Shark.<\/p>\n","protected":false},"author":2,"featured_media":128,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[686],"tags":[19],"ppma_author":[794],"yoast_head":"<!-- This site is optimized with the Yoast SEO Premium plugin v22.9 (Yoast SEO v22.9) - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Paper of the week: &quot;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&quot; [1] - Stratio<\/title>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Paper of the week: &quot;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&quot; [1]\" \/>\n<meta property=\"og:description\" content=\"This paper has been presented at the\u00a0Eurosys 2013 conference\u00a0and is avaiblable for\u00a0download\u00a0at the conference website. The paper presents BlinkDB that, despite its name, is not a database but a query engine on top of\u00a0Hive\u00a0and\u00a0Shark.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\" \/>\n<meta property=\"og:site_name\" content=\"Stratio\" \/>\n<meta property=\"article:published_time\" content=\"2014-03-14T04:26:13+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-09-20T13:54:19+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"730\" \/>\n\t<meta property=\"og:image:height\" content=\"312\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"admin\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@stratiobd\" \/>\n<meta name=\"twitter:site\" content=\"@stratiobd\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"admin\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\"},\"author\":{\"name\":\"admin\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/af4f5fbbeb95bd7d55f79d9a677e615d\"},\"headline\":\"Paper of the week: &#8220;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&#8221; [1]\",\"datePublished\":\"2014-03-14T04:26:13+00:00\",\"dateModified\":\"2023-09-20T13:54:19+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\"},\"wordCount\":209,\"publisher\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg\",\"keywords\":[\"Big Data\"],\"articleSection\":[\"Product\"],\"inLanguage\":\"en-US\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\",\"url\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\",\"name\":\"Paper of the week: \\\"BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data\\\" [1] - Stratio\",\"isPartOf\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage\"},\"thumbnailUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg\",\"datePublished\":\"2014-03-14T04:26:13+00:00\",\"dateModified\":\"2023-09-20T13:54:19+00:00\",\"breadcrumb\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage\",\"url\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg\",\"contentUrl\":\"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg\",\"width\":730,\"height\":312},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.stratio.com\/blog\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Paper of the week: &#8220;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&#8221; [1]\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#website\",\"url\":\"https:\/\/www.stratio.com\/blog\/\",\"name\":\"Stratio Blog\",\"description\":\"Corporate blog\",\"publisher\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.stratio.com\/blog\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#organization\",\"name\":\"Stratio\",\"url\":\"https:\/\/www.stratio.com\/blog\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png\",\"contentUrl\":\"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png\",\"width\":260,\"height\":55,\"caption\":\"Stratio\"},\"image\":{\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/x.com\/stratiobd\",\"https:\/\/es.linkedin.com\/company\/stratiobd\",\"https:\/\/www.youtube.com\/c\/StratioBD\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/af4f5fbbeb95bd7d55f79d9a677e615d\",\"name\":\"admin\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/image\/589aaf4b404b1fe099b09564062c4563\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/9b181ae4395243dccaf1c3e3a4749d81?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/9b181ae4395243dccaf1c3e3a4749d81?s=96&d=mm&r=g\",\"caption\":\"admin\"}}]}<\/script>\n<!-- \/ Yoast SEO Premium plugin. -->","yoast_head_json":{"title":"Paper of the week: \"BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data\" [1] - Stratio","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/","og_locale":"en_US","og_type":"article","og_title":"Paper of the week: \"BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data\" [1]","og_description":"This paper has been presented at the\u00a0Eurosys 2013 conference\u00a0and is avaiblable for\u00a0download\u00a0at the conference website. The paper presents BlinkDB that, despite its name, is not a database but a query engine on top of\u00a0Hive\u00a0and\u00a0Shark.","og_url":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/","og_site_name":"Stratio","article_published_time":"2014-03-14T04:26:13+00:00","article_modified_time":"2023-09-20T13:54:19+00:00","og_image":[{"width":730,"height":312,"url":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg","type":"image\/jpeg"}],"author":"admin","twitter_card":"summary_large_image","twitter_creator":"@stratiobd","twitter_site":"@stratiobd","twitter_misc":{"Written by":"admin","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#article","isPartOf":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/"},"author":{"name":"admin","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/af4f5fbbeb95bd7d55f79d9a677e615d"},"headline":"Paper of the week: &#8220;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&#8221; [1]","datePublished":"2014-03-14T04:26:13+00:00","dateModified":"2023-09-20T13:54:19+00:00","mainEntityOfPage":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/"},"wordCount":209,"publisher":{"@id":"https:\/\/www.stratio.com\/blog\/#organization"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage"},"thumbnailUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg","keywords":["Big Data"],"articleSection":["Product"],"inLanguage":"en-US"},{"@type":"WebPage","@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/","url":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/","name":"Paper of the week: \"BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data\" [1] - Stratio","isPartOf":{"@id":"https:\/\/www.stratio.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage"},"thumbnailUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg","datePublished":"2014-03-14T04:26:13+00:00","dateModified":"2023-09-20T13:54:19+00:00","breadcrumb":{"@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#primaryimage","url":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg","contentUrl":"https:\/\/www.stratio.com\/blog\/wp-content\/uploads\/2014\/03\/stratio.jpg","width":730,"height":312},{"@type":"BreadcrumbList","@id":"https:\/\/www.stratio.com\/blog\/paper-of-the-week-blinkdb-queries-with-bounded-errors-and-bounded-response-times-on-very-large-data-1\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.stratio.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Paper of the week: &#8220;BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data&#8221; [1]"}]},{"@type":"WebSite","@id":"https:\/\/www.stratio.com\/blog\/#website","url":"https:\/\/www.stratio.com\/blog\/","name":"Stratio Blog","description":"Corporate blog","publisher":{"@id":"https:\/\/www.stratio.com\/blog\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.stratio.com\/blog\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/www.stratio.com\/blog\/#organization","name":"Stratio","url":"https:\/\/www.stratio.com\/blog\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/","url":"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png","contentUrl":"https:\/\/stratio.com\/blog\/wp-content\/uploads\/2020\/06\/stratio-web-logo-1.png","width":260,"height":55,"caption":"Stratio"},"image":{"@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/x.com\/stratiobd","https:\/\/es.linkedin.com\/company\/stratiobd","https:\/\/www.youtube.com\/c\/StratioBD"]},{"@type":"Person","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/af4f5fbbeb95bd7d55f79d9a677e615d","name":"admin","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.stratio.com\/blog\/#\/schema\/person\/image\/589aaf4b404b1fe099b09564062c4563","url":"https:\/\/secure.gravatar.com\/avatar\/9b181ae4395243dccaf1c3e3a4749d81?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/9b181ae4395243dccaf1c3e3a4749d81?s=96&d=mm&r=g","caption":"admin"}}]}},"authors":[{"term_id":794,"user_id":2,"is_guest":0,"slug":"admin","display_name":"admin","avatar_url":"https:\/\/secure.gravatar.com\/avatar\/9b181ae4395243dccaf1c3e3a4749d81?s=96&d=mm&r=g","0":null,"1":"","2":"","3":"","4":"","5":"","6":"","7":"","8":""}],"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/1378"}],"collection":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/comments?post=1378"}],"version-history":[{"count":23,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/1378\/revisions"}],"predecessor-version":[{"id":13920,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/posts\/1378\/revisions\/13920"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/media\/128"}],"wp:attachment":[{"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/media?parent=1378"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/categories?post=1378"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/tags?post=1378"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/www.stratio.com\/blog\/wp-json\/wp\/v2\/ppma_author?post=1378"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}