Putting value at the core of your business: Data-driven vs Data-centric
A data-driven or data-centric company?
We’re always hearing technology companies claim that the future lies in becoming a data-driven organisation: “Five steps to becoming a data-driven company”, “Build a data-driven business”, “Becoming a data-driven enterprise takes dedication”, “Apache Hadoop and big data platforms for a data-driven enterprise.”…
In a Forbes Insight and EY information brief, IBM General Manager, Business Analytics, Marc Altshuller emphasized how important it is for companies to be data-driven if they want to grow revenue and profits.
But what about the data-centric approach? Is the journey the same as towards a data-driven approach? Is the result the same?
The answer is very simple: No. There is often confusion and companies that are working towards becoming data-driven, think they are on the road to a data-centric paradise.
But is a data-driven company even close to data-centricity? It’s a good start, and in terms of culture, it helps get across the message of the value of data. But there is still a lot of work to be done.
In this post, we will analyze the differences between a data-driven and a data-centric company.
Towards a data-driven approach
Many books discuss data-drivenness, and how to put data at the heart of a business’s culture. Discussions center around achieving better data quality and more valuable information to base decision-making on.
This is where Big Data comes into the picture. Companies are told to ingest all their data, clean it and let the magic happen! Easier said than done, especially when there is such a lack of skills in the sector. Personally, I get very excited about Big Data and the immense possibilities. I truly think that these new advanced analytics methods and technologies are amazing. But when I’m told to “let the magic happen”, I have to admit I feel a bit like a muggle.
The reality is that most projects like these fail. In 2015, Gartner predicted that, through 2017, 60 percent of big data projects would fail to go beyond piloting and experimentation, and would be abandoned. The projects tend to end up as yet another Proof of Concept and maintain an application-centric focus – that is to say, they are oriented towards a specific use case in which each application has its own data lake and is responsible for collecting and storing the data it needs.
This is really far away from being a data-centric company, in which data should be at the heart of a company’s IT Systems, rather than being a consequence of a business process implemented through an application being built for a specific functionality.
There are of course a lot of success stories when it comes to Big Data and data-driven companies really creating high-quality and high-value data lakes. Let me pose a question: What is the difference between the relatively new concept of data lakes and traditional data warehouses? Data lakes are typically designed to complement data warehouses, but basically it’s the same concept with different technologies. While the data lake has delivered some improvements, you won’t make a data revolution by creating new big data silos: Indeed, the key issue remains: in an application-centric model for IT, the applications still own the data.
A data-driven company improves its analytical capacities and business intelligence, bringing new insights to the company, but it is nowhere near the data-centric revolution that we implement here at Stratio.
Adding data-centricity to the mix
As we explain in our whitepaper (Data-centric Architecture: A model for embracing the machine age), the root of the information-silo problem is one: the application-centric approach. Business Units create projects or buy products to respond to a specific need with a deadline, ignoring other KPIs such as data control, quality or a reusable architecture. Using this approach for decades… – well, we all know how that story ends: applications own the data, products own the data, or even worse: vendors own the data. Imagine how frustrating it must be to not have your most valuable active under your control.
A data-centric architecture has a permanent and primary core: Data. Applications and services are ephemeral, they live as long as they are useful. But Data is always there.
Sounds easy, right? So, where is the problem? It’s not that simple….
Businesses want new functionalities fast and it’s easier to build new data models or to buy new products to cover these functionalities, than to design and maintain a single data model. When I say “a single data model”, I mean a unified and integrated vision of the data, not a “single database”. This means that you can model your data for a specific use case, but always in a centralized way, with governed data and processes to ensure data accuracy, integrity, and timeliness.
In this scenario, all applications and services read and write from the same data model.
There are so many benefits to a data-centric approach:
- No data misunderstandings
- A simplified systems and applications map
- No silos
- No data owners outside of your organization
- The unified governance of data and processes
Evolving from a data-driven to a data-centric company
First of all, you need to define, implement, maintain and add the unified data model to your core business. If you are starting in a data-driven scenario, you will probably have to generate a data catalog of almost all your data. With this catalog, it will be easier to generate a map of your data and to then create a data model, rather than doing it from scratch.
This is a bottom-up approach to data-centric governance that starts at the technical level and builds up to the business level. In the latest release of our product, Stratio Data Centric, we included the first increment of our Governance module with automatic metadata autodiscovery. This feature provides the aforementioned data catalog out-of-the-box, integrated with a Data Dictionary in which you can see all your technical metadata and include new business metadata.
At first, your catalog will look like a mess. This is because you need to add business tags to your data and unify everything with for example:
- Functional aliases
- Levels of sensitivity
- Quality levels
- Business terms
- Associated KPIs
In Stratio’s Data Dictionary it is possible to tag all your business metadata over your autodiscovered catalog.
The secret is that you don’t have to do all this mapping from scratch. As I mentioned before, a data-driven company already has enough mapping to at least evaluate the data needed for new advanced analytics projects. This is a great first model to work with in an incremental way. In cases where you still don’t have enough information, a Data Discovery tool could be useful. There is a module that covers this features in Stratio Data Centric: Stratio Discovery. This module allows you to ask questions, check your data profiling, create dashboards and share them.
What comes next depends on you! It’s critical to maintain and enrich your unified data model. And the only way to ensure this, is to include this model in your business assets. Every new data project should have KPIs and/or use and enrich this core model.
Conclusion: Data-Driven experiences to become a Data-Centric Company
Becoming a data-driven company is a useful first step, but is based on building tools, abilities, and a culture that acts on data, instead of really making an internal transformation around data.
Use your data-driven experiences to move up to a higher lever and become a data-centric company, putting your data at the core of your organization.
Stratio Data Centric is a unique product that does just that. It puts your most valuable asset at the core of your business: YOUR DATA.
In a future post I’ll discuss the differences between top-down and bottom-up approaches to GDPR compliance in a data-centric approach.
Alfonso Fernández is Product Owner at Stratio. He is actually leading the Data Centric’s Governance and Discovery modules. He holds degrees in Computer Science from the Universidad Autónoma de Madrid and has a large experience in IT project development. During the last 4 years, he was the Senior Manager of the Big Data unit at EY and KPMG where he was leading the development of Big Data solutions for Santander Bank. Previously, he worked at everis and coordinate the 1º Open Datathon in Madrid.