“In God we trust, all others must bring data”
This quote attributed to William Edwards Deming sets the tone for businesses around the world today. What is data and why is it so important? What is the buzz around data all about? These are not just questions from a novice outsider, but from established enterprises as well. Before we get to the buzz, a couple of questions that precede, are what it is and why it is important. A data strategy is in short, the 5Ws and H of enterprise data.
The importance of a data strategy is underscored by Keith B Carter, in his book, Actionable Intelligence: A Guide to Delivering Business Results with Big Data Fast!, where he states that the biggest mistakes made by enterprises are indiscriminate data acquisition and indiscriminate data categorization[1]. And both the above mistakes indicate the lack of an effective data strategy. An alternate viewpoint on Tech Republic[2] emphasizes that the biggest challenges for data librarians of today is that they can’t keep every byte that comes in, and that they need a business continuity plan for data. This also highlights why enterprises must invest in understanding what their data requirements are, and taking a strategic approach to aggregating, managing and utilizing this data.
Data Strategy
What is Data Strategy? Why does it comprise of? In broad terms, data strategy can be defined as
The approach that businesses take in order to gain insights into how past failures can be avoided, successes repeated and most importantly, how lessons from history can be utilized to shape a better future.
It is the comprehensive methodology for data collection, management of the data created, storage, security, data governance and retention among other things. A good data strategy is flexible to changing business objectives, seeks the most optimum technologies, considers the requirements of all the departments and links with the corporate strategy of the organization.
What goes into a Data Strategy?
First, we ought to ask why data, and why the data strategy – this forms the basis of the strategy itself.
If you don’t know how to ask the right questions, you discover nothing – William Edward Deming again, as he seems to put things into perspective quite nicely. Perspective – it’s what drives the data strategy. If you’re collecting data, you first need to ask why you are doing so.
Why: The answer to this question when analysed logically will give you the business case for acquiring data and when crystallized, the scope of data acquisition, and the data management strategies and processes. It is the first and most critical question. A data strategy should clearly show the tangible business benefits of investing in data technologies.
What: This can be what data you want to acquire and for what purpose. The answer to these questions allow you to define whether your internal data sources are sufficient, and if you have multiple internal sources, what to start with, how to scale up, and how to choose data sources. It also allows you to define external data that you would use and define how to limit the indiscriminate aggregation of data – something that has caused many a data strategy to fail. This is where the existing data sources, technologies, procedures and type of data are analysed and new data architecture and systems set-up after due diligence.
Where: Asking where your data is coming from and where it exists gives you the all-important perspective of visualizing your data distribution, which in turn will give you the strategy to integrate this data as Enterprise Data, which improves the ability of data to answer questions with consistency.
When: There are two perspectives to ‘when’, when it comes to data
You get a picture of both the size of your data which defines your storage and retrieval systems, and the approach to data acquisition by identifying the spectrum of historical data you would like to aggregate. Identifying when your data becomes obsolete, gives you the power to control the inherent quality of data to confuse and confound the users. It also improves the timeliness and reliability of lessons data can teach you, as well as the cost of arriving at these lessons. This is because the size of data is directly proportional to the cost of managing it and the cost of using it.
Who: This is a very important question, because in today’s knowledge economy, proprietary information is the basis for competitive differentiation. This question also defines the stakeholders who will ultimately use the data – Finance, Operations, Top Management, Business development, Marketing, they all use the same data at the end of the day, but they do so in vastly different ways and the data must be provided in a manner that is appropriate for each purpose. Thus this defines the storage and transformation strategies that are crucial to data management. The answer to this question also provides the data access and security strategies and processes that will help maintain data integrity, data governance as well as provide the required level of security to the crown jewels of the enterprise.
How: This is essentially defining the way the data will be accessed, utilized and modified by the stakeholders. As we saw above, data has to be presented in different dimensions and perspectives, and to different applications, through different access modes, and on different information delivery platforms. The answer to the ‘how’ defines the architecture of the data consumption solution and thus also helps define the security, the transformation strategies, as well as the storage and retrieval systems
The greatest value of a picture is when it forces us to notice what we never expected to see. This is the real value delivered by data and the data strategy is a crucial tool in making sure this happens as often as we would like it possible.
[1] http://searchbusinessanalytics.techtarget.com/feature/Managing-big-data-the-two-biggest-mistakes-companies-make
[2] http://www.techrepublic.com/article/big-data-trends-in-2015-reflect-strategic-and-operational-goals/