Data has a lifecycle. Every part of the lifecycle is important to a different type of user.  Every business will have their own rules for what is considered “hot”, “warm”, and “cold” data.

Being able to consistently meet these rules is a process that is built into every data warehouse effort. However, when these particular business rules change, how flexible is the architecture you have chosen at adapting to these changes? Let’s look at a scenario where the business rules may change, and how using a data vault architecture as the foundation of your data warehouse effort can mitigate risk, and facilitate the rapid deployment of new standards to meet your business goals.

An example, for an organization that has three primary applications: Sales, Service and Inventory management. The data for the Sales lifecycle is using a low-latency feed into a data vault. Service and Inventory have a nightly update process that populates the data vault. The data from the Sales system is organized such that it is on low-latency SSD drives, whereas the other two have their data stored on 15k spindles. The data vault produces multiple data marts.

The business rules under which this architecture was originally deployed stated that all data sourced from Sales less than two weeks old is stored on SSD drives.

Based on the evolution of events that happen in business, be that regulatory requirements, or wanting to review data sooner than what may have been accepted previously, it is decided to migrate all data from Servicing less than 6 weeks old onto SSD drives, for both the data vault as well as the data marts that are fed by the data vault.

What architectural changes are required to meet this new business goal?

Answer: 0

There is not an architectural change required to either the data vault or the data marts. The architecture of the data vault itself is resilient enough to deal with this type of change without requiring any modifications. There are some changes required at the physical layer wherein the data is stored. However, once this is done, the data vault structures, and rules that surround the loading of data into the data vault show you where the data is that needs to be migrated from one physical storage area to another.

Having a logical abstraction layer sitting above your physical storage, as well as breaking the data apart into the business keys, relationships between the keys, then contextual information describing either the relationship, or the keys themselves, allow you the most flexibility in adapting to new business rules that affect the performance of the integration layer of your data warehouse.


Leave a Reply

Your email address will not be published. Required fields are marked *