There are a number of different ‘types’ of data warehouses. I say this with trepidation because a data warehouse is usually not referred to as having a type. Perhaps I should rephrase my statement; there are a number of different types of data marts. A data warehouse can contain multiple data marts, each with a different aim. For the purposes of this discussion, I’m talking about a data warehouse as if it has a primary objective, which in my experience most do.
Read any book on data warehousing, look at a blog with a sample star schema, or take an introductory data warehouse design class, and you will see the same pattern. You will see a fact table representing a well-defined transaction surrounded by several dimensions. The fact table will be easy to understand; it usually reflects a common transaction that has been mapped from an application in a straightforward manner. It will look something like this:
In the diagram above, the fact table is derived from a sales transaction that occurred in an OLTP database. This star schema is a good learning tool and it’s a pattern that can be reused for many transactional processes, but let’s talk about business process workflows.
Examples of Business Process Workflows
Business Process Workflows happen over periods of time, are vague in their scope, and acquire new or changing data as the process moves forward. Also, it can be uncertain when they are actually complete. These are a few examples of the kinds of business processes I’m referring to:
– A financial services company receives a request from a customer to open a new 401k account.
– A social worker places a child in a foster home.
– A laptop in sent in to the manufacturer for warranty repair.
What’s different about the Business Process Workflow data warehouse vs. the Transactional data warehouse? The following are significant distinctions:
1. Workflows span long periods of time, some take hours, but others months or even years.
2. Workflows don’t have definite time periods and some may not end.
3. Workflows almost always require accumulating facts that must be updated after they are stored when more data arrives.
4. Workflow data can be removed or hidden due to incorrect data entry, legal issues, or duplicate data entry.
5. There are no standard practices for data as there are with financial, accounting and sales applications.
6. Data is often much more poorly entered and verified.
In the next several days, I will discuss each of these points in more detail, review why they make the Business Process Workflow data warehouse a true challenge, and discuss solutions for each. There is a paradigm shift that must be grasped and some deep-seated data warehouse modeling principles must be relinquished.