In part 2 of The Business Process Workflow Data Warehouse I will review some of the problems encountered with data quality when developing this type of warehouse. Data quality is often a sizeable problem the data architect needs to solve before deploying a data warehouse. Poor data quality in any data warehouse can be a showstopper that causes major delays and even outright failure. Be prepared for these problems to be worse in the Business Process Workflow-oriented Data Warehouse. The points numbered 4-6 below are the second of six points I made earlier and they address the quality of your data in different ways:
4. Workflow data can be removed or hidden due to incorrect data entry, legal issues, or duplicate data entry.
5. There are no standard practices for data as there are with financial, accounting and sales applications.
6. Data is often much more poorly entered and verified.
The central theme to these three statements is that workflows are vague and imprecise when compared with their transactional counterparts. Businesses are dependent on their transactional data. An order can’t be processed unless the transaction has been entered into a software application. However, a child can be visited by a social worker without entering any data into their computer system. If the social worker visits the child and enters the wrong date for the visit, nothing is missed, except the measure is reported incorrectly. Two rules of thumb to keep in mind when working with workflows:
o Data entry for workflows can be wrong while the work performed was correct.
o Data entry often occurs after the work has been performed.
To support this ‘flexible’ type of data entry requires a change in the data architect’s way of thinking. Your design must allow for data to be entered after the fact. Your design must allow for data to be corrected. Your design must allow for partial data to be entered for an activity, and then updated later with more data for the same activity.
This approach can be hard to accept if you’ve developed a transactional data warehouse before. You will break rules that you previously thought were untouchable, but accepting these ‘broken’ rules will put more accurate measurable data into your data warehouse.