Firstly, I must make these qualifications before you read further:
- I consider Bill Inmon and Ralph Kimball the pioneers of data warehousing;
- Data warehousing has been a relatively recent concept, since 1983 when Teradata introduced a database decision support system, and most current day concepts we understand about data warehousing have been taking shape over the past 10 years;
- I am new to this as well, but am very satisfied with the results I am obtaining.
- Subject-oriented - data is organized so that all data elements relating to the same real-world event or object are linked together.
- Time-variant -changes to the data in the database are tracked and recorded so that reports can be produced showing changes over time.
- Non-volatile - data in the database is never over-written or deleted - once committed, the data is static, read-only, and retained for future reporting.
- Integrated - database contains data from most or all of an organization's operational systems and this data is made consistent.
Instead, you can establish what your basis for your data is through fact tables and dimension tables.
A fact table is the "primary table where the numerical performance measurements of the business are stored" (The Data Warehouse Toolkit, second edition, p.16). The most useful facts of this table are numeric and additive.
Daily Sales Fact TableA dimension table, the companion to the fact table, is the user interface to the data warehouse.
Date Key (FK)
Product Key (FK)
Store Key (FK)
Quantity Sold
Dollar Sales Amount
Production Dimension Table
Product Key (PK)
Product Description
SKU Number (not to be used as a key in the warehouse)
Brand Description
Category Description
Department Description
...more
Bringing together facts and dimensions becomes easier across all levels of business, as users agree that the dimensional model is their business. In our examples above, store operators use the store dimension table to relate to the number of products sold daily; marketing and product development refer to the attributes of the product dimension table to make conclusions about which products to promote and develop further.
Next I will relate common pitfalls known in the early stages of data warehousing and how to overcome them.
Until Next Time,
David
No comments:
Post a Comment