dinsdag 12 april 2011

Why datavault will diminish the Kimball architectures.

Hi, in 2010 i started to collect more in-depth knowledge of the datavault modeling. Most of the Datavault folks know the TDAN articles of Dan Linstedt,  Dan blogs, the linkedin datavault discussions and Dan wrote a technical book about Datavault ("Supercharge your datawarehouse"). This year i succeeded for the Datavault certification of the Genessee academy. Still, i have to unreveal many areas of the datavault modeling (and anchor modeling) for a thorough understanding of the datavault.

After years and years building Kimball datawarehouses i'm convinced that the materialized star models will be more and more obsolete in the future due to hardware improvements like SSD's and Fusion techniques. In my opinion the only business case for (materialized) star modeling will be simplicity of the model (and perhaps the technical business case: it's easier to build cubes on pure star models (with dummy records)).

Because of the Self Service BI movement users will be enabled by the user friendiness tooling building easier and faster reports based on products like clickview and the new tool of Microsoft codename "Cresent" (and offcourse other SSBI Tooling). Building (easy) reports will be less and less important for the BI consultant. The easy- and medior reports will be build by the end (power) users. There are three reasons when the BI Consultant comes in to play and that is that reports needs scalability (enterprise wide), needs standardization or/and needs some certain amount of skills (the three S-esses). As the Kimball star models are positioned at the user side, less and less usage of the kimball datawarehouses will be inevitable. The end user BI tooling will be very powerful in such a way that sometimes a Kimball warehouses is not needed.

The main information delivery of a enterprise (and beyond) will be focussed on integrating data in a flexible way. (3NF) datamodels and starmodels were inflexible when changes were needed, rapidly. Changing the models caused sometimes errors, problems at the front end application, other databases that depended it on the datamodel, etc. With the datavault model more flexibility and agility is achieved. The datavault model is more error proof and will be the base or storing your enterprise data. Because of the principle "store facts not truth (one version...)" ALL data is loaded in the datavault resulting a auditable architecture because whenever the source changes KPI's can be recalculated in history(Type II principle).

The next step is building the starmodels, not physically but virtually. Starting point should be building them virtually unless the performance is bad. This way you are very flexible in building your starmodels. Perhaps a new manager wants another KPI or wants another calculation it's possible to recalculate this in the past (because you have kept all the information and used a Type II approach).

What do you think? Let me know.


3 opmerkingen:

  1. Interesting read Henie ! For your question I don't have really input to give ( since you are the one with the DV certification ;) ) But i hope you will generate some more interesting blogs about this subject !
    Keep it up!

  2. Hi Hennie,

    Congrats on you DataVaut certification. I did not succeed.

    Lucky I will get a second shot and might have build a DV by then hoping I did understand it a bit or build a complete wrong DV model :-)


  3. Greeting Hennie,

    I would agree that DBMSs such as Netezza improve the feasibility of virtual star schemas. I would be interested in your thoughts around denormalization of code descriptions. I can't see materializing the joins to get dozens of descriptions for each dimension at run time.

    Best Regards,
    Chris Busch