vrijdag 21 januari 2011

Master data and reference data

Currently reading David Loshins book "Master Data Management" and i was reading about reference data and Master data. One might question the difference between master data and reference data. Loshin defines reference data "as collections of values that are used to populate the existing application data stores as well the master data model.". I've a bit troubles with this definition because it's not saying what it is, but it's defined as how its being used (in master data and transactional data).

Today, i've read Stefan Frost article on ITToolbox and he also talks about the differences between reference data, master data and transactional data and i saw a resemblance between the two writers.  Stefan defines referential data as : "Reference structures with descriptions and names for types and codes that describe something in transactional or master data.". Also this definition describes how referential data is used and not what it is.

Reading Linstedts DV specifications and his blogs and these quotes the following: "Reference data is also known as: cross-reference (XREF), or lookup tables, they may or may not contain HISTORY – and if they contain history, they are to be modeled in their own Hub/Link/Sat structures.". Hmmm.

Another distinction is defined on wikipedia. Wikpedia defines reference data as "data describing a physical or virtual object and its properties. Reference data are usually described with nouns" and Master Reference data as : "these are reference data shared over a number of systems. Some master reference data are universal like country".

Yet another article on MDM by Malcom Chisholm gives also a viewpoint on the difference between reference and master data. I quote an interesting sentence: "Reference data is any kind of data that is used solely to categorize other data found in a database, or solely for relating data in a database to information beyond the boundaries of the enterprise".

As it seems to me a clear definition is quite difficult to give. Below i'll describe the most interesting characteristics of reference data (most from the article of Malcolm Chisholm):
  • Reference data has fixed key numbers and Master data is identified by different keys in different systems, lists, etc.
  • Reference data is stored at a higer level than Master data.
  • The number of records of reference data is mostly less than Master data.
  • In Reference data more metadata information is stored than in Master data. For instance NL and the Netherlands has more meaning than individual rows of master data like Customer A is just Customer A, and Product X is just Product X. Rows of master data do not have meanings.
And in my current job (academic hospital) we have so called "AGB codes". These are national codes for care providers. DBC codes for diagnosis treatment coding is another example for reference data.

Other examples of reference data are:
  • A country list (ISO country codes).
  • National postal code tables.
  • Internal product categories.
  • Classification systems.


1 opmerking:

  1. "Reference data is a close cousin of master data. While master data is challenged with problems of unique identification, may be more rapidly changing, requires consensus building across stakeholders and lends structure to business transactions, reference data is simpler, more slowly changing, but has semantic content that is used to categorize or group other information assets – including master data – and gives them contextual value.
    Reference data types may include types and codes, business taxonomies, complex relationships & cross-domain mappings or standards.
    Reference data carries contextual value and meaning and therefore its use can drive business logic that helps execute a business process, create a desired application behavior or provide meaningful segmentation to analyze transaction data. Further, mapping reference data often requires human judgment" (https://blogs.oracle.com/mdm/entry/reference_data_management_and_master)

    "For organizations with large commercial exposures, well understood and shared mastered data is key, and in highly evolved financial markets, "common" reference data is so critical that the emergence of mastered data shared services is starting to become a feature of everyday life." (http://www.accenture.com/us-en/Pages/insight-technology-master-data-management-summary.aspx)

    and something about meta data in general: https://blogs.oracle.com/IanT/entry/is_metadata_important