When is Different Not Different?
An MDM solution must be able to tell the difference between how data is represented versus what it represents. This is just another of the complications that MDM architects must deal with when designing a solution.
By Martin Dunn
Core
Processing
An MDM solution must be able to tell the difference between how data is represented versus what it represents. This is just another of the complications that MDM architects must deal with when designing a solution.
Consider the following two addresses. There are many differences between the two records, but none represents a true conflict in the business information. A key to successful MDM is to be able to distinguish between conflicts, omissions and standards.

Case
Many legacy systems literally shout I’M OLD MAINFRAME DATA in uppercase only. Modern systems prefer the more pleasing Title Case. Case differences don’t represent a difference in business data so we treat fields like Street, City and State as identical values.
Name
The Name field is clearly different but means the same thing. The mainframe system holds a contracted version of the university name to squeeze this name into 30 bytes. This contraction is typical of mainframe data where each byte was valuable. If we want to compare the source record to the Hub we need first to transform the mainframe data by expanding contractions. The expanded Name1 data can be considered equivalent to the hub data but not identical.
Street, City and State
Street and city are identical between the two systems (ignoring case). The equivalent Name and the identical Street and City data are enough for us to consider these two records to be good match pairs.
Zip
It is not uncommon for older systems to limit Zip codes to 5 digits. The “plus-four” digits were only introduced in 1983 and have never been considered mandatory for address validation. In the MDM process, an address should be validated against the USPS database to validate the address components including the Zip+4 code. While MDM survivorship should prefer the CRM Zip+4 address for the golden record, this does not mean that the Legacy system is wrong.
Summary
Data does not need to be identical to be the same. Ensure that your MDM strategy is tolerant of differences in representing business data to avoid sending needless changes to connected systems.
Martin Dunn
Author
Martin Dunn was the co-founder of Delos Technology which developed the MDM technology marketed under the Siperian brand. The Delos MDM technology introduced many MDM concepts that are now widespread within the MDM discipline including a data steward console to adjudicate match results, opt-in synchronization, cell level delta detection and the concept of measuring trust.
Martin is now a partner with Gaine Solutions and continues to advance the techniques by which enterprise Master Data is managed.
Related Posts
Key Questions to Ask During Master Data Consolidations
Key Questions to Ask During Master Data ConsolidationsTypical master data consolidation starts with combining the operational master records from all the data silos where they exist. The key aspect being, creation of master data indexes to support single view; knowing...
Opt-in Synchronization
Opt-in SynchronizationNot all operational systems will choose to, or be able to, consume the changes made to master data in an MDM hub. The reasons for being out-of-synchronization may be technical, regulatory, political or economic but at some point it will be...
Changing a Match Rule
Changing a Match RuleWhen we are talking to companies about our MDM platform we cover a broad range of topics, from measuring ROI, to more technical questions about the way the software operates. A common technical question is "How do we change a match rule?" Our...
Ready to master data mastering?
Subscribe to our mailing list and we’ll send you courses, insights, product updates, and more. Get to know the ins-and-outs of your Gaine MDX platform, features, and solutions.