gtag('config', 'AW-803824614');

Data Standards: The devil is in the Details

Mergers, acquisitions and joint ventures are expected to continue in 2020 as a way for provider organizations to deal with industry challenges, according to Bloomberg Law. Which brings with it the need to consolidate databases—and the very real danger of winding up with dirty data when standardization is lacking. 

Whether it’s abbreviations run amok or the inability to capture full dates of birth (DOBs) or social security numbers (SSNs), lack of data standards and naming conventions can have broad and long-term impacts on the quality of the post-conversion data and the facilities with which it is shared. Consider what we found in just five reconciliation projects:  a rate of discrepancy in data captured in the middle name field of 45% and in the SSN field of 64% for confirmed duplicate pairs.

The reality is that while we are seeing improvements, standardization remains an issue both within and across healthcare organizations. And when a merger or acquisition is on the horizon, the problems created by loose or non-existent data standards multiply exponentially. While some problems are created by system limitations, more often than not the root of the problem can be traced back to policies and procedures—or lack thereof. 

Problem Areas

When it comes to post-merger data conversion and consolidation, a few of the biggest problem areas:

  • Capitalization:  Are upper- and lower-case letters used or all caps? 
  • Naming conventions:  Are middle names captured or just an initial? Are hyphenations and other punctuation used?
  • SSNs: Are all 10 digits captured or just the last four?
  • DOBs: Is the year captured? If so, is it four digits or two? 
  • Addresses: Are abbreviations used or are streets spelled out in full? Are apartment numbers captured? Is USPS address matching used to verify the information? 
  • Newborns: When are newborns required to have a name, if at all?
  • Alias names: Are incorrect names retained as an alias once corrected? Are nicknames included in the alias name field?
  • Sexual orientation and gender identity: Is this information being captured?


Few facilities are going to have identical data standard policies, and some are dictated by system limitations. However, if the differences aren’t identified and a plan of action put in place to eliminate them post-merger, the end result will be long-term EMPI contamination. In fact, even when facilities aren’t merging but are in a joint venture in which MPIs are shared, lack of standardized data and naming conventions will have the same result.

Mitigating the Risks

The first step before any data is merged is to conduct an MPI cleanup to deal with any duplicates, overlays and shell records that might be lurking in the system. At the same time, an audit of standards should be conducted to determine what policies are in place and how closely they align with those of the other facilities involved. It’s important to involve IT in this process, especially if it’s possible to automate some standardization.

Once the immediate problem is solved and the MPIs can be joined cleanly, it’s important to ensure policies and procedures are in place across all facilities to ensure the MPI remains clean going forward. Employees need to be trained on the standards, with annual refresher courses. Conduct ongoing quality assurance to identify problem areas and intervene with education and training when necessary.

Again, ensure everyone impacted by the policies is represented in their development, including HIM, scheduling/registration, patient billing and IT. 

Taking the steps necessary to establish data standardization protocols before MPIs are merged will go a long way in improving data integrity and minimizing MPI errors going forward.