How to deliver a complex healthcare data migration: Interview with Julie Waters

Healthcare data migration projects have to be some of the most critical initiatives to get right. Ensuring the highest levels of patient data quality are obviously key but delays can often impact the delivery of healthcare services so a robust data migration delivery strategy is vital.

In this interview I wanted to find out what it takes to deliver large and complex healthcare data migrations by speaking to Julie Waters, Operation Manager at Avoca and a highly experienced practitioner of 30+ healthcare migrations.

Julie has recently helped the Queen Elizabeth and Queen Mary’s Hospitals of Lewisham complete a major data migration to Cerner Millennium®.

In the interview Julie provides a wealth of practical advice for any data migration project leader or stakeholder about to embark on a large healthcare data migration initiative.

The summary of success factors at the end makes for particularly compelling reading.


Dylan Jones : Congratulations on the recent news regarding the data migration at Queen Elizabeth Hospital and Queen Marys Hospital in Lewisham, London. Can you share a brief outline of what has been achieved at these two large hospitals?

Julie Waters : Our main deliverable was a full complement of the typical acute load files for the Cerner Millennium® system:

  • Master Patient Index (MPI)
  • Admitted Patient Care (APC)
  • Outpatient History (OP)
  • Outpatient Waiting List (OPWL)
  • Inpatient Waiting List (IPWL)
  • Outpatient Future Appointments (OPFAP)
  • Choose & Book (C&B)
  • Referral to Treat (RTT)

That was a total of 29 data files across bulk and delta, encompassing 1,147 columns and 2 million rows of data….a lot of data! With this kind of volume of data to get load statistics of 99% or more across the board was simply fantastic!

Merging patients and their records from the two legacy systems was a large challenge; we took Queen Elizabeths master patient index of 1,444,504 patients, eliminating 18,326 duplicate registrations and removed 543,767 (test patients, prisoners, out of date records) records. We then matched remaining patients against Queen Mary Sidcup (732,837 patients), to produce a fully merged MPI of 326,725 existing QMS patients updated, and 555,686 new patients from QE.

In addition we improved NHS number population from 65.6% traced on QE and 93.5% QMS in source, to a combined demographic of 97.4% traced, so it’s clear to see that we’ve added real value to the quality of QMS/QE patient data, and therefore reduced clinical risk significantly.

Extraction is always a problem and although McKesson provided standard extracts these are hierarchical which are not suitable for migration. We had overcome this on prior projects by developing a tool specifically for McKesson extracts. We used this at Lewisham to quickly transform these extracts into a set of relational text files for migration, thus reducing the overall transformation timeline.

Another key aspect of a Cerner deployment is ensuring that the configuration of clinics keeps in sync with the operational data migration workstream and the changes that happen every day in busy hospitals. To assist in this we produced a fully integral ‘DEFAULT SCHEDULES – TEMPLATES’, and ‘APPOINTMENT SLOTS’ in the ESM (Enterprise Scheduling Management) DCW data collection. For those that have worked on these spreadsheets you will know how difficult this can be, a team of six full time staff on this activity is not unusual.

Post migration, the client are using our Healthcare archive that they will use review how individual records were transformed and view legacy archive data.

Dylan Jones : What is your role at Avoca and what was your role on the project?

Julie Waters : As Operations Manager for Avoca Systems, I’m responsible for managing our migration service and tools to ensure exceptional levels of accuracy with patient data, delivery to deadlines and project risk. Over 15 years and 30+ projects I have been involved in the development of our tools, methods and deliverables and I am proud that Avoca have grown to be market leaders for healthcare data migration, especially for Cerner projects.

For Lewisham I ensured that our plans tied in with the overall project and joined in on the many conference calls and planning meeting that these projects generate. These can be hard work as the various vendors, systems integrators and client staff try to keep plans on track and deal with the many issues that arise.

Behind the scenes I was managing our team to meet operational milestones, and ensure that staff are trained in understanding the complexities of health data, to deliver exceptional levels of accuracy.

Dylan Jones : Was this considered a large migration by industry standards?

Julie Waters : Yes, absolutely, for the first 14 months there were three legacy systems and three hospitals involved: Queen Elizabeth, Queen Mary and Princess Royal . A triple migration involving 1,444,504 QE patients, 732,837 QMS patients and 752,402 PRU patients…that’s a lot of patient data!

For these size of hospitals it is unheard of as far as I’m aware.

Dylan Jones : Patient data is of course very sensitive, how did you deal with this?

Julie Waters : Not really a problem, our strategy is for all data and our tools to remain at the Clients data centre, we then connect our central team of analysts through secure networks to provide our services. We use this approach successfully on Lewisham and all our global projects,.

Dylan Jones : Can you think of one element of the project in particular that was pivotal to success?

Julie Waters : I think the key thing is experience, the client team may only do one large migration in their career, but we have been involved in 15 acute hospital migrations to Cerner Millennium® and 4 migrations from McKesson systems so we have an excellent understanding of both source and destination systems. We’ve really nailed the business rules to migrate to Cerner Millennium® so any new projects we migrate get all the value of past projects.

Dylan Jones : I understand the data migration strategy had to change mid way through, how did you deal with that”

Julie Waters : Yes, after the second attempt at dress rehearsal, operational staff at the Hospital had looked at the ‘refer to treat’ (RTT) reports in more detail and discovered a number of issues that didn’t work for them and did not fully reflect the collection of RTT open pathways.

For example there were a number of data quality issues in the McKesson RTT Pathway Extract like poor population of Episode Numbers in the source file STAR_RTT_PTHWY_DATA_REF and issues around the determination of a Stopped Pathway and the Current RTT status.

RTT migration was already a complex area so the issues identified prompted a focussed review of RTT where a team of four from Avoca went down to Lewisham to discuss the complexities face to face. This review produced a 17 page document on how differently the RTT load specification and other event files should be generated. Through collaboration with the Hospital, we established that the source files should join in a different way and have different filters applied. Rather than drive the cohort of RTT open pathways from the source data, we totally overhauled how we generated the RTT IFF to drive from the transformed event data sets and merely collect the appropriate additional RTT information from the McKesson RTT source files, where it existed. There was also a revised set of Cerner load specifications to reflect required changes for RTT in the target system. .

The outcome was that the files delivered in trial load 6 had 560,408 records in it compared to 12,979 in dress rehearsal, and for example, the Future Appointment IFF had 32,228 psuedo RTT Pathway ID’s generated, versus 2,221 at go live. So overall the result was that not only volume of records being migrated was more accurate, but also the quality of the data within the IFF. This presented the trust with a far truer reflection of their RTT open pathways to what they had had previously and hence a successful migration.”

Dylan Jones : I read that there was also a major restructure during the migration, what was the impact of this and how did you keep the momentum going?

Julie Waters : Yes, there were a number of changes to the scope and strategy for the data migration, that happened during the project. Many of these were down to the organisational changes that were happening around the system implementation period. Many of them couldn’t have been foreseen at the outset, like South London Healthcare becoming the first NHS organisation to be put into administration in July 2012.

Although the Client made the decision to continue migration to Cerner Millennium® in Nov 2012, the first change was that Princess Royal University Hospital migration was aborted at this time, 14 months from data extraction.

Responsibility shifted from South London Healthcare to Lewisham and Greenwich. This involved a change of Director of IT so around this time there were also changes in their project team. In total, we have dealt with four different client Data Migration Leads over the course of the project, so one of the key benefits Avoca brought was continuity between these. Fortunately our migration engine is capable of handling things like inactivating data sets and changes to baseline, so we didn’t cause the project unnecessary delays in uncoupling Princess Royal from Queen Elizabeth.

Dylan Jones : How did you deliver the actual load strategy – was it a Big-Bang? Phased? Parallel load?

Julie Waters : The project used a bulk-delta strategy which we fully support. The idea of this is to migrate the bulk of your patients and activity about a week prior to go live and then the more recent history up to the point the legacy system is turned off, in a ‘delta’ load – the actual cutover. This means that during bulk, your legacy system remains on and users can continue using the operational system. The load into the new system can be done during a quiet period (if you’re loading into a live domain) such as a weekend, and there is more control over the bulk load.

This just leaves the delta, when your legacy system will be properly turned off (or made read only), and the final data extract taken to ensure you get all the remaining patients and associated activity. It’s only at this point that the client have to go into manual paper collection while the legacy system is down. Overall the impact on end users of the operational system transition, is kept to a minimum.

The trickiness with this type of strategy is ensuring the right data is processed at either bulk or delta and that no data is missed. Also that any record keys such as PATIENT IDENTIFIER, HOSPITAL NUMBER, APPOINTMENT ID, SPELL NUMBER etc. are integral between bulk and delta to ensure that everything hangs together correctly once both loads, bulk and delta, are complete.

Dylan Jones : One thing I notice on a lot of migrations is that the focus is often on the transactional records so did this project have a big requirement for reference data collection too?

Julie Waters : Yes, absolutely, we ensured the strategy tackled the huge amounts of reference data right from the outset in two ways.

The first was managing the enormous number of code level mappings (over half a million code mappings across nearly a hundred mapping tables) that have to happen between codes in the old system and codes in the new system.

After the initial code mapping its really important that the client can see and edit them as changes in the day to day hospital occur. But this change has to be controlled so the client used our Reference Data Manager product to ensure that version control and a full audit trail exists on these mappings.

The second way we helped was by was helping them to populate their Cerner DCW’s (Data Collection Workbooks), which are formatted Excel spreadsheets – this is where the client has to collect the reference data that will sit behind their new PAS/EHR. We helped them collate and present their clinic template information in a suitable way for Cerner to load and configure in Millenium The hospital had been struggling for over a year to populate their relevant Cerner DCW’s.

I spent a few days on-site with the clients’ team where I had to quickly establish credibility in a team involving multiple directorates, layers of seniority and also with other suppliers.

The end result was that Avoca produced an integral ESM DCW for the client through a cleanse and transform data migration of their legacy clinic set up information.

Dylan Jones : Obviously healthcare data has to be correct so how was data quality coordinated throughout the project?

Julie Waters : Excellent data quality is of course what it is all about but its not a binary measure, so throughout our processes we’re testing the data quality and concerning ourselves with fixing it in the best way, because there are often choices and compromises to be made.

To do this we we try to consider at all times what’s best for the patient and the true integrity of the patient record and its relationships to others. So for example, we would think deeply about the consequences of a particular business rule to clean the data, on the patient and the client, together with the usability of the data in the new system. As opposed to a quick blanket “oh we’ll just default these records”, which some suppliers might do.

Sometimes we find things in the data that don’t specifically trip a load validation i.e. they wouldn’t cause the record to fail to load into the new system, but we are concerned, for example when a business rule change has had unexpected consequences on volumes of data produced. Here, we would give the client Data Migration Lead a call. Sometimes the Data Migration Lead will have an explanation for it, say clinic closures in the run up to go live. Other times, the concern that we’ve raised will prompt further investigation and testing by the client.

Avoca’s Data Healthcheck process is the first stage of data quality coordination and where we helped the client get some sort of understanding of how their source data related to the target system’s data specification, rules and validations. This really helps with planning because it provides that elusive insight into the types of data quality problems and how much clinical risk the client face, upfront. By having this information upfront, planning could be far more realistic and decisions could be taken about how the data quality problems were going to be overcome (either fix through the cleanse and transform, at source on the legacy PAS or not at all). At the end of the Data Healthcheck process the client then had a clear action plan of the data quality issues to address on the legacy system and in what order of priority. This is all before trial load 1, so this is where the client gets confidence on the timeline…they’re not having to wait until the end of trial load 1 to see which records fail to load and only then start taking action on them..

Each time we receive a subsequent extract we check to see the progress of the cleansing that the client has agreed to do on the legacy data, so the hospital always has up to date information about their progress against that aspect of quality. This allows them to take further decisions about resourcing levels and likely readiness for the extract and trial load.

The second key area that we add real value (aside from the continued data quality checks that Avoca do prior to delivery, that I talk about below), is in how we report on the quality of code mapping tables (where one value is transformed to another) as part of the trial load process. This report determines the suitability of mapping tables and provides the client with information about the problems identified, such that these may be rectified or taken into account prior to trial load. This is useful to the Client because they can decide whether the trial load will be of value if they run with the code mappings they have, so a trial load isn’t wasted. The report may flag for example if there are problems with the build, so the client could choose to rectify their build, or proceed ahead to trial load.

This genuine pursuit of excellence and attention to detail pays dividends when you apply that over the life of the project as proven by supplier load statistics on our data.

Dylan Jones : What kind of testing strategy did you adopt to ensure the final migrated data was fit-for-purpose?

Julie Waters : All of the data we cleanse and transform undergoes rigorous testing. We have a tool called ‘Validator’ that ensures our files meet the requirements of the load specification but it also does checks around the general fitness of the data, so encompassing all of our accrued knowledge and understanding of healthcare data, to ensure the usability of the data. This includes cross data integrity checks and checks about the validity of the patient journey. This means that we don’t need to wait to hear the results of loading, we already know what it will flag up before the data leaves Avoca. We also give the tool to the client so this allows them to drill through to the row level patient data to find hospital number if they want to correct data on source prior to the following trial load.

We perform further checks as we produce the Reconciliation Report. This is an accounting type balance sheet of what numbers of rows we received in source data and what is delivered, including a detailed breakdown of how the transform has got from number of records A to number of records B. Here, there’s a high degree of sanity checks – are these the number of records we’re expecting given what we know about what’s been filtered out, are the number of records similar to previous trial loads, how have the current load business rule changes affected volume of rows?

Finally, the Analysts perform a full complement of manual checks. This is the labour intensive part of the process but well worth the human interaction with the data to catch any of those issues that the top two checks would otherwise miss. Overall it’s a thorough process but it gives us absolute confidence when we deliver the load files to the client about the quality of the data.

Dylan Jones : For healthcare program leaders about to embark on a massive data migration such as this, what final advice would you give them based on your experiences?

Julie Waters : From my experience of over 30 hospital migrations, the two that went the smoothest were Cerner implementations at Winchester and Eastleigh Healthcare 2006 and Barnet & Chase Farm Hospitals..

There were a number of factors that I think led to this:

  1. They had Data Migration Experts with experience migrating to Cerner Millennium® delivering a fully managed service (ourselves!). I’m not just being biased here, I’ve worked on projects that we’ve done ourselves, been involved in those that are migrated by other suppliers, and those that are migrated by the client themselves. I’ve seen the consequences of the latter two options and know that the mistakes that have been made have already been solved by ourselves and could have been mitigated. Believe me I’ve been there when a go-live with one of the last two options is in live cutover, with system down and the go-live has to be aborted and the hospital have to resume with the old system because the cleanse and transform has gone that wrong. It’s not pleasant. You don’t want that level of risk.
  2. Get yourself a really good Data Migration Lead acting on the clients behalf (Avoca can help out here). The devil is in the detail with healthcare system implementations so they need to have an excellent understanding of health informatics, but they also need to understand risk. This is where I’ve seen other projects go wrong – they appoint their Senior Information Analyst to cover the role and they have the SQL skills and knowledge of the data to deal with the detail, but they fail to have the wider awareness of stakeholder, project and change management that’s necessary. These skills are needed to be able to recognise when there’s a problem that needs escalating and communicate effectively to Director’s to ensure they can make the right decisions for the organisation at the right time.
  3. The more you can make each trial load really count, the fewer you will need and the quicker you will go live. The quicker you can go live, the less it will cost you in contractors, trial loads, time etc. A hospital is always changing its staff, its clinics, its services….it’s an operational system. The problem if the timeline starts to slip out is that not only do you have this operational change to contend with, but the more likelihood of more significant organisational changes and decisions being made that will incur even more work for the data migration project to keep up with. Once you get into this realm, the project can slip further and further out from original intentions, and as it does that, the original requirements of the system you’re migrating to, get further and further away from why you originally chose to buy that system. You want to go live quickly, get the true benefits of the system for the reasons you chose to buy it in the first place and start to maximise the cost-benefit. There are plenty of news stories on about hospitals that have finally migrated that are questioning whether the costs they incurred migrating, warrant the benefits they’re getting from their new system.
  4. At Avoca we can get to trial load one in 17 weeks from the source data extract arriving, and 26 weeks for go live to Cerner Millennium® so we’re really able to give your project the traction it needs. You can see from this that the effort is in getting trial load one, but when you do, we’ve really made it count from the thorough data quality work and robust business rules we’ve implemented. Trial load 1 MPI for the combined PRU, QE and QMS for bulk had 100% load rate for example.
  5. Finally, don’t forget about reporting and interfaces. It’s really important that your data migration and your reporting and interfacing streams don’t work as silos and you properly test migrated data in all streams. Again, there have been lots of horror stories on London migrations regarding financial penalties as a result of reporting problems once live. You don’t want to find out about these problems that late.

If you heed this advice then the proof is available in the success of these two projects. Winchester and Eastleigh Healthcare went live with 8 loads in 12 months and Barnet & Chase Farm Hospitals (two PAS’s migrating into one) went live with 5 loads in 8 months, so it is possible!


About the Author – Julie Waters

Julie is an experienced project and programme manager with specialisms in people management, health informatics and process evaluation/improvement with an excellent track record for success and a high level of credibility within the healthcare sector.

Julie has been involved with over 30 healthcare system implementations including 11 successfully delivered ‘live’ acute hospital PAS data migration projects and 20 years Health Informatics experience working with and for health organizations.

Avoca Systems are a healthcare data management solutions provider who specialise in supplying technology and expertise to help deliver complex data migration initiatives globally.