by Michael Ochs on April 27, 2017
A little over a year ago we began development of a new technology for migrating data from on-premises instances of Dynamics CRM to Dynamics 365 (then referred to as CRM Online). What we knew was the current tools were inadequate due to their broad integration capabilities that didn’t necessarily translate to a simple one-to-one migration from on-premises to the cloud without a lot of work to set up. So we set out to create a purpose-built tool that would do one thing and do it well. We never intended to build a complicated integration and mapping application that would allow users to put data from the source system into different areas of the destination system. We didn’t set out to make the tool infinitely configurable for different source and destination data stores. All we wanted to do was create a tool that would allow us at Cobalt and anyone else to move data from one instance of Dynamics to another.
What seemed like a simple enough task, given the assumption that the schemas of both the source and destination were the same, ended up being a lot more complicated than you’d think.
Here’s the thing about Dynamics; it’s a finicky beast:
If you were moving data from one SQL database to another, for instance, there are some known constraints that can create roadblocks for copying data. Foreign keys, for instance, can cause problems if you don’t move the data in the right order or exclude specific fields that look up to non-existent records in other tables.
Think about the relationship in Dynamics between a contact and an account. An account can have a primary contact and a contact can have a parent account. If, for instance, Satya Nadella were the primary contact for Microsoft and Satya’s parent account is Microsoft, we’re left with a chicken and egg situation to decide which record to move first. The quick fix for this in SQL is to get rid of the foreign key constraints while migrating. Unfortunately, this isn’t possible when moving data to Dynamics 365 using the supported web APIs.
The scenario above is well known and there are lots of methods for getting around the problem, like excluding one or the other relationship on the first pass and then “re-parenting” during a second pass of the data. While this works fine as a way to solve the issue, these types of relationships can exist all over the place in CRM/365 and you’ll need to know what they are to get the order for migrating records correct.
Okay, let’s say you only have a handful of circular relationships like contact and account and you can, through trial and error or schema analysis, determine the correct order to move your data to ensure you don’t have orphaned records. Assuming that you have more than one user in your on-premises Dynamics system you’re going to need some way to map your users since their records will have different identifiers in the source and destination organization. Additionally, you may have teams of users that also need to be mapped. Most integration tools out there have a mapping component that will allow for this, but what about disabled users?
A common scenario when migrating data from on-prem to Dynamics 365 online is the need to maintain the owner, created by, and modified by of a record for auditing purposes. The problem here is there’s only a couple of supported methods for updating these fields, most of which require that you have an active user to “impersonate” while recreating the record. However, if you have 50 ex-employees in your on-prem organization, you don’t want to have to pay for 50 extra active online licenses just to move the data owned, created, or modified by those users. Ideally, you’d like to be able to create corresponding inactive users in the target system and still be able to maintain the historical data or potentially map an inactive user to an active user.
There are a bunch of other rules that exist in Dynamics that make simply moving data a difficult task not for the faint of heart. In the interest of avoiding a TLDR post (if I haven’t already). Here is a not exhaustive list of them.
Finally, what about the mechanism / architecture for migrating the data and monitoring the migration process or for error reporting? That is, where does the migration take place? Running the migration from your workstation or from the source server means only someone with the ability to login and check on that server can monitor the progress of the migration. It also means you better hope there’s not a scheduled restart pending while the process is running. Also, what about error logging? Where do those logs end up? On the hard drive of the server or workstation of the person doing the migration?
In my next post, I’ll talk about the decisions we made for the questions above. Hint: Azure is Awesome!