Merging two large RM databases with some of the same people in it

I have a dilemma that I have been meaning to deal with for a long time, but I have been putting it off because I am nervous and downright scared to do it.

As the title states. I need to merge two large RM databases with some of the same people in it.
I know this is possible, but I don’t want one to overwrite the other information on the same person because one may have something the other one does not and vice versa.

Here are my two questions:

  1. Where is the merge feature in RM, or should I export one of the databases to a GEDCOM and then import it into the other as a GEDCOM?
  2. What is the best approach to merging two RM databases so that nothing gets lost out of either when they get merged?

Thank you

Rick

There is not a “merge two databases” feature in quite the sense you are thinking about. Something like the following is going to be your best strategy.

  • Let’s suppose you have databases A and B. They will be left completely untouched by the merging process that I’m going to describe. If the merging process goes completely off the rails, just delete the new database C we are about to make and you will be back where you started.
  • Copy database A to a new database C.
  • Drag and drop database B to the new database C. All the merging will take place in the new database C, leaving the original database A and database B untouched as previously mentioned.
  • From now on work only in the new database C, with the original database A and database B being closed.

At this point, your two files have been “merged” in the sense that everybody that was in database A or in data base B is now in database C, but no duplicate people between database A and database B have yet been merged. So we will start merging duplicate people in database C. That’s what I mean by the fact that there is not a tool to merge two databases. What there is instead are tools to merge duplicate people within a single database. So we have combined the databases without merging the duplicate people, and now we are about to merge the duplicate people in the new database.

There are three basic ways to merge the duplicate people in the new database. There are several different places to find the merge tools. The place I usually use is on the main People tab on the left side of the screen and then in the Tools icon in the upper right corner of the screen.

So click on that Tools icon and choose one of the Merge options.

Automatic Merge

This option is completely automatic and it is very exacting. It only merges people if there is data is exactly the same or almost exactly the same. Many RM users swear by this option, but I don’t like it. The reason I don’t like it is that it just runs very quickly and after it is done I have no idea what it did. I ran the tool just now in a copy of my main RM database, and it said it merged over 50 duplicate people. But I have no idea who those duplicate people were or if they really should have been merged.

Manual Merge

This merges two duplicate people, period, and you have to find the two people to merge. I use the option quite a bit, but I don’t think it’s a very good fit for what you are trying to do. It would be too hard and too slow for you to find the possible duplicate people that need to be merged. I would only use this option when there are an extremely small number of duplicate people needing to be merged, and when you already know who those people are. That may fit what you are trying to do, but it sounds like your problem is a little bigger than that.

Duplicate Search Merge

This one is called Merge duplicates in the Tools menu, but historically it has been called Duplicate search merge. First of all, it has some options about how close the match has to be between two possible duplicate people. You will probably want to play with these options.

Second of all, it steps through your possible duplicate people, pausing at each possible duplicate pair. Then you make a decision to merge or not to merge based on what you see. After you make your merge or not merge decision, it steps on to the next possible duplicate pair.

Actually, even if you make a decision to merge, it remains paused on the merged person until you tell it to move on. The reason it remains paused is that no data whatsoever is lost by the merge process. For example, suppose one John Doe was born 12 Sep 1912 in Tennessee and another John Doe was born 12 Sep 1912 in Jefferson County, Tennessee and you decide to merge them. The combined person will now have two Birth facts, one born in Tennessee and one born in Jefferson County, Tennessee. So you can edit the person, decide which Birth fact to keep, and delete the other. Plus, you can further edit the Birth fact you decide to keep.

As you do these edits of the merged people, you need to look at everything - names, facts, places, dates, citations, and media files. No data is lost by the merge operation, but you need to decide what to keep and what to delete when the data is mostly duplicate. If the data is exactly duplicate, then the exactly duplicate data is fully merged and you don’t have to do anything. It’s only when the data is approximately duplicate that you need to edit the merged person and clean up any residual duplicate data. For example, the people could be the same but the names might be spelled slightly different, etc.

In theory, you could just go through the merged duplicate people without cleaning them up as you go. But in my experience, it can be very difficult to go back after the fact and find out which people still need cleanup, so I find it better to cleanup as I go.

Depending on your data, merging all the duplicate people in database C could go very fast or it could take a long time. There is a big difference between having a dozen duplicate people or a hundred duplicate people or a thousand duplicate people. If you have a very large number, the good thing is that you can stop the Duplicate Search Merge process at any time, and then start it up again an hour later or a day later or a week later.

And finally, if this gets all screwed up the first time you try it, just delete database C and start over again. Database A and database B will still be there and will not have been changed at all.

Thank you for getting back to me on the Jerry.
I am still nervous about pulling the trigger on this.

I have three questions about what you’re explaining.

  1. You said drag and drop database B into Database C.
    Where is the option to drag and drop one database into the other one?
    I understand a lot about RM, but I did not know you could drag and dop one database into another one. Please explain this process about dragging and dropping a little more please.

  2. If I understand you correctly. Whether I do an Automatic, Manual, or Duplicate Search Merge of people, it will put all the data from one into the other. That means if I have a field that is different by a date or one single character, it will create a duplicate entry in that person’s file. Is that correct?

  3. If I do the Duplicate Search Merge, you said it will stop at each possible duplicate and give me the option to merge. If there is an entry or two from the duplicate person in database B that is wrong or I wish not to merge, will I be able to exclude one or both of those entries from being merged and have the rest of the entries of that person continue to be merged? (I hope that I said makes sense).

Thank you

Rick

  • To drag and drop from database B into database C, get them both open at the same time. When you first get them both open, they probably will both be in full screen windows. You can switch back and forth between the two full screen windows at that point by clicking on F5. Reduce the size of both windows enough so that both of them can be on the screen at the same time. In theory, this ought to be a pretty easy process. But I find it a little tricky in practice.
  • Click and hold one person in database B, drag the cursor over to database C, and release the cursor. That sounds like it’s only going to drag and drop one person from database B over to database C. But it will stop and ask you who else should be copied. Choose the Everybody in the database option. By the way, if you know for sure at this point one person in database B who will need to be merged into somebody in database C, you can drop the person from database B on top of the same person in database C. But you are going to have lots of people to merge. So just drop the person from database B onto some empty space on the database C window.
  • Click OK and wait for the drag and drop to complete. Everybody on both database A and database B are now in database C.
  • I always arrange my windows so that I’m dragging from the window on the left and dropping onto the window on the right. But that’s totally unnecessary. You can drag and drop between the two windows in either direction. Just be sure you know which is database A, which is database B, and which is database C. The name of the database is at the top left of each screen. Plus I color code each database a different color. In my screen capture, I’m dragging and dropping from the green database on the left to the red database on the right.

That is correct.

If one of the people had a Burial fact and the duplicate person didn’t have a Burial fact, the Burial fact would be kept in the merged person and there would be nothing to clean up. If both people had identical Burial facts, one of the Burial facts would be kept in the merged person and there would be nothing to clean up. If both people had Burial facts that differed ever so slightly, the merged person would have two Burial facts that would have to be cleaned up, one Burial fact from one of the original people and the other Burial fact from the other original person.

Here is what the Merge screen looks like.

The person on the right is always considered to be the duplicate person and the person on the left is always considered to be the primary person. The person on the right will be merged into the person on the left. If the people are truly identical, it will not matter which is primary and which is duplicate unless you care about preserving the person’s ID number. In this case, the person on the left is person 40237 an the person on the right is person 88308. If you wish, can always swap the left person and the right person before doing the merge.

I chose a duplicate person where the person on the left had a Birth place and the person on the right did not have a Birth place. So if we did a merge at this point, there would be two Birth facts after the merge, one with a birth place and one without.

In this particular case, the two duplicate people also differ in having different parents because the parents have also been duplicated. I don’t know how much of this sort of thing you will encounter with your project. In my experience, it’s better just to deal with one person at a time rather than trying to deal with the duplicate parents at the same time. In fact, the duplicate parents will also show up in their own right to be merged as the Duplicate Search Merge process continues.

In any case, the Merge does not take place until you are ready. You make the Merge happen by clicking on the Merge duplicate into primary button at the bottom left of the Merge screen. That’s the point at which the Duplicate Search Merge process moves on to the next duplicated person.

I don’t have reason to do this kind of merging very often, so I had to review a bit before posting this message. My review caused me to remember an unfortunate behavior of the Duplicate Search Merge process. Namely, after you click on the Merge duplicate into primary button, it goes on immediately to the next pair of duplicates to merge, even if there are things to clean up in the pair you just merged. I think there should be one more click required before moving on to the next duplicate pair if cleanup is required. The extra click wouldn’t be required when there was nothing to clean up. After the Duplicate Search Merge process is completed, it can be very difficult to go back and find the people that need the cleanup. This argues for making the people identical before doing the merge.

Perhaps some other experienced RM user can think of a solution to this problem, but I can’t think of one right now.

That’s not quite the way it works. You either do the merge or you don’t. If you do the merge, then identical items are not duplicated in the newly merged person but items that are not identical do become duplicated in the newly merged person.

What you can do is to edit either duplicate person or both before the merge to make them identical before the merge.

What you can’t do is that it won’t pause after a merge to allow you to do a cleanup of the newly merged person before moving on the the next duplicate pair in cases where the newly merged person needs cleanup. I’m still trying to remember how I have worked around that problem in the past. One thing I may have done is to exit Duplicate Search Merge immediately when I know cleanup is required, do the cleanup, and then resume the Duplicate Search Merge. Again I suggest that perhaps other experienced RM users can help with this issue.

These days, I have so few duplicate people left in my database that when I notice one of them, I just do it with a Manual Merge. Doing it that way, it lets me do the cleanup. But I’m pretty sure you will need to do the Duplicate Search Merge to find all your duplicate pairs.

Rick, you will find that many of your questions can be answered by exploring the online Help pages which the F1 key in the application opens to a page related to the screen of the application currently in focus. More can be answered by trying things. Don’t be afraid to try. Jerry’s A, B ->C and making a backup before doing something you’re unsure of minimise risk.

Thanks once again Jerry.
One last question, I hope.
Once I start the Duplicate Search Merge am I obligated to complete it, or can I stop at a point and go back and finish it later?
The two databases are large, and I really don’t want to be doing the merge for many hours if I don’t have to.

You can finish it later. For a large number of people needing to be merged, I would definitely recommend multiple sessions.

It doesn’t remember where you left off. Instead, it starts over again each time. However, when it “starts over again” it doesn’t need to merge again those people who have already been merged. So it has the effect of picking up where you left off.

I’m still bummed out that I cannot remember how I used to deal with the cleanup needed for a person after a merge during a Duplicate Search Merge. it really shouldn’t move on to the next duplicate person without giving you chance to do the cleanup before moving on. I may have run the Duplicate Search Merge to find the duplicates, but then cancelled out of the Duplicate Search Merge after each merge. That would have given me the chance I needed to do the cleanup.

I do remember that it was really hard for me to go back after the fact to do the needed cleanup on a large number of people. There was simply no easy way to find them. That’s why I liked doing the cleanup as I went. I think some RM users would simply write down on a piece of paper the names of the people they had merged, or something like that. I also can’t remember if there is an official RM video on Duplicate Search Merge that deals with how best to do the needed cleanup of each duplicate pair after they are merged. And I ask again for other experienced RM users to share their ideas.

@Attroll Unless Jerry has changed his mind after playing with duplicate merges, he said you could do so many and stop-- start again an hour, a day or a week later

Since once you merge 2 people, it immediately goes to the next, you could do one of 2 things

  1. take a screen shot of so many of your duplicates-- say abt 20-- work thru the list merging them and then use the screen shot to clean up the ones you merged
  2. write the person’s name ( and probably RIN #
    down ) in a note. word doc before you merge them and then go back later and clean them up…

Actually there is a 3rd option-- once you merge the 2 people, the person does NOT fall off the list UNTIL you CLOSE OUT of the merge duplicates–so you could click back on him and edit him before moving to the next-- if you do it this way, just make sure you edit all of them before closing the merge duplicates…

Just a thought

If I can add my two cents…

If you are nervous/worried/concerned about doing anything in RootsMagic, then I’d guess that you do not have complete confidence in your backups.

Having confidence in your backups includes understanding how to restore them.
Once you know that, and you have multiple backups at different locations, you can try anything in RM and if you’re not happy, just restore from a backup.

Of course, there is caveat. How do you know that “you’re happy”, i.e. that the operation that you performed gave the result that you wanted.
That’s where testing comes in. Check out the result to see if it’s what you wanted and then leave the result database alone for a while.
Using Jerry’s databases A, B and C notation, don’t add or edit new research information to database C. That way, you can keep thinking about new ways to test the results of the operation (merge) that you did and you’re not invested in using database C.

Keep adding research to A and B.

After awhile and you’re sure C is OK. Follow the same steps that you did to generate C the first time using the latest A and B databases.
Test again. If happy, use C for your research.