Beware of merging sources and/or citations

jbmccall47 · December 20, 2023, 3:05pm

I recently made a resolution to move entirely from RM7 to RM9, largely because the search features in RM9 make it much easier to work with names and sources, reducing duplication since I can find what I want in the lists. I worked for ten days and cleaned up a lot, including more than 500 separate sources from Find-a-Grave. Each uses the built-in (RM7) template and is listed separately in my source list like this: “Prescott, Sarah Hayward (1645-1709) - FindaGrave #37130889.”

I went on to other clean-up tasks and then ran the tools to merge duplicate sources and duplicate citations. I thought it was amazing that the source count went from 2330 to 1520, and the citations list went from 29370 to 4640. I thought – “wow, what an improvement!” I spent the next five days working on all sorts of other details to freshen my main file.

I should have been thinking “OMG, what a disaster,” because I just now found only FOUR FindaGrave citations left in the file after I ran the merge tools. It doesn’t appear other sources were affected like this. Now I need to choose whether to rebuild yesterday’s file (without nearly 500 FindaGrave sources) or last week’s backup (with all the FindaGrave sources intact but without five days of other editing).
So, before I do anything I’m asking for help and advice!

thejerrybryan · December 20, 2023, 3:32pm

There is a rock and a hard place involved in your question.

On the one hand, it’s almost mandatory that you run the Merge Duplicate Citations tool after you import your RM7 database into RM9. Otherwise, duplicate footnotes and endnotes will not merge in any way when you run reports. For me at least, that’s pretty awful.

On the other hand, it’s almost mandatory that you not run the Merge Duplicate Citations tool after you import your RM7 database into RM9. Otherwise, certain citations that are not really duplicate will merge anyway creating the kind of disaster you have just encountered. That’s also pretty awful.

The situation that I’m most familiar with concerning false merges of citations that are not really duplicate is when the citations differ only in having a different image file. That can happen when downloading from Ancestry using TreeShare when the Ancestry collection providing the source is only partially indexed and when the citations in Ancestry differ only by the image file.

I’m not familiar with the same thing happening with FindaGrave citations. Your example of “Prescott, Sarah Hayward (1645-1709) - FindaGrave #37130889" sounds on its face like it should be ok because there seems to be a lot information that should be different for every citation. However, my RM7 does not have a built-in source template for FindaGrave, so your RM7 doesn’t either. That template has to have come into your RM7 database some other way. Maybe it was put there by TreeShare?

I enter all my sources completely by hand, so I don’t run into the problem of citations that differ only by the media file. And citations that differ only by a WebTag could have the same problem of differing only by the WebTag.

I don’t understand your situation well enough to provide a complete solution. But I do think you are going to need to delete your current RM9 database and import it all over again from RM7. But before you do, I think you are going to need to look into your RM7 database and look at that FindaGrave template to see what’s going on. Also, look at some of the FindaGrave citations in RM7 to see how they do or do not differ.

rzamor1 · December 20, 2023, 5:07pm

If the Citation Details are blank, and you only have added items like Media or WebTags it won’t see them as unique citations and they will be merged. To prevent those merges you need to give the Citation an unique Citation Name.

TomH · December 20, 2023, 6:11pm

To preclude such false merges, the RM application needs to take into account whether there are differences among WebTags and among Media tags. This has been a requested fix for the user trap created by the Merge All Duplicate Citations feature since it was first reported years ago.

jbmccall47 · December 20, 2023, 6:44pm

Thank you for helping here. I’ve been digging deeper and see that four of these FindaGrave records did survive – four out of 542, but I don’t see why. Three are on based on my RM7 template (which I think came with an update) and one on the ee-FindaGrave template. Here’s a sample of one that didn’t survive – I don’t have anything in the “citation detail text, media” box, but obviously I have unique information in each “citation details” panel.

Maybe the more important question is “what now?” Do I re-enter every single FindaGrave source using the ee-FindaGrave template, and then put something unique in the “citation detail text, etc.” panel. I guess I need to experiment. Maybe someone can suggest a shortcut to ease the pain!

thejerrybryan · December 20, 2023, 7:09pm

One suggestion I might make is to go back to your last RM9 backup before the Auto Merge rather than going all the way back to RM7 as a part of recovering.

I have some puzzlement about why so many of the FindaGrave citations merged. The only way to tell would be to look at the citation in RM7 or to restore to RM9 from before the merge and look at the citations there. I don’t think there is any way to figure out what happened just by looking at your RM9 database after the merge.

Looking at your screen shot, any differences in the Citation Name or Name or Born-Died or Comments or Memorial No. or Research Note or Detail Comment should prevent the merge. Differences in Media and Citation Web tags will not prevent the merge, but this citation doesn’t have any Media or Citation Web tags anyway.

If you do decide to restore your “before the merge” RM9 database, be aware that it will be restored with the same database name as before. There is no way to rename it as a part of the restore process. That means that by default, the restore will destroy your “after the merge” RM9 database. You can solve this problem either by restoring to a different folder or else by renaming the “after the merge” RM9 database before the restore. The rename is probably simplest. Just rename it to something like “after_the_merge” without the quotes and then do the restore

Then look very closely at some of the FindaGrave citations that merged that shouldn’t have. What you see there is going to guide the “what do I do now?” process.

kevinm · December 20, 2023, 10:39pm

I am confused by the attached image. The “Citation Used”= 1 so this doesn’t appear to show a merged citation. Also, the source name indicates that you’re a source splitter so I wouldn’t expect this source to have many citations under it. I’m struggling to see this screenshot as an example of a merge problem.

[edit: added the following additional text]
Typically, when people post about automerge problems it’s related to problems resulting from ancestry treeshare. Some ancestry collections are severely lacking in the source and citation information that gets sent across the API to RM. (Ancestry’s US Directory collection is a prime example of this issue.) If the RM user doesn’t review and edit the citation after it’s been treeshared then the only unique item will be the attached media. Neither the media field nor the webtag field is considered in RM’s merge process and the result is a mess.This is the issue that Jerry and Tom were raising. The problem is avoided if the citation adds any unique information to the citation. The image you posted shows fields that are not in the source template that gets created via Treeshare and the image shows source and citation details that I assume you manually entered and would have made it unique. It shouldn’t have caused a problem when merging.

TomH · December 20, 2023, 11:03pm

I am inclined to agree and suggest that its loss is the result of manually merging the different FindAGrave Sources. There still may be another unwanted outcome from the flawed Merge All Duplicate Citations but that example looks to be a unique Source that should have persisted through a Merge All Duplicate Sources type operation.

kevinm · December 21, 2023, 3:44pm

Yes, that would make sense. I reread the original post and it speaks of merge vs automerge. When I saw the numbers involved I assumed automerge was the culprit.

jbmccall47 · December 21, 2023, 6:39pm

All the merges for sources and citations have been from the “merge all duplicate” buttons. I’ve thought of various tests and will get to them after the holidays. I created a group in the RM9 file (the one that lost all but four of the FindaGrave sources), that includes all 150 people I had added to the file since I had imported from RM7. Then I started over: 1) Importing into RM9 the original RM7 file that has all 550 FindaGrave sources but lacks the 150 later edits, 2) running “merge all duplicate citations” then "merge all duplicate sources, and rejoicing when it worked with any apparent loss of sources, 3) importing a GEDCOM of 150 names I had edited since I had closed the original RM7 file.

Going forward I’m doing only ONE file operation at a time, then backing-up with a unique name and making a note so I’m clear what I have done and when.

In the meantime, thank you all for your helpful ideas. If I find the gremlin, or can duplicate my error, I’ll give an update.

MardeeB · March 30, 2025, 7:01pm

Ugh, I just learned this the hard way. I had a lot of duplicate sources and citations but did not realize that the system would not look at the name of the attached media file or webtags. In one case, I had almost 90 media files attached to one citation.

Thanks to another one of your posts, I figured out that I could kind of ‘unmerge’ these by exporting to GEDCOM and starting a new database by importing my data back in. That was a start, but I’m still stuck manually updating hundreds of citations. It seems like adding code to check the name of the attached media file and the webtags would not be that hard, but then I was a project manager, not a developer. Is there some way that we can upvote this feature request so we prevent this happening to other users?

I did check out the SQL Lite scripts that are available, but again, I’m a PM and not a developer, so it was a bit over my head.

Thanks for your posts (even the ones that are 2 years old!) and the help you’re giving us new users. It’s much appreciated.

thejerrybryan · March 31, 2025, 1:39am

Which version of RM are you on? There is a fix in version 10.0.5 that appears as if it might address this problem.

10.0.4 - 29 Jan 2025

 Fixed: Merge all duplicate citations no longer merges citations with blank names and fields

It’s not totally clear to me if this particular fix actually is intended to address this particular problem. Typically, citations that differ only in their image files or Web tags have blank citation names. But if the citation names are not blank and are identical, then citations that differ only in their image files and Web tags would still be merged. If that’s your current situation, then the problem still is not fixed for your use case.

MardeeB · March 31, 2025, 2:31am

Thanks for the heads up. I’m currently running 10.0.5.0, so it appears that I would have that fix, but it might not have been released yet when I first caused this problem. I’m not sure how long it was occurring before I realized what was happening.

I think part of the problem might also be with downloading sources from Ancestry and/or FamilySearch. I have stopped downloading sources for now, until I figure out how I want to handle it.

Thanks again for taking time to reply and also for letting me know about the bug fix.

thejerrybryan · March 31, 2025, 2:51am

Yep, downloading images from Ancestry and/or FamilySearch using RM’s tools can be a major cause of this problem.

I don’t have the problem because I enter all my sources by hand from Ancestry or FamilySearch rather than by using RM’s tools to download them. But I didn’t choose to enter my sources by hand because of this problem. This problem didn’t start until RM8 and I was entering sources by hand long before RM8. I chose to enter sources by hand so that the citations would appear the way I want them to appear and so the images would be named the way I want them to be named and so the images would be stored in a sub-folder structure that makes sense to me.

MardeeB · March 31, 2025, 10:34pm

Thanks Jerry. I’m definitely seeing the benefits to manually entering my sources and citations, as well as downloading and naming the media files. It seemed so convenient to just download them via the interface but then I ended up with kind of a kludgy mess. Once I figure out how to unwind the mess, I will probably start doing this manually for the exact same reasons that you cite. I did consider just flushing my first tree and starting over, but I think the research I did is good enough, I just need to clean up the data.

Thanks again for the advice. It’s so much appreciated.

Topic		Replies	Views
Merging Citations - Caution! RootsMagic tip , feature-request	38	1296	November 15, 2024
Conclusions Regarding the Merge Duplicate Citations Utility in RM11 RootsMagic tip , feature-request	30	393	February 15, 2026
Benefit of Merging Citations RootsMagic question	8	286	May 22, 2024
Duplicate Citations vs Tags or "Uses" RootsMagic tip	13	877	March 5, 2023
Merge all duplicate citations RootsMagic feature-request	9	596	January 14, 2024

Beware of merging sources and/or citations

Related topics