Conclusions Regarding the Merge Duplicate Citations Utility in RM11

Experimenting with the merge duplicate citation utility in RM11, I conclude the following:

1-Citation objects with blank Citation Name fields are exempt from the duplicate citation merge utility.
2-Removing a “citation” (link) from a fact does not delete the underlying Citation Object.
3-Multiple links between the same fact and the same citation object appear multiple times in the “citation” list for that fact. Deleting one of these duplicate entries from the fact, removes them ALL from the fact.

Does this behavior comport with other user’s experience?

This is not merely an academic discussion. TreeShare and Ancestry are creating an explosion of duplicate, identical citations, many with empty Citation Name fields, some with no links to any facts in the database, some linked multiple times to the same fact.

Due to the limitations above, the citation merge utility does not help with many of these duplicate citations. I observe that an export of my DB to Gedcom and import back to RM cleans up most of these citation issues. I believe the merge utility needs tweaking to deal with these issues.

At the very least, identical citations with empty Citation Name fields should be merged.

2 Likes

There is some history here. Originally, the merge citation utility did merge citations with blank citation names. But doing so created a major problem. Namely, citations that actually were different we being merged.

The problem is that there are some collections at Ancestry where the indexing is not complete. I’m not sure that “not complete” is really the best description, but what happens is that citations for such collections come down to RM via TreeShare in such a way that they differ only in their media file or in a Web tag being different. And those differences did not prevent them from being merged even though they were really different.

The workaround implemented in RM was not to merge citations with blank Citation names. The Citation names were blank because the main body of the citation data was blank and the information about the media files and/or Web tags did not go into the Citation name.

That’s still the current status. Citations with blank citation names are not merged.

1 Like

Thank you for the background. So instead of providing a way to detect the difference in media attached to the citation, they chose to simply not merge citations that do not contain a name?

Have you heard if perhaps a real fix might be in the works?

I guess if I have two identically named citations that only differ in media, a merge WILL take place? When that happens, do you end up with a single citation object that contains both media?

I honestly do not know what happens in that case. Somebody will need to do a test in a test database.

If citations have identical citation names and fields they will merge, media and webtags are not taken into consideration. It is only looking at what would be included in a footnote.

Thank you for the information.

I assume that the media and webtags associated with each merged citation are merged into the surviving citation object?

Do you have any insight as to why citations with empty Citation Name fields are not merged?

Obviously, computers and programmers are always faced with the difficulty of determining human “intention” and, thusly, attempt to avoid “assumption”. One possibility, for an empty Citation, is that the Name field was left blank in (human) error. If that was the case and (a merge) allowed to occur, several additional mitigations would need to be incorporated, for potential recovery… both for in the immediate aftermoment of the user realization or later (if possible) detection. All that to say, it’s “complicated” LOL.

When blank citation names were merged then all the media attached was now in one citation. For me that led to 154 media items on a single citation. A mess I’m still working to fix

Thank you for the reply.

The trouble we are having is that TreeShare sometimes creates multiple citations that do not have citation names. Many of these citations are identical in all respects, including media and web links. Often these identical, unnamed citations appear linked to the same fact.

As it stands now, merge duplicate citations does not deal with this (because it skips unnamed citations). If I export my entire database and re-import back to RM, these duplicates are gone. Obviously, I should not be required to resort to this extreme measure.

It may sound crazy to some, but all I’m asking is that merge duplicate citations do exactly that: merge duplicate citations. I suppose, if the community is split on this, RM should simply offer a setting that toggles merging unnamed citations on or off.

If you give the blank citations a citation name they will merge.

1 Like

Yes, I suppose that is true, but it is also quite a bit of work to manually locate the empty citations and give each one a name.

I’d really like to see RM provide an optional setting that allows users to enable the merging of nameless citations.

I don’t totally agree and I don’t totally disagree.

It’s seems to me that the problem with merging duplicate citations automatically is merging them when they really aren’t duplicate. The current criteria is that the citation names have to be non-blank and the citation data that creates a footnote/endnote sentence has to be the same. The problem with that approach is that there can be important data as a part of the citation that is not a part of the data that generations a footnote/endnote.

I think I know how I would want it done. But maybe I don’t. Maybe there needs to be some user options - not just the citation name being blank or not not, but taking citation notes into account or not, taking citation media into account or not, and taking citation Web tags into account or not. That sort of thing. I realize that that approach could seem to be overwhelmingly complex. But I don’t think any simple solution would meet the needs of all RM users.

1 Like

Is there any official documentation about how the merge duplicate citations utility actually works?

Some here seem to be saying that if two citations are identical and only differ by the media and/or weblinks attached to them, they will merge. Are you saying there are other data fields within the citation that can also differ and a merge will occur in spite of differences in these fields?

Personally, I would like an option for the merge utility to merge only citations that are truly identical in all respects (including media and web links) regardless of whether the citation name is empty.

Perhaps the most flexible approach would be to offer users an array of check boxes for which parameters the merge should ignore, and which it should consider.

I could be wrong about one or two of them, but my understanding from testing is that the following citation data items are ignored for merging citations. Looking at it from the other direction, I think the only citation data that is taken into account is the citation name and the citation variables. So I think the following are not considered, even if they are different.

  • Research Note
  • Detail Comment
  • Detail Ref#
  • Citation Media
  • Citation Web tags
  • Quality

Quality is on the Edit Person screen as being associated with the citation. But it really isn’t associated with the citation, and shouldn’t be counted. Quality is really associated with an internal citation link. That’s because the same citation can have a different quality depending on the fact with which it is linked. For example, a death certificate citation may be linked to both a Birth fact and a Death fact. But it’s usually a better quality citation for the Death fact than it is for the Birth fact.

1 Like

When you create a new master source and citation all the fields available are taken into consideration when merging all duplicates. The additional items you add later, ie repositories, media and webtags are not. All of those items can support more than one link attached to them. It is designed to include all of them when merging. It will not take them into consideration when trying to determine if they are duplicates or not. We no longer merge blank citations due to the chance the media and webtags are unrelated.

Personally, I have been going through my blank citations looking for those with media and webtags so I can give them a unique citation name. It’s not just so they will merge. Its so I can find what I’m looking for when going through my citations.

2 Likes

Do you know when they changed the duplicate merge to exclude blank citations? When I tried it in an earlier version of RM11 they did indeed merge citations that should have been different - to my dismay as I then had to disentangle all these merged citations and give them proper citation names which was an absolute pain. OK, I should have reverted to an earlier backup, but by the time I noticed what was wrong, it was much later and indeed too late. What I need now, unfortunately, is an option to separate references with blank names. I’m not holding my breath.

Do you recall which fields/info were different between the citations that merged?

Yes, this is true. The workaround and it is not that bad is to open the “Citation Used” window on the “Edit Citation” screen and from there you can delete the duplicates without causing them all to delete. Optionally, you can copy the citation just before you delete them all and then paste the citation back.

1 Like

10.0.4 - 29 Jan 2025
Fixed: Merge all duplicate citations no longer merges citations with blank names and fields

This was an intentional change, so I do not see us going back to merging blank citations again. If you want to merge blank citations you will need to do so manually.

As Renee posted the update was a little over a year ago. Finding the date of fixes is pretty easy. From the Home Menu, click on your current version # under “Updates”. It will open the update History weblink which is pretty straightforward to read.