Problems when using RM8's Merge Duplicate Citations tool

This message is going to be a little technical and is going to be very long. It has to do with conditions under which duplicate citations cannot be merged automatically in RM8. And in some cases, duplicate citations cannot be handled in any manner except to manually delete the duplicates.

The story starts with my RM7 database. I have discovered a few dozen “almost duplicate” citations in my RM7 database that I didn’t know where there until I imported my RM7 database into RM8 and ran RM8’s tools to merge duplicate sources and merge duplicate citations. The RM8 tools didn’t have any trouble with “totally duplicate” citations that would merge but the tools did have trouble with “almost duplicate” citations that didn’t merge.

The duplicate citations in my RM7 database that would not merge automatically in RM8 arose in one of two ways which were really the same issue. I went through a phase a long time ago where I would Drag and Drop a few families from my production RM database into a temporary database where I would work on the families and then Drag and Drop them back onto my production database. After that, I would merge the people manually, and all would be well. Sort of.

I still had a problem because Drag and Drop will not merge identical custom source templates and I use only custom source templates rather than RM’s built-in templates. After I realized what was going on, I manually merged the identical source templates and identical sources with a little assistance from SQLite because RM will not merge source templates. So far, so good except that I didn’t realize at the time that some of the citations based on the source templates and sources I had merged were not quite identical. They looked identical and it didn’t matter much prior to RM8 because citations prior to RM8 could not be reused and hence were never really duplicate even when they were identical. I also quit doing the Drag and Drop thing.

Much more recently, I deleted a few hundred people from my RM7 database over a period of several days. This was quite intentional and was done for good and valid reasons. It was done very carefully one person at a time and no SQLite was involved. But I changed my mind about a few of them and wanted to put them back. This wasn’t really a mistake. I just changed my mind. I had made good backup before starting the project to delete the people, so I restored the backup to a different folder and dragged and dropped those few people back into my production database and all seemed well. Except that I had once again introduced duplicate source templates and duplicate sources into my database. Grrrr. I considered just restoring totally from my good backup, but I didn’t want to lose the several days of work I spent deleting those people, so I cleaned up the duplicate source templates and sources again, assisted by SQLite. I still didn’t realize that this whole process had left a problem for RM8.

RM stores the data you enter into source templates into database columns called Fields. There is a column in the SourceTable called Fields to contain the data you enter into the Master Source portion of a source template. There is a column in the CitationTable called Fields to contain the data you enter into the Citation Details portion of a source template. My source templates place all the data into the Master Source and the Citation Details portion is therefore null.

Here’s the really technical part. When you type sourcing data into RM using my templates, the null Citation Details is encoded as follows. It’s a string of XML.

<Root><Fields/></Root>

After a Drag and Drop using my templates, in addition to the template having been duplicated, the null Citation Details is encoded as follows.

<Root><Fields><Field><Name>Page</Name><Value/></Field></Fields></Root>

Both encodings in effect mean that there are no Source Details. But RM8’s Merge Duplicate Citations tool does not recognize that the two different encodings are both null and therefore will not merge otherwise duplicate citations. I’m again cleaning up this problem in RM7 so that duplicate citations are mergeable in RM8.

The final case is the following. Suppose in RM7 (or RM8, for that matter) you accidentally memorized and pasted the exact same citation to the same person or fact twice. RM8’s Merge Duplicate Citations tool does not fix this problem. I’m not sure if it should be expected to do so. The trouble I have encountered is that Drag and Drop can create the exact same situation in either RM7 or RM8 if ever you Drag and Drop somebody out of your database and then back in and if you use custom source templates. That’s because the Drag and Drop changes the encoding of a null Fields value in the CitationTable.

Why would a user Drag and Drop in that manner? Well, for example they might use Drag and Drop to split their database into two databases and later they might use Drag and Drop to combine the two databases back together. Or that might use Drag and Drop as I did to recover a few people from a restored database without having to revert to an entire restored database and thereby lose all intervening work in the database.

As I said, this problem crept into my RM7 database and then is being carried forward into RM8. That’s why I’m fixing it in my RM7 database. But a new user who only used RM8 could encounter the same problem if they used Drag and Drop. I think any solution to this problem is going to require the ability to merge duplicate source templates in any Drag and Drop operation, which really means in any GEDCOM import operation.

If you are still awake and reading after all this, you might notice that the null value for the Fields column in the CitationTable after a Drag and Drop is actually the way a null Page value would be coded using the free form template. That might or might not be an accident, and it didn’t matter in RM7 because duplicate citations couldn’t be merged anyway. But it does matter in RM8 because you really do want duplicate citations to be merged unless they are those weird ones from TreeShare that are really different but which only differ in their media file. That’s because if you don’t merge duplicate citations, RM8 will include all the duplicated citations as distinct footnotes and endnotes in reports. Those duplicate footnotes and endnotes cannot be combined by any reporting options. If you run reports with footnotes or endnotes, RM8 therefore forces you to merge your duplicate citations from RM7 and to use Paste/Reuse when you Memorize and Paste citations.

1 Like