Notes are duplicated. The duplicate begins with MERGED NOTE

My RootsMagic 8 file has about a bazillion records with the Note field populated. The note is followed by
– MERGED NOTE ------------
and then a repeat of the original note.

I can find all of them with Search Everywhere but I hate to have to individually edit each record to remove the extraneous data. I’m looking for an easier way to get rid of this stuff.

Search and Replace can find and remove the “– MERGED NOTE ------------” text, but if you have information following it, that also needs to be removed, then you will have to do a manual cleanup.

Thanks. That’s exactly what I was afraid of. I’m sure this is a result of importing and converting years ago from PAF or whatever other format I was using.

It is a consequence of merging people. RM Inc preserves the Note values from both people and flags it with “– MERGED NOTE ----” so each instance can be easily found. In early RM4, it responded to the complaint that the flag was being inserted even if one of the values was blank. I don’t remember and have not checked of late to see if duplicate Note values are cleanly handled (i.e., only one copy and no flag).

The only way I can imagine your case to be resolved en masse is by a procedure using sqlite on the database or a text processing tool on a GEDCOM from it. That’s assuming you have confidence that you can delete everything in the Note after the flag.

It’s amusing to see how many RM generated public websites have the – MERGED NOTE ---- flag:
https://www.google.com/search?q=rootsmagic+“--+MERGED+NOTE+----”

If I can get rid of the header, the note left behind would be an exact duplicate of the note I want to keep. I need to do some research into removing duplicated notes.

Don’t get rid of the flag/header until after you have gotten rid of the duplicates. It’s your key to parsing the first note in the Note value from the rest to test for duplication. In the search results, I found examples of more than one flag, cases where the first note was duplicated but there was at least one more note that was different. These you would want to leave untouched for manual inspection and correction.

1 Like

I guess something is not working quite right since you deleted this post but I did see it and was going to say “Wow!” anyway after I came back to try and figure out how your regexp expression works. And to see if I might be able to adapt it to a SQLite query.

Tom’s latter recommendation can be accomplished by exporting a full GEDCOM opening it in NotePad++ and applying this form of regex:

2023-02-11_211951

Of course, for insurance, always backup before doing anything with your database.
Find what : ([0-5]\ CONT – MERGED)[\s\S]+?(?=(\R[^3]\ ))
Replace with : LEAVE THIS FIELD EMPTY

I thought it looked pretty cool, too. It brings up 2 other questions. 1. Does anyone have sample code for manipulating an rmtree file with sqlite? If I export to gedcom, fix the data, do I just re-import it or should I import it to a new rmtree file?

That is a great solution. Thanks!

Haha, I don’t have my computer with me. My hijinks are on someone else’s tablet with which I’m unfamiliar and inept.

I’d re-import to a new database first. If all looks well, backup your original, import the .GED and ~try~ the AutoMerge route.
EDIT: Actually I guess you’d have to switch to the new one. Having never done it… AutoMerge isn’t smart enough to dump the duplicates, (I don’t think) and manual merge or drag n’ drop (between databases) would be tedious.

Visit SQLiteToolsForRootsMagic.com, the descendant of a wiki I launched with a few others shortly after RM4 was released.

The regex didn’t find anything. Here’s a piece of the gedcom:

1 NOTE 1910 Census: Ohio, Logan, Bellefontaine, ED 132.
2 CONT
2 CONT
2 CONT – MERGED NOTE ------------
2 CONT
2 CONT 1910 Census: Ohio, Logan, Bellefontaine, ED 132.

Great reference. Thanks very much!

Hmmm, as I could not find any useful examples in my GED files from various sources all the way back to RM4, I attempted to create some. Neither ShareMerge nor SmartMerge created any after a database was copied onto a copy of itself… that’s a good thing, I think. So I tried creating some controlled test people and Manual Merge which led me into freeze-ups and other mysteries that I don’t understand and to the usual level of frustration that deters me from using RM8.

  • Adding a new person seemed to automatically add it to the only group I had just created.
  • Selecting a duplicate person from the Manual Merge window if it is still open after having completed a merge freezes the system.
  • Somehow, a person disappeared from the main views and the side Index but would appear in the Select a Duplicate list of names. I discovered the name record in the NameTable with a non-zero OwnerID but there was no longer a corresponding record in the PersonTable which illustrates, I think, a procedure that got aborted midway - probably one of those Merges that froze.

Spaces are critical. There are none before the beginning parentheses or after the ending double parentheses… but there are (and must be) four spaces between them. One each after the backslashes and one each before and after the dash mark. Find should highlight several succesive lines each press and Replace All should delete all those highlighted little groups.

Aren’t there two hyphens preceding MERGED in the GEDCOM and so there should be in the REGEX to match? I think Discourse converts the two hyphens (–) into a ‘longdash’ character (my made-up name for it).

([0-5]\ CONT -- MERGED)[\s\S]+?(?=(\R[^3]\ ))

That said, NotePad is only finding the one line in the sample, nothing after.

Does the GEDCOM always have a “3” level line following the end of the 2 CONT series? If so then this modified version of Kevin’s regexp seems to work for me in Notepad:

([0-5]\ CONT -- MERGED)[\s\S]+?(?=(\R^3\ ))

Not in his example(s) so I just added that to a random GEDCOM downloaded from the web I’m away from home/my .GED’s. If there are two you would do as you’ve shown. EDIT my goof …the "\R^3\ " is flawed now that you mention. The concept is to find any CONT preceded by 0-5 and a space and the MERGED stuff. Then include from the beginning of that all the way through until a new tag begins (signaled by a change in what number begins before a new tag) I mistakenly used ^3 to detect a tag level number change because that what level my test CONT tag was at… but it actually can be any number 0-5 other than what it just was. I’ll have to look into it tomorrow. Sorry.