It is a consequence of merging people. RM Inc preserves the Note values from both people and flags it with “– MERGED NOTE ----” so each instance can be easily found. In early RM4, it responded to the complaint that the flag was being inserted even if one of the values was blank. I don’t remember and have not checked of late to see if duplicate Note values are cleanly handled (i.e., only one copy and no flag).
The only way I can imagine your case to be resolved en masse is by a procedure using sqlite on the database or a text processing tool on a GEDCOM from it. That’s assuming you have confidence that you can delete everything in the Note after the flag.
Don’t get rid of the flag/header until after you have gotten rid of the duplicates. It’s your key to parsing the first note in the Note value from the rest to test for duplication. In the search results, I found examples of more than one flag, cases where the first note was duplicated but there was at least one more note that was different. These you would want to leave untouched for manual inspection and correction.
I guess something is not working quite right since you deleted this post but I did see it and was going to say “Wow!” anyway after I came back to try and figure out how your regexp expression works. And to see if I might be able to adapt it to a SQLite query.
I thought it looked pretty cool, too. It brings up 2 other questions. 1. Does anyone have sample code for manipulating an rmtree file with sqlite? If I export to gedcom, fix the data, do I just re-import it or should I import it to a new rmtree file?
I’d re-import to a new database first. If all looks well, backup your original, import the .GED and ~try~ the AutoMerge route. EDIT: Actually I guess you’d have to switch to the new one. Having never done it… AutoMerge isn’t smart enough to dump the duplicates, (I don’t think) and manual merge or drag n’ drop (between databases) would be tedious.
Hmmm, as I could not find any useful examples in my GED files from various sources all the way back to RM4, I attempted to create some. Neither ShareMerge nor SmartMerge created any after a database was copied onto a copy of itself… that’s a good thing, I think. So I tried creating some controlled test people and Manual Merge which led me into freeze-ups and other mysteries that I don’t understand and to the usual level of frustration that deters me from using RM8.
Adding a new person seemed to automatically add it to the only group I had just created.
Selecting a duplicate person from the Manual Merge window if it is still open after having completed a merge freezes the system.
Somehow, a person disappeared from the main views and the side Index but would appear in the Select a Duplicate list of names. I discovered the name record in the NameTable with a non-zero OwnerID but there was no longer a corresponding record in the PersonTable which illustrates, I think, a procedure that got aborted midway - probably one of those Merges that froze.
Spaces are critical. There are none before the beginning parentheses or after the ending double parentheses… but there are (and must be) four spaces between them. One each after the backslashes and one each before and after the dash mark. Find should highlight several succesive lines each press and Replace All should delete all those highlighted little groups.
Aren’t there two hyphens preceding MERGED in the GEDCOM and so there should be in the REGEX to match? I think Discourse converts the two hyphens (–) into a ‘longdash’ character (my made-up name for it).
([0-5]\ CONT -- MERGED)[\s\S]+?(?=(\R[^3]\ ))
That said, NotePad is only finding the one line in the sample, nothing after.
Not in his example(s) so I just added that to a random GEDCOM downloaded from the web I’m away from home/my .GED’s. If there are two you would do as you’ve shown. EDIT my goof …the "\R^3\ " is flawed now that you mention. The concept is to find any CONT preceded by 0-5 and a space and the MERGED stuff. Then include from the beginning of that all the way through until a new tag begins (signaled by a change in what number begins before a new tag) I mistakenly used ^3 to detect a tag level number change because that what level my test CONT tag was at… but it actually can be any number 0-5 other than what it just was. I’ll have to look into it tomorrow. Sorry.