–
… unless the string is in Preformatted text format: </> or (Ctrl+E)
- -- ---
–
… unless the string is in Preformatted text format: </> or (Ctrl+E)
- -- ---
Ah! Thanks. EDIT my goof …the "\R^3\ " is flawed now that you mention. The concept is to find any CONT preceded by 0-5 and a space and the MERGED stuff. Then include from the beginning of that all the way through until a new tag begins (signaled by a change in what number begins before a new tag) I mistakenly used ^3 to detect a tag level number change because that what level my test CONT tag was at… but it actually can be any number 0-5 other than what it just was. I’ll have to look into it tomorrow. Sorry.
Quickest thoughts this late is running a series of regex’s one after the other:
([1]\ CONT -- MERGED)[\s\S]+?(?=(\R^1\ ))
([2]\ CONT -- MERGED)[\s\S]+?(?=(\R^2\ ))
([3]\ CONT -- MERGED)[\s\S]+?(?=(\R^3\ ))
([4]\ CONT -- MERGED)[\s\S]+?(?=(\R^4\ ))
([5]\ CONT -- MERGED)[\s\S]+?(?=(\R^5\ ))
to screen NOTE tags at all possible tag levels
Do you see the MERGED NOTE flag line at any level other than 2? I thought it might only happen for a person"s General Note and that always starts at 1 under the 0 INDI tag. Maybe it happens for merging of Sources and Places, too, which would require other levels. I don’t remember having last looked at it a decade ago.
I trust your understanding… I’m away from RootsMagic and not even sure I understand where the MERGED duplication originates. I only know that NOTE tags can be at different “levels” (I guess 1-5 can have NOTE tags) and thusly my latest was a broad brush that hopefully wouldn’t “corrupt”.
EDIT: Ah, you were likely aiming that question at rjmolloy and I now understand that RootsMagic merges have this potential, yipes!
Here’s an example of a GEDCOM with multiple -- MERGED NOTE ...
flags: http://sites.rootsmagic.com/Worland/download.php?f=gedcom . It has them only at level 2 and as there are multiple flags for some individuals, there have been multiple merges. Your
([2]\ CONT -- MERGED)[\s\S]+?(?=(\R^2\ ))
regex just finds the flag line. Revised:
^2\ CONT -- MERGED[\s\S]+?(?=(\R^[^2]\ ))
finds the flag line and everything down to the next change in level.
What stands out is that there is not a lot of duplication within the NOTE so one really must inspect and compare the text preceding the flag to the text following it before deleting. I was anticipating that in thinking about how to proceed using sqlite…
I’ve been noodling in SQLite on this GEDCOM file which brought out many exceptions to your case because of multiple merges with as many as five “-- MERGED NOTE ------------
” flags. In some cases, the duplication might be between the 2nd and 4th segment or there might be triplicates. I have developed a way to keep just the first note when the rest is just one duplicate which is what you described. And the procedure could work only on those, leaving the rest to be manually inspected and edited.
Is that of interest to you?
I am on the verge of having a C# application working that will scan each of the 9 tables in the .rmtree SQLite database which contain a “Note” column. I’m only looking for the first “MERGED NOTE” and either removing the duplicate or, if not a duplicate, appending it to the primary note. I’m thinking that, if there are multiple “MERGED NOTE” entries, rerunning the program will pick up the additional one.
Sounds good. From my limited exposure, I think the only table in which you will find “MERGED NOTE” flag is PersonTable. However, covering all the possible cases is a good general solution.
I have created a second sqlite script that addresses quasi-duplication among the merged components in any order. I’ll post it tomorrow to sqlitetoolsforrootsmagic.com. It works only on the PersonTable and executes in milliseconds on a 25000 person database of which some 400 have (had) the merged note flag.
That’s great. Thanks very much!
Can I have the name of the script and will you let me know when it is posted?
Thanks.
Here’s my sqlite script that should address your problem:
https://sqlitetoolsforrootsmagic.com/remove-duplicate-merged-notes/
Please let me know how it works out!
I’m not sure it’s doing anything. In person #60, I have this note and it doesn’t change after running the script. I’ll attach my rmtree in case you want to look at it.
6 Mendon St. when married.
Katherine Daley, sister, Maid of Honor
5 Field St. new address Dec. 1913
Lot 87 Section St. James
Retired 1955 U.S. Equipment Offices
– MERGED NOTE ------------
6 Mendon St. when married.
Katherine Daley, sister, Maid of Honor
5 Field St. new address Dec. 1913
Lot 87 Section St. James
Retired 1955 U.S. Equipment Offices
(Attachment Molloy.rmtree is missing)
Bob, I pm’d you yesterday with a response and my email address to send your file to because the settings for this Discourse platform preclude attachments.
This morning, I copied the Note from your post and, of course, my script could not find the -- MERGED NOTE ------------
flag because Discourse converted the leading pair of hyphens to a dash: – MER… Corrected the dash and all ok.
Question is whether you followed the instructions. Executing the whole script (F9) does not change the data in the PersonTable; it presents you with a table to compare the draft new Note to the original|current Note. You have to select|highlight the UPDATE statement in the comments at the bottom of the script and execute just it (Ctrl+F9) to apply the new Note to the PersonTable. I set it up that way so that unwanted outcomes might be found and mitigated by editing the original Note BEFORE the database is changed.
I think that’s what I did. It’s my first time running one of these SQLite scripts so there is every possibility that I screwed it up.
I’ve updated the page with more explanation and revised the script to facilitate the comparison when the original Note has leading whitespace (the new Note does not).