RM9 for Mac crash with unicode quotes in citation research notes

TLDR; If you use unicode quotes of “ and ” (201C and 201D) in a citation research note, RM9 for Mac will crash. This behaviour doesn’t occur with RM9 for Windows. The workaround is to use the search and replace tool to replace these unicode quotes with the " quote mark.

Firstly; I have raised this rather obscure issue with RM9 for Mac crashing (with the work-around of search and replace) with the RootsMagic Support team. I thought I would paste this message in this forum in case a few other people have a similar problem.

Over many years, I have been adding transcriptions from emails, family tree bible extracts, newspapers articles etc as citation transcription of text in Ancestry. When I downloaded my Ancestry family tree into RM9 for Mac, clicking on the source icon for a few citations caused the program to crash.

For example, If I enter a citation detail research note of “some text” (including the quotation marks) into the RM9 for Mac, the program will unfortunately crash. Changing the research note to use different quotes, e.g. "some text" (including quotation marks), resolves the problem.

Anyhow, I thought I would post this in case it helps someone having a strange crash with citations. :slightly_smiling_face:

3 Likes

Are Unicode Left & Right single quotation marks (U+2018, U+2019) working?

1 Like

This wild guess is more for the RM support team than for the user community, but here goes anyway.

It is well known that RM uses a special collating sequence called RMNOCASE for its database columns which contain text data. There have been reports that the RMNOCASE collating sequence on a Mac is different than the collating sequence on Windows. See Sharing a File Between Mac and Windows

There therefore must be something different about RM’s RMNOCASE on a Mac and RM’s RMNOCASE on Windows. So I wonder if that difference is the root cause of the Unicode quotes causing crashes on a Mac.

Quotes are not characters that inherently have case as do letters of the alphabet. So it’s hard to picture why RM’s RMNOCASE would be involved in this problem. It’s just that the Unicode quotes problem is different between a Mac and a PC and RM’s RMNOCASE is different between a Mac and a PC and RM’s RMNOCASE deals with Unicode characters. And a collating sequence has to deal with all characters, not just letters of the alphabet.

Thanks for posting. I can recreate this issue causing RM9 to crash by adding the unicode 201C + 201D quotes. RM seems to crash when these chars are added to any Edit Note pane for sources or citations. Surprisingly, I can add these chars in the Fact Note field without causing a crash and similarly I can add one of the quotes without causing a crash. The 2 chars together cause RM9 to crash when trying to write the edit to the db (upon leaving the Edit Note pane). Also, I don’t keep older versions of RM9, but RM8 is behaving the same way, so perhaps this issue is due to something that changed on the MacOS side. I am running Sonoma 14.1.2 and RM9.1.2/RM8.5.0. Unicode Left & Right single quotation marks (U+2018, U+2019) do not cause an issue for me.

Finally, after doing some additional testing, I notice that both RM9 and RM8 now crash when trying to view some of my existing source citations. This is new behavior so I wonder if it’s more than these 2 chars because I typically only use standard keyboard characters. Strange indeed.

1 Like

That’s a really good thing. That surely means that the developers can also recreate it.

1 Like

I think that may be a red herring. Note fields are not indexed in the SQLite database so RMNOCASE does not come into play. Rather, I would suspect the application is getting the delineation of the Note variable confused by these quotation marks. That may result in overwriting of other variables or possibly even some codespace resulting in the crash.

1 Like

Yes. Also I just opened the RM8 and 9 files under Windows and confirmed that the issue I noted where some of my existing source citations crashed when trying to view was cause by the same characters. Apparently they came across via copy/paste action from the source document. As Teddy noted, using Search and Replace on source and citation fields is a valid workaround until this issue gets sorted.

[edit] True confessions, I had to manually edit the rm8 version in windows as search and replace does not work on source fields in rm8.

1 Like

Upon further reflection, I’m sure you are correct. But at least it seems possible to recreate the problem so maybe it won’t be too hard to fix.

1 Like

Confirming issue has been reported to development.

1 Like

Are Unicode Left & Right single quotation marks (U+2018, U+2019) working?

@kbens0n I added the wording ‘test’ (including the single quotation marks of U+2018 and U+2019) to the citation research note with no issue.

The fancy quotes cause a crash when they are in which fields?
Am I right in saying- Source Text, Source Comment, Research Note, Detail Comment ? (using the names from the RM UI)

If so, then it’s not xml processing since those are simple UTF8 encoded TEXT columns in the database. All the XML is in the Fields column.
Maybe the problem is in HTML processing in the current note editor. I’m willing to blame everything on it…

1 Like

@RichardOtter It occurred in research notes from my family tree, but in reading your post, I created a new freeform source citation and by pasting the fancy double quotes, it crashes on the source fields from footnote, short footnote, bibliography, source text, and source comment.

These all show the Edit Note form and it is exiting this form, that the crash occurs; “Notes: PC register does not match crashing frame (0x0 vs 0x7FF8A41C6A78)”.

The single quotation marks (U+2018, U+2019) mentioned previously, all work fine.

I agree with that possibility. What I can’t reconcile is that apparently the problem doesn’t happen with fact notes and the fact notes use the same note editor as do the notes in RM’s sourcing system. Is there something different about RM’s note editor when it is editing fact notes than when it is editing notes in RM’s sourcing system?

I don’t think we can rule out XML as the culprit. Quotes are delimiters and XML has to be parsed. Perhaps the XML parser that RM is using is becoming confused by the Unicode quotes. I don’t know how many special characters an XML parser would need to recognize, but surely there would need to be some mechanism to escape any text that looks like XML text. And perhaps that escape mechanism would involve the use of the Unicode quotes as special characters to encode the escape mechanism.

Bruce will figure it out, but the problem will be more difficult to solve if the error is in the XML parsing library that RM is using. Or for that matter, the problem will be more difficult to solve if instead of the XML processing, the error is in the library RM is using to support the note editor.

1 Like

That is strange; I tried entering “ and ” (201C and 201D) in a facts note after reading your post, and as you indicated, there is no crashing on exiting the Edit Note form.

@rzamor1 - To add some additional detail… I just confirmed that the same RM8 db that crashes RM8.5.0 running on the latest macOS Sonoma 14.1.2 does not crash RM8.5.0 running on an older macOS Big Sur 11.7.6.

This explains why I was able to initially add the unicode quotes (201C and 201D) into the source text field without experiencing this crashing problem. (btw, the source was the 1950 Census that was released in 2022. The offending unicode characters were copied from the about page of the census)