Copying of Facts

Hello from Norway.

I would like to thank you for your commitment to the case with the letters Æ, Ø and Å. As far as I have heard, RM said many years ago that they should go international.

The program has UNICODE and it is a hope that the program will come in several languages. We have offered to help get translated into Norwegian if possible. TMG was once translated into Norwegian and it worked excellently.

I am the leader of the Norwegian user group for RM. Although I am not an expert in data, we have people who have / have had this as a profession.

Hilse fra Tennessee. Jeg bodd i Norge fra 1985 til 1987, og jeg huske liten Norske, men bare liten. (I lived in Norway from 1985 until 1987 and I learned a little Norwegian.) I still remember attending a conference in Germany and trying to write an email back to my employer in Norway. I was composing in my poor Norwegian while using a German keyboard that had all the letters needed for German and none of the letters needed for Norwegian.

I later served on a committee that evaluated the original draft of the proposal for ISO10646 - a character set to support all languages and character sets in the world. It was a committee of thousands, so I certainly learned far more than I contributed. In any case, the very same people who developed the IS10646 proposal ended up voting it down because of its complexity. A committee had tried to invent a horse and had ended up with a camel.

Between living in Norway and serving on the committee, I learned a lot about alphabets and character sets. For example, we talk in English about “dotting all the i’s and crossing all the t’s”. But there are languages that have both a dotted i and an undotted i and they are a different letter. Genealogy software is going to have to understand things like that.

Shortly after the ISO10646 proposal was voted down, two guys in Silicon Valley invented UNICODE at a restaurant where they wrote the basic design down on a cocktail napkin. UNICODE became a success, and RM uses UNICODE - actually RM uses the UTF-8 encoding of UNICODE.

The upshot of all this is that I have a lot of interest in character sets and internationalization of software. It’s a very non-trivial problem. RM has stated that they have future plans for multi-language support in their product. I don’t know what such support will look like. But I do know that such support will have to be able to handle collation issues appropriately and upper case and lower case issues appropriately. It can’t simply treat a letter in one language as if it’s a different letter in a different language. And it can’t do case insensitive ordering for English letters and not do case insensitive ordering for non-English letters.

1 Like

Agree with most, but setting a language for a database file seems worrisome. My database is about half English and half German. Which would I choose?

It’s a good question with perhaps not an easy answer.

How would you wish; for Ü and ü to work in RM? Suppose it’s a perfect world and all you had to do was to wave a magic wand to get Ü and ü to work the way you wish. How would you wish them to work?

I would wish for Ü and ü to be treated as unequal in a case sensitive search or compare and to be treated as equal in a case insensitive search or compare. That’s independent of ordering. That should work correctly if you choose for your database to be English or German. That’s just my opinion, and I think it would be hard to find people who would disagree on this one.

I wish for Ü and ü never to be treated as equal to U and u in any search or compare. I think that should be true in English or German. That’s just my opinion, and there would probably be many people who would disagree. Perhaps this is a case where Ü and ü would be treated as equal to U and u in English and not equal to U and u in German.

I wish for Ü and ü to sort after Z if the database is set to English and to sort wherever they sort in German if the database is set to German. That’s just my opinion, and there would probably be many people who would disagree. Perhaps this is a case where Ü and ü should sort with U and u in English and should sort wherever Ü and ü sort naturally in German if the database is set to German.

Those suggestions would certainly satisfy my needs.
I guess my main issue was the upper/lower case problem.

It seems to me that RM could fix that issue with existing tools. I certainly could be wrong. But it seems to me that if a collating sequence that sorts Ü and ü the same as U and u can be created, and if a collating sequence that sorts A and B the same as a and b respectively can be created, then a collating sequence that sorts and searches and compares Ü the same as ü can be created.

I’m a C++ programmer, so I’m not sure about other languages. The C++ tools I’m aware of for comparing character strings don’t have functions that will compare Ü and ü the same. Such libraries only work for upper and lower case English letters. Essentially the tools just work for traditional ASCII characters and not for UNICODE characters. But these tools are just simple functions. New functions can be written. The C++ tools I’m aware won’t treat Ü and ü the same as U and u, either. Yet RM did something to make it happen.