Excessive Blank Lines in *.docx Files for RM's Descendant Narrative Reports

Windows 10 Professional, RM 10.0.2.0 (64 bit)

This topic has come up before, but I wanted to focus on it again. Here is a short snippet of a Descendant Narrative (NEHGS) saved as an *.pdf file.

Here is the same short snippet saved as an *.docx file. There are extra blank lines. These kinds of blank lines appear throughout the entire report. They are not there when you print directly from RM and they are not there when you save the file as an *.pdf file.

If you click on the pilcrow (backwards P) in the Microsoft Word ribbon, you can see extra paragraph marks or extra new line marks. It’s difficult to know exactly which are extra and which are the ones that are supposed to be there. You can also see the XE records (index records) which are not printed but which cause entries to be made in the Name Index and in the Place Index.

Frustrating though they may be, these particular extra blank lines and many similar blank lines throughout the report can be eliminated in just a few seconds with a global search and replace. For example, this particular sequence of extra blank lines can be fixed in MS Word by replacing ^l^p^l^p with ^p^p, where ^p and ^l are the MS Word codes for the paragraph mark and the new line mark when doing searches and replaces. Dozens if not hundreds of the extra blank lines throughout the report can be fixed in one fell swoop. I would like to see the problem fixed, but I can live with it if is not fixed.

However, here is another example that’s not so easy to fix.

The reason this one is not so easy to fix is not so easy to see, but here is the same text using the pilcrow again to show the paragraph marks and new line marks.

The problem is that there is a XE record embedded in the middle of the paragraph marks and new line marks. It’s the XE record for Tennessee:Loudon County:Lenoir City. It’s not like that particular XE record is in the wrong place, even though it is. Rather, it shouldn’t be there at all because the same XE record is there where it’s supposed to be a few lines lower as a part of the Marriage record.

The reason the XE record that is embedded in the middle of the paragraph marks and new line marks is such a problem is that it makes it impossible to fix the extra blank lines with a global search and replace. I can fix them, but I have to fix them one at a time as a manual process.

This particular problem happens every single time a person has more than one spouse and has children with any of the spouses. The extra XE record that is so much of a problem appears between the last child for one spouse and the marriage record for the next spouse.

The process of fixing this problem is not 100% manual. I can find the location in the *.docx file that have the problem by turning the pilcrow off and by search for the string
^l^p^l^p ^l^l Notice that there is a blank in the search string where the XE entry is located. I have to delete the found text with the Delete key and then use the Enter key one time to enter one Paragraph mark. This would all be much easier except that MS Word does not handle the XE records very well in searches and replaces. In any case, what I have done for my report for this year’s family reunion is to do the global search and replace for as many extra blanks lines possible. And for the rest, I fixed them one at a time, finding them with the ^l^p^l^p ^l^l search string.

3 Likes

The core problem is not the extra spaces or carriage returns, but they won’t use styles. No serious publishing tool should ignore styles. They’ve been in use for decades, work for both RTF and Word (and every other real publication platform) and frankly make the exported text smaller.

So assign a style to each type, let you set a default, and if the person wants two lines after each so and so, then they change the style in Word and everything in the entire document changes.

Why the haven’t done this a decade ago still baffles me.

We may or may not be meaning the same thing by “using styles”. But I think that RM10’s *.docx files are using styles and style sheets in a way that RM7 didn’t quite use styles and style sheets in its *.rtf files.

For example, here is a snippet that shows a few endnotes from a Descendant Narrative reports. When I was creating the report in RM, I specified Consolas font with 9pt font size. I understand that this particular font and point size is a very poor choice in general, but I was doing an experiment and here is a sample of the results.(For a lot of these images I have posted, I seem to have to right click them in a new tab and then increase their size to be able to seem them properly with the forum software. Otherwise, they are too small.)

Having created this report, it appears to me that the *.docx file created by RM10 actually is using style sheets. For example, I can click anywhere in the endnotes and then type Ctrl+Shift+S. This will bring up a popup window with style sheet information. I think MS Word user interface for bringing up the popup window is terrible and that the popup window for the style sheet itself is poorly done, but it is there. We are ready to look at and modify the style sheet information for the Endnotes style sheet.

So we click on Modify, and the following pops up. From here, I can easily change the font to something else, and it will apply to all the endnotes in the entire document. So just for demo purposes, I will change it to Arial 10 point. I will also click on the Format button for additional options.

Under format, I will change After from 2 point to 10 point. This creates much more space between each endnote. I’m not saying this much space in the list of endnotes is desirable, because it isn’t. But it is just a demo of what you can do with styles from within MS Word.

So after changing After from 2 point to 10 point and after changing the front from Consolas 9 point to Arial 10 point and after Ok a bunch of times, here is what I get. Is this what you mean by being able to have style sheets and being able to change them?

The point of using styles is instead of relying on extra character returns or changing individual elements and changing each element hundreds, or potentially thousands of times is you change the style and it applies throughout the entire document with one change, so there shouldn’t be ANY extra carriage returns or line feeds anywhere in the document, all formatting should be controlled by styles. It’s how real publishers have been doing it for half a century since computer generated text became popular.

I’m not sure why you struggled with the UI and had to hit OK a zillion times. I normally just click on the style in the dropdown at the top, make my change, say OK, and I’m done. I will confess I’ve been using style sheets in several different programs since the 80’s so my comfort level is obviously much different than yours.

In any case RM using styles when it has to, such as TOC and Index, and apparently EndNote to take advantage of Word’s functionality, but doesn’t use them consistently throughout reports and publications. Every element in the document should be controlled by a paragraph or character style, so if I want all the chapter headers to be 18 point bold, that’s once change. If I want all primary names to be bold and alternate names italic, I make one change and my entire document reformats. (I’ll admit I haven’t checked with 10.0.2, so maybe they fixed this)

I also need to reiterate that styles aren’t simply a Word feature. They existed in RTF and WordPerfect, as well as modern publishing tools like InDesign and FrameMaker, so a .docx created by RM can be imported into InDesign, and if they used styles properly could be quickly reformatted.