Very Slow Advanced Search with Very Simple Search Criterion

I thought the best way to show this was to make a screencast. It’s a very simple Advanced Search looking for people without a Birth fact. The screencast is 3:17. About half the time is me introducing the situation and the other half is me waiting for the search to complete. As I say in the video, I have about 55,000 people in my database. The search gets to the 99% mark almost instantly, and then takes about 90 seconds to get the last 1% done. I have a fast Windows 10 machine with 16GB memory and an SSD. The link to the video is at Screencast of very slow Advanced Search with Simple Search Criterion

1 Like

Monitor Task Manager to see if it provides any clues. Could be that the result set is complete at 99% and the rest is taken up rendering it for display.

I have run the test before while monitoring Task Manager and I should have reported the results. During the first short period where the percent completed is increasing rapidly, it is CPU intensive and is using an entire processor core’s worth of CPU. For maybe 20 or 30 seconds after that, it is disk intensive and is running at a data transfer rate that varies between about 4MB to 6MB per second. The disk utilization then goes down essentially to zero and RM8 becomes CPU intensive again until it’s done.

I have also monitored with a tool that’s a free download from Microsoft called Sysinternals Process Monitor which in some cases provides a bit more detail than Task Manager. It reports that all that disk I/O activity consists of read operations. I can’t tell for sure, but I assume it’s reading the RM8 database.

I have a second machine that’s my backup machine. It’s identical to the first machine except that it has an HDD disk instead of having an SDD disk like the first machine. The HDD is much larger and the SDD is much faster. I don’t presently have RM installed on the second machine, but I will do so and run the same test. Because an HDD is so much slower than an SDD, I expect the test to run much longer on the second machine.

The HDD test is completed. It took a little over ten minutes to run the same test. The HDD was running about 0.2MB or 0.3MB per second, which was about 20 to 30 times slower than the SDD. So that’s where most of the time extra time went on the HDD machine.

I also made a group using the same criterion of Birth => Exists => Is False. It was so fast it was hard to time, but was about three seconds or a little less to make the group. That’s as compared to the 90 seconds to run the Advanced Search on the SDD machine and a little over 10 minutes to run the Advanced Search on the HDD machine, all using the same very simple search criterion.

I’m beginning to suspect the following. When doing the Advanced Search, that first CPU bound part is very fast, taking about the same three seconds or so as when making the group. So what the Advanced Search is surely doing after that which is taking so much time is gathering the data to populate the other columns. To test that theory, I removed all the columns from People List View - except that you can’t remove the actual list of people. Upon rerunning the Advanced Search on the SDD machine, the time was reduced from 90 seconds to about 75 seconds. That’s a little better, but it’s still a far cry from the 3 seconds to make the group using the same criterion.

I tend to think that an Advanced Search should be able to run just as fast as making a group. But I would still rather just have my Find Next back because it’s so instantaneous and I don’t have to go back and forth between the People tab and the Search tab to progress to the next person.

1 Like

I ran the same test on my 11k database and it took 5 seconds on my Surface 2. About 10 seconds on my Dell laptop and 10 seconds on my HP desktop. Just so you know.

I tried the test also. I have a little over 16400 people and it completed in about 14 seconds. I have an HP Ryzen 5000 series laptop 15-ef2126wm with 8 GB memory and a 256 GB SSD. I am using Windows 11. By the way I have ordered a 1TB SSD and 16 GB memory but have not received it yet. I was running low on memory in some applications and I like to keep a lot of tabs open in Firefox resulting in a lot of paging.

Curious if you happen to know the results of the rows. My “old” laptop with SSD and 16gb ram ran under 10 seconds. Also if you happened to test using SQLite and if there was a time difference. I would not think links to people and facts should have much impact on that. One other thing if one customized the person list view if that would have any impact. (PS - I have over 10K people and 30,000 facts)
Kevin

I have tested a little more, but I have no particular new insights. If we may extrapolate proportionally be the number of people in the database, my 55,000 person database would probably take about 55 seconds on your machine. Except that the time may go up by more than proportionally if the process is doing something like sorting. If it goes up more than proportionally as you increase the number of hits, then your 55 seconds would be somewhat more than 55 seconds which is not so far off from my 90 seconds.

It does make a relatively small difference if there are more columns or fewer columns. It’s slower for more columns and faster for fewer columns.

It makes a huge difference about how many hits there are. If there are a small number of hits in the search, then it’s very fast and vice versa. For example, if there are just a few dozen hits then it’s under 3 seconds on my machine with the 55,000 people. But remember that the first pass to find the hits is always fast, no matter if there are few or many hits. It’s whatever happens after the first pass to find the hits is completed that can take so long.

Another thing that happens is that after it gets past the initial fast phase and the counter has gone to 99%, it can no longer be cancelled. You are locked out until it is finished.

The can’t cancel also applies to reports – so that is a good call out. I suspect the simple initial query gathers all the indexes and RIN or similar – and the the final query/queries do the pull of details so the 99% is very misleading. Still your example seems to take much much longer than one would expect

I did the search “birth date exists” = false
77K individuals took one minute flat.
Mac OS 24G memory HDD I did notice that RM sucked up all of the processor time and used all of the remaining memory for cashe

it would seem that the query might not be efficient and pulling way more info than its displaying.

The query reads every individual to see if matches your criteria. It appears that each match is stored in memory or a work file. Once that is done it executes a sort by name once again to memory or a work file. The results are then displayed. I am on a 24G Intel MAC. HHD. It takes about one minute to search and display the results for a 77K individuals search. When I look at activity monitor it sucks up all available memory for disc cache.

I have some new information about this problem. A part of the problem actually seems to have nothing to do with Advanced Search being slow. Rather, it seems to have to do with People List View being slow when applying filters.

I was trying to develop a work around for the problem. I already mentioned that that when I was doing an Advanced Search on Birth => Exists => False there seemed to be three phases to the search.

  1. The marking of the people without a birth fact. It takes only a second or two, and is actually faster on RM8 than on RM7. The only bad thing about the way RM8 does it is that it doesn’t have a progress bar and RM7 does have a progress bar.
  2. About 60 seconds where RM8 appears to be locked up and where it is reading the RM database like crazy for the whole time.
  3. About 30 seconds where RM8 still appears to be locked up and where it is CPU bound, maxing out a single processor core for the whole time.

So my planned workaround was to eliminate phase #2 and phase #3 from the workflow. Instead, I would make a group of people without a birth fact and I would work back and forth between People List View and the other views with People List View filtered by the group.

Making the group of people without a birth date was very fast , about a second or so. It was just like phase #1 of the Advanced Search that was so slow overall. What I had not yet tried was using the group of people without birth date to filter People List View. When I did, RM8 locked up for about 30 seconds until the filter was applied. At that point, RM8 was unlocked and I could proceed normally. During the whole 30 seconds, RM8 was CPU bound, maxing out a single processor core for the whole time. It was just like phase #3 of the Advanced Search that was so slow overall.

A bit of good news is that filtering by making a group instead of by using the Advanced Search directly seemed to avoid the 60 second of waiting in phase #2 while RM8 was disk bound reading something out of the RM8 database. And this was on my machine with the very fast SDD disk. On my machine with the more typical HDD disk, phase #2 took over 600 seconds.

A bit of bad news is that if I then went from People List View to one of the other Views such as Pedigree View or Family View and then returned to People List View, RM8 locked up for 30 seconds again. The lockup occurs every time I leave People List View for another View and then return to People List View. If I remove the filter, People List View responds instantaneously. If I re-apply the filter while in People List View, it locks up for 30 seconds. So it’s obviously the application of the filter that’s the problem, not changing views. It’s just that changing into People List View can require RM8 to re-apply the filter.

A final bit of bad news is that I tried the same filter on the Index tab of the Left Side Panel. RM8 also locks up for the same 30 seconds of being CPU bound when the group filter is applied to the Left Side Panel.

So I still don’t understand the 60 seconds of RM8 being disk bound during the Advanced Search. But for the 30 seconds of being CPU bound during this Advanced Search, it’s like Advanced Search is making a tempory and unnamed group and that it is applying that group as a filter for the search. And it’s taking 30 seconds of CPU time to apply the filter to this particular group.

I thought about making another video to show how it is slow with this alternative workflow, and I can do so if necessary. But I think this word description is adequate to describe the problem.

There seems to be a general issue in either building a ‘list’ of data items and/or then formatting them for display, one or both of which seem incredibly CPU intensive.

I am a Mac user, but have seen such symptoms in other areas, even when the ‘list’ should be quite small. For example, note edit gets more and more CPU intensive as the note lengthens. Similarly, when importing an RM7 file, the scanning of the disk for RM7 files results in an increasingly long CPU bound period as the number of files that can be found increases from a single digit number to more than 40. Scanning the rest of the disk is very fast, but handing the 40 file names found can be processor bound for about 70 seconds. It is as though the processing required expands almost exponentially as the number of ‘items’ - lines of a note, number of files, etc - increases by relatively low multiples.

I have reported these 2 particular ‘problems’ as they are very easy to reproduce and they have been accepted, but I haven’t seen any improvement so far. There MUST be some weird common processing going on that is used in a number of different areas within RM. All we can do is try to find examples that are easily repeatable and keep reporting them to support.