Reset treeshare tool

Various people have asked what reset treeshare does (for example in this post). As I have just successfully used it, I thought I would share my experience.

I have an Ancestry tree with almost 65,000 people, updated via Ancestry and copied to RM via Treeshare. This results in an RM 10 file almost 1gb large and a huge set of media files. My pc died on Monday, so I bought a new one and have been re-installing everything. My backup software copies each version of each user file; saving a 1gb RM file every time I made an update was excessive, so I excluded it from the backup and made a copy to external media roughly once a week.

Everything loaded properly on the new PC. As my RM file was about a week old, I expected to have a few Treeshare changes to update from Ancestry; there were only three, but when I browsed through the family I had recently worked on, I found several people needing updates, but marked as having no changes. More worryingly, quite a large proportion of people on the main Treeshare page showed as matched (little icons for RM and Ancestry next to each other) but with no data for the Ancestry person, even though the people still existed on Ancestry. I ran all the database tools, and ‘reset Treeshare’. When I next ran treeshare, it took a little longer than usual, but did not have the desired effect; all the people in the tree now appeared on the main Treeshare page as matched but with no Ancestry data.

So I tried running ‘reset Treeshare’ again and re-ran Treeshare itself. This time Treeshare took well over an hour to run. When it had finished, it showed rather more changed people (although still not all that I would have expected), and all the data on the main Treeshare page looked good. A quick check showed exactly the same number of people in Ancestry and RM.

I am still not sure why my backup file introduced an error, but I have learnt a little more about how Treeshare works and what re-setting it does. I have always wondered how Treeshare managed to run so quickly, populating a fair bit of data on all the people in my fairly large tree; clearly Ancestry generally only sends changed data; RM stores the previous version and adds/deletes/amends based on what it gets from Ancestry. I guess that RM has a parameter to show when it last updated its Ancestry data and asks Ancestry to see data for all people with a changed date after that (presumably including deleted people). ‘Reset Treeshare’ seems at least to re-build this table with all the data. I still don’t understand why I had to run ‘Reset Treeshare’ twice for it to work.

RM plainly also holds a value linking people in its database to Ancestry; it doesn’t seem that ‘Reset Treeshare’ changes these values in any way.

I hope this is useful background for others who wondered how it works.

3 Likes

I was trying to do something similar. Although my DB isn’t anywhere near as big as yours! This was very helpful. Thanks for sharing! Kudos. -Ken

The 1 gb sounds way too large.

Is that the size of an RM backup file which includes the media files? If so, then the size sounds about right. But if it’s actually the size of the *.rmtree file, then it sounds about 10 times too large. I would have expected about 100 mb with 65,000 people.

It’s a very useful message. Much thanks.

I work directly with the RM database using SQLite all the time. You are correct that RM has a table which links each RM person to an Ancestry person. So most of the people are matched automatically all the time. It’s not like RM’s interface with FamilySearch where you have to go through a matching process.

That being said, I find certain aspects of how TreeShare works to be a little mysterious, even though I understand the RM database structure very well. It seems to me that TreeShare actually does very little actual comparing of RM data with Ancestry data. Rather, it seems to rely primarily on change dates in RM and Ancestry. As long as the change dates have not been modified, RM doesn’t seem to compare the actual data. It only seems to compare the actual data when the change dates have been modified. That’s why comparisons can seem to go so fast.

I certainly wondered about the file size. In fact I hadn’t recently compressed the database, and it came down from 1.2gb to just under 1gb when I did (taking about 10 seconds to do so).

The media files themselves are enormously bigger than that. My file’s stats are people 64k, families 25k, events 180k, places 24k, sources 22k, citations 40k, media items 69k and media links 227k. I don’t know how much data each of these takes, but if you guessed roughly 200 bytes for each of 600k items (excluding the media files) then you would get to roughly 100mb as you say (and 200 bytes seems an awful lot for an average).

However, I think I read somewhere that the RM database itself contains a thumbnail for each of the media files. If so, this might account for roughly 5kb per thumbnail x 64k images = approx 400mb. Again, I am only guessing at the size of the thumbnails.

Plainly adding the two together gets me to about 0.5gb, only half the size I really have, but more than your estimate.

I would welcome comments on this.

Edit.

On reflection, I would expect the vast majority of my media files and links to be for source citations - I do add media items independently, but not a vast number. Given that, I don’t really see how 40k source citations leads to 69k media items or 227k media links.

You are certainly right that Treeshare relies on a last changed date or something like it to produce its changed list (and as I have now worked out to populate its summary table of data from Ancestry).

However, it also does do some comparison, if only for the person highlighted on the details screen at any one time; it tries rather successfully to line up events that match, highlights those with more or less important differences and only lets you insert, delete or amend those which it thinks different. This can cause problems; for example it does not appear to compare the ‘living’ flag and will not directly let you amend it from Treeshare even when you know it differs.