Posting for friend -- odd database issue - duplicate filename

A friend @sbankscharles of mine who uses Roots Magic has been having odd issues with his database.
Examining a copy of his database – I have noticed that there are many duplicate filename (indexed field). He has ~70 media items - at least 7 have duplicate MediaID with same filename / path. One has triplicates. Trying to gather more info before sending to RM support.

I can’t figure out how what steps a user would do that would cause this – Database tools have been run – possible something corrupted db before tools were run. Here are a few examples:

unnamed

unnamed

from the left columns are: MediaID, MediaType, MediaFile

Not sure if issues only involved/related to media

A common cause of the duplicate file names is adding a file with Add New File when the file is already linked into RM9. In that case, the user should add the file with Select Existing Media.

RM9 does not prevent the user from doing this. Also, there is no Merge Duplicate Media feature as there is with Sources and Citations. The only way I know of to fix the problem is to delete the duplicates and to add them back in with the Select Existing Media feature.

I have seen reports of this problem sometimes arising from TreeShare. But the file names in your screen shot did not come from TreeShare. So it seems that adding new media instead of using existing media is the mostly likely cause of the problem.

@kevync1985 what other odd issues are you having

That is very interesting, considering I have always used ‘Add New Media’ in version 7 and have never encountered this problem that I recall. Maybe something changed in RM8 and 9?

My most common use case is adding Census images to each of the people involved. I don’t share the Census fact but instead give each their own and attach the photo to each person’s fact.

Personally I almost “add” media via the drag N drop method and have not encountered duplicates or such issues – however CSB adds via different method. Since the media file name should be an index and not allow duplicate – shouldn’t RM check and simply add tags? Trying to figure out exact what steps he is using to create this issue. To his point – he just using the program as it should be able to be used. With treeshare I might partly understand as that media is usually in different folder/path. The Regular RM media should be the regular media path and also he is not using multiple folders for media. Most of the issue appear to during same session.

If you use Add New Media repeatedly for the same file, it does add a new record to the Multimediatable. There’s no harm in it other than table bloat which might slow some operations and db file size (thumbnails are a major contributor as there will be one for each duplicate). Clutters the Media Gallery.

I answered this in some detail in

as that is where I saw your duplicate post first.

I don’t know what duplicate post you speak of, however as I stated, I don’t seem to find any such problem. I am in the MultiMediaTable as we speak and I can not find a single instance of duplication.

To further that statement, I am looking at one Sarah Jones and an 1860 Census that I added using the default Census fact. I also added an 1860 fact to a grandson and two children, all of who were in Sarah’s household. I added the image, named Jones_Sarah-1860FederalCensus.jpg.

I have precisely 1 entry by that name in the MediaFile column. I have precisely one MediaPath to that file locations and precisely one MediaID for that file. Every time I entered the Census fact for these people, I used the ‘Add New Media’ option as I have done for years.

How this worked out this way I have no idea. However in RM7, it is not reflected as multiple in my database.

Maybe later I will try figuring out a query to determine what files are linked to who, but right now I am tired so I will do it after I nap.

My reply was to the OP who duplicated his post to the Forum at Sqlitetoolsforrootsmagic.com

I use Add from gallery on RM7 when adding a media to a fact or a person or a citation when the media already exists in RM7’s media gallery. Add from gallery is RM7’s equivalent of RM9’s Select Existing Media.

However, I concur with your observation that in RM7, using Add new media does not result in a duplicate media record as does using Add New Media in RM9. The practical result is that you can choose always to use Add new media in RM7 without any problems in RM7 but you cannot use Add New Media in RM9 for existing media files without creating duplicate media items.

This behavior may be a bug in RM9. It occurs to me that it might also be a design feature. I haven’t played with it, but it might provide a way to link the same media file into RM9 in such a way that each time time the media file is linked into RM9 it can have a different caption. That might be useful for a group photo where the caption could be different for each person for whom the photo is linked. For John Doe, the caption might be something like “John Doe, second from left” and for Elizabeth Doe the caption might be something like “Elizabeth Doe, fourth from left”. But that’s only a wild guess on my part.

At one point, early versions of RM supported multiple captions for media files which was useful for group photos. I made heavy use of this feature. Then in RM5 I think it was, RM dropped the ability to have multiple captions for media files. I lost a bunch of my captions at which point I made a promise to myself never to trust RM with my captions again. I am now doing my captions in a totally different way that does not depend on RM. But my experience with RM’s captions is why I’m making my wild guess about why RM9 is apparently supporting the linking of the same media file multiple times. Obviously, wild guesses may be completely wrong.

It seems to me that depending on the intentions of the RM9 design, adding a duplicate media file should be prevented similar to the way RM7 does it or else it should be supported while raising a warning. Or as another option, merging of duplicate media files should be supported. I have no idea what to do about duplicate media files if they are coming into RM via TreeShare.

1 Like

Ken something like this – will show duplicate media ( for this part I am worried about the duplicate media) not how/ where attached – the problem would be I believe – the galleria will only how one of them – and then if you attach media – which media ID is being attached – then the user could be misled if enhanced tools show used media etc
Query correct to avoid confusion (Iwas exporting to EXCEL
SELECT
Count(MediaFile) as Qty, MediaType, MediaFile, MediaPath, UTCModDate
FROM MultimediaTable
GROUP BY MediaFile
HAVING Count(MediaFile) > 1

image

in my view of proper design - when a duplicate file/path EXISTS rm should add addtional tags – not add duplicate record. This should occur of whatever method the user had done via the RM interface – including TreeShare.

hmm interesting on captions - I could see how that could cause issues as well as possible explanation to what is occurring in CSB’s db. What puzzles me about this-- how is the media galleria suppose to select and link the correct media item if there are 2 or more?

Kevin

One possible theory/explanation — Maybe @rzamor1 can comment on

Not sure of the the logic RM is using (or supposed to be using to decide to add duplicate MediaFile vs add additional tags) – Thinking maybe if/when add media is used and file is selected and there are no tagging (Tag count = 0) it might be adding duplicate mediafile (record in table). If Tags are present (>=1) then it will add additional tags instead. Not sure if this has to do with changes being made during same session.

Either way why would RM allow in any sceario to have duplicate MediaFile / Paths as that should be Unique indexed – and the same should be true for user or TreeShare.

…and what happens appears to me to be just that, a multiple tag gets added in RM 7. So in light of what @thejerrybryan said, this may be a change in versions later than 7. I have never troubled myself with captions as I really don’t have any desire to print scrapbooks or anything, however I can see why the change was maybe made.

I don’t use version 9 for many reasons, and handling of the media files may just be another of those reasons.

1 Like

For me - Captions would be mostly relevant when I export for my website (or next website currently)

Your sqlite query might give someone the impression that MediaID is duplicated when really what is shown is that of just one record in the group. MediaID is the Primary Key for that table and SQLite enforces it to be unique.

I’m surprised by the change reported by @thejerrybryan in filename duplication from RM7 to RM9. What surprises me is that there was any prevention of duplication in RM7 as I do not recall it and it certainly is not constrained in the table definition. I do not know when the application software imposed it. When captions were moved from the MediaLinkTable to the MultimediaTable in RM5, I made a pre-upgrade sqlite procedure that preserved the tag captions by using multiple records for the same Media File. It continued to be possible to add the same file and apply a different caption through RM6, iirc.

Edit: I went back to RM7 and RM5 and now know that the software constraint was added in RM5 and persisted through RM7 and still does in RM9 for drag’n’drop (added after RM5) but not for Add New Media. My query that converted multiple captions for the same file record in RM4 to multiple file records with unique captions (and distributed the tags accordingly) worked because there was no UNIQUE constraint on the path and filename as stored in the table. However, the RM5+ application software prevented further such replication of unique path/name combinations until RM9, which has loosened it up for Add New Media (probably unintentionally).

True I should have removed that col

I know Drag n Drop media will reuse existing media with the same pathname. I can’t recall if “Add New Media” is suppose to or not so I reported it to development.

1 Like

Yes I agree with that in my experience – Drag N Drop has not created duplicates (for me)/
Maybe it is intentional by design to add duplicate media – however that will cause problems --only of them one appears in Gallery list etc from what I can tell. What about the mediaID / records that are not (or no longer displayed ) in gallery ?

Kevin