Feature Request - Direct Export to JSON

Going from GEDCOM to JSON as it turns out, is not as trivial as I had initially thought it would be.

This morning I took a second crack at it and learned a good bit thanks to ChatGPT, who is by the way a wonderful teacher. After more than 3 hours in a single unpaid ChatGPT session, we actually ended up writing from scratch a GEDCOM to JSON parser which worked well but only after plumbing the numerous shortcomings of python-gedcom.

The Phase 1 - Source-Aware GEDCOM to JSON Parser we created today worked great. Tomorrow we’ll have a go at Phase 2 - Enhanced Parser for GEDCOM Citations.

Along the way, the idea presents itself - could RootsMagic directly export a .json file?

For anybody interested, here’s today’s Phase 1 Python script gedcom_tojson.py:

import json
from gedcom.parser import Parser
from gedcom.element.individual import IndividualElement
from gedcom.element.family import FamilyElement

GEDCOM_FILE = "David Howe.ged"
OUTPUT_JSON = "david_howe_family.json"

parser = Parser()
parser.parse_file(GEDCOM_FILE)

elements = parser.get_root_child_elements()

individuals = {}
families = {}

# First pass: collect individuals and families
for elem in elements:
    if isinstance(elem, IndividualElement):
        pointer = elem.get_pointer()
        individuals[pointer] = {
            "id": pointer,
            "name": elem.get_name(),
            "gender": elem.get_gender(),
            "birth_date": elem.get_birth_data()[0],
            "death_date": elem.get_death_data()[0],
            "spouses": [],
            "children": [],
            "parents": [],
            "famc": None,  # Family child (parents)
            "fams": []     # Family spouse (marriages)
        }
        # Look for FAMC and FAMS in sub-elements
        for child in elem.get_child_elements():
            if child.get_tag() == "FAMC":
                individuals[pointer]["famc"] = child.get_value()
            elif child.get_tag() == "FAMS":
                individuals[pointer]["fams"].append(child.get_value())

    elif isinstance(elem, FamilyElement):
        pointer = elem.get_pointer()
        family = {
            "id": pointer,
            "husband": None,
            "wife": None,
            "children": []
        }
        for child in elem.get_child_elements():
            tag = child.get_tag()
            value = child.get_value()
            if tag == "HUSB":
                family["husband"] = value
            elif tag == "WIFE":
                family["wife"] = value
            elif tag == "CHIL":
                family["children"].append(value)
        families[pointer] = family

# Second pass: connect relationships
for ind in individuals.values():
    # Parents
    famc = ind.get("famc")
    if famc and famc in families:
        parents = families[famc]
        for parent_id in [parents.get("husband"), parents.get("wife")]:
            if parent_id and parent_id in individuals:
                ind["parents"].append(individuals[parent_id]["name"])

    # Spouses and children
    for fam_id in ind.get("fams", []):
        if fam_id in families:
            fam = families[fam_id]
            spouse_id = fam.get("wife") if ind["gender"] == "M" else fam.get("husband")
            if spouse_id and spouse_id in individuals:
                ind["spouses"].append(individuals[spouse_id]["name"])
            for child_id in fam.get("children", []):
                if child_id in individuals:
                    ind["children"].append(individuals[child_id]["name"])

# Clean up unused keys
for ind in individuals.values():
    ind.pop("famc", None)
    ind.pop("fams", None)

# Save to JSON
with open(OUTPUT_JSON, "w", encoding="utf-8") as f:
    json.dump(list(individuals.values()), f, indent=2, ensure_ascii=False)

print(f"Done! JSON saved as {OUTPUT_JSON}")

I’m wondering if just calling SQLite’s CLI( via the Windows command-line) could get You the JSON (table-by table):
sqlite3 -json databasename.rmtree "select * from rmtablename" > rmtablename.json

1 Like

Good question, great idea, and I’ve been trying to do this, though for me this morning it’s been a bit like like reading Braille for the first time. I’ve managed to install SQLite but I’m not sure about the bits in the command line beginning at rmtable

sqlite3 -json "David Howe.rmtree" "select * from rmtablename" > rmtablename.json

For example, in an RM database.rmtree, does the tablename identically mirror the name of the .rmtree ? In this example there is a space between David and Howe; does that then require use of quotes or is it handled in some other way?

Comically, in the dark I tried to discover the rmtablename with the command .tables without success. I don’t really know SQLite, though have been exposed to it in different surveying-related applications for several years. Uncle!

One of the reasons I want to apply your approach (assuming it will produce a .json file) is to help evaluate the .json that’s produced using the Python script gedcom_to_json.py.

And if you’re wondering why the feature request for an export from RM to .json in the first place, it’s largely tied to defining the predicate for experiments in AI-generated Descendant Narratives; e.g., screenshot of example request to ChatGPT below.

Your PowerShell session shows having typed sqlite3 (in yellow) without the accompanying database to be opened, so it instead opened a BLANK transient in-memory database (NO tables).

Let’s just give You a table name to work with and see the file created:

sqlite3 -json "David Howe.rmtree" "select * from NameTable" > NameTable.json

I’m not home at my computer, so I had only used a Windows CMD (Command Prompt) window… not sure how PowerShell might need anything tweaked.

Thank you Kevin.
That worked to produce 154 lines; e.g., the last line:
{"NameID":161,"OwnerID":154,"Surname":"Whiting","Given":"Charles","Prefix":"","Suffix":"","Nickname":"","NameType":0,"Date":".","SortDate":9223372036854775807,"IsPrimary":0,"IsPrivate":0,"Proof":0,"Sentence":"","Note":"","BirthYear":1814,"DeathYear":1890,"Display":0,"Language":"","UTCModDate":45757.72148973379808,"SurnameMP":"Whiting","GivenMP":"Charles","NicknameMP":""}]

There doesn’t appear to be included in the output NameTable.json linked source citations for any of the 154 lines

Different tables → SourceTable table and CitationTable table and CitationLink table, ETC.

Look up Table Names and Table Fields HERE

Should work (from within sqlite3’s shell)

Thank you Bob & Kevin for the help and sorry not to have acknowledged it sooner.

FWIW - This thread on community.RM unfolded while I was playing with ChatGPT. The 4-day long continuous conversation with ChatGPT was broken into multiple sessions with the AI which took place from 20250413 through 2020416. On Friday, I added some notes on my experiments. Even though I tried to share, it’s not provided here at community.rm because I’m not allowed to use more than 32k characters, can’t attach any file types which aren’t images which include .pdf, .zip, etc..

Therefore, the file is temporarily available from:

image