Script for getting model_region-to-iso mapping

Open jkikstra opened this issue 11 months ago • 1 comments

Some time ago, @dc-almeida wrote a script that I adapted a bit, creating and writing out ("region_df.csv") a dataframe that combines each registered region with each set of countries that are defined under it, and creates a list of iso3c for it for more standardized names.

See https://github.com/iiasa/emissions_harmonization_historical/issues/25, and copied here for convenience:

from nomenclature import countries
from nomenclature.definition import DataStructureDefinition
from nomenclature.processor import RegionProcessor
import pandas as pd

# a DSD stores the definitions info for regions, variables, scenarios
dsd = DataStructureDefinition("definitions")

# dsd.region stores a dictionary of <region name>: <RegionCode object>
# a RegionCode object stores the info for a region defined in the YAML files, such as the model (
region_df = pd.DataFrame(
    [(r.name, r.hierarchy, r.countries, r.iso3_codes) for r in dsd.region.values()],
    columns=["name", "hierarchy", "countries", "iso3"]
)
# fill currently empty iso3 column
# Function to fetch ISO3 codes
def get_iso3_list(country_list):
    if not country_list:
        return None
    iso3_list = []
    for country in country_list:
        try:
            iso3 = countries.get(name=country)
            if iso3:
                iso3_list.append(iso3.alpha_3)
            else:
                iso3_list.append(None)  # If no match found
        except Exception:
            iso3_list.append(None)
    return iso3_list
# fill the iso3 column
region_df["iso3"] = region_df["countries"].apply(get_iso3_list)


# the RegionProcessor creates the mappings object
# the mappings are defined at the model level in a RegionAggregationMapping object
# each RAM contains, among other things, the list of "common regions" (regions resulting
# from aggregation) and their constituents
rp = RegionProcessor.from_directory("mappings", dsd)
rows = []
for ram in rp.mappings.values():
    rows.extend(
        [
            (
                ram.model,
                common_region.name,
                [ram.rename_mapping[nr] for nr in common_region.constituent_regions],
            )
            for common_region in ram.common_regions
        ]
    )
mappings_df = pd.DataFrame(
    rows,
    columns=["model(s)", "common_region", "constituent_regions"],
)


region_df.to_csv("region_df.csv")
mappings_df.to_csv("mappings_df.csv")

We're using this in a different package, and are currently thinking to put it there and make it point to someone's local installation.

However, maybe it is worth to have "user scripts" that create files based on this package for easy use elsewhere? So maybe you want to host this code yourself, and allow users to simply run the script?

Jan 16 '25 12:01 jkikstra

See also related issue on nomenclature; https://github.com/IAMconsortium/nomenclature/issues/403

Jan 23 '25 13:01 jkikstra