multi-vector-simulator icon indicating copy to clipboard operation
multi-vector-simulator copied to clipboard

Storing json files needs to be reworked

Open smartie2076 opened this issue 4 years ago • 10 comments

We need to call F0.store_as_json() after the csvs are read from file, so that we can (a) provide ICOM with the things we expect from EPA more easily in the future and (b) update our pytests more easily. In my case, I used json_processed.json to replace/update mvs_config.json , and now I get error messages because I am actually providing the json file as it is after applying C to it:

_____________________________________ test_input_folder_json_file_have_required_parameters[D:\\PycharmProjects\\mvs_eland\\tests\\inputs] ______________________________________

input_folder = 'D:\\PycharmProjects\\mvs_eland\\tests\\inputs'

    @pytest.mark.parametrize("input_folder", TEST_JSON_INPUT_FOLDERS)
    def test_input_folder_json_file_have_required_parameters(input_folder):
        """
        Browse all folders which contains a {CSV_ELEMENTS} folder and a {JSON_FNAME} json file
        (defined as json input folders) and verify that this json file have all required
        parameters
        """
        if os.path.exists(os.path.join(input_folder, JSON_FNAME)):
            comparison = compare_input_parameters_with_reference(input_folder, ext=JSON_EXT)
>           assert MISSING_PARAMETERS_KEY not in comparison, (
                f"In path {input_folder}/{JSON_FNAME}, the following parameters are missing:"
                + str(comparison[MISSING_PARAMETERS_KEY])
            )
E           AssertionError: In path D:\PycharmProjects\mvs_eland\tests\inputs/mvs_config.json, the following parameters are missing:{'energyConsumption': ['type_asset', 'dsm', '
type_asset', 'dsm', 'type_asset', 'dsm', 'type_asset', 'dsm', 'type_asset', 'dsm', 'type_asset', 'dsm']}
E           assert 'missing_parameters' not in {'extra_parameters': {'constraints': ['maximum_emissions', 'minimal_degree_of_autonomy'], 'economic_data': ['CRF', 'an.....}, 'mis
sing_parameters': {'energyConsumption': ['type_asset', 'dsm', 'type_asset', 'dsm', 'type_asset', 'dsm', ...]}}

tests\test_input_folder_parameters.py:75: AssertionError

(The error is connected to functions integrated with PR #761)

smartie2076 avatar Jan 20 '21 08:01 smartie2076

The excess sinks added in C0 result in the error message printed above (excess sinks dont have dsm and asset_type, but should also not be in the json file provided in the MVS).

"PV plant (mono) bus_excess": {
            "Specific_replacement_costs_of_installed_capacity": {
                "unit": "currency/unit",
                "value": 0
            },
            "Specific_replacement_costs_of_optimized_capacity": {
                "unit": "currency/unit",
                "value": 0
            },
            "age_installed": {
                "unit": "year",
                "value": 0
            },
            "annuity_of_specific_investment_costs_and_specific_annual_om": {
                "unit": "currency/unit/year",
                "value": 0.0
            },
            "development_costs": {
                "unit": "currency",
                "value": 0
            },
            "dispatch_price": {
                "unit": "currency/unit",
                "value": 0
            },
            "energyVector": "Electricity",
            "inflow_direction": "PV plant (mono) bus",
            "installedCap": {
                "unit": "unit",
                "value": 0.0
            },
            "label": "PV plant (mono) bus_excess_sink",
            "lifetime": {
                "unit": "year",
                "value": 20
            },
            "lifetime_price_dispatch": {
                "unit": "?/hour",
                "value": 0.0
            },
            "lifetime_specific_cost": {
                "unit": "currency/unit",
                "value": 0
            },
            "lifetime_specific_cost_om": {
                "unit": "currency/ye",
                "value": 0.0
            },
            "optimizeCap": {
                "unit": "bool",
                "value": true
            },
            "simulation_annuity": {
                "unit": "currency/unit/evaluated_period",
                "value": 0.0
            },
            "specific_costs": {
                "unit": "currency/unit",
                "value": 0
            },
            "specific_costs_om": {
                "unit": "currency/year",
                "value": 0
            },
            "type_oemof": "sink",
            "unit": "?"
        },

smartie2076 avatar Jan 20 '21 08:01 smartie2076

Actually, this already takes place! B0.load_json() stores the input values into mvs_config.json, located in [OUTPUT_PATH]/inputs. It seems I overlooked this :D

@Bachibouzouk @SabineHaas were you aware of this? Should we store the json file that results from reading the csv files one level up, directly [OUTPUT_PATH]?

smartie2076 avatar Jan 27 '21 15:01 smartie2076

I wanted to avoid confusion between an original mvs_config.json and a mvs_csv_config.json, the input values are in, but not the timeseries, so it is not analog to json_processed.json

Bachibouzouk avatar Jan 27 '21 15:01 Bachibouzouk

@Bachibouzouk @SabineHaas were you aware of this? Should we store the json file that results from reading the csv files one level up, directly [OUTPUT_PATH]?

I wasn't aware of that.. so there are three different jsons in the end? The original, mvs_csv_config.json directly read from csvs, that is used as input for the simulation and json_process.json including everything (added time series, added vales and parameters) ?

SabineHaas avatar Jan 27 '21 15:01 SabineHaas

To summarize:

  • mvs_csv_config.json is created via json.dump in A1 , and the name of this file is generated from CSV_FNAME, a path is generated here.
  • json_input_processed.json is created here in .cli, using F0.store_as_json with filename stored in JSON_PROCESSED.
  • json_with_results.json is created here in F0, using F0.store_as_json with filename stored in JSON_WITH_RESULTS.
  • mvs_config.json is defined with JSON_FNAME = "mvs_config.json", which then is used to define JSON_PATH. ... but the actual file is generated no-where. I have just deleted the mvs_config.json from our tests/inputs and the MVS_outputs/inputs did not contain the file anymore! I suspect that before, it was always just copied from our tests/inputs but not generated anymore.

So, I guess, we have different things to tackle here.

  • Should mvs_csv_config.json be stored without the F0.store_as_json function? I would think it should use F0.store_as_json.
  • I would like to have mvs_config.json to be generated before we apply C0 (not sure if this means "after A1" or "after B0), as that is the file closest to the EPA and also it includes all needed data for the simulation (timeseries etc).
  • Should we store two seperate files mvs_csv_config.json and mvs_config.json? I would think that actually mvs_config.json is enough. If mvs_csv_config.json is needed for A1 to work, can we delete it after that is done?
  • Where should we store the files? I would say not in MVS_outputs/inputs but in MVS_outputs directly.
  • Is there an argument for not generating all json files directly from .cli? I would like to be consistent with this.

smartie2076 avatar Jan 28 '21 20:01 smartie2076

@Piranias could you please check whether @smartie2076 's proposals/questions do affect pvcompare in a way you could not handle well?

SabineHaas avatar Feb 01 '21 16:02 SabineHaas

  • Should mvs_csv_config.json be stored without the F0.store_as_json function? I would think it should use F0.store_as_json.

You mean one of the files starting with json_ right?

  • I would like to have mvs_config.json to be generated before we apply C0 (not sure if this means "after A1" or "after B0), as that is the file closest to the EPA and also it includes all needed data for the simulation (timeseries etc).

  • Should we store two seperate files mvs_csv_config.json and mvs_config.json? I would think that actually mvs_config.json is enough. If mvs_csv_config.json is needed for A1 to work, can we delete it after that is done?

I also don't see the advantage of having these two files; but maybe @Bachibouzouk knows more.

  • Where should we store the files? I would say not in MVS_outputs/inputs but in MVS_outputs directly.

  • Is there an argument for not generating all json files directly from .cli? I would like to be consistent with this.

SabineHaas avatar Feb 01 '21 16:02 SabineHaas

@Piranias could you please check whether @smartie2076 's proposals/questions do affect pvcompare in a way you could not handle well?

We don't do anything with json files at all. Neither one ;) So I don't think storing it in outputs or not changes anything in pvcompare.

Piranias avatar Feb 01 '21 16:02 Piranias

  • Should mvs_csv_config.json be stored without the F0.store_as_json function? I would think it should use F0.store_as_json. Yes otherwise it will not write the pandas objects to the json in a fashion where we can then re-read the file with B0.read_json
  • I would like to have mvs_config.json to be generated before we apply C0 (not sure if this means "after A1" or "after B0), as that is the file closest to the EPA and also it includes all needed data for the simulation (timeseries etc).

The timeseries are loaded within C0 though, maybe move that part in B0, or in a new module B1 ? (it is relatively easy to move, if there is a FILENAME field in an asset as well as a TIME_SERIES fied, raise a warning and try to load from the file, otherwise fallback on the TIME_SERIES, if there is only FILENAME, try to load the file. if there is none of them raise an error) At this stage one could also check the length of the TIME_SERIES compared to simulation settings and issue relevant warnings (shorter or longer than expected, datetimeindex not matching, etc)

  • Should we store two seperate files mvs_csv_config.json and mvs_config.json? I would think that actually mvs_config.json is enough. If mvs_csv_config.json is needed for A1 to work, can we delete it after that is done?

mvs_csv_config.json is needed for A1, but it could be deleted and we could use mvs_config.json instead. The idea was that the simulation parameter could be provided by either csv or json. We are 100% sure mvs_csv_config.json is matching the content of csv_elements, but never sure whether a mvs_config.json does or not (it could not have been produced by A1) If we replace something I would only do so in the copied "inputs" folder in the output" folder, not in the original folder. If we save mvs_config.json before we apply C0 anyway, we could not save mvs_csv_config in A1.create_input_json, but instead returning a python dict and passing it to B0.load_json (simply then modify B0.load_json to check whether argument is a path to a file or already a python dict and skip the file loading in second case)

  • Where should we store the files? I would say not in MVS_outputs/inputs but in MVS_outputs directly.

I like that MVS_outputs/inputs is a self contained folder I can zip and send to someone to try a simulation with same input parameters. I would keep inputs or rename it used_inputs or inputs_copy

  • Is there an argument for not generating all json files directly from .cli? I would like to be consistent with this. You mean calls to save files not hidden in functions but explicitly written in cli.main ? I think some of the tests rely on file generation being done in submodules (otherwise tests would require whole run of the cli), so I would only move the calls not affecting tests too much ;)

Bachibouzouk avatar Feb 01 '21 21:02 Bachibouzouk

Puuh, there is a lot to unpack here.

  • Should mvs_csv_config.json be stored without the F0.store_as_json function? I would think it should use F0.store_as_json.

You mean one of the files starting with json_ right?

@SabineHaas No, I do mean the other ones. The function that parses the output function is currently located in F0 maybe that is what`s confusing.

  • Should mvs_csv_config.json be stored without the F0.store_as_json function? I would think it should use F0.store_as_json. Yes otherwise it will not write the pandas objects to the json in a fashion where we can then re-read the file with B0.read_json

Okay makes sense @Bachibouzouk

  • I would like to have mvs_config.json to be generated before we apply C0 (not sure if this means "after A1" or "after B0), as that is the file closest to the EPA and also it includes all needed data for the simulation (timeseries etc).

The timeseries are loaded within C0 though, maybe move that part in B0, or in a new module B1 ?

I think we should move all functions that read timeseries from files from C0 to A1 (or B1) - otherwise we might have trouble with the EPA, which does not supply us with any filenames.

I did not understand what you proposed clearly.

  • Should we store two seperate files mvs_csv_config.json and mvs_config.json? I would think that actually mvs_config.json is enough. If mvs_csv_config.json is needed for A1 to work, can we delete it after that is done?

mvs_csv_config.json is needed for A1, but it could be deleted and we could use mvs_config.json instead. [...] If we save mvs_config.json before we apply C0 anyway, we could not save mvs_csv_config in A1.create_input_json, but instead returning a python dict and passing it to B0.load_json

Yes, we should change it that way.

  • Where should we store the files? I would say not in MVS_outputs/inputs but in MVS_outputs directly.

I like that MVS_outputs/inputs is a self contained folder I can zip and send to someone to try a simulation with same input parameters. I would keep inputs or rename it used_inputs or inputs_copy

True, but when storing mvs_config.json in the input folder, exactly that is not possible anymore: We have two sets of complete input data. You can not know if you changed one of them manually. So we should store all jsons out of inputs/inputs_copy. However, I think then one must move mvs_config.json to be able to restart a simulation (no multiple json files supported in input folder)

  • Is there an argument for not generating all json files directly from .cli? I would like to be consistent with this. You mean calls to save files not hidden in functions but explicitly written in cli.main ? I think some of the tests rely on file generation being done in submodules (otherwise tests would require whole run of the cli), so I would only move the calls not affecting tests too much ;)

Yeah, those. In that case, we could move the file generation consistently into the modules?

Following tasks are open:

  • [ ] Move F0.store_to_json into utils/helpers.py
  • [ ] Move all functions and function calls that read timeseries from csv to A1/B1
  • [ ] Avoid storing to mvs_csv_config.json, change return values/function calls in A1 and B0
  • [x] Enable storing mvs_config.json again
  • [ ] Change path to store mvs_config.json to MVS_outputs (attention, this may not work)
  • [ ] Rename inputs in MVS_outputs to inputs_copy
  • [ ] Refactor: json_processed.json to mvs_processed.json and json_with_results.json to mvs_with_results.json (proposal)

I was thinking of doing all this now, but honestly this is more of an add-on and contingency thing, so I will only take care that mvs_config.json is created again and leave it at that right now.

smartie2076 avatar Feb 03 '21 14:02 smartie2076