Limit read_excel() to existing elements for set 'year' etc.
read_excel allows the upload of data for years not defined in the scenario, this creates issues and therefore either the data-upload should be rejected or a new scenario with additional timesteps using the function add_year should be created.
If a scenario has the years defined:
scen.set('year').tolist()
# This yields
[1960, 1970, 1980, 1990, 1995, 2000, 2005, 2010, 2015, 2020, 2025, 2030, 2035, 2040, 2045, 2050, 2060, 2070, 2080, 2090, 2100, 2110]
And an data is added using the function read_excel:
scen.read_excel('<name_of_my_file.xlsx>, add_units=True, init_items=True)
# And then the duration_period parameter is retrieved, the following is the result
year value unit
0 1970 10.0 y
1 1980 10.0 y
2 1990 10.0 y
3 1995 5.0 y
4 2000 5.0 y
5 2005 5.0 y
6 2010 5.0 y
7 2015 5.0 y
8 2020 5.0 y
9 2025 5.0 y
10 2030 5.0 y
11 2035 5.0 y
12 2040 5.0 y
13 2045 5.0 y
14 2050 5.0 y
15 2060 10.0 y
16 2070 10.0 y
17 2080 10.0 y
18 2090 10.0 y
19 2100 10.0 y
20 2110 10.0 y
21 1960 10.0 y
22 1955 5.0 y
23 1965 10.0 y
24 1975 10.0 y
25 1985 10.0 y
26 2055 70.0 y
27 1950 -160.0 y
Some thoughts:
ixmp.Scenario.read_excel()was originally written to support reading all data (sets and parameters) for a scenario into an empty Scenario object. The newer implementation inixmp.backend.ioand tests are for this behaviour.- It's nice that the code is flexible enough so that it works in updating a Scenario that already contains data. But this is a novel usage: there was never any discussion of how it should work, and there are no tests for it.
- The code is in ixmp, and it doesn't know how to give preferential treatment to different sets and parameters that are only defined in message_ix, e.g.
year.
We can imagine many different use-cases like (2):
A. The one described above: add or modify parameter values, but do not add values to year (or any set? Only some sets?)
B. The user has a file that describes one technology to be added to an existing model. The technology does not exist in the model. The sheet for the technology set contains the ID of the new technology to be added; other sheets contain parameter data that reference this new set element.
We could consider some enhancements:
-
A new option to
ixmp.backend.io.s_read_excelto read only parameter data, and no elements for any set. -
Enhance
ixmp.backend.base.Backend.read_file,s_read_excel, andScenario.read_excelto accept/pass/handle a filters= keyword argument that allows to specify, e.g.filters=dict(year=self.set("year"), …)and filters the data read from file before adding it.In this case, further enhancements could be added by overriding
Scenario.read_excel()inmessage_ixto apply defaults to this argument, e.g. foryear,year_vtg,year_act, etc.
In any case, the code can't distinguish between (A) and (B) vs. (1); the user would always need to indicate the desired behaviour somehow.
In the meantime, the docs for ixmp and/or message_ix could be improved to explain (1), and also link to this issue.
Maybe the slightly related issue #163 could also be handled simultaneously with this one.