oemof-solph
oemof-solph copied to clipboard
Saving results
Is pickle the most efficient way to store the results?
@oemof/oemof-solph Would it make sense to provide something like:
outputlib.processing.save(results, 'results/my_results.pkl')
results = outputlib.processing.load('results/my_results.pkl')
@oemof/oemof-solph Would it make sense to provide something like:
outputlib.processing.save(results, 'results/my_results.pkl') results = outputlib.processing.load('results/my_results.pkl')
I would leave the pickling to the user since it's only two more lines of code than providing a function/method.
One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings.
Is pickle the most efficient way to store the results?
I have also thought about that. At the moment, I am thinking about providing a functionality of writing all results to excel spreadsheets in a generic way.
@uvchik: what do you think about this functionality?
I think to_something
functions are really helpful and important for a results structure because everybody wants to store results.
One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings.
This make me think that a to_pickle
function might be helpful. This functions raises an error if you have not converted your keys to strings. Or raises a warning and converts the keys to strings and pickle them afterwards: "Keys will be converted to strings. Set convert_to_string=True
to avoid this warning".
I think to_something functions are really helpful and important for a results structure because everybody wants to store results.
Definetely! We just have to think about the structure/naming as more formats might follow.
One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings. This make me think that a to_pickle function might be helpful. This functions raises an error if you have not converted your keys to strings. Or raises a warning and converts the keys to strings and pickle them afterwards: "Keys will be converted to strings. Set convert_to_string=True to avoid this warning".
You are right. This would really speak for such a function.
I think we should distinguish between dump
functions that make it possible to reload the results in the same form and export
functions that may drop some information.
Let us create an i/o-scheme at the meeting and afterwards everybody is free to add export, dump, load etc. functions.
I think we should distinguish between dump functions that make it possible to reload the results in the same form and export functions that may drop some information.
Let us create an i/o-scheme at the meeting and than everybody is free to add export, dump, load etc. functions.
:+1:
My proposal:
-
processing.py
-> creates/preserves (re)stores original result object e.g. by (un)pickling -
views.py
-> creates subsets of the original result object e.g. by slicing for specific nodes -
conversion(or another name).py
-> exports/imports original result object or views e.g. to xls, csv, json, ...
So it could look like:
from oemof.outputlib import processing, views, conversion
my_results = processing.results(om)
processing.store(my_results, 'my_results.oemof') # to file system with nodes internally converted to strings. returns a boolean
my_results = processing.restore('my_results.oemof') # from file system
node_data = views.node(my_results, 'my_node')
conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean
Sometimes I think about having a result class, again. But using functions also provides a clear API and we have single modules with clearly defined functions which do not explode ;-)
@oemof/oemof-developer-group: What do you think?
by the way, is their a nice way to get all inputs of a bus?
Something like that:
my_results = processing.results(om)
my_results[(,bus)]['sequences']
I wrote a function to divide the columns into input and output. Now I think it might be better to pass the DataFrame and get back the part of the DataFrame.
Maybe somebody knows an easier way.
my_results = processing.results(om)
df = views.node(results, 'my_bus')['sequences']
divided_columns = divide_bus_columns(bus_label, df.columns)
input_flows = df[divided_columns['in_cols']]
output_flows = df[divided_columns['out_cols']]
def divide_bus_columns(bus_label, columns):
"""
Divide columns into input columns and output columns. This function
depends on the API of the oemof outputlib. Last changes (v0.2.0).
Parameters
----------
bus_label : str
Label of the bus.
columns
Returns
-------
"""
return {
'in_cols': [
c for c in columns if (len(c[0]) > 1 and c[0][1] == bus_label)],
'out_cols': [
c for c in columns if (len(c[0]) > 1 and c[0][0] == bus_label)]}
@simnh I did not found a nice way yet... By now I do it like this:
from itertools import zip_longest
bus = energysystem.groups['b_el2']
bus_flows = zip(*zip_longest(*results))
bus_input_flows = [
(input_flow, output_flow)
for (input_flow, output_flow) in bus_flows if output_flow == bus
]
for flow in bus_input_flows:
print(results[flow]['sequences'])
Thereby, the double zipping (with zip_longest filling up tuples up to longest with Nones
) is needed in case a single-node-flow is present (ie. in case of GenericStorage). Otherwise unpacking of results would cause an error as (GenericStorage, )
cannot be unpacked into two variables...
I would appreciate easier suggestions to do the above!
Maybe then it would be nice to add a shortcut for that for the API?
outputs_bel = results.outputs(bel)
Sometimes I think about having a result class, again.
I think you did a good job with the new results API. Releasing v0.3 people can bring together their experience a think about creating a results class or doing something else 😄
I agree with your store/restore API but if it could be that we have more than one way to dump/restore we need a way to distinguish them from each other.
processing.store(my_results, 'my_results.oemof', to='pickle')
processing.store_to_pickle(my_results, 'my_results.oemof')
I like export
more than conversion
and I would say that import does not make sense if the object can not be recovered. If the object can be recovered it should not be export but (store/restore) as described above.
The export function will have the problem that you can export different things.
my_results = processing.results(om)
export.to_xls(my_results, 'my_file.xlsx)
# OR
node_data = views.node(my_results, 'my_node')
conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean
In the future there could be even more views.
Maybe then it would be nice to add a shortcut for that for the API?
outputs_bel = results.outputs(bel)
Yes. And it should be provided as a view within my proposed logic. We could use @uvchik s logic or something similar. It's not a big deal..
@henhuy s approach would of course, also work!
Sometimes I think about having a result class, again. I think you did a good job with the new results API. Releasing v0.3 people can bring together their experience a think about creating a results class or doing something else
I agree with your store/restore API but if it could be that we have more than one way to dump/restore we need a way to distinguish them from each other.
processing.store(my_results, 'my_results.oemof', to='pickle') processing.store_to_pickle(my_results, 'my_results.oemof') I like export more than conversion and I would say that import does not make sense if the object can not be recovered. If the object can be recovered it should not be export but (store/restore) as described above.
:+1: I agree with the name.
The export function will have the problem that you can export different things.
my_results = processing.results(om) export.to_xls(my_results, 'my_file.xlsx)
OR
node_data = views.node(my_results, 'my_node') conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean In the future there could be even more views.
I don't see a problem here. It could be export.to_csv(*args)
or export.to_xls(*args)
, etc.
Or did I miss something?
Maybe then it would be nice to add a shortcut for that for the API?
node_data = views.node(my_results, 'my_node', split='output')
node_data = views.node_outputs(my_results, 'my_node')
or something like this....
Or did I miss something?
This is defined by the main API:
my_results = processing.results(om)
export.to_xls(my_results, 'my_file.xlsx)
...but if you want to export views you need to have an definition how a view should look like if other views will be added otherwise it will be complicated.
my_results = processing.results(om)
node_data = views.node(my_results, 'my_node')
# This function works only with views from the view.node function
export.to_xls(node_data, 'my_file.xlsx)
I like any solution, as long as it provides a one-liner for getting inputs, outputs...
actually, I think it would be great if the export can transfer the tuple to a two row format, don't you think?
Or did I miss something? This is defined by the main API:
my_results = processing.results(om) export.to_xls(my_results, 'my_file.xlsx) ...but if you want to export views you need to have an definition how a view should look like if other views will be added otherwise it will be complicated.
my_results = processing.results(om) node_data = views.node(my_results, 'my_node')
This function works only with views from the view.node function
export.to_xls(node_data, 'my_file.xlsx)
Actually I have some ideas for this but would prefer to discuss them next week. Dicussing it here is too exhausting and it won't take long ;-)
actually, I think it would be great if the export can transfer the tuple to a two row format, don't you think?
Let's also discuss it next week. It just about decisions and easy to implement..
Actually I have some ideas for this but would prefer to discuss them next week. Dicussing it here is too exhausting and it won't take long ;-)
Agree :+1:
Connected to #380
At the developer meeting @ckaldemeyer agreed to add a proposal for the export/store structure.
It might be effected by #420.
Is there something new, especially with regards to the release v0.2.1?
And generally: How far have we come with implementing the functionality and structure of storing results when comparing it to the results of your discussion? Do we need to collect more feedback from users and developers using oemof on this?
Feedback from users would always be great. But as @uvchik mentioned, this will be related to #420. So lets see how we deal with it in the feature. But this should stay open to not forget it...
Self-assignment due to oemof dev meeting - issue auction
Don't see this happening before v0.3.0. Postponed.
As nobody interacted with this suggestion for a very long time, I do not see this coming. Closed.