oemof-solph icon indicating copy to clipboard operation
oemof-solph copied to clipboard

Saving results

Open uvchik opened this issue 7 years ago • 26 comments

Is pickle the most efficient way to store the results?

@oemof/oemof-solph Would it make sense to provide something like:

outputlib.processing.save(results, 'results/my_results.pkl')
results = outputlib.processing.load('results/my_results.pkl')

uvchik avatar Oct 27 '17 19:10 uvchik

@oemof/oemof-solph Would it make sense to provide something like:

outputlib.processing.save(results, 'results/my_results.pkl') results = outputlib.processing.load('results/my_results.pkl')

I would leave the pickling to the user since it's only two more lines of code than providing a function/method.

One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings.

Is pickle the most efficient way to store the results?

I have also thought about that. At the moment, I am thinking about providing a functionality of writing all results to excel spreadsheets in a generic way.

@uvchik: what do you think about this functionality?

ckaldemeyer avatar Nov 20 '17 09:11 ckaldemeyer

I think to_something functions are really helpful and important for a results structure because everybody wants to store results.

One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings.

This make me think that a to_pickle function might be helpful. This functions raises an error if you have not converted your keys to strings. Or raises a warning and converts the keys to strings and pickle them afterwards: "Keys will be converted to strings. Set convert_to_string=True to avoid this warning".

uvchik avatar Nov 20 '17 10:11 uvchik

I think to_something functions are really helpful and important for a results structure because everybody wants to store results.

Definetely! We just have to think about the structure/naming as more formats might follow.

One note: The results can only be pickled/accessed if the dict-keys are converted to strings (function already exists) as the energy system objects (even if created identically). So if the "old" objects are not exsiting due to a new session/deleting/... one cannot access the results anymore if the keys have not been converted to strings. This make me think that a to_pickle function might be helpful. This functions raises an error if you have not converted your keys to strings. Or raises a warning and converts the keys to strings and pickle them afterwards: "Keys will be converted to strings. Set convert_to_string=True to avoid this warning".

You are right. This would really speak for such a function.

ckaldemeyer avatar Nov 21 '17 10:11 ckaldemeyer

I think we should distinguish between dump functions that make it possible to reload the results in the same form and export functions that may drop some information.

Let us create an i/o-scheme at the meeting and afterwards everybody is free to add export, dump, load etc. functions.

uvchik avatar Nov 24 '17 11:11 uvchik

I think we should distinguish between dump functions that make it possible to reload the results in the same form and export functions that may drop some information.

Let us create an i/o-scheme at the meeting and than everybody is free to add export, dump, load etc. functions.

:+1:

ckaldemeyer avatar Nov 27 '17 13:11 ckaldemeyer

My proposal:

  • processing.py -> creates/preserves (re)stores original result object e.g. by (un)pickling
  • views.py -> creates subsets of the original result object e.g. by slicing for specific nodes
  • conversion(or another name).py -> exports/imports original result object or views e.g. to xls, csv, json, ...

So it could look like:

from oemof.outputlib import processing, views, conversion

my_results = processing.results(om)
processing.store(my_results, 'my_results.oemof') # to file system with nodes internally converted to strings. returns a boolean
my_results = processing.restore('my_results.oemof')  # from file system
node_data = views.node(my_results, 'my_node')
conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean

Sometimes I think about having a result class, again. But using functions also provides a clear API and we have single modules with clearly defined functions which do not explode ;-)

@oemof/oemof-developer-group: What do you think?

ckaldemeyer avatar Nov 29 '17 10:11 ckaldemeyer

by the way, is their a nice way to get all inputs of a bus?

Something like that:


my_results = processing.results(om)
my_results[(,bus)]['sequences']

simnh avatar Nov 29 '17 10:11 simnh

I wrote a function to divide the columns into input and output. Now I think it might be better to pass the DataFrame and get back the part of the DataFrame.

Maybe somebody knows an easier way.

my_results = processing.results(om)
df = views.node(results, 'my_bus')['sequences']
divided_columns = divide_bus_columns(bus_label, df.columns)
input_flows = df[divided_columns['in_cols']]
output_flows = df[divided_columns['out_cols']]
def divide_bus_columns(bus_label, columns):
    """
    Divide columns into input columns and output columns. This function
    depends on the API of the oemof outputlib. Last changes (v0.2.0).

    Parameters
    ----------
    bus_label : str
        Label of the bus.
    columns

    Returns
    -------

    """
    return {
        'in_cols': [
            c for c in columns if (len(c[0]) > 1 and c[0][1] == bus_label)],
        'out_cols': [
            c for c in columns if (len(c[0]) > 1 and c[0][0] == bus_label)]}

uvchik avatar Nov 29 '17 11:11 uvchik

@simnh I did not found a nice way yet... By now I do it like this:

from itertools import zip_longest
bus = energysystem.groups['b_el2']
bus_flows = zip(*zip_longest(*results))
bus_input_flows = [
    (input_flow, output_flow)
    for (input_flow, output_flow) in bus_flows if output_flow == bus
]
for flow in bus_input_flows:
    print(results[flow]['sequences'])

Thereby, the double zipping (with zip_longest filling up tuples up to longest with Nones) is needed in case a single-node-flow is present (ie. in case of GenericStorage). Otherwise unpacking of results would cause an error as (GenericStorage, ) cannot be unpacked into two variables... I would appreciate easier suggestions to do the above!

henhuy avatar Nov 29 '17 11:11 henhuy

Maybe then it would be nice to add a shortcut for that for the API?

outputs_bel = results.outputs(bel)

simnh avatar Nov 29 '17 11:11 simnh

Sometimes I think about having a result class, again.

I think you did a good job with the new results API. Releasing v0.3 people can bring together their experience a think about creating a results class or doing something else 😄

I agree with your store/restore API but if it could be that we have more than one way to dump/restore we need a way to distinguish them from each other.

processing.store(my_results, 'my_results.oemof', to='pickle')
processing.store_to_pickle(my_results, 'my_results.oemof')

I like export more than conversion and I would say that import does not make sense if the object can not be recovered. If the object can be recovered it should not be export but (store/restore) as described above.

The export function will have the problem that you can export different things.

my_results = processing.results(om)
export.to_xls(my_results, 'my_file.xlsx)
# OR
node_data = views.node(my_results, 'my_node')
conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean

In the future there could be even more views.

uvchik avatar Nov 29 '17 11:11 uvchik

Maybe then it would be nice to add a shortcut for that for the API?

outputs_bel = results.outputs(bel)

Yes. And it should be provided as a view within my proposed logic. We could use @uvchik s logic or something similar. It's not a big deal..

@henhuy s approach would of course, also work!

ckaldemeyer avatar Nov 29 '17 11:11 ckaldemeyer

Sometimes I think about having a result class, again. I think you did a good job with the new results API. Releasing v0.3 people can bring together their experience a think about creating a results class or doing something else

I agree with your store/restore API but if it could be that we have more than one way to dump/restore we need a way to distinguish them from each other.

processing.store(my_results, 'my_results.oemof', to='pickle') processing.store_to_pickle(my_results, 'my_results.oemof') I like export more than conversion and I would say that import does not make sense if the object can not be recovered. If the object can be recovered it should not be export but (store/restore) as described above.

:+1: I agree with the name.

The export function will have the problem that you can export different things.

my_results = processing.results(om) export.to_xls(my_results, 'my_file.xlsx)

OR

node_data = views.node(my_results, 'my_node') conversion.to_xls(node_data, 'my_chp.xlsx') # returns a boolean In the future there could be even more views.

I don't see a problem here. It could be export.to_csv(*args)or export.to_xls(*args), etc.

Or did I miss something?

ckaldemeyer avatar Nov 29 '17 11:11 ckaldemeyer

Maybe then it would be nice to add a shortcut for that for the API?

node_data = views.node(my_results, 'my_node', split='output')
node_data = views.node_outputs(my_results, 'my_node')
or something like this....

uvchik avatar Nov 29 '17 13:11 uvchik

Or did I miss something?

This is defined by the main API:

my_results = processing.results(om)
export.to_xls(my_results, 'my_file.xlsx)

...but if you want to export views you need to have an definition how a view should look like if other views will be added otherwise it will be complicated.

my_results = processing.results(om)
node_data = views.node(my_results, 'my_node')
# This function works only with views from the view.node function
export.to_xls(node_data, 'my_file.xlsx)

uvchik avatar Nov 29 '17 13:11 uvchik

I like any solution, as long as it provides a one-liner for getting inputs, outputs...

simnh avatar Nov 29 '17 13:11 simnh

actually, I think it would be great if the export can transfer the tuple to a two row format, don't you think?

simnh avatar Nov 29 '17 13:11 simnh

Or did I miss something? This is defined by the main API:

my_results = processing.results(om) export.to_xls(my_results, 'my_file.xlsx) ...but if you want to export views you need to have an definition how a view should look like if other views will be added otherwise it will be complicated.

my_results = processing.results(om) node_data = views.node(my_results, 'my_node')

This function works only with views from the view.node function

export.to_xls(node_data, 'my_file.xlsx)

Actually I have some ideas for this but would prefer to discuss them next week. Dicussing it here is too exhausting and it won't take long ;-)

ckaldemeyer avatar Nov 29 '17 15:11 ckaldemeyer

actually, I think it would be great if the export can transfer the tuple to a two row format, don't you think?

Let's also discuss it next week. It just about decisions and easy to implement..

ckaldemeyer avatar Nov 29 '17 15:11 ckaldemeyer

Actually I have some ideas for this but would prefer to discuss them next week. Dicussing it here is too exhausting and it won't take long ;-)

Agree :+1:

uvchik avatar Nov 29 '17 16:11 uvchik

Connected to #380

uvchik avatar Dec 01 '17 10:12 uvchik

At the developer meeting @ckaldemeyer agreed to add a proposal for the export/store structure.

It might be effected by #420.

uvchik avatar Jan 09 '18 14:01 uvchik

Is there something new, especially with regards to the release v0.2.1?

And generally: How far have we come with implementing the functionality and structure of storing results when comparing it to the results of your discussion? Do we need to collect more feedback from users and developers using oemof on this?

jnnr avatar Mar 05 '18 16:03 jnnr

Feedback from users would always be great. But as @uvchik mentioned, this will be related to #420. So lets see how we deal with it in the feature. But this should stay open to not forget it...

simnh avatar Mar 06 '18 12:03 simnh

Self-assignment due to oemof dev meeting - issue auction

busiing avatar May 17 '19 15:05 busiing

Don't see this happening before v0.3.0. Postponed.

p-snft avatar May 30 '19 12:05 p-snft

As nobody interacted with this suggestion for a very long time, I do not see this coming. Closed.

p-snft avatar Nov 10 '22 05:11 p-snft