GPflowOpt icon indicating copy to clipboard operation
GPflowOpt copied to clipboard

Support additional training information

Open javdrher opened this issue 8 years ago • 1 comments

Following the goals in #44 I took a step back on the issue. It got me thinking about the role of the data object, right now I think the fundamental rule should be: the data object is to bring the data (X/Y and anything extra) from the expensive functions to the models.

Right now, I have the following idea in my mind:

  • If a model expects more than only X/Y, it should inform the data object. As the core models are defined in GPflow we can not add logic to inform us. However we could scan for all dataholders in a model and use their name property to look for an entry in the data object, or we follow the decorator pattern to enable users to implement additional mappings.
  • If expensive objective functions return additional information they should instantly return a Data object. In case no additional information is returned, objective values can be returned directly, as is the case now. We then combine all these data objects and call set_data which visits all models and performs the updates.

I have some different versions in my mind, we could automatically construct a pipeline with for instance Apache Beam but that would be overkill and it might introduce a lot more coupling between the objects. Also it would make the learning curve to contribute a lot higher. Also I think the logic in the Data object can be straightforward.

javdrher avatar Jul 29 '17 12:07 javdrher

I think supporting additional training data needs more thought and is a lot of work. Lets not fix this on a release version yet

icouckuy avatar Nov 11 '17 18:11 icouckuy