Surprise icon indicating copy to clipboard operation
Surprise copied to clipboard

Fix Dataset builder pattern. Decouple Dataset and reader.

Open Sklavit opened this issue 7 years ago • 6 comments

Sklavit avatar Jan 24 '18 13:01 Sklavit

I agree that design for this part could be improved a lot. Could you please give details about the PR? Also all tests seem to be failing?

NicolasHug avatar Jan 27 '18 07:01 NicolasHug

  1. What do you mean by "the PR"?
  2. Yes, I will fix tests.

Sklavit avatar Jan 29 '18 08:01 Sklavit

PR = Pull Request. The question is: what changes are you proposing?

NicolasHug avatar Jan 29 '18 08:01 NicolasHug

Fastfix reader call in PredefinedKFold.split. Actually, reader should not be called from PredefinedKFold.split because it leads to extra coupling. So I think a bit more redesign is needed. I will check it in a bt later.

Sklavit avatar Jan 29 '18 09:01 Sklavit

If the changes only affect Dataset.split() then it's probably not worth spending too much time on it: the split() method is deprecated and replaced by the use of CV iterators.

NicolasHug avatar Jan 29 '18 09:01 NicolasHug

  1. Move data preprocessing for building data sets into the dedicated functions (load_from_df and load_from_file) away from initializer (so DatasetAutoFolds.__init__ will become trivial).
  2. Move read_ratings to Reader because it is Reader functionality to read rating, not DataSet.

Sklavit avatar Jan 30 '18 11:01 Sklavit