SDV
SDV copied to clipboard
Support sample weight in data?
Problem Description
Currently for every models all row are treated equally. Is it possible (at least for some models) that can support sample weight such that some rows are weighted more heavily? For instance it could be helpful for data that has distribution shift over time.
Expected behavior
In model fit method add the possibility to supply a sample weight that equal to number of rows in the data. It's similar to how sample weight is done on sklearn API.
Hi @903124, thanks filing the feature request. We can keep this open as we think about it more and use it to track any updates.
I'm curious what kind of data you are working with? Have you thought about creating multiple models for old vs. new data?
Workaround
One manual workaround may be to modify your input data. You can duplicate the rows that seem most important, essentially encoding a "weight" in terms of volume.
Hi I'm working on sports data where e.g. for weight player performance more recent data would be more valuable. In general I think it would be a useful feature to increase or decrease the occurrence of rows with certain features without using external sampler as mentioned