expan icon indicating copy to clipboard operation
expan copied to clipboard

sanity check whether the input data contains duplicated entities

Open jbao opened this issue 9 years ago • 5 comments

The ExperimentData class can handle 2 kinds of input data as the metrics argument:

  • aggregated metrics: this should always be aggregated on the entity level
  • time-resolved metrics: this requires an additional column time_since_treatment in the input, and should always be aggregated per unique entity and time point

jbao avatar Jun 28 '16 15:06 jbao

I am willing to work on this. Maybe you can explain a bit more.

piyush0609 avatar Aug 20 '16 18:08 piyush0609

hey @piyush0609 , glad that you volunteered, just added some description, feel free to reach out if it's still unclear;-)

jbao avatar Aug 22 '16 08:08 jbao

So what do we want to do here @jbao, from your comment earlier what I understood is that we have to make some changes in time_resolved metrics.

piyush0609 avatar Aug 22 '16 08:08 piyush0609

basically we need to distinguish the two metric types:

  • if aggregated, we need to add a checkpoint to ensure all entities are unique
  • if time-resolved, we need to ensure the data is unique based on the combination of entity and time

jbao avatar Aug 22 '16 09:08 jbao

I am not sure that I can do it, but I will definitely try and I will be needing your help.

piyush0609 avatar Aug 22 '16 19:08 piyush0609