DiCE icon indicating copy to clipboard operation
DiCE copied to clipboard

original outcome is not same as query instance.

Open dayongfu opened this issue 4 years ago • 6 comments
trafficstars

I have the following query instance, but in the output original outcome is not same as query instance, I assume they should be same otherwise it is useless.

dice_exp = exp.generate_counterfactuals({ 'five_star_rate': 3.5, 'nights_booked': 1.0 }, total_CFs = 4, desired_class="opposite") dice_exp.visualize_as_dataframe()

Query instance (original outcome : 1)

 # five_star_rate nights_booked label
1 0.0 127.7 0.564298

Diverse Counterfactual set (new outcome : 0)

  # five_star_rate nights_booked label
1 0.0 66.2 0.215
2 0.0 66.2 0.215
3 0.0 77.0 0.280
4 0.0 62.5 0.196

dayongfu avatar Jan 26 '21 19:01 dayongfu

Hi @dayongfu,

The way you are trying to generate the counterfactual is by saying the desired_class="opposite".

dice_exp = exp.generate_counterfactuals({ 'five_star_rate': 3.5, 'nights_booked': 1.0 }, total_CFs = 4, desired_class="opposite")

Hence the counterfactual generated is for the scenario to flip the class for a query instance.

Does that answer your question?

Regards, Gaurav

gaugup avatar Feb 04 '21 02:02 gaugup

thanks @gaugup !

yes, I want to generated the opposite scenarios, and I believe these scenarios are listed in the table under "Diverse Counterfactual set (new outcome : 0)". I'm curious about the table under "Query instance (original outcome : 1)", I assume it is the query instance that I input for generate_counterfactuals method, but looks so different from my inputs. why?

dayongfu avatar Feb 04 '21 05:02 dayongfu

May I know how did you initiate the data object, d? Make sure to feed five_star_rate and nights_booked as continuous features to the data object:

d = dice_ml.Data(dataframe=dataset, continuous_features=['five_star_rate', 'nights_booked'], outcome_name='label')

If the above did not work, could you share how your model is trained? Internally, by default, DiCE min-max normalizes the continuous features and feeds to the ML model. So if your model expects these features to be in a different format, there might be issues.

We are working on generalizing this data-transformation function so that user can specify their own methods, and will update the code shortly.

raam93 avatar Feb 16 '21 16:02 raam93

@dayongfu As Ram mentioned, that bug was likely due to an encoding mismatch. We have released a new version (0.5) on PyPI that does not have this encoding issue--- generate_counterfactuals can now take input in the original data space, without having to do any encoding. Can you try your example with the new version? Hope that solves the issue.

If not, will appreciate if you can provide a small working example that we can debug.

amit-sharma avatar Mar 16 '21 05:03 amit-sharma

thank all of you. I definitely would like to try it out and keep you posted. I also want to know whether DiCE can be used on a LSTM based classification model?

dayongfu avatar Mar 17 '21 21:03 dayongfu

Sorry missed your message @dayongfu DiCE can be used for any Pytorch/tensorflow model. Although we haven't tested it on a LSTM-based model, it should work as long as the model is differentiable.

amit-sharma avatar May 06 '21 07:05 amit-sharma