PublicData.get_decimal_precisions fails for very small or large numerical values
If the mode of a float type column is <= 1e-5 or >= 1e16, this line in get_decimal_precisions fails as the string representation of the mode uses scientific notation and doesn't contain a decimal point: https://github.com/interpretml/DiCE/blob/8027ebbf696e8b6c9344a889fb1ba4e90ea448d9/dice_ml/data_interfaces/public_data_interface.py#L396
I have encountered this issue as well. Unlike with other issues, it is not generally possible to fix it without modifying the training data in a way that changes its meaning. I do not see a good way to work around it, some datasets simply cannot be used with DiCE until this bug is fixed.
I too encountered this.
Hi, would you mind adding a minimal reproducible example so that one can investigate this?
I do not have time to provide a MWE that involves DiCE, but here is a MWE that makes the same mistake as DiCE:
import pandas as pd
x = pd.DataFrame({'col1': [1e-9]})
modes = x['col1'].mode()
str(modes[0]).split('.')[1]
Any dataset with a column whose mode's string representation does not contain a dot will cause the same issue. The assumption is made on line 396 (as quoted above) that any value returned by mode(), when represented as a string, contains a dot. This assumption is incorrect. The fix is to find a different way to calculate maxp.