Use a List instead of a Bool
The method transform of the BayesGMMTransformer is the following: def transform(self, data, drop=True): where drop is a bool according to the doc. However, here during the transformation we use a list instead of a bool (transformed = gm.transform(data, [column_name])).
Thus, since the transforming use if bool it will always be true, thus leading to a potential bug if the intent was not to drop it for that specific case.
I will propose a PR with the fix depending on the original intent since it is not clear to me.
Hello, thanks for catching this. Note that the RDT API is changing as we integrate with the new RDT 1.0+
Seems like the default for all transformers is drop=True, so this may be working as intended even though the usage isn't quite correct. However, we'll clarify the original intent first in case this is a bug.
Hello, I believe this issue may be stale now, as we've significantly update the RDT library and its abstractions.
If it is still relevant, please feel free to reply and we can reopen to investigate. Alternatively, we can file a new issues to continue this discussion post-RDT 1.0. Thanks!
I must say that I am a little bit disappointed by your approach against "outsider" PR. It has been (I think) my third PR that I have been told is a good/interesting fix, but in the end, my code is denied. I've put some of my free time into this, and the only reward I was hoping for was at least to get some line of code in this OSP.
You do what you want with this comment, and I will not put more effort into trying to help this OSP.