`pseudolog` transformer
Request for an inverse-hyperbolic-sine, a.k.a. asinh or pseudolog, transformer. x->arcsinh(x/2) behaves like ln(x) for large values of x, but behaves like x->x for small values of x; this behavior is very useful for values that are almost lognormal, but take on both positive and negative values (e.g. net worth).
Ideally, this should provide location and scale parameters that can be either tuned or set to 0/1 (making the transformation x -> arcsinh((x + loc) / 2scale).)
Hi Carlos,
Thanks for the suggestion. I am not familiar with this transformation, so I don't undertand what you mean by location and scale parameters and tune to 0/1.
Do you have a resource with more details about this transformation that you could share? Like when is it used? who developed it, or whatever you have at hand? We would need that in any case to create the documentation.
Thank you!
Thanks for the suggestion. I am not familiar with this transformation, so I don't undertand what you mean by location and scale parameters and tune to 0/1.
You can find more information here or here.
By location and scale parameters, I just mean that the transformation is of the form:
x -> asinh( (x + loc) / scale / 2)
Which has 2 parameters, loc and scale, which need to be estimated (usually by maximum likelihood).
However, people will sometimes set loc to 0, giving a simplified transform of the form:
x -> asinh(x / scale / 2)
Which only has one estimated parameter (scale).
Some people will even set scale to 1, just giving asinh(x/2).
This one is interesting, but is it numerically stable?
This one is interesting, but is it numerically stable?
Yes, there shouldn't be any problems with it. The only possible numerical problem is if the data aren't scaled and mean-centered, you may have problems with fitting loc and scale. This should probably be mentioned in the docs.
Hey guys! Thank you for the links and discussion. It looks good to me. Would you like to give it a go at drafting a class?
Hey guys! Thank you for the links and discussion. It looks good to me. Would you like to give it a go at drafting a class?
I think so, but I'm a bit stuck on how to do fitting, in that there are two approaches:
- Choose a fit to maximize the normality of the predictor variable. (Easy, but not as accurate)
- Maximum likelihood/minimum loss estimation, where we estimate the
scaleparameter by minimizing the loss in the predictions. (More principled+more accurate).
I think I've worked out how to do 1, but not how to do 2, or whether it's even possible to do using the sklearn API.
@solegalli do you know how I can add a new transformer to the existing tests? I'm not sure where I can find the tests.
You'd probably create a new .py with your transformer within the transformation folder.
Then, you need to create another script within this folder where you'd add the tests.
Plus, you'd need to add your transformer to this file for generic tests, that may fail, but then i can help you troubleshoot.
Hey @ParadaCarleton ! I wonder if you have a template of this function / class that we could use as a starter to create this transformer? Did you do any work on this?