eli5 dependency and categorical data: it is not described in user manual doc what data and how generated

dependency and categorical data: it is not described in user manual doc what data and how generated

Open Sandy4321 opened this issue 5 years ago • 3 comments

it is not described what data and how generated for blackbox lime: https://eli5.readthedocs.io/en/latest/blackbox/lime.html eli5 supports dataset generation using Kernel Density Estimation, to ensure that generated dataset looks similar to the original dataset;

1 can you generate dependency in data when data is mixture on categorical and continues variables for example feature 1: green, black, red, green, white, green feature 2: 2.1 , 34.7, 22, 2.0 , 8.2, 2.2 so for green we have values around 2.1 +- 0.1 2 can you generate univariate independent and identically distributed sample data for non Gaussian distributions for non continues data (categorical data) so there is no " smoothness or continuity" in pdf? Seems to be not! Then how you package deals with non continues data (categorical data) ??

As we see from https://en.wikipedia.org/wiki/Kernel_density_estimation "Let (x1, x2, …, xn) be a univariate independent and identically distributed sample drawn from some distribution with an unknown density ƒ. We are interested in estimating the shape of this function ƒ. " So this Kernel_density_estimation method is not for case when there is dependency between variables (features)

Oct 28 '19 20:10 Sandy4321

hello community !! This is Vyom Goel. I am very new to open source but would love to work on this issue. Can anyone guide me in the process.

Feb 18 '20 19:02 Vyom16

Great , eli5 team can you help to understand what to do?

Feb 19 '20 12:02 Sandy4321

hello community !! This is Vyom Goel. I am very new to open source but would love to work on this issue. Can anyone guide me in the process.

Hi @Vyom16. You can start by understanding the code at https://github.com/TeamHG-Memex/eli5/tree/master/eli5/lime. If you think the docs need extra information like Sandy suggests you can create a pull request (one tutorial is at https://opensource.com/article/19/7/create-pull-request-github). Maybe one of the maintainers can help you.

Feb 20 '20 12:02 teabolt

eli5 eli5 copied to clipboard

dependency and categorical data: it is not described in user manual doc what data and how generated

eli5
eli5 copied to clipboard