dataprep icon indicating copy to clipboard operation
dataprep copied to clipboard

Ability to run distribution analysis regarding to a target feature

Open borisRa opened this issue 3 years ago • 3 comments

Hi,

Is it possible to run distribution analysis regarding to a target feature ? For example in Titanic data, to show how "Survived" is affected by each variable.

For example here we can see how 'Survived'is affected by "Age"- per train/test

image

Thanks, Boris

borisRa avatar Feb 22 '22 15:02 borisRa

Hi @borisRa , thanks for the suggestion! yeah we do plan to add more support for the scenario as you mentioned. May I know what's the meaning of the line the fig. Is it the survived rate?

jinglinpeng avatar Feb 24 '22 03:02 jinglinpeng

Hi @borisRa , thanks for the suggestion! yeah we do plan to add more support for the scenario as you mentioned. May I know what's the meaning of the line the fig. Is it the survived rate?

yes , this is the survived rate

borisRa avatar Feb 24 '22 09:02 borisRa

@jinglinpeng @borisRa I'd like to help with this feature, should this be an ensemble choice at the start of the mathematical process or just start with a singular target feature?

Given passenger id and/or survived find P(Y|X1) ? Y = survived X1 = Pclass X2 = Name and so on thru Xn? Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked

datatalking avatar Jun 20 '22 18:06 datatalking