wrong interpretation of "n_redudant" parameter

Open alvcoelho opened this issue 2 years ago • 1 comments

Hi, Chris. Your comment on the "n_redudant" parameter seems to be inaccurate. It says:

five features that are random and unrelated to the output's classes

Actually, from what I got from the methods' description, these attributes are random linear combinations of the "informative" ones. So, they are redundant (in the sense that they are statistically dependent on them), but may not be considered as unrelated to the class labels. Apart from the "informative" and "redundant" attributes, the remaining are really random noise generated by the method. Since, in your case, "n_redudant" + "n_informative" = "n_features", there is no noisy feature inserted. Please, take a look and check whether I am correct. Best.

Apr 12 '22 18:04 alvcoelho

I forgot to make reference to this specific page of your set of tutorials: https://chrisalbon.com/code/machine_learning/basics/make_simulated_data_for_classification/

Apr 12 '22 19:04 alvcoelho

short_notes_on_machine_learning short_notes_on_machine_learning copied to clipboard

wrong interpretation of "n_redudant" parameter

five features that are random and unrelated to the output's classes

short_notes_on_machine_learning
short_notes_on_machine_learning copied to clipboard