homlr icon indicating copy to clipboard operation
homlr copied to clipboard

lumping, prep, bake do not work as per example

Open guidothekp opened this issue 3 years ago • 0 comments

In section, 3.6.1 Titled Lumping, the following code:

# Lump levels for two features
lumping <- recipe(Sale_Price ~ ., data = ames_train) %>%
  step_other(Neighborhood, threshold = 0.01, 
             other = "other") %>%
  step_other(Screen_Porch, threshold = 0.1, 
             other = ">0")

# Apply this blue print --> you will learn about this at 
# the end of the chapter
apply_2_training <- prep(lumping, training = ames_train) %>%
  bake(ames_train)

results in the following error:

Error in `step_other()`:
Caused by error in `prep()`:
! All columns selected for the step should be string, factor, or ordered.
Run `rlang::last_trace()` to see where the error occurred.
rlang::last_trace()
<error/recipes_error_step>
Error in `step_other()`:
Caused by error in `prep()`:
! All columns selected for the step should be string, factor, or ordered.
---
Backtrace:
    ▆
 1. ├─prep(lumping, training = ames_train) %>% bake(ames_train)
 2. ├─recipes::bake(., ames_train)
 3. ├─recipes::prep(lumping, training = ames_train)
 4. └─recipes:::prep.recipe(lumping, training = ames_train)
 5.   ├─recipes:::recipes_error_context(...)
 6.   │ ├─base::withCallingHandlers(...)
 7.   │ └─base::force(expr)
 8.   ├─recipes::prep(x$steps[[i]], training = training, info = x$term_info)
 9.   └─recipes:::prep.step_other(x$steps[[i]], training = training, info = x$term_info)

The error message wants the columns to be string, factor, ordered.

class(ames_train$Neighborhood)
[1] "factor"

class(ames_train$Screen_Porch)
[1] "integer"

Screen_Porch is the problem. We can verify this by removing the step_other in the lumping step. If we remove the Screen_Porch part, we don't get the error.

guidothekp avatar Apr 09 '23 04:04 guidothekp