hardhat icon indicating copy to clipboard operation
hardhat copied to clipboard

Roadmap: hardhat support for sparse tibbles

Open EmilHvitfeldt opened this issue 1 year ago • 1 comments
trafficstars

Right now I'm imagining that {hardhat} won't be used directly with sparse matrices, and rather is handling the more internal things

  • [ ] make sure mold() and forge() works with sparse tibbles
  • [x] make sure recompose() works https://github.com/tidymodels/hardhat/pull/259
  • [ ] more things

EmilHvitfeldt avatar Jun 13 '24 01:06 EmilHvitfeldt

mold() issues

Reprex:

mat <- matrix(sample(0:1, 100, TRUE, c(0.9, 0.1)), nrow = 10)
colnames(mat) <- letters[1:10]
sparse_mat <- Matrix::Matrix(mat, sparse = TRUE)
sparse_mat <- sparsevctrs::coerce_to_sparse_tibble(sparse_mat)
mold(a ~ b, sparse_mat)

Gives us the following traceback:

    ▆
 1. ├─hardhat::mold(a ~ b, sparse_mat)
 2. └─hardhat:::mold.formula(a ~ b, sparse_mat)
 3.   ├─hardhat::run_mold(blueprint, data = data)
 4.   └─hardhat:::run_mold.default_formula_blueprint(blueprint, data = data)
 5.     └─hardhat:::mold_formula_default_process(...)
 6.       └─hardhat:::mold_formula_default_process_predictors(...)
 7.         └─hardhat::model_matrix(terms = framed$terms, data = framed$data)

Which then calls model.matrix() followed by tibble::as_tibble(). Which breaks sparsity

EmilHvitfeldt avatar Sep 11 '24 19:09 EmilHvitfeldt

we will not be doing mold() and forge() this time around

EmilHvitfeldt avatar Nov 15 '24 20:11 EmilHvitfeldt