parsnip icon indicating copy to clipboard operation
parsnip copied to clipboard

Roadmap: parsnip support for sparse tibbles

Open EmilHvitfeldt opened this issue 1 month ago • 0 comments

What we need:

  • [ ] fit() to take sparse tibbles as data
  • [ ] fit() to take {Matrix} sparse matrix as data
    • turn them into sparse tibbles early
  • [ ] Have sparse tibbles turned into appropiate object before they are passed to engines fit function
    • {Matrix} sparse matrix if model supports it
    • back to normal tibble/matrix if not
  • [ ] predict() to take sparse tibbles as data
  • [ ] predict() to take {Matrix} sparse matrix as data
    • turn them into sparse tibbles early
  • [ ] Have sparse tibbles turned into appropiate object before they are passed to engines predict function
    • {Matrix} sparse matrix if model supports it
    • back to normal tibble/matrix if not
  • [ ] look into if we document which engines are sparse friendly
  • [ ] special cases for some model types
    • {xgboost} with xgboost::xgb.DMatrix()
    • etc

I think we could use a option() of some kind to unit test that the data passed is passed around in a way that keeps the sparsity.

Adding all of this will give us

  • standalone usage of sparse matrices in {parsnip}
  • everything it needs to be able to work with the rest of {tidymodels} in regards to sparse tibbles

EmilHvitfeldt avatar Jun 13 '24 00:06 EmilHvitfeldt