BPCells
BPCells copied to clipboard
Normalizations, Feature Selections, DimReduction S3 class, (Iterative) Lsi Implementation, Partial function framework, Clustering changes and Framework
Meta-PR that describes the culmination of efforts of multiple features.
Included and previously described features:
- Iterative LSI design docs (#167)
- Log, tf-idf normalization functions (#168)
- Feature selection by variance, dispersion, mean accessibility (#169)
- LSI, var feature selection (#156)
- Rewrite of LSI to utilize normalizations (#181)
- Current efforts of iterative LSI
EDIT: oops, didn't spot that the LSI code had all been collapsed in the diff view. Will update below in a bit
I'm assuming this is still in-progress since the LSI and IterativeLSI implementations aren't currently in. Just leaving a few quick comments on what's in there so far:
- We should discuss whether to add RcppHNSW to "Imports" or put it in "Suggests". Previously, I've left most of the clustering packages out of "Imports" to avoid taking on big dependencies for non-core BPCells functionality. We can revisit that, but we'll also need to cover how to handle RcppAnnoy, Seurat, and igraph.
- In
normalize_tfidfpartial, I'm not sure if it makes sense to havescale_factorlisted in themissing_args, unless there's a real use-case where we need to distinguish being left as default or being set explicitly to a default. I was originally thinking this would mostly apply to justthreadsandverbosearguments, though there might also be a justification for something likefeature_means - Small suggestion, but could the call of
return(create_partial(...))be collected into fewer lines? I think a 1-liner is probably not crazy long for many cases, but even splitting up like this would condense a lot of boiler-plate lines:return(create_partial(mising_args = list( ... )))