QFeatures
QFeatures copied to clipboard
Refactor: QFeatures subsetting
We could refactor subsetting and filtering of Qfeatures
object. Here are some points I think would be worth improving for maintainability of the code:
- In my opinion, filtering and subsetting are related concepts that should be considered together. Therefore I would suggest to combine all code related to one or the other in a single file. This would facilitate the centralization of the subsetting implementation. More specifically, I would suggest to combine the following code into
QFeatures-subset.R
:
-
[
(inQFeatures-class.R
) -
filterNA()
(inQFeatures-missing-data.R
) -
subsetByFeature()
(insubsetBy-methods.R
) -
filterFeatures()
(inQFeatures-filter.R
)
- The functions/methods listed above make use of different subsetting backend instead of a centralized backend that provides consistency and ensures validity. More specifically:
-
[
(inQFeatures-class.R
): this should be the centralized backend, iex[i, j, k]
-
filterNA()
: currently usesx[i, j]
, but should bex[i, j, k]
-
subsetByFeature()
: it reconstructs aQFeatures
using its constructor. -
filterFeatures()
: usesx[i, j, k]
, so OK.
-
filterFeaturesWithAnnotationFilter()
andfilterFeaturesWithFormula()
have a lot of duplicated code what requires parallel maintenance. I'm sure we could make use of a common internal function to handle this. - The implementation of
.subsetByFeature()
is too long and difficult to understand (my bad sorry).
bonus: subsetting can be very slow for large datasets (cf SCP). Proper refactoring may help identify and solve resource bottlenecks.