QFeatures Refactor: QFeatures subsetting

Refactor: QFeatures subsetting

Open cvanderaa opened this issue 1 year ago • 1 comments

We could refactor subsetting and filtering of Qfeatures object. Here are some points I think would be worth improving for maintainability of the code:

In my opinion, filtering and subsetting are related concepts that should be considered together. Therefore I would suggest to combine all code related to one or the other in a single file. This would facilitate the centralization of the subsetting implementation. More specifically, I would suggest to combine the following code into QFeatures-subset.R:

[ (in QFeatures-class.R)
filterNA() (in QFeatures-missing-data.R)
subsetByFeature() (in subsetBy-methods.R)
filterFeatures() (in QFeatures-filter.R)

The functions/methods listed above make use of different subsetting backend instead of a centralized backend that provides consistency and ensures validity. More specifically:

[ (in QFeatures-class.R): this should be the centralized backend, ie x[i, j, k]
filterNA(): currently uses x[i, j], but should be x[i, j, k]
subsetByFeature(): it reconstructs a QFeatures using its constructor.
filterFeatures(): uses x[i, j, k], so OK.

filterFeaturesWithAnnotationFilter() and filterFeaturesWithFormula() have a lot of duplicated code what requires parallel maintenance. I'm sure we could make use of a common internal function to handle this.
The implementation of .subsetByFeature() is too long and difficult to understand (my bad sorry).

bonus: subsetting can be very slow for large datasets (cf SCP). Proper refactoring may help identify and solve resource bottlenecks.

Apr 18 '23 09:04 cvanderaa

QFeatures QFeatures copied to clipboard

Refactor: QFeatures subsetting

QFeatures
QFeatures copied to clipboard