QFeatures icon indicating copy to clipboard operation
QFeatures copied to clipboard

Refactor: QFeatures subsetting

Open cvanderaa opened this issue 1 year ago • 1 comments

We could refactor subsetting and filtering of Qfeatures object. Here are some points I think would be worth improving for maintainability of the code:

  1. In my opinion, filtering and subsetting are related concepts that should be considered together. Therefore I would suggest to combine all code related to one or the other in a single file. This would facilitate the centralization of the subsetting implementation. More specifically, I would suggest to combine the following code into QFeatures-subset.R:
  • [ (in QFeatures-class.R)
  • filterNA() (in QFeatures-missing-data.R)
  • subsetByFeature() (in subsetBy-methods.R)
  • filterFeatures() (in QFeatures-filter.R)
  1. The functions/methods listed above make use of different subsetting backend instead of a centralized backend that provides consistency and ensures validity. More specifically:
  • [ (in QFeatures-class.R): this should be the centralized backend, ie x[i, j, k]
  • filterNA(): currently uses x[i, j], but should be x[i, j, k]
  • subsetByFeature(): it reconstructs a QFeatures using its constructor.
  • filterFeatures(): uses x[i, j, k], so OK.
  1. filterFeaturesWithAnnotationFilter() and filterFeaturesWithFormula() have a lot of duplicated code what requires parallel maintenance. I'm sure we could make use of a common internal function to handle this.
  2. The implementation of .subsetByFeature() is too long and difficult to understand (my bad sorry).

bonus: subsetting can be very slow for large datasets (cf SCP). Proper refactoring may help identify and solve resource bottlenecks.

cvanderaa avatar Apr 18 '23 09:04 cvanderaa