ToDo: Refactor SyncedRepo class and its subclasses ModulesRepo and WorkflowRepo
This issue is a reminder that somebody (maybe at the Boston Hackathon?) should look into the three classes and check for unnecessary code duplications?
At the moment, we have two use cases of locally cached Git repositories that are cloned/updated/synced when needed:
- The Module repository for the
nf-core modulesandnf-core subworkflowsfunctionalities. - Workflow repositories if pipelines are downloaded with
nf-core downloadas--platformdownloads.
Conceptually, the idea was that most static and utility functions needed could be shared by the two and thus two subclasses ModulesRepo and WorkflowRepo were defined that inherit from their SyncedRepo superclass.
Because both subclasses define their own __init__() functions, it for example went entirely unnoticed that the __init__() function of the SyncedRepo calls self.setup_local_repo(), which is only defined for the subclasses.
Also, the SyncedRepo instances don't have a self.repo attribute, but several associated class methods use it nonetheless. This has not caused noticeable bugs yet, because the superclass is never initiated itself, but only the subclasses, which have a self.setup_local_repo() method that creates the self.repo attribute on the subclass instances.
Apart from making mypy unhappy when touching the old code and trying to commit, this is evidently far from ideal.
Hence, I think it would be good to look into those three classes. The bare minimum would be to sort out the __init__() function of the SyncedRepo class and the missing self.repo attribute, but likely this is only the tip of the iceberg.
So it would likely be required to comprehensively assess how many common and distinct methods each subclass has and either consolidate them better in the superclass or strip the superclass down in favor of the subclasses.
This might serve as inspiration for some additional functionality if needed.