rdflib.js icon indicating copy to clipboard operation
rdflib.js copied to clipboard

Identify core areas of code base

Open michielbdejong opened this issue 5 years ago • 3 comments

In order to maximise impact of code-cleanup, I want to find out which areas of the code base are "core" and which areas areas are more expendable (i.e. it would be possible to use rdflib.js without ever touching those).

michielbdejong avatar Apr 07 '20 09:04 michielbdejong

Under src/ there are 5891 lines of JS and 7259 + 446 + 135 = 7840 lines of TS. Some things we'll definitely need:

  • src/blank-node.ts (101)
  • src/fetcher.ts (2124)
  • src/formula.ts (935)
  • src/index.ts (119)
  • src/literal.ts (187)
  • src/n3-parser.js (1570)
  • src/named-node.ts (113)
  • src/serializer.js except statementsToXML (965 - 306)
  • src/store.ts except applyPatch (1137 - 95)
  • src/statement.ts (132)
  • src/serialize.ts (102)
  • src/uri.ts (214)

Total: 101 + 2124 + 935 + 119 + 187 + 1570 + 113 + (965 - 306) + (1137 - 95) + 132 + 102 + 214 =

7298

So that's 7298 / (5891 + 7840) = 53%.

michielbdejong avatar Apr 07 '20 09:04 michielbdejong

Sorry, as @megoth pointed out I forgot about src/update-manager.ts, so then the estimate would be (7298 + 1142)/(5891 + 7840) = 61%.

There may also be more parts we can skip, even in these files. For example, I already excluded statementsToXML and applyPatch from the count, but there may be more like that, so consider 61% to be an upper-limit estimate.

michielbdejong avatar Apr 09 '20 14:04 michielbdejong

Without going through it here, we already have the write up and so we know the recommendations we need to follow. But in general my thoughts on refactoring in this context:

There are many strategies that can be used but some of them are not on the cards for us at the moment because of time constraints. We have dates that we need to deliver dependent products by.

Some of the files are way too big and need to be broken up. We need to consider cohesiveness, loose coupling, testability and separation of concerns. I would not be in favour of refactoring a file at a time because

    1. We do not know that the individual files are cohesive and doing a file at a time will risk not breaking it up properly
    1. We most likely do not need all the code in every file right now so lets only refactor what we need first.

As we create the current product we should refactor rdflib like this:

Start:

  • Pull out a function or class that we need to use into a new library
  • Determine the next dependency that function has
  • Goto Start

This way we only need to refactor the code that we need right now. We will eventually refactor it all but we don't have the need or the time to do it all at once.

You will end up with just the code we need to refactor.

emmettownsend avatar Apr 09 '20 14:04 emmettownsend