code-base-investigator icon indicating copy to clipboard operation
code-base-investigator copied to clipboard

Refactor codebase dictionary into CodeBase class

Open Pennycook opened this issue 1 year ago • 0 comments

Related issues

  • Closes #86. A CodeBase tracks all source files in the directories.

  • Closes #66. Since all tests had to be rewritten to use the CodeBase class, I updated the paths.

  • Progress towards #58. CodeBase uses pathlib internally to store and manipulate paths. The external interfaces still accept and return strings for now, but this is only temporary. We can move away from strings entirely once all interfaces accept Path.

Proposed changes

  • Add a new CodeBase class storing all information about which directories make up a code base and which files should be excluded from analysis.
  • Move legacy functionality into CodeBase where appropriate: whether a file should be excluded from analysis is now implemented via __contains__; and listing the contents of a code base is now implemented via __iter__.
  • Rewrite all tests to use CodeBase. Note that most of the changes here are actually related to casting between Path and strings, required because some legacy internals do not consider these representations to be equivalent.
  • Update the documentation and worked example to highlight that tracking all source files in a directory may result in unexpected files being included in the analysis.

Note that although most uses of CodeBase here only use a single directory, the intent is to enable a list of explicit directories to be passed in the future as:

[codebase]
directories = [
  "src1/",
  "src2/",
]

...in order to support analysis of disjoint codebases, and to allow codebasin to be run from any directory.

Pennycook avatar May 01 '24 18:05 Pennycook