warehouse
warehouse copied to clipboard
Code organization proposal: `views`
Two files warehouse/manage/views.py and tests/unit/manage/test_views.py are some of the largest Python files in the codebase today.
As reported by scc (trimmed):
$ scc warehouse/ tests/ --by-file --format wide
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity Complexity/Lines
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Python 616 113270 11295 11440 90535 2443 2059.67
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
tests/unit/manage/test_views.py 10503 274 27 10202 37 0.36
tests/unit/email/test_init.py 5728 11 11 5706 0 0.00
warehouse/manage/views.py 4966 353 149 4464 224 5.02
tests/unit/accounts/test_views.py 3876 343 20 3513 43 1.22
tests/unit/forklift/test_legacy.py 3668 95 26 3547 31 0.87
warehouse/accounts/views.py 1572 129 58 1385 88 6.35
warehouse/forklift/legacy.py 1495 59 75 1361 56 4.11
tests/unit/accounts/test_services.py 1435 129 14 1292 7 0.54
~alware/checks/setup_patterns/test_check.py 1158 21 11 1126 1 0.09
warehouse/email/__init__.py 1016 7 13 996 0 0.00
...
Larger files are harder for humans to parse and keep track of where things should be, make sense of structure.
pylint sets their default to 1000 for some 18 years now - possibly longer, went digging for historical context and reasoning, didn't find much. When triggered, it raises too-many-lines
The additional context reads:
When a module has too many lines it can make it difficult to read and understand. There might be performance issue while editing the file because the IDE must parse more code. You need more expertise to navigate the file properly (go to a particular line when debugging, or search for a specific code construct, instead of navigating by clicking and scrolling)
I tend to agree - larger module sizes make navigating the complexity of a given module a little harder, increasing friction.
flake8 does not seem to have a similar setting, and PEP8 doesn't specify a maximum either.
Instead of asking the general question of "what should we target as our maximum module size", since that opens a general discussion, instead I propose targeting the views files for ~~refactoring~~ reorganization.
Proposal
As a way to prove this idea, here's one path I considered:
- Convert
warehouse/manage/views.pyto a packagewarehouse/manage/views/__init_.py - Create
warehouse/manage/views/organization.pyand move theManageOrganization*Viewsclasses there - Create
tests/unit/manage/views/test_organization.pyand moveTest*Organization*test classes there
Evaluate where we're at file-size-wise and decide if we want to continue organization this way.
Alternately, depending on how we feel about the manage namespace, we can move anything organization-related to warehouse/organizations/vews.py (does not yet exist). I'm leaning against that, since the docs currently state that warehouse/manage/:
logged-in user functionality (i.e., manage account & owned/maintained projects)
I think I'd prefer that as well!
Post- #13187 , here's the same stats table:
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Language Files Lines Blanks Comments Code Complexity Complexity/Lines
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
Python 624 114656 11411 11552 91693 2443 2096.54
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
tests/unit/manage/test_views.py 7833 161 23 7649 22 0.29
tests/unit/email/test_init.py 5776 11 11 5754 0 0.00
tests/unit/accounts/test_views.py 3884 343 20 3521 43 1.22
tests/unit/forklift/test_legacy.py 3668 95 26 3547 31 0.87
warehouse/manage/views/__init__.py 3271 167 75 3029 106 3.50
~ts/unit/manage/views/test_organizations.py 2949 180 15 2754 10 0.36
warehouse/manage/views/organizations.py 1623 151 70 1402 95 6.78
warehouse/accounts/views.py 1581 130 60 1391 89 6.40
warehouse/forklift/legacy.py 1497 59 75 1363 56 4.11
tests/unit/accounts/test_services.py 1461 129 14 1318 7 0.53
~alware/checks/setup_patterns/test_check.py 1158 21 11 1126 1 0.09
tests/unit/accounts/test_forms.py 1081 41 11 1029 12 1.17
warehouse/email/__init__.py 1022 7 13 1002 0 0.00
tests/unit/oidc/test_services.py 920 120 26 774 8 1.03
...
Nice dent. ~The next carveout will be teams and follow a similar approach.~ I'll try and make smaller changes next instead.
Is this issue useful? If so I'm fine keeping it open, but it feels like the kind of meta issue that never actually gets closed because there's no actual end state or acceptance criteria for it.
I think maybe we should just close it, and say that we're totally fine refactoring any file that is "too long" to be more coherently organized, without being prescriptive about what exactly too long is.
Was curious, so looked at the stats today. As at main commit da33fa0a8a9, lines >1K py files are:
Python 701 130655 12135 13164 105356 2636 2011.75
─────────────────────────────────────────────────────────────────────────────────────────────────────────────
tests/unit/manage/test_views.py 6959 137 18 6804 15 0.22
tests/unit/email/test_init.py 5852 9 11 5832 0 0.00
tests/unit/accounts/test_views.py 4789 296 19 4474 44 0.98
tests/unit/forklift/test_legacy.py 3880 8 11 3861 7 0.18
warehouse/manage/views/__init__.py 2995 199 77 2719 113 4.16
~ts/unit/manage/views/test_organizations.py 2960 183 18 2759 10 0.36
warehouse/accounts/views.py 1867 139 74 1654 96 5.80
warehouse/manage/views/organizations.py 1616 152 74 1390 98 7.05
tests/unit/accounts/test_services.py 1506 135 14 1357 8 0.59
warehouse/forklift/legacy.py 1230 43 85 1102 85 7.71
tests/unit/accounts/test_forms.py 1185 40 11 1134 16 1.41
tests/unit/manage/views/test_teams.py 1069 122 11 936 4 0.43
warehouse/email/__init__.py 1047 7 13 1027 0 0.00
tests/unit/oidc/test_services.py 1010 117 32 861 12 1.39
tests/unit/manage/test_forms.py 1007 15 11 981 3 0.31
The only comment I can make as an outsider is that the test_views.py is doing a pile of mocking, which I usually dislike for tests as it can be brittle. For tests like this I often just commit things to the db during setup, and check db/whatever state ... this makes the tests slower, but fewer mocks = more like prod tests. Sorry if this is obvious, I don't have your extensive history with the code.
Closing, as the main refactors are done to reduce the largest files. As other refactors occur, reorganizing the code should be considered prior to new implementation.