[DOC] Docstring coverage in NVTabular
Coverage improved:
- From 36.6% to 38.4% by https://github.com/NVIDIA-Merlin/NVTabular/pull/1475
Modules with missing docstrings:
- [ ] ops
- [ ] tools
- [x] workflow
Modules to be deprecated:
- [ ] framework_utils
- [ ] loader
- [ ] inference
| Name | Total | Miss | Cover | Cover% |
|---|---|---|---|---|
| _version.py | 19 | 0 | 19 | 100% |
| ops/add_metadata.py | 8 | 7 | 1 | 12% |
| ops/bucketize.py | 2 | 1 | 1 | 50% |
| ops/categorify.py | 14 | 11 | 3 | 21% |
| ops/clip.py | 2 | 1 | 1 | 50% |
| ops/column_similarity.py | 5 | 3 | 2 | 40% |
| ops/data_stats.py | 5 | 4 | 1 | 20% |
| ops/difference_lag.py | 3 | 2 | 1 | 33% |
| ops/drop_low_cardinality.py | 3 | 2 | 1 | 33% |
| ops/dropna.py | 2 | 1 | 1 | 50% |
| ops/fill.py | 10 | 7 | 3 | 30% |
| ops/filter.py | 2 | 1 | 1 | 50% |
| ops/groupby.py | 5 | 4 | 1 | 20% |
| ops/hash_bucket.py | 3 | 2 | 1 | 33% |
| ops/hashed_cross.py | 3 | 2 | 1 | 33% |
| ops/join_external.py | 5 | 4 | 1 | 20% |
| ops/join_groupby.py | 8 | 7 | 1 | 12% |
| ops/lambdaop.py | 3 | 2 | 1 | 33% |
| ops/list_slice.py | 2 | 1 | 1 | 50% |
| ops/logop.py | 2 | 1 | 1 | 50% |
| ops/normalize.py | 10 | 8 | 2 | 20% |
| ops/operator.py | 4 | 1 | 3 | 75% |
| ops/reduce_dtype_size.py | 6 | 5 | 1 | 17% |
| ops/rename.py | 3 | 2 | 1 | 33% |
| ops/stat_operator.py | 5 | 0 | 5 | 100% |
| ops/target_encoding.py | 8 | 7 | 1 | 12% |
| ops/value_counts.py | 5 | 4 | 1 | 20% |
| tools/data_gen.py | 25 | 22 | 3 | 12% |
| tools/dataset_inspector.py | 4 | 2 | 2 | 50% |
| tools/inspector_script.py | 4 | 3 | 1 | 25% |
| workflow/node.py | 1 | 0 | 1 | 100% |
| workflow/workflow.py | 9 | 0 | 9 | 100% |
| TOTAL | 190 | 117 | 73 | 38.4% |
Since many of the missing docstrings are on operators, we might consider using a tool for docstring inheritance in Python (instead of Sphinx), such as custom_inherit.
Since many of the missing docstrings are on operators, we might consider using a tool for docstring inheritance in Python (instead of Sphinx), such as custom_inherit.
Does custom_inherit work with interrogate? My understanding of interrogate was that it just reads in the AST - so I'm guessing that custom_inherit won't work out here either.
fwiw, There is a feature request here for inherited docstrings in interrogate https://github.com/econchick/interrogate/issues/12 - but it sounds tricky to implement using the AST.
For nvt, we're not relying on using sphinx inherit docstring options - but are explicitly copying the docstrings from parent classes, which interrogate doesn't pick up on. For instance, interrogate marks the fit/transform methods for TE as missing docstrings, but we have set them here:
https://github.com/NVIDIA-Merlin/NVTabular/blob/4e877bfa5968700ca9adc98065ef66847fd12bce/nvtabular/ops/target_encoding.py#L415-L417
but interrogate marks them (and others) as missing since its not checking what methods have docstrings at runtime, but only if there is a docstring in the AST:
interrogate -vvv target_encoding.py
================================= Coverage for /home/ben/code/nvtabular/nvtabular/ops/ =================================
-------------------------------------------------- Detailed Coverage ---------------------------------------------------
| Name | Status |
|----------------------------------------------------------------------------|-----------------------------------------|
| target_encoding.py | |
| TargetEncoding (L36) | COVERED |
| TargetEncoding.fit (L165) | MISSED |
| TargetEncoding.fit_finalize (L209) | MISSED |
| TargetEncoding.compute_selector (L219) | MISSED |
| TargetEncoding.column_mapping (L229) | MISSED |
| TargetEncoding.set_storage_path (L282) | MISSED |
| TargetEncoding.clear (L286) | MISSED |
| TargetEncoding.transform (L379) | MISSED |
|----------------------------------------------------------------------------|-----------------------------------------|
------------------------------------------------------- Summary --------------------------------------------------------
| Name | Total | Miss | Cover | Cover% |
|----------------------------------|--------------------|-------------------|--------------------|---------------------|
| target_encoding.py | 8 | 7 | 1 | 12% |
|----------------------------------|--------------------|-------------------|--------------------|---------------------|
| TOTAL | 8 | 7 | 1 | 12.5% |
------------------------------------ RESULT: FAILED (minimum: 35.0%, actual: 12.5%) -----------------------------------
pylint manages to pick up that fit/transform etc have docstrings (with the missing-function-docstring options that we have disabled now) - but isn't configurable like interrogate =(