NVTabular icon indicating copy to clipboard operation
NVTabular copied to clipboard

[DOC] Docstring coverage in NVTabular

Open karlhigley opened this issue 3 years ago • 2 comments

Coverage improved:

  • From 36.6% to 38.4% by https://github.com/NVIDIA-Merlin/NVTabular/pull/1475

Modules with missing docstrings:

  • [ ] ops
  • [ ] tools
  • [x] workflow

Modules to be deprecated:

  • [ ] framework_utils
  • [ ] loader
  • [ ] inference
Name Total Miss Cover Cover%
_version.py 19 0 19 100%
ops/add_metadata.py 8 7 1 12%
ops/bucketize.py 2 1 1 50%
ops/categorify.py 14 11 3 21%
ops/clip.py 2 1 1 50%
ops/column_similarity.py 5 3 2 40%
ops/data_stats.py 5 4 1 20%
ops/difference_lag.py 3 2 1 33%
ops/drop_low_cardinality.py 3 2 1 33%
ops/dropna.py 2 1 1 50%
ops/fill.py 10 7 3 30%
ops/filter.py 2 1 1 50%
ops/groupby.py 5 4 1 20%
ops/hash_bucket.py 3 2 1 33%
ops/hashed_cross.py 3 2 1 33%
ops/join_external.py 5 4 1 20%
ops/join_groupby.py 8 7 1 12%
ops/lambdaop.py 3 2 1 33%
ops/list_slice.py 2 1 1 50%
ops/logop.py 2 1 1 50%
ops/normalize.py 10 8 2 20%
ops/operator.py 4 1 3 75%
ops/reduce_dtype_size.py 6 5 1 17%
ops/rename.py 3 2 1 33%
ops/stat_operator.py 5 0 5 100%
ops/target_encoding.py 8 7 1 12%
ops/value_counts.py 5 4 1 20%
tools/data_gen.py 25 22 3 12%
tools/dataset_inspector.py 4 2 2 50%
tools/inspector_script.py 4 3 1 25%
workflow/node.py 1 0 1 100%
workflow/workflow.py 9 0 9 100%
TOTAL 190 117 73 38.4%

karlhigley avatar Mar 23 '22 18:03 karlhigley

Since many of the missing docstrings are on operators, we might consider using a tool for docstring inheritance in Python (instead of Sphinx), such as custom_inherit.

karlhigley avatar Mar 30 '22 21:03 karlhigley

Since many of the missing docstrings are on operators, we might consider using a tool for docstring inheritance in Python (instead of Sphinx), such as custom_inherit.

Does custom_inherit work with interrogate? My understanding of interrogate was that it just reads in the AST - so I'm guessing that custom_inherit won't work out here either.

fwiw, There is a feature request here for inherited docstrings in interrogate https://github.com/econchick/interrogate/issues/12 - but it sounds tricky to implement using the AST.

For nvt, we're not relying on using sphinx inherit docstring options - but are explicitly copying the docstrings from parent classes, which interrogate doesn't pick up on. For instance, interrogate marks the fit/transform methods for TE as missing docstrings, but we have set them here:

https://github.com/NVIDIA-Merlin/NVTabular/blob/4e877bfa5968700ca9adc98065ef66847fd12bce/nvtabular/ops/target_encoding.py#L415-L417

but interrogate marks them (and others) as missing since its not checking what methods have docstrings at runtime, but only if there is a docstring in the AST:

interrogate -vvv target_encoding.py 
================================= Coverage for /home/ben/code/nvtabular/nvtabular/ops/ =================================
-------------------------------------------------- Detailed Coverage ---------------------------------------------------
| Name                                                                       |                                  Status |
|----------------------------------------------------------------------------|-----------------------------------------|
| target_encoding.py                                                         |                                         |
|   TargetEncoding (L36)                                                     |                                 COVERED |
|     TargetEncoding.fit (L165)                                              |                                  MISSED |
|     TargetEncoding.fit_finalize (L209)                                     |                                  MISSED |
|     TargetEncoding.compute_selector (L219)                                 |                                  MISSED |
|     TargetEncoding.column_mapping (L229)                                   |                                  MISSED |
|     TargetEncoding.set_storage_path (L282)                                 |                                  MISSED |
|     TargetEncoding.clear (L286)                                            |                                  MISSED |
|     TargetEncoding.transform (L379)                                        |                                  MISSED |
|----------------------------------------------------------------------------|-----------------------------------------|

------------------------------------------------------- Summary --------------------------------------------------------
| Name                             |              Total |              Miss |              Cover |              Cover% |
|----------------------------------|--------------------|-------------------|--------------------|---------------------|
| target_encoding.py               |                  8 |                 7 |                  1 |                 12% |
|----------------------------------|--------------------|-------------------|--------------------|---------------------|
| TOTAL                            |                  8 |                 7 |                  1 |               12.5% |
------------------------------------ RESULT: FAILED (minimum: 35.0%, actual: 12.5%) -----------------------------------

pylint manages to pick up that fit/transform etc have docstrings (with the missing-function-docstring options that we have disabled now) - but isn't configurable like interrogate =(

benfred avatar Apr 01 '22 20:04 benfred