PyRIT icon indicating copy to clipboard operation
PyRIT copied to clipboard

TASK: improve docstrings completeness and quality across PyRIT modules

Open paulinek13 opened this issue 1 month ago • 9 comments

For people who want to contribute/help with this issue:

  1. Follow the general guidelines for contributing to PyRIT: https://azure.github.io/PyRIT/contributing/README.html
  2. Pick one or more modules from the list below and comment below which module(s) you're working on to avoid duplication of effort
  3. Remove the module(s) from tool.ruff.lint.per-file-ignores in pyproject.toml so that docstring issues are no longer ignored there
  4. Run ruff check or pre-commit run ruff-check --all-files to see what issues to fix
  5. Fix the issues
  6. Verify completion: ruff check should report "All checks passed!"
  7. Submit a PR referencing this issue

Here's a list of modules with the number of docstring issues in each:

  • [x] pyrit/analytics: 5 @riyosha
  • [x] pyrit/auth: 12 @riyosha
  • [x] pyrit/chat_message_normalizer: 13 @riyosha
  • [x] pyrit/cli: 2 @riyosha
  • [x] pyrit/common: 47 @paulinek13
  • [x] pyrit/datasets: 34 @paulinek13
  • [ ] pyrit/embedding: 5 @riyosha
  • [ ] pyrit/exceptions: 26 @riyosha
  • [ ] pyrit/executor: 92
  • [ ] pyrit/memory: 154
  • [ ] pyrit/models: 146
  • [ ] pyrit/prompt_converter: 387
  • [ ] pyrit/prompt_normalizer: 15 @paulinek13
  • [ ] pyrit/prompt_target: 209 @paulinek13
  • [ ] pyrit/scenarios: 9
  • [ ] pyrit/score: 117
  • [ ] pyrit/setup: 18

The issues are mostly related to missing docstrings for public classes/methods/functions as well as missing parameter/return documentation.


Original post from 07XI25 "SUGGESTION: enforce docstring consistency in PyRIT with Ruff":

Related issue: #653

In short, this issue proposes adding Ruff as a dev dependency to enforce documentation consistency in PyRIT.

Why

PyRIT currently lacks automated verification of docs consistency. The proposed solution will automatically check for the docstring requirements defined in the project's style guide (.github/instructions/style-guide.instructions.md), including:

- Google-style docstring format
- Type information in parameter descriptions
- Documented return values
- Documented exceptions (Raises sections)
- Triple quotes for all docstrings

It will also ensure that, for example, new public classes/methods/functions actually are documented to prevent missing entries in the docs.

Of course, things like that can be manually reviewed during PR reviews, but this could be easily automated + this change makes it now easier in fixing #653, I think.

Proposed Solution

As stated above, the proposal is to add Ruff as a new dev dependency, configure it properly to match as much of the existing style as possible, and add ruff check pyrit/ to the .pre-commit-config.yaml.

https://pypi.org/project/ruff/ https://docs.astral.sh/ruff/

In future ruff could replace flake8/isort/pylint in PyRIT, but for now, I'm proposing to add it just for docs checks.

More info

ruff check pyrit\ takes about 200-600 ms, so it's super fast.

Currently, there are about 1700 "errors":

Image

config I've used:

[tool.ruff.lint.pydocstyle]
convention = "google"

[tool.ruff.lint]
preview = true
select = [
    "D",  # https://docs.astral.sh/ruff/rules/#pydocstyle-d
    "DOC"  # https://docs.astral.sh/ruff/rules/#pydoclint-doc
]
ignore = [
    "D100",  # Missing docstring in public module
    "D200",  # One-line docstring should fit on one line
    "D205",  # 1 blank line required between summary line and description
    "D212",  # Multi-line docstring summary should start at the first line
    "DOC502",  # Raised exception is not explicitly raised: {id}
]
extend-select = [
    "D204",  # 1 blank line required after class docstring
    "D213",  # Multi-line docstring summary should start at the second line
    "D401",  # First line of docstring should be in imperative mood: "{first_line}"
    "D404",  # First word of the docstring should not be "This"
]

paulinek13 avatar Nov 07 '25 13:11 paulinek13

I'm all for this. It should be part of the pre commit config. We should at the same time evaluate which parts of the existing config are no longer useful. However, fixing 1800 issues is probably going to take a little while. I wonder if we can do it in batches?

romanlutz avatar Nov 08 '25 22:11 romanlutz

Thanks for the feedback. We can certainly do this in batches.

Comment content:

I've divided the docstring errors into two main categories:

  1. formatting / whitespace / style -- easy to fix in one go across the repo
    118     D202    [*] blank-line-after-function
    105     D213    [*] multi-line-summary-second-line
     44     D411    [*] no-blank-line-before-section
     20     D410    [*] no-blank-line-after-section
      3     D209    [*] new-line-after-last-paragraph
      1     D214    [*] overindented-section
      6     D301    [ ] escape-sequence-in-docstring
      2     D300    [*] triple-single-quotes
      3     D403    [*] first-word-uncapitalized
      3     D416    [*] missing-section-name-colon
     80     D415    [ ] missing-terminal-punctuation
  1. public API coverage / completeness / language style -- these will be addressed in separate PRs per module(s) to keep changes manageable
    244     D102    [ ] undocumented-public-method
     71     D107    [ ] undocumented-public-init
     62     D101    [ ] undocumented-public-class
     44     D417    [ ] undocumented-param
     35     D103    [ ] undocumented-public-function
     33     D104    [ ] undocumented-public-package
     27     D105    [ ] undocumented-magic-method
      2     D106    [ ] undocumented-public-nested-class
    191     DOC501  [ ] docstring-missing-exception
    167     DOC201  [ ] docstring-missing-returns
     11     DOC202  [ ] docstring-extraneous-returns
      1     DOC402  [ ] docstring-missing-yields
     21     D404    [ ] docstring-starts-with-this
      8     D418    [ ] overload-with-docstring
    471     D401    [ ] non-imperative-mood

The plan is as follows:

  1. I will start with a PR that adds Ruff as a dev dependency along with the initial doc-related config. The same PR will also fix all errors from the first group across the repo (as they are mostly "mechanical" changes and easy to fix in one go).

  2. Subsequent PRs would address the second group of errors broken down by modules to keep the PRs manageable. These PRs would need to ensure that all Ruff docstring errors in the targeted module(s) are fixed (ruff check pyrit/module1 pyrit/module2 -> "All checks passed!")

  3. Once all the above is done, we can add Ruff to the precommit so that future PRs are blocked if they introduce docs issues.

As for the second point, I think we could track the overall progress of this effort. Maybe a checklist of modules to be done, so (new) contributors can pick/claim one (or more) and work on it.


Does this sound like a good approach?

paulinek13 avatar Nov 11 '25 14:11 paulinek13

Sounds fantastic. I think you can just add a "list" to the original post and I'll add a tag with "help wanted" so people can find this.

romanlutz avatar Nov 12 '25 14:11 romanlutz

@romanlutz Do you think we could ignore docstring issues in pyrit/ui and pyrit/auxiliary_attacks since they are not exposed in the public API Reference?

paulinek13 avatar Nov 12 '25 15:11 paulinek13

They should be in the API reference. It's mostly an oversight that they're not... You can ignore them for the moment (if it's possibly to exclude directories) and I'll get to that at some point.

romanlutz avatar Nov 13 '25 22:11 romanlutz

Hi I'd like to contribute! I'm working on 2bmodules initially:

  • pyrit/analytics: 5

  • pyrit/auth: 12

I'm waiting for 1181 to be merged before I raise a PR. Thanks!

riyosha avatar Nov 16 '25 22:11 riyosha

@riyosha I just merged #1181 so feel free to pull the changes and get started šŸ˜‰

romanlutz avatar Nov 17 '25 21:11 romanlutz

@riyosha I just merged #1181 so feel free to pull the changes and get started šŸ˜‰

Thanks! Just raised a PR. I will also take up modules

  • pyrit/chat_message_normalizer
  • pyrit/cli
  • pyrit/embedding
  • pyrit/exceptions

if the PR looks okay!

riyosha avatar Nov 18 '25 03:11 riyosha

Everything looks fantastic! I reviewed and merged the open PRs. Excellent work, thank you both for your contributions šŸ™‚ Looking forward to move more of the validation to ruff.

romanlutz avatar Nov 19 '25 14:11 romanlutz

Hi! I’m a student and interested in contributing to this issue. I can work on pyrit/scenarios: 9.

eve1ne avatar Nov 30 '25 21:11 eve1ne

@eve1ne please go ahead!

romanlutz avatar Dec 01 '25 01:12 romanlutz

@eve1ne please go ahead!

Thanks! pyrit/scenarios already had all checks passing, so I moved on to pyrit/setup (18 issues) and just opened a PR for those fixes. If everything looks good with that PR, I can take on the pyrit/executor (92 issues) next.

eve1ne avatar Dec 02 '25 22:12 eve1ne

Sounds good! I would only recommend holding off on "prompt_target" until I'm done with #1194 because a lot will change there.

romanlutz avatar Dec 02 '25 22:12 romanlutz

Hi, i'm a student and I would like to help with pyrit/memory. I can start working on the docstrings for this module.

hatimoua avatar Dec 04 '25 14:12 hatimoua

Hi! I just raised a PR for pyrit/executor, if everything looks good with that I can move on to pyrit/models.

eve1ne avatar Dec 05 '25 19:12 eve1ne

It looks great! Feel free to move on to the next one.

romanlutz avatar Dec 06 '25 21:12 romanlutz