forest icon indicating copy to clipboard operation
forest copied to clipboard

Extend cargo-spellcheck to cover entire repository including .github directory

Open Signor1 opened this issue 3 months ago • 25 comments

Summary of changes

Changes introduced in this pull request:

  • Extended cargo-spellcheck configuration to include .github directory for repository-wide markdown spellchecking
  • Added explicit markdown file inclusion pattern to ensure all .md files are scanned across the repository
  • Updated forest.dic dictionary with legitimate technical terms, contributor names, and project-specific terminology
  • Fixed actual spelling errors in CHANGELOG.md: applicabile→applicable, facilate→facilitate, daeamon→daemon, advistory→advisory etc
  • Added security identifiers (RUSTSEC-*) and version codes to project dictionary

The motivation was to expand spellcheck coverage from only docs/docs directory to the entire repository, ensuring consistent spelling standards across all documentation including GitHub workflows, README files, and other markdown content.

Reference issue to close (if applicable)

Closes #[5958]

Other information and links

  • Uses existing cargo-spellcheck infrastructure rather than introducing redundant tooling
  • Maintains consistency with project's existing spellcheck workflow and Hunspell dictionary format
  • Configuration changes ensure CI will catch spelling issues in .github workflows, README files, and other repository documentation
  • All legitimate technical terms were identified through comprehensive analysis and added to forest.dic with appropriate Hunspell flags

Change checklist

  • [x] I have performed a self-review of my own code,
  • [x] I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • [x] I have added tests that prove my fix is effective or that my feature works (if possible),
  • [x] I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Summary by CodeRabbit

  • Documentation

    • Expanded spellcheck coverage to include markdown files across the repository.
    • Updated documentation formatting for consistency and clarity.
    • Corrected spelling and terminology throughout guides and references.
  • Style

    • Standardized inline code formatting for technical identifiers and terms.
    • Improved capitalization and naming conventions in documentation.
  • Chores

    • Enhanced spellcheck configuration with additional pattern rules and emoji support.
    • Simplified build workflows by removing redundant spellcheck scripts.

Signor1 avatar Sep 10 '25 19:09 Signor1

Walkthrough

This PR consolidates documentation and configuration updates including spelling corrections, code formatting consistency (backtick-enclosed identifiers), spellcheck tooling enhancements, and one functional API change. Approximately 90% of changes are editorial; one functional modification updates the bundle-loading function signature.

Changes

Cohort / File(s) Summary
Documentation & README spelling/formatting fixes
CHANGELOG.md, CONTRIBUTING.md, README.md, docs/README.md, f3-sidecar/README.md, scripts/devnet-curio/README.md, scripts/devnet/README.md
Corrected spelling ("behaviour" → "behavior", "applicabile" → "applicable", "everyting" → "everything", "overriden" → "overridden"), fixed typos, adjusted markdown formatting, and standardized capitalization (e.g., "Arch Linux", "Cloudflare Pages").
Developer documentation code formatting
documentation/src/developer_documentation/chain_index_spike.md, documentation/src/developer_documentation/rpc_api_compatibility.md, docs/docs/developers/guides/network_upgrades.md, docs/docs/developers/deep_dives/chain_follower.md
Added backticks around code identifiers (SQLx, NV23, NV24, etc.) and type references for consistency in inline documentation.
User-facing documentation inline formatting
docs/docs/users/getting_started/install.md, docs/docs/users/guides/gc.md, docs/docs/users/guides/interacting_with_wallets.md, docs/docs/users/guides/running_with_gateway.md, docs/docs/users/knowledge_base/docker_tips.md, docs/docs/users/knowledge_base/network_upgrades_state_migrations.md, docs/docs/users/reference/env_variables.md, docs/docs/users/reference/json_rpc.mdx, docs/docs/users/reference/json_rpc_overview.md, docs/docs/users/reference/cli.md
Wrapped code-like terms (F3, CIDs, WSL2, M1/M2, NV22/NV23, localhost, etc.) in backticks; reworded descriptions for clarity (e.g., "~80 GiB" → "80 GiB", "It's" → "It is").
User-facing reference files
docs/docs/users/openrpc.json
Updated TipSetKey description to format CIDs with backticks.
Source code doc comments — CID references
src/blocks/tipset.rs, src/chain/mod.rs, src/chain/store/chain_store.rs, src/chain_sync/validation.rs, src/cid_collections/hash_map.rs, src/cid_collections/mod.rs, src/cid_collections/small_cid_vec.rs, src/cli/subcommands/chain_cmd.rs, src/cli/subcommands/state_cmd.rs, src/cli_shared/cli/client.rs, src/db/car/forest.rs, src/db/car/forest/index/mod.rs, src/db/mod.rs, src/documentation.rs, src/interpreter/fvm...rs, src/ipld/util.rs, src/rpc/error.rs, src/rpc/methods/f3/types.rs, src/rpc/methods/mpool.rs, src/rpc/methods/state.rs, src/rpc/registry/actors_reg.rs, src/shim/machine/manifest.rs, src/tool/subcommands/api_cmd/api_compare_tests.rs, src/tool/subcommands/archive_cmd.rs, src/tool/subcommands/shed_cmd.rs, src/utils/db/car_stream.rs, src/utils/db/mod.rs, src/utils/encoding/cid_de_cbor.rs, src/utils/encoding/fallback_de_ipld_dagcbor.rs
Consistently wrapped CIDs references in backticks throughout doc comments and inline comments.
Source code doc comments — other code terms
src/shim/actors/builtin/miner/mod.rs, src/blocks/block.rs, src/blocks/chain4u.rs, src/chain/snapshot_format.rs, src/chain_sync/consensus.rs, src/cli/humantoken.rs, src/cli/subcommands/f3_cmd.rs, src/cli/subcommands/mod.rs, src/db/car/any.rs, src/db/car/mod.rs, src/db/car/plain.rs, src/eth/eip_1559_transaction.rs, src/eth/eip_155_transaction.rs, src/eth/homestead_transaction.rs, src/f3/mod.rs, src/health/endpoints.rs, src/libp2p/chain_exchange/message.rs, src/libp2p/discovery.rs, src/libp2p_bitswap/message.rs, src/lotus_json/ipld.rs, src/lotus_json/mod.rs, src/lotus_json/sector_size.rs, src/message/signed_message.rs, src/rpc/channel.rs, src/rpc/reflect/mod.rs, src/rpc/types/mod.rs, src/shim/crypto.rs, src/shim/version.rs, src/tool/subcommands/api_cmd.rs, src/tool/subcommands/api_cmd/stateful_tests.rs, src/utils/io/progress_log.rs, src/utils/io/writer_checksum.rs, src/utils/misc/env.rs, src/utils/multihash.rs, src/utils/proofs_api/paramfetch.rs
Added backticks around code terms (F3, SECP, struct, async, enum, CAR/CARv1/CARv2, EIP155/EIP-1559, Mutex, TTY, hasher, truthy, Cloudflare, R2, etc.) in doc comments for consistency.
GitHub templates and workflows
.github/ISSUE_TEMPLATE/1-bug_report.md, .github/ISSUE_TEMPLATE/6-network-upgrade.md, .github/CARGO_ADVISORIES_ISSUE_TEMPLATE.md, .github/DOCKER_ISSUE_TEMPLATE.md, .github/INTEGRATION_TESTS_ISSUE_TEMPLATE.md, .github/RPC_PARITY_ISSUE_TEMPLATE.md, .github/SNAPSHOT_PARITY_ISSUE_TEMPLATE.md
Updated spelling to American English ("behaviour" → "behavior"), adjusted template formatting, wrapped code terms in backticks (e.g., env.WORKFLOW_URL), and minor heading formatting adjustments.
Build configuration & tooling
.cargo/config.toml, .config/forest.dic, .config/spellcheck.toml, Makefile
Fixed typo in cargo config ("overriden" → "overridden"); expanded forest.dic with new tokens; significantly enhanced spellcheck.toml with improved regex patterns, emoji support, and new sections ([nlprules], [reflow]); added markdown-focused spellcheck pass to Makefile.
Workflow and documentation build
.github/workflows/docs-check.yml, docs/Makefile, docs/package.json
Removed spellcheck dictionary formatting targets and yarn spellcheck script; removed redundant workflow steps.
Dictionary cleanup
docs/dictionary.txt
Removed broad set of dictionary entries.
Functional API change
src/daemon/bundle.rs
Updated load_actor_bundles_from_server() signature: now accepts bundles: &[ActorBundleInfo] parameter and returns Vec<Cid> instead of (); refactored control flow to process bundles concurrently using FuturesUnordered, filter to bundles not in DB, and return loaded root CIDs.
Bootstrap script
scripts/tests/bootstrapper/README.md
Reformatted "Teardown" section heading to use inline code formatting.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~15 minutes

  • High repetition, low complexity: ~95% of changes follow consistent, mechanical patterns (backtick formatting, spelling corrections, typo fixes across multiple files).
  • One functional change requiring attention: src/daemon/bundle.rs modifies the public API of load_actor_bundles_from_server() to accept bundle parameters and return CID results. Review should verify:
    • Caller sites properly handle the new Vec<Cid> return value
    • Concurrent processing with FuturesUnordered correctly handles error cases and bundle filtering logic
    • Thread-safety of shared database access during concurrent downloads

Pre-merge checks and finishing touches

âś… Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check âś… Passed The pull request title clearly summarizes the primary objective: extending cargo-spellcheck to cover the entire repository, particularly the .github directory.
Docstring Coverage âś… Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • [ ] 📝 Generate docstrings
đź§Ş Generate unit tests (beta)
  • [ ] Create PR with unit tests
  • [ ] Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b506c918b939dff77a7463722428aebfeb332c71 and 7de371a0f97775b382f7172fd0ab8395d66cd1bb.

đź“’ Files selected for processing (3)
  • .config/spellcheck.toml (1 hunks)
  • docs/docs/users/knowledge_base/docker_tips.md (1 hunks)
  • src/utils/proofs_api/paramfetch.rs (4 hunks)
âś… Files skipped from review due to trivial changes (1)
  • src/utils/proofs_api/paramfetch.rs
đźš§ Files skipped from review as they are similar to previous changes (1)
  • docs/docs/users/knowledge_base/docker_tips.md
🔇 Additional comments (4)
.config/spellcheck.toml (4)

27-53: Transform regex patterns look appropriate.

The expanded set of skip patterns (operators, file extensions, import paths, version identifiers, emojis, etc.) are well-anchored and targeted to avoid false positives without over-matching legitimate prose. Patterns appropriately balance between filtering false positives and preserving spellcheck coverage of actual content.


25-25: Pattern correctly handles documentation by design—no changes needed.

The pattern ^[a-z]+_[a-z]+$ with both anchors is working as intended. It matches only lines that consist entirely of an underscore-separated identifier, which is appropriate for YAML configuration keys. The empty search result across all .md files confirms the pattern does not skip legitimate prose or inline code references—those are longer than a bare identifier and won't match the anchored pattern. The comment "YAML frontmatter - skip underscore_words common in config" confirms the intended scope aligns with the implementation.


9-9: The tokenization change is safe; cargo-spellcheck uses CommonMark parsing for markdown.

cargo-spellcheck uses a CommonMark parser (pulldown-cmark/doc-chunks) to parse docs and extract tokens from the AST rather than relying on regex tokenization. Angle brackets in markdown autolinks (https://...) are recognized by the parser and treated as link nodes, with the URL not tokenized as ordinary prose for spellchecking.

Removing <> from tokenization_splitchars does not affect markdown link handling because markdown syntax is parsed at the AST level by the CommonMark parser, independent of tokenization configuration. The change is intentional and safe.


59-62: Configuration is correct and intentionally minimal.

The [reflow] section with max_line_length = 80 is a supported configuration option. The [nlprules] section has no documented per-section configuration in cargo-spellcheck; nlprules is a checker feature (enable/disable) and configuration is handled by the nlprule library's data/binaries rather than a dedicated TOML section. The empty [nlprules] section is therefore expected and correct. No conflicts or version incompatibilities found.

[!TIP]

📝 Customizable high-level summaries are now available!

You can now customize how CodeRabbit generates the high-level summary in your pull requests — including its content, structure, tone, and formatting.

  • Provide custom instructions to shape the summary (bullet lists, tables, contributor stats, etc.).
  • Use high_level_summary_in_walkthrough to move the summary from the description to the walkthrough section.

Example:

"Create a concise high-level summary as a bullet-point list. Then include a Markdown table showing lines added and removed by each contributing author."


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot] avatar Sep 10 '25 19:09 coderabbitai[bot]

@Signor1 CI is failing can you check it.

akaladarshi avatar Sep 11 '25 03:09 akaladarshi

Hi @akaladarshi, I'm having challenges installing cargo-spellcheck locally.

error: failed to run custom build command for `hunspell-sys v0.3.1`

So I had to use CI to check.

Signor1 avatar Sep 11 '25 11:09 Signor1

Hey @Signor1 could you run make spellcheck and make sure it passes locally?

hanabi1224 avatar Sep 11 '25 13:09 hanabi1224

Hey @hanabi1224, make spellcheck is passing successfully now. Thanks for the support

Signor1 avatar Sep 11 '25 21:09 Signor1

Hey @hanabi1224, make spellcheck is passing successfully now. Thanks for the support

Also, you can let me know if the spellcheck implementation inside the docs folder should be discarded.

Signor1 avatar Sep 11 '25 21:09 Signor1

Hey @hanabi1224 , there's an unrelated CI failure in the docs deployment workflow where yarn dependencies aren't being installed before Wrangler runs.

Signor1 avatar Sep 12 '25 00:09 Signor1

Commits must have verified signatures.

@Signor1 All commits have to to signed to unblock merge. Could you sign all of them?

Also, you can let me know if the spellcheck implementation inside the docs folder should be discarded.

That's fine, we can keep it.

there's an unrelated CI failure in the docs deployment workflow

Retrying, if could still fail due to insufficient permission to deploy from a fork but never mind.

hanabi1224 avatar Sep 12 '25 01:09 hanabi1224

@hanabi1224 , Alright.

Meanwhile, I just signed this PR #6062

Signor1 avatar Sep 12 '25 02:09 Signor1

Hey @Signor1 it requires all commits to be signed before we can merge this PR, please check and make sure every commit has a checkmark https://github.com/ChainSafe/forest/pull/6062/commits

hanabi1224 avatar Sep 12 '25 08:09 hanabi1224

Hey @Signor1 it requires all commits to be signed before we can merge this PR, please check and make sure every commit has a checkmark https://github.com/ChainSafe/forest/pull/6062/commits

Hey @hanabi1224, they're all signed now

Signor1 avatar Sep 12 '25 17:09 Signor1

Hey @Signor1 Thanks for signing all the commits! Please resolve the merge conflict in Cargo.lock and I think we are ready to merge this one.

hanabi1224 avatar Sep 15 '25 08:09 hanabi1224

Hey @Signor1 Thanks for signing all the commits! Please resolve the merge conflict in Cargo.lock and I think we are ready to merge this one.

Hey @hanabi1224 , the conflict has been resolved.

Signor1 avatar Sep 15 '25 09:09 Signor1

@Signor1 There are still a lot of entries that shouldn't be there. Types and variables should be in backticks (it should prevent the spellcheck from complaining about them). Some short forms shouldn't be short. F1..F28 - it's not an entry for the dictionary. Same for v0..v14 and others like this. Also, some names have bad casing, e.g., Devops should be DevOps.

LesnyRumcajs avatar Sep 16 '25 14:09 LesnyRumcajs

Also - feel free to add https://github.com/ChainSafe/forest/tree/main/documentation to exceptions, i.e., don't run checks there. This is outdated and unmaintained; the pages there will be either migrated manually to new docs or removed.

LesnyRumcajs avatar Sep 16 '25 14:09 LesnyRumcajs

@Signor1 Could you please address coderabbitai's and Hubert's comments, resolve conflicts, and ensure the CI tests pass?

elmattic avatar Oct 03 '25 10:10 elmattic

@Signor1 Could you please address coderabbitai's and Hubert's comments, resolve conflicts, and ensure the CI tests pass?

Alright

Signor1 avatar Oct 03 '25 11:10 Signor1

@Signor1 Could you please address coderabbitai's and Hubert's comments, resolve conflicts, and ensure the CI tests pass?

Hi @elmattic , you can check it out.

Signor1 avatar Oct 06 '25 10:10 Signor1

@elmattic, I've resolved the typos.

Signor1 avatar Oct 10 '25 21:10 Signor1

@Signor1 Please make sure the CI check is passing

akaladarshi avatar Oct 16 '25 09:10 akaladarshi

@akaladarshi CI check is fixed. Can this PR be duly checked and merged ? It has taken so much time.

Signor1 avatar Oct 16 '25 12:10 Signor1

@Signor1 CI is still failing you can run the check locally once and verify

akaladarshi avatar Oct 16 '25 12:10 akaladarshi

@akaladarshi Fixed now. All good. Was an oversight

Signor1 avatar Oct 16 '25 12:10 Signor1

@akaladarshi please review. I've fixed the issue

Signor1 avatar Oct 18 '25 15:10 Signor1

@hanabi1224

Signor1 avatar Nov 15 '25 21:11 Signor1