software-submission icon indicating copy to clipboard operation
software-submission copied to clipboard

[Submission] LangLint — String-Level Translation and Linting of Code Comments and Docstrings for Research Software

Open HzaCode opened this issue 2 months ago • 3 comments

Submitting Author: @HzaCode All current maintainers: @HzaCode Package Name: LangLint One-Line Description of Package: A Rust-powered, code-aware toolkit that extracts, validates, and translates multilingual strings (comments, docstrings, string literals) in scientific software. Repository Link: https://github.com/HzaCode/Langlint Version submitted: v1.0.0 EiC: @yeelauren Editor: TBD Reviewer 1: TBD Reviewer 2: TBD Archive: TBD JOSS DOI: TBD Version accepted: TBD Date accepted (month/day/year): TBD


Code of Conduct & Commitment to Maintain Package

  • [x] I agree to abide by [pyOpenSci's Code of Conduct][PyOpenSciCodeOfConduct] during the review process and in future interactions in spaces supported by pyOpenSci should it be accepted.
  • [x] I have read and will commit to package maintenance after the review as per the [pyOpenSci Policies Guidelines][Commitment].

Description

LangLint ensures multilingual consistency at the string level inside code—comments, docstrings, and string literals—rather than full documents. It safely extracts human-language units from parsed code, validates language consistency, and optionally translates them while protecting executable syntax. The core is implemented in Rust (PyO3/maturin) for significant speedups (observed 10–50× vs. a prior Python implementation). The test suite runs offline by default (using a mock translator) for reproducibility. Optional remote providers (e.g., OpenAI, DeepL, Google Cloud Translate, LibreTranslate) are opt-in and require explicit user configuration; usage is documented to comply with each provider’s Terms of Service. LangLint integrates cleanly with CI/CD (GitHub Actions, pre-commit) so teams can enforce multilingual consistency much like they enforce style with Ruff.


Scope

  • [x] Data extraction
  • [x] Data processing/munging
  • [x] Data validation and testing
  • [x] Workflow automation

Domain Specific

  • [ ] Geospatial
  • [ ] Education

Community Partnerships

  • [ ] Astropy
  • [ ] Pangeo

Why it fits the scope

  • Target audience: Research software engineers and scientists maintaining multilingual codebases where comments/docstrings/strings are not all in English.
  • Scientific applications: Improves readability, reproducibility, and FAIR compliance of research code by standardizing the linguistic layer that explains methods and assumptions.
  • Other packages: Linters (Ruff/Flake8) focus on syntax/style; translation libraries focus on free text. LangLint uniquely bridges both—it is code-aware, extracting only translatable string units and validating/transforming them without altering code semantics.
  • Pre-submission enquiry: N/A (initial submission).

Technical checks

This package:

  • [x] does not violate the Terms of Service of any service it interacts with. Notes: Tests/CI run offline using a mock translator. Any remote translators are opt-in, require explicit configuration (e.g., API keys), and documented to be used in accordance with each provider’s ToS.
  • [x] uses an [OSI approved license][OsiApprovedLicense] (MIT).
  • [x] contains a README with instructions for installing the development version.
  • [x] includes documentation with examples for all functions.
  • [x] contains a tutorial with examples of its essential functions and uses.
  • [x] has a test suite.
  • [x] has continuous integration setup (GitHub Actions). Benchmarking: We provide (or will provide) a simple, reproducible benchmark script (e.g., bench/) to substantiate performance claims.

Publication Options

Note: JOSS accepts pyOpenSci’s review. We will link this issue in the JOSS submission and indicate that the package has undergone pyOpenSci review.


Are you OK with Reviewers Submitting Issues and/or pull requests to your Repo Directly?

  • [x] Yes I am OK with reviewers submitting requested changes as issues/PRs to my repo. Reviewers will then link to the issues in their submitted review.

Confirm each of the following:

  • [x] I have read the [author guide](https://www.pyOpenSci.org/software-peer-review/how-to/author-guide.html).
  • [x] I expect to maintain this package for at least 2 years and can help find a replacement for the maintainer (team) if needed.

HzaCode avatar Oct 20 '25 01:10 HzaCode

Thanks for your submission(s) @HzaCode, I will follow-up with eic checks soon!

yeelauren avatar Oct 24 '25 18:10 yeelauren

@yeelauren Thanks so much! Really looking forward to the review process 🙌

HzaCode avatar Oct 25 '25 00:10 HzaCode

Hi @yeelauren,

Thanks for the heads-up last week! Is there any update on the pre-review checks?

Let me know if you need anything from me. Thanks!

HzaCode avatar Nov 03 '25 05:11 HzaCode