Use pospell
https://pypi.org/project/pospell/
There are Polish dictionaries available for hunspell (pospell), we could leverage it to improve the quality of the translation. It would require some configuration (extra custom dictionary and skipping code blocks). We could look at the other languages' setups.
% pospell --language pl tutorial/*.po
…
tutorial/stdlib2.po:701:heappop
tutorial/stdlib2.po:778:wywnioskowując
tutorial/stdlib2.po:778:Decimal
tutorial/stdlib2.po:791:modulo
tutorial/venv.po:35:Pythonowe
tutorial/venv.po:146:bash
tutorial/venv.po:187:deaktywować
tutorial/venv.po:199:pragramu
tutorial/venv.po:210:podkomend
tutorial/venv.po:210:install
tutorial/venv.po:210:uninstall
tutorial/venv.po:210:freeze
tutorial/venv.po:239:podajac
tutorial/whatnow.po:43:tutorial
tutorial/whatnow.po:77:Szegółowe
tutorial/whatnow.po:100:Cheese
tutorial/whatnow.po:111:Cookbook
tutorial/whatnow.po:111:Wydawnicto
tutorial/whatnow.po:111:Reilly
tutorial/whatnow.po:130:Scientific
tutorial/whatnow.po:172:Cheese
It looks good though it may be annoying with Polishized words like Pythonowe and words like heappop? I will look into the other repos.
python-docs-es has a nice solution: a Python script that merges (in runtime) a base dictionary (with common words for all docs) and per-doc dictionary, which reduce the duplication if you want a dictionary file per-doc and avoid a huge single-file dictionary.
Bigger issue, pospell crash on a codeblock, how do we exclude them?:
<rst-doc>:7: (ERROR/3) Unexpected indentation. while parsing: class Parrot:
def __init__(self):
self._voltage = 100000
@property
def voltage(self):
"""Uzyska aktualne napięcie.""
return self._voltage
Traceback (most recent call last):
<rst-doc>:3: (ERROR/3) Unexpected indentation. while parsing: # punkt to dwukrotka (x, y)
match point:
case (0, 0):
print("Początek")
case (0, y):
print(f"Y={y}")
case (x, 0):
print(f"X={x}")
case (x, y):
print(f"X={x}, Y={y}")
case _:
raise ValueError("Nie punkt")
File "/opt/hostedtoolcache/Python/3.13.3/x64/bin/pospell", line 8, in <module>
sys.exit(main())
~~~~^^
File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 480, in main
errors = spell_check(
args.po_file,
...<4 lines>...
args.jobs,
)
File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 384, in spell_check
errors = flatten(
pool.map(
...<2 lines>...
)
)
File "/opt/hostedtoolcache/Python/3.13.3/x64/lib/python3.13/site-packages/pospell.py", line 342, in flatten
return [element for a_list in list_of_lists for element in a_list]
^^^^^^
TypeError: 'int' object is not iterable
I believe Sphinx should be adding code-block flag to msgids made from code blocks in gettext builder. Then pospell should enable us to filter out those msgids from checking.
Bigger issue, pospell crash on a codeblock, how do we exclude them?:
In python-docs-pt-br, when I was having tons of sphinx-lint errors because of literal-blocks being extracted, my work-around was the following: 1) make gettext disabling literal blocks to generate POT without it; 2) 'sphinx-intl update' to update PO files with the newly generated POT files; 3) run pospell; 4) discard changes to PO files (or simply don't commit).
I believe Sphinx should be adding code-block flag to msgids made from code blocks in gettext builder.
@m-aciek Do you know whether there is an issued filed for this in Sphinx?
I believe Sphinx should be adding code-block flag to msgids made from code blocks in gettext builder.
@m-aciek Do you know whether there is an issued filed for this in Sphinx?
There isn't yet, as far as I'm concerned.
I could not find anything either. pospell has been recently improved to display multi-line msgs better tough, but still a lot to be done.
Christmas came earlier this year: Gettext is implementing a custom sticky flag, which could be used to tell "this is a code-block". See https://lists.gnu.org/archive/html/bug-gettext/2025-06/msg00018.html
I already notified Transifex to support it. Should it be reported to pybabel, Sphinx or both to add support to this new feature?
Thanks for sharing. Hm, so motivation to introduce second prefix is to increase reliability in tooling. I think it's definitely worth sharing with projects! Sphinx writes PO files directly, and uses gettext afaic, so rather doesn't need changes to support it. Pybabel parse files to update them, so probably should be changed.
By the way, current tooling should also correctly handle our custom flag, yet without this new syntax. Edit: hm, or not?
Do you know whether there is an issued filed for this in Sphinx?
For the reference: https://github.com/sphinx-doc/sphinx/issues/13722