paper-checker
paper-checker copied to clipboard
Find simple grammar mistakes in scientific documents.
PaperCheck
PaperCheck is a python script that searches for simple grammar mistakes in scientific english texts. Unlike other grammar checkers it is free and tailored for scientific texts, such as papers. It might find words that pass a spell check but are most likely not intended in a scientific context, such as "angel" vs. "angle".
Getting Started
git clone https://github.com/emareg/paper-checker.git
cd paper-checker
make setup
Afterwards, you can use the script in two ways:
1. Run the python file
python3 papercheck.py -sgy example/testfile.tex
2. Compile as a stand-alone executable (Unix only)
make
./papercheck -sgy example/testfile.tex
Supported file types: .tex .txt .md .pdf
The found issues are displayed in the terminal and also written into papercheck_report.html
System wide installation
make install
This will copy the stand-alone executable to ~/.local/bin
Install as a Python package
pip3 install .
cd example
python3 -m papercheck -sgy testfile.tex
Features
Spell Checker (-s
option)
Will highlight spelling errors. The script uses a small basic dictionary plus some additional self-made dictionaries for terms such as
- technical: “microcontroller”, “superframe”, “bitmask”
- mathematical: “eigenvector”, “linearization”
- chemical: todo
The larger standard dictionaries are unsuitable because they
- contain errors such as “longitudianl” or “schemati”
- mask informal plural forms such as “vertexes” which should be “vertices”
- include obsolete forms such as “latence” which should be “latency”
Grammar Checker (-g
option)
Will highlight simple grammar mistakes such as
- misuse of “a” or “an”
- doubled auxiliary verbs (e.g. “is are”)
- doubled determiners (e.g. “this the”)
- confused “then” vs. “than”
- confused “to” vs. “too”
- wrong person-verb combination (e.g. “This were”)
Style Checker (-y
option)
Will highlight language that could be improved such as
- wrong words in scientific context (e.g. “angle” vs. “angel”)
- non explained acronyms
- improve less formal words (e.g. use “entire” instead of “whole”)
Plagiarism Checker (-p
option)
experimental!
The script will try to find significant sentences, which are then compared to Google search results. This is a very poor approach but useful as a minimal effort with zero cost.
TeX checker
When you run the script on .tex
files, it will also check for certain TeX problems such as
- unused labels
- missing periods in figure/table captions
- unused math operators in math mode, e.g.
$sin$
instead of$\sin$
Related Work
- LanguageTool: Grammar, Style and Spell Checker written in Java
-
textidote: uses LanguageTool on
.tex
files
So why not use LanguageTool? It is large, slow and not tailored for scientific/technical texts. However, I recommend to use LanguageTool in addition.