Fixing misspellings in test vectors and other files
Hello,
While searching possible misspellings in some project (secp256k1), I encountered:
src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json:13: occurences ==> occurrences
src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json:50: knowning ==> knowing
src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json:60: knowning ==> knowing
src/wycheproof/ecdsa_secp256k1_sha256_bitcoin_test.json:98: implemenation ==> implementation
This file was copied from the Wycheproof project (testvectors_v1/ecdsa_secp256k1_sha256_bitcoin_test.json) so I am opening an issue here to get the misspellings fixed.
While at it, I ran codespell on the whole project and found some similar misspellings in several test vector files:
testvectors_v1/aes_wrap_test.json:13: becames ==> becomes, became
testvectors_v1/aria_wrap_test.json:13: becames ==> becomes, became
testvectors_v1/camellia_wrap_test.json:13: becames ==> becomes, became
testvectors_v1/dsa_2048_224_sha224_p1363_test.json:12: occurences ==> occurrences
testvectors_v1/dsa_2048_224_sha224_p1363_test.json:22: knowning ==> knowing
testvectors_v1/dsa_2048_224_sha224_test.json:11: occurences ==> occurrences
testvectors_v1/dsa_2048_224_sha224_test.json:37: knowning ==> knowing
testvectors_v1/dsa_2048_224_sha224_test.json:47: knowning ==> knowing
testvectors_v1/dsa_2048_224_sha256_p1363_test.json:12: occurences ==> occurrences
testvectors_v1/dsa_2048_224_sha256_p1363_test.json:22: knowning ==> knowing
testvectors_v1/dsa_2048_224_sha256_test.json:11: occurences ==> occurrences
testvectors_v1/dsa_2048_224_sha256_test.json:37: knowning ==> knowing
testvectors_v1/dsa_2048_224_sha256_test.json:47: knowning ==> knowing
This tool also identified several misspellings in documentation and other files:
BUILD.bazel:67: specifing ==> specifying
BUILD.bazel:87: specifing ==> specifying
BUILD.bazel:140: specifing ==> specifying
BUILD.bazel:160: specifing ==> specifying
doc/aesgcm.md:13: featur ==> feature
doc/aesgcm.md:15: authencation ==> authentication
doc/aesgcm.md:77: returing ==> returning
doc/bib.md:4: thrid ==> third
doc/bib.md:196: condidtions ==> conditions
doc/dh.md:69: possiblity ==> possibility
doc/dh.md:71: chould ==> should, could
doc/ecdh.md:95: compuatation ==> computation
doc/ecdh.md:97: compuation ==> computation
doc/ecdsa.md:22: messag ==> message
doc/files.md:601: inlcude ==> include
doc/formats.md:105: verion ==> version
doc/index.md:9: individial ==> individual
doc/json_web_crypto.md:73: stringly ==> strongly, stringy
doc/rsa.md:23: initalization ==> initialization
doc/rsa.md:296: certifiates ==> certificates
doc/rsa.md:297: distinguised ==> distinguished
doc/rsa.md:432: makeing ==> making
doc/spongycastle.md:18: possbile ==> possible, possibly
doc/spongycastle.md:33: sepcify ==> specify
doc/spongycastle.md:36: initialze ==> initialize
doc/spongycastle.md:37: resuse ==> reuse, refuse, resume
doc/spongycastle.md:125: taht ==> that
doc/types.md:236: unamed ==> unnamed
doc/types.md:886: hasing ==> hashing
doc/types.md:922: ore ==> or
doc/types.md:923: ore ==> or
doc/types.md:931: alternatve ==> alternative
doc/types.md:943: corretly ==> correctly
java/com/google/security/wycheproof/AccpAllTests.java:28: accidential ==> accidental
java/com/google/security/wycheproof/EcUtil.java:56: becuase ==> because
java/com/google/security/wycheproof/EcUtil.java:240: explicitely ==> explicitly
java/com/google/security/wycheproof/EcUtil.java:340: alogorithm ==> algorithm
java/com/google/security/wycheproof/EcUtil.java:340: preferrable ==> preferable
java/com/google/security/wycheproof/EcUtil.java:547: genrator ==> generator
java/com/google/security/wycheproof/TestResult.java:67: compuation ==> computation
java/com/google/security/wycheproof/TestUtil.java:126: acces ==> access
java/com/google/security/wycheproof/TestVectors.java:55: intentially ==> intentionally
...
What would be the best approach to get these misspellings fixed? If it helps, I can submit a Pull Request.
Thanks a lot for pointing out these misspellings.
The best place to fix the misspellings in the test vector files would be to fix them in the generation code. Unfortunately I don't have access to that code. Some of the documentations was generated from the same source code (e.g. types.md and files.md). These files have not been updated and are out of sync with the test vectors. The other files were written using a Google specific mark down variant and then converted. So unless we get access to the project, updating these files seems to make more trouble than necessary.
About the test vectors themselves: Crypto currencies often use slightly modified primitives with different key formats, signature formats or signature verification. One particular novelty is the use of public key recovery in some protocols. Such public key recoveries add new edge cases and hence new potential for mistakes. I have some new test vector generation code and could potentially generate new test vectors for such protocols.
What would be the best approach to get these misspellings fixed? If it helps, I can submit a Pull Request.
I think a PR fixing these would be welcome. Fixing them in the original generator code would be the ideal approach, but doing the straightforward direct fixes seems like an acceptable way to make immediate progress.