Scribe-Data icon indicating copy to clipboard operation
Scribe-Data copied to clipboard

[Multiple assignees possible] Add missing tests to Scribe-Data

Open andrewtavis opened this issue 5 months ago โ€ข 15 comments

Terms

Description

In #598 we made some progress on the tests for Scribe-Data ๐Ÿงช๐Ÿš€ This issue would continue on the effort to increase the testing coverage of Scribe-Data ๐Ÿ˜Š

The current pytest-cov coverage report is (updated September 20th, 2025):

Name                                                    Stmts   Miss  Cover   Missing
-------------------------------------------------------------------------------------
setup.py                                                   17      3    82%   8-9, 62
src/scribe_data/check/check_project_metadata.py            76      1    99%   219
src/scribe_data/check/check_pyicu.py                       79     64    19%   30-36, 51-54, 69-83, 103-112, 135, 154-220, 224
src/scribe_data/check/check_query_forms.py                226     58    74%   308-309, 313, 321-322, 350-352, 476-479, 523, 545-623, 626-631, 638
src/scribe_data/cli/cli_utils.py                           78      4    95%   83-84, 179, 197
src/scribe_data/cli/contracts/check.py                     66     15    77%   80, 88, 108-113, 125-126, 130, 137-155, 159
src/scribe_data/cli/contracts/filter.py                   110     13    88%   102-130, 135-140
src/scribe_data/cli/convert.py                            147     35    76%   69, 78-79, 104, 122, 153-155, 161-162, 168-170, 225, 236-237, 243-245, 251, 263-264, 314-329, 361-363, 418, 429
src/scribe_data/cli/download.py                           111      9    92%   83, 183-185, 235, 277-283
src/scribe_data/cli/get.py                                 85      8    91%   129-137, 212-215, 222, 290
src/scribe_data/cli/interactive.py                        215     58    73%   155-156, 182-183, 194, 211-226, 255-256, 290-291, 320-321, 328, 372-377, 437-475, 486, 495, 510-513, 589-613, 621-622, 626
src/scribe_data/cli/list.py                                76      2    97%   69, 85
src/scribe_data/cli/main.py                               162     68    58%   413-414, 417-418, 428, 439-441, 448-617, 621
src/scribe_data/cli/total.py                              159     29    82%   86, 95, 104-107, 128, 135-138, 171-189, 194-197, 245, 255, 319-320, 440-443
src/scribe_data/cli/upgrade.py                             36      6    83%   49-51, 56-58, 84
src/scribe_data/cli/version.py                             30      1    97%   75
src/scribe_data/load/data_to_sqlite.py                    167     41    75%   50, 199, 201, 213, 216, 254-256, 277-288, 296, 302-305, 345-361, 372-455
src/scribe_data/load/send_dbs_to_scribe.py                 23      0   100%
src/scribe_data/unicode/generate_emoji_keywords.py         22      1    95%   43
src/scribe_data/unicode/process_unicode.py                 59     46    22%   15, 48-203
src/scribe_data/unicode/unicode_utils.py                    5      0   100%
src/scribe_data/utils.py                                  242     59    76%   51-52, 59-60, 66-67, 73-74, 88, 99, 180-184, 296-301, 324-331, 366-376, 727-740, 746-764, 793-801, 866-877
src/scribe_data/wikidata/check_query/check.py             117     32    73%   291-339, 382
src/scribe_data/wikidata/check_query/query.py              17      0   100%
src/scribe_data/wikidata/check_query/sparql.py             26     15    42%   32-36, 74, 79-91
src/scribe_data/wikidata/parse_dump.py                    365    129    65%   139-140, 152-155, 179, 192, 196, 205, 209, 263-269, 312, 316, 363-374, 388-401, 442-452, 493-494, 501-536, 546, 565-586, 631-632, 634, 638, 653-654, 682-683, 705-706, 722-723, 725, 729, 746-751, 823, 829-874, 877-878, 881-882, 901-903, 910-918
src/scribe_data/wikidata/query_data.py                     95     33    65%   174, 185-193, 217-264
src/scribe_data/wikidata/wikidata_utils.py                 38      2    95%   98, 112
src/scribe_data/wikipedia/extract_wiki.py                 178     78    56%   103, 135-136, 161-170, 192-197, 226-313, 368-371, 374-376, 380, 417-423, 439, 443, 446-447, 461-465
src/scribe_data/wikipedia/generate_autosuggestions.py      26      2    92%   56-59
src/scribe_data/wikipedia/process_wiki.py                 104     55    47%   57, 62, 122-131, 138, 259, 336-437
src/scribe_data/wiktionary/parse_mediaWiki.py              61      3    95%   63, 115-117
-------------------------------------------------------------------------------------
TOTAL                                                    3218    870    73%
Required test coverage of 70% reached. Total coverage: 72.96%

Contribution

Happy to work with people on PRs for this and potentially open a few myself! ๐Ÿš€

andrewtavis avatar Jun 27 '25 20:06 andrewtavis

Ping @angrezichatterbox on this one ๐Ÿ‘‹ Do you think that this could be something that people at the hackathon would have interest in? If need be we could make sub-issues for this ๐Ÿ˜Š

andrewtavis avatar Jun 27 '25 20:06 andrewtavis

Ping @angrezichatterbox on this one ๐Ÿ‘‹ Do you think that this could be something that people at the hackathon would have interest in? If need be we could make sub-issues for this ๐Ÿ˜Š

We could have people interested in it. I will ping in Elements for further discussion.

angrezichatterbox avatar Jun 28 '25 09:06 angrezichatterbox

I'd like to be assigned to this issue.

GraceBocek avatar Aug 04 '25 22:08 GraceBocek

Great to have you on the project, @GraceBocek! Please let us know which files you'd like to focus on for a first PR :) No need for our approval of the files to get started though. Feel free to look into whatever seems fitting ๐Ÿ˜Š

andrewtavis avatar Aug 05 '25 08:08 andrewtavis

Please let us know which files you'd like to focus on for a first PR :)

Is it okay for me to start with testing check_missing_forms.py, the first file on the above list?

GraceBocek avatar Aug 09 '25 18:08 GraceBocek

No stress if you've already started on it, @GraceBocek, but the check_missing_* files would actually be ones that I'd avoid for now as #626 from @harikrishnatp will change them :) No other files above would be ones you'd need to avoid ๐Ÿ˜Š

andrewtavis avatar Aug 10 '25 13:08 andrewtavis

@andrewtavis I will work on testing check_project_metadata.py then in the check folder but outside of the check_missing_forms folder.

GraceBocek avatar Aug 12 '25 15:08 GraceBocek

Fantastic, @GraceBocek! Looking forward to the PR ๐Ÿ˜Š

andrewtavis avatar Aug 12 '25 16:08 andrewtavis

Hi. I would like to work on src/scribe_data/check/check_missing_forms/get_forms.py src/scribe_data/check/check_missing_forms/normalize_forms.py ๐Ÿ˜Š

catreedle avatar Sep 10 '25 03:09 catreedle

Thanks for offering to support here, @catreedle! Let's wait on the check directory for now as we're in the midst of a rewrite, but anything in src/scribe_data/cli/, src/scribe_data/load/, src/scribe_data/unicode/, src/scribe_data/utils.py or src/scribe_data/wikidata/ would be fine!

andrewtavis avatar Sep 10 '25 04:09 andrewtavis

ok, then. I'll start with src/scribe_data/load/ thank you for the heads up @andrewtavis ๐Ÿ˜Š

catreedle avatar Sep 10 '25 14:09 catreedle

Thanks so much for picking this up, @catreedle! ๐Ÿ˜Š

andrewtavis avatar Sep 10 '25 14:09 andrewtavis

Quick update here, I've updated the issue text with the most recent coverage report given the work that's been done to add tests for check_project_metadata and src/scribe_data/load/ ๐Ÿ˜Š We've been able to improve the coverage threshold to 70%! Happy to discuss another file to focus on :)

andrewtavis avatar Sep 20 '25 13:09 andrewtavis

I'd like to work on further tests for src/scribe_data/wikidata/query_data.py.

GraceBocek avatar Nov 21 '25 23:11 GraceBocek

Thanks so much for writing in, @GraceBocek! ๐Ÿ˜Š

andrewtavis avatar Nov 22 '25 11:11 andrewtavis

I will be working next on further tests for wikidata/parse_dump.py.

GraceBocek avatar Dec 02 '25 16:12 GraceBocek