Metabug: Improving C-level coverage
[edit by @encukou, May 2024] The coverage report and checklist are outdated. Run coverage locally before contributing.
This bug is going to be used to track work in a other bugs to improve the C-level coverage of the CPython test suite.
There is a set of ~~baseline coverage results on main~~ [edit: outdated, see above] that can be used to find coverage gaps.
The plan, discussed on discuss.python.org is as follows:
- Read through the coverage report and record any notable gaps in the checklist below. The goal is not 100% coverage, and each area of improvement will probably require some judgement calls. For example, covering all cases where memory exhaustion can occur is probably not worth the effort. On the other hand, detailed coverage in the eval loop may be worth the effort.
- When someone has "read through" a particular source file and added created subitems for any interesting gaps, they should check it off on the list below and add links to any issues created.
Related work:
There is related work to publish coverage results from CPython on a regular basis, but this issue is concerned with using those results to actually reduce our gaps in coverage.
List of source files:
- [x]
Include/internal/pycore_asdl.h - [ ]
Include/internal/pycore_bitutils.h - [ ]
Include/internal/pycore_call.h - [ ]
Include/internal/pycore_code.h - [ ]
Include/internal/pycore_frame.h - [ ]
Include/internal/pycore_moduleobject.h - [ ]
Include/internal/pycore_object.h - [ ]
Include/internal/pycore_pymath.h - [ ]
Include/internal/pycore_pymem.h - [ ]
Include/internal/pycore_pystate.h - [ ]
Include/object.h - [ ]
Include/pydtrace.h - [x]
Objects/abstract.c- [ ] Buffer related functions:
PyBuffer_FromContiguous,PyObject_CopyData,PyBuffer_FillContiguousStrides - [ ]
PyNumber_Checkdoesn't testcomplex - [ ]
PySequence_RepeatandPySequence_InPlaceRepeathave no coverage - [x]
PySequence_SetItemwith a negative index is untested - [ ]
PySequence_SetSliceandPySequence_DelSliceare untested - [ ]
PyMapping_HasKeyandPyMapping_HasKeyStringare untested
- [ ] Buffer related functions:
- [x]
Objects/accu.c - [x]
Objects/boolobject.c- [x] #94859
- [x]
Objects/bytearrayobject.c- [x] #95802
- [x]
Objects/bytes_methods.c - [x]
Objects/bytesobject.c- [x] ~In
PyBytes_FromFormatV, the special handling of%pisn't tested.~ It's only the case where the underlying libc is broken that isn't tested. - [x] #95895
- [x] ~In
- [ ]
Objects/call.c- [ ]
PyEval_CallObjectWithKeywordshas no coverage - [ ]
_PyObject_CallMethodId_SizeThas no coverage
- [ ]
- [x]
Objects/capsule.c - [x]
Objects/cellobject.c - [x]
Objects/classobject.c - [x]
Objects/codeobject.c- [x] #94814
- [x] #96609
- [ ] #94816
- [ ]
Objects/complexobject.c - [x]
Objects/descrobject.c - [ ]
Objects/dictobject.c- [ ] In
dictresizeconvert split table into new combined table" is uncovered. - [ ]
_PyDict_GetItemHinthas no coverage
- [ ] In
- [x]
Objects/enumobject.c - [x]
Objects/exceptions.c - [ ]
Objects/fileobject.c- [ ]
PyFile_FromFdhas no coverage - [ ]
PyFile_GetLineoverbytesinput has no coverage
- [ ]
- [x]
Objects/floatobject.c- [x] #94860
- [x]
Objects/frameobject.c- [ ] frame_setlineno has poor coverage in its helper functions get_arg and mark_stacks.
- [ ] _PyFrame_GetState has a switch statement where only the default case is covered.
- [ ] In _PyFrame_FastToLocalsWithError there is no test that exercises the COPY_FREE_VARS case.
- [x]
Objects/funcobject.c- [ ] A bunch of API is untested:
PyFunction_GetCode,PyFunction_GetGlobals,PyFunction_GetModule,PyFunction_GetDefaults,PyFunction_SetDefaults,PyFunction_GetKwDefaults,PyFunction_SetKwDefaults,PyFunction_GetClosure,PyFunction_SetClosure,PyFunction_GetAnnotations,PyFunction_SetAnnotations\ - [x] #98449
- [x] #98317
- [ ] A bunch of API is untested:
- [x]
Objects/genericaliasobject.c - [x]
Objects/genobject.c- [ ]
gen_new_with_qualnameand APIPyGen_NewWithQualNameandPyGen_Newhave no coverage. - [ ]
PyCoro_Newhas no coverage - [ ]
PyAsyncGen_Newhas no coverage - [ ]
async_gen_athrow_sendhas poor coverage
- [ ]
- [x]
Objects/interpreteridobject.c - [x]
Objects/iterobject.c- [x] #95923
- [x]
Objects/listobject.c - [x]
Objects/longobject.c- [ ]
_PyLong_Sing_t_Converterhas no coverage - [ ]
long_format_binarydoesn't test outputting to UCS2 or UCS4 - [ ]
int_bit_length_implandint_bit_count_impldoesn't cover the case where expression overflows
- [ ]
- [x]
Objects/memoryobject.c- [ ]
init_sliceis not well-covered
- [ ]
- [x]
Objects/methodobject.c - [x]
Objects/moduleobject.c- [ ]
PyModule_GetFilenamehas no coverage
- [ ]
- [x]
Objects/namespaceobject.c - [x]
Objects/object.c- [ ]
PyObject_Printhas no coverage - [x]
PyObject_Bytesdoes not test the case where there is a__bytes__ - [x] #96627
- [ ]
PyObject_SetAttrStringdoesn't test when object has atp_setattr - [ ]
PyObject_GetAttrStringdoesn't test when object has atp_getattr - [ ]
_PyObject_LookupAttrdoesn't test when object has atp_getattr
- [ ]
- [ ]
Objects/obmalloc.c - [x]
Objects/odictobject.c - [x]
Objects/picklebufobject.c- [ ]
PyPickleBuffer_FromObject,PyPickleBuffer_Releasehas no coverage
- [ ]
- [x]
Objects/rangeobject.c - [x]
Objects/setobject.c - [x]
Objects/sliceobject.c- [ ]
PySlice_GetIndices/PySlice_GetIndicesExhas no coverage
- [ ]
- [ ]
Objects/stringlib/codecs.h - [ ]
Objects/stringlib/count.h - [ ]
Objects/stringlib/ctype.h - [ ]
Objects/stringlib/eq.h - [ ]
Objects/stringlib/fastsearch.h- [x] #96760
- [ ]
Objects/stringlib/find.h - [ ]
Objects/stringlib/find_max_char.h - [ ]
Objects/stringlib/join.h - [ ]
Objects/stringlib/localeutil.h - [ ]
Objects/stringlib/partition.h - [ ]
Objects/stringlib/replace.h - [ ]
Objects/stringlib/split.h - [ ]
Objects/stringlib/transmogrify.h - [ ]
Objects/stringlib/undef.h - [ ]
Objects/stringlib/unicode_format.h - [x]
Objects/structseq.c - [x]
Objects/tupleobject.c - [x]
Objects/typeobject.c- [ ]
wrap_sq_setitemhas no coverage
- [ ]
- [x]
Objects/unicodectype.c - [x]
Objects/unicodeobject.c- [ ]
xmlcharrefreplacedoesn't test for codepoints < 100 (This seems almost impossible to occur). - [ ]
resize_inplacehas no coverage - [ ]
unicode_kind_namewhen!PyUnicode_IS_COMPACTisn't covered -- low priority used by consistency check only - [ ]
unicode_write_cstrdoesn't test writing into UCS2 or UCS4 - [x] #96677
- [ ]
PyUnicode_AsDecodedObject,PyUnicode_AsDecodedUnicode,PyUnicode_AsEncodedObject,PyUnicode_AsEncodedUnicodehas no coverage - [ ]
_Py_DecodeUTF8Exand_Py_EncodeUTF8Exhas no coverage forerror == surrogateescape - [ ]
PyUnicode_BuildEncodingMapdoesn't handle theneed_dictcase - [ ]
ucs1lib_find_sliceanducs1lib_rfind_slicearen't covered. - [x]
PyUnicode_Counthas no coverage - [x] #98228
- [x]
PyUnicode_CompareWithASCIIStringhas no coverage for comparing with UCS2 or UCS4 - [ ]
_PyUnicode_EqualToASCIIIdhas no coverage
- [ ]
- [x]
Objects/unicodetype_db.h - [x]
Objects/unionobject.c - [x]
Objects/weakrefobject.c - [x]
Parser/action_helpers.c- [ ]
_PyPegen_set_expr_contextdoesn't cover "starred kind" - [ ]
_PyPegen_get_expr_nameswitch statement coverage is non-exhaustive
- [ ]
- [x]
Parser/myreadline.c(N/A Windows-only) - [x]
Parser/parser.c - [x]
Parser/peg_api.c - [x]
Parser/pegen.c - [x]
Parser/pegen.h - [x]
Parser/pegen_errors.c- [x] #94926
- [x]
Parser/string_parser.c- [x] #95925
- [ ]
Parser/tokenizer.c- [x] #94823
- [ ] tokenizer.c seems to have no coverage for a few functions related to interactive usage, e.g. tok_underflow_interactive and tok_concatenate_interactive_newline.
- [x] ~
Python/Python-ast.c~ Generated code - [x]
Python/Python-tokenize.c - [x]
Python/_warnings.c- [ ]
show_warningdoesn't cover the case where there is asourceline. - [ ]
PyErr_WarnExplicithas no coverage
- [ ]
- [ ]
Python/asdl.c - [x]
Python/ast.c- [ ]
ensure_literal_*functions aren't covered - [ ]
validate_pattern_match_valuedoesn't cover all elements of switch
- [ ]
- [x]
Python/ast_opt.c- [ ]
check_complexitydoesn't cover thefrozensetcase - [ ]
ast_foldbodyisn't covered
- [ ]
- [ ]
Python/ast_unparse.c - [ ]
Python/bltinmodule.c - [ ]
Python/bootstrap_hash.c - [x]
Python/ceval.c- [ ]
PyEval_AquireLockandPyEval_ReleaseLockare uncovered - [x] #95932
- [ ]
STORE_ATTR_WITH_HINTdoesn't cover the case where the dictionary doesn't have Unicode keys - [x]
CALL_FUNCTION_EXdoesn't cover the case where kwargs is not an exact dict - [ ]
PyEval_EvalCodeExdoesn't cover the case where kwargs are passed in - [ ]
PyEval_GetFramehas no coverage - [x] #98300
- [ ]
- [ ]
Python/ceval_gil.h - [ ]
Python/codecs.c - [x]
Python/compile.c- [ ] write_instr is not handling the case where ilen > 2. It might be that those are never seen in practice...? If so, feel free to close this bug.
- [ ] check_ann_subscr doesn't have any coverage for slice or tuple kinds.
- [ ] optimize_basic_block has some opcodes that aren't covered in the JUMP_IF_FALSE_OR_POP and the JUMP_IF_TRUE_OR_POP cases.
- [ ]
Python/condvar.h - [ ]
Python/context.c- [ ]
PyContext_Copy,PyContext_Enter,PyContext_Exithave no coverage
- [ ]
- [ ]
Python/deepfreeze/deepfreeze.c - [ ]
Python/dtoa.c - [ ]
Python/dup2.c - [x]
Python/dynamic_annotations.c - [ ]
Python/errors.c - [x]
Python/fileutils.c- [ ]
is_valid_wide_chardoesn't test error branches - [ ]
encode_ascii/decode_asciihas no coverage (probably very low priority -- comment says only for platforms with a broken mbstowcs (FreeBSD, OpenIndiana) - [ ]
_Py_stathas no coverage
- [ ]
- [ ]
Python/formatter_unicode.c - [x]
Python/frame.c - [ ]
Python/frozenmain.c - [ ]
Python/future.c - [ ]
Python/getargs.c - [ ]
Python/getopt.c - [ ]
Python/hamt.c - [ ]
Python/hashtable.c - [ ]
Python/import.c - [ ]
Python/importdl.c - [ ]
Python/initconfig.c - [ ]
Python/marshal.c - [ ]
Python/modsupport.c - [ ]
Python/mysnprintf.c - [ ]
Python/mystrtoul.c - [ ]
Python/pathconfig.c - [ ]
Python/preconfig.c - [ ]
Python/pyarena.c - [x]
Python/pyfpe.c - [ ]
Python/pyhash.c - [ ]
Python/pylifecycle.c - [ ]
Python/pystate.c - [ ]
Python/pystrcmp.c - [ ]
Python/pystrhex.c - [ ]
Python/pystrtod.c - [ ]
Python/pythonrun.c - [ ]
Python/pytime.c - [x]
Python/specialize.c - [ ]
Python/structmember.c - [ ]
Python/suggestions.c - [x]
Python/symtable.c - [ ]
Python/sysmodule.c - [ ]
Python/thread.c - [x]
Python/traceback.c- [x] tracebacks with angle-bracketed filenames https://github.com/python/cpython/issues/95259
- [ ] tb_printinternal with
depth > limit - [ ]
_PyTraceBack_Print_Indentedwith overflowingtracebacklimit - [ ] No coverage for
_Py_DumpDecimal,_Py_DumpHexadecimal,_Py_DumpASCII,dump_frame,dump_traceback,_Py_DumpTraceback,write_thread_id,_Py_DumpTracebackThreads-- possibly they have tests which are disabled under some circumstances.
- PR: gh-98749
- PR: gh-96767
- PR: gh-99091
- PR: gh-99123
- PR: gh-98300
- PR: gh-98809
- PR: gh-99126
- PR: gh-99133
- PR: gh-99196
- PR: gh-99319
- PR: gh-99429
- PR: gh-99472
- PR: gh-97672
- PR: gh-100483
- PR: gh-100484
- PR: gh-100619
- PR: gh-102469
- PR: gh-111445
- PR: gh-117421
- PR: gh-119222
- PR: gh-119227
- PR: gh-119263
- PR: gh-119264
@mdboom: I think a better approach would be to create sub-checklists or a single issue per file with another checklist for the paths that need coverage, instead of opening dozen of issues for each individual path.
Even having an issue per file will result in 548 new issues though, so I'm not sure if we want to do that preemptively for each file. I'd say it's better to keep the checklist with the files and paths here, and then directly create PRs for each (or multiple) paths.
Another option is to create a project, and handle it there. You can add a new custom field to specify the file, and create draft issues for each path without creating actual issues here (let me know if you need more help with that).
I think a better approach would be to create sub-checklists or a single issue per file with another checklist for the paths that need coverage, instead of opening dozen of issues for each individual path.
I thought about that, but most of the uncovered areas will have independent fixes, and bug-per-area sets up that work to happen.
I'm starting with files that have seen a lot of changes lately, so we're seeing quite a few issues in them. I suspect most files will not be that way -- most will probably have no issues, and we can just check the box here and not create a flood of issues where they aren't needed.
Even having an issue per file will result in 548 new issues though, so I'm not sure if we want to do that automatically for each file. I'd say it's better to keep this checklist here, and create PRs.
Agreed.
Another option is to create a project, and handle it there (let me know if you need help with that).
I'm happy to use a project instead if you'd prefer. IIUC, it's pretty easy to move the existing issues already created into it. I know I can't create a project, but once it's created, I don't know what my limited permissions will allow me to do.
Maybe it would be better to start by trimming down the list to remove files that have no issues, and see how many files are left first.
The remaining files and their paths could be listed here, and if someone starts working on them and wants to discuss the approach they could create the issues lazily or directly create PRs that refers to this meta issues if the fix is straightforward enough.
I don't think having lot of almost empty issues (like https://github.com/python/cpython/issues/94817) will help this effort.
Maybe it would be better to start by trimming down the list to remove files that have no issues, and see how many files are left first.
I programmatically removed files that have 100% coverage, and then also (manually) removed anything platform specific and all of Modules (which can be handled easily in separate waves later -- they aren't a priority now). This gets us down to 136 tasks from 500+, which is a lot more manageable.
The remaining files and their paths could be listed here, and if someone starts working on them and wants to discuss the approach they could create the issues lazily or directly create PRs that refers to this meta issues if the fix is straightforward enough.
My thinking was this checklist was to say "someone has read through the file and identified all of the potential issues". Some of the issues will have simple tests that can be written, some are dead code, some might reveal bugs, but we need a place to have those discussions and deal with them individually (not here, ideally). For really simple ones, if it's ok to just reference this bug, that's fine by me, but we still need a way to keep track of which files have been vetted to track progress.
I don't think having lot of almost empty issues (like #94817) will help this effort.
Maybe not in general. The ones I've filed so far are directly tied to the faster CPython work, and the ability to move with more confidence there. So there are motivated people who want to close these bugs. But I empathize with the concern of just creating bugs for the sake of creating them.
This gets us down to 136 tasks from 500+, which is a lot more manageable.
This is much better, thanks for doing this!
My thinking was this checklist was to say "someone has read through the file and identified all of the potential issues".
My idea was to do something like:
- [ ]
Python/compile.c- [ ] write_instr is not handling the case where ilen > 2. It might be that those are never seen in practice...? If so, feel free to close this bug.
- [ ] check_ann_subscr doesn't have any coverage for slice or tuple kinds.
- [ ] optimize_basic_block has some opcodes that aren't covered in the JUMP_IF_FALSE_OR_POP and the JUMP_IF_TRUE_OR_POP cases.
- [ ]
Python/condvar.h - [ ]
Python/context.c - [ ] ...
Once people start working on these items, there are different options. They can:
- create a single PR to fix all 3, and add it to the checklist (maybe next to the filename)
- create multiple PRs to fix them individually, and add them to the checklist (as sub-items)
- create a single issue for the file with its own checklist where to discuss all 3, add it to the checklist, and then create PRs linked to the issue
- create multiple issues for each of the problems in case they need to be discussed individually (what you were doing)
The first two options work for straightforward cases, the third works if the problems are similar, and the last for more complex cases that require discussions. You can also do this incrementally, i.e. start with a list like the one above, then convert some to PRs and some to issues as needed.
In addition, if you hover with the mouse over a checklist item, a β¨ icon will appear to the right which will allow you to quickly convert the item to an actual issue if/when you decide to open a discussion about a specific file or problem.
So there are motivated people who want to close these bugs.
That's great to hear, I just wanted to avoid creating a bunch of issues that might end up sitting there indefinitely :)
Thanks. The revised plan you suggested should work just fine. I didn't know about the feature to automatically open an issue from a checklist item.
@pablogsal, should we be backporting these coverage-improving tests to 3.11? I think we should be consistent in our handling of all of them.
I don't think we usually backport test-only PRs, but perhaps these are sort of a special case.
Test only PRs are generally ok to backport π
When someone has "read through" a particular source file and added created subitems for any interesting gaps, they should check it off on the list below and add links to any issues created.
Hi, do you need some more eyes in checking coverage report? I just read Python Developer's Guide and I'm finding some issue with "easy" tag to fix ;)
i am not sure why i run ./python Lib/test/test_funcattrs.py: test__builtins__ failed
@ofey404 wrote:
Hi, do you need some more eyes in checking coverage report? I just read Python Developer's Guide and I'm finding some issue with "easy" tag to fix
Yes. We could use help both finding coverage holes (by taking a file without a checkmark above and adding entries for anything not well-covered beneath it), or taking any of the holes and trying to write a test to cover it.
@fatelei wrote:
i am not sure why i run ./python Lib/test/test_funcattrs.py: test__builtins__ failed
If you want to run a single test file, you should use:
./python -m test test_funcattrs
Hi @mdboom, I tried to improve the test coverage of PyObject_HasAttrString. I'm new to the C API of Python and I hope that my PR can help you π
Working on gen_new_with_qualname and API PyGen_NewWithQualName and PyGen_New have no coverage PR πͺ
Question from https://github.com/python/cpython/pull/98545
Do we care to cover deprecated API?
No, just like we donβt care to fix bugs in deprecated code (security excepted).
Thank you, https://github.com/python/cpython/pull/98545 is now closed.
@mdboom I have several pending PRs to get merged, and after that I propose to update the meta information / coverage data. There will be a lot of changes in coverage.
Plus, there are new changes to the source code.
What do you think? :)
@sobolevn could you request review from me on these PRs? I'll see if I can work through some of them.
WIP:
- Rebasing existing PRs to fix merge conflicts with new file structure
- Working on
PySlice_GetIndices/PySlice_GetIndicesEx
Hi, I tried to improve coverage for Objects/dictobject.c with this PR: #100619 BTW, it seems like _PyDict_GetItemHint (which is mentioned in the list) no longer exists.
The list is quite outdated :)
The checklist above says
PyNumber_Checkdoesn't testcomplex
but Objects/abstract.c:905 currently has
return nb && (nb->nb_index || nb->nb_int || nb->nb_float || PyComplex_Check(o));
PyComplex_Check itself seems well-developed in Objects/complexobject.c.
Is this checklist item obsolete?
The checklist above says
PyMapping_HasKeyandPyMapping_HasKeyStringare untested
This appears not to be the case. The baseline coverage report linked in the bug description says they are uncovered, but that report is from 2022. I ran a full lcov today and the functions do appear to be fully covered. as of commit 16c9415fba4 from August 2023. For example:
Lib/test/test_capi/test_abstract.py lines 440β528 appear to test these functions extensively.
Is this checklist item obsolete?
Yes, it's quite possible the list is obsolete at this point.
Perhaps edit the comment:
- Check off the items I indicated, to prevent future maintainers from trying to follow them up
- Remove the link to the coverage report from 2022
If you like, I can check into the other Object/abstract.c items and let you know if they can all be checked off.
I've added a warning about outdated. There is a PR for publishing automatic coverage reports here: https://github.com/python/cpython/pull/94760 @mdboom, do you need help pushing that forward?
@encukou: Yeah, I haven't really thought about #94760 in a while. It still seems like a worthwhile thing to do, but I'm not sure I will have time in the near future. Feel free to take it over if you are interested.