GitPython icon indicating copy to clipboard operation
GitPython copied to clipboard

Updating regex pattern to handle unicode whitespaces.

Open jcole-crowdstrike opened this issue 1 year ago • 3 comments

Replacing the \s whitespace characters with normal spaces (" ") to prevent breaking on unicode whitespace characters (e.g., NBSP). Without this, if a branch name contains a unicode whitespace character that falls under \s, then the branch name will be truncated.

jcole-crowdstrike avatar Mar 01 '24 20:03 jcole-crowdstrike

Thanks Byron! I will update the PR to accommodate the lint-related fix today.

jcole-crowdstrike avatar Mar 01 '24 20:03 jcole-crowdstrike

The failing Windows test jobs are actually related: the test case that fails is TestRemote.test_fetch_unsafe_branch_name, which this pull request introduces.

For example, for the test in Python 3.12, the summary lists exactly one test with failing status, whose full details can be read higher in the report.

(I have checked that this is the case for all six versions of Python tested, and in both runs of the test on 5ca0fba. The same should be expected in the test runs on 827e986 since that change is just to fix the lint checks.)

@Byron I am wondering if the reason it had seemed unrelated is that pytest output now contains details of tests with xfail (expected failure) status, which are unrelated, and that may have distracted from the related test's failure. This changed in major version 8 of pytest (which GitPython is using automatically except on Python 3.7 which doesn't support it). Overall, for most projects, it seems to me that this is an improvement, since it's useful to be able to see the specific details of xfailing tests without extra effort (and check if they really match up with the messages written in the xfail decorators). However, it may be that for GitPython's tests it makes common tasks more cumbersome. If so, then the pytest configuration can be changed (this does not require downgrading pytest).

EliahKagan avatar Mar 01 '24 22:03 EliahKagan

Thanks for double-checking, @EliahKagan! Indeed, when looking again I realized that the test introduced here is failing in the very last line of the output. All the XFail warnings are printed in red, which seem to exaggerate their importance in this case.

Now that I know where to look I can probably point to the exact line of test failure in future, also to help PR authors to find what matters to them. However, if the xfail information isn't always important or examined, maybe it can also be tuned down by default, but tuned up in PRs that need it, or benefit from it. If in doubt, everything could stay as is as I think it can be handled.

Byron avatar Mar 02 '24 09:03 Byron

I did a lot of tracing and eventually found: stdout <_io.TextIOWrapper name=7 encoding='cp1252'> I think we have an encoding issue whereby stdout is being encoded as Windows-1252, instead of UTF-8, due to passing universal_newlines=True (thanks StackOverflow) to subprocess. I have a commit with a couple changes and the new test that I added now passes when I run it in Windows. I will push that and you can see what you think.

jcole-crowdstrike avatar Mar 05 '24 03:03 jcole-crowdstrike

The failure of the Cygwin test job in 8b8c76a (currently the tip of this PR) is unrelated to the changes here. It is instead due to a problem downloading and verifying the authenticity of Cygwin. See the Signature files missing thread on the Cygwin mailing list, and the comment discussion in https://github.com/egor-tensin/setup-cygwin/issues/10.

This prevented that job from running any tests, so there are no results here for Cygwin tests. @Byron I think directly causing them to rerun in the PR is something only you can do. If the workflow is rerun, Cygwin should be able to install with no problems now and the unit tests should be able to run.

EliahKagan avatar Mar 11 '24 07:03 EliahKagan

My pleasure. Thanks so much everybody.

jcole-crowdstrike avatar Mar 11 '24 20:03 jcole-crowdstrike