python-unidiff icon indicating copy to clipboard operation
python-unidiff copied to clipboard

parse github patch UnidiffParseError: Unexpected trailing newline character

Open liziwl opened this issue 4 years ago • 2 comments

test case

    def test1(self):
        import urllib.request
        from unidiff import PatchSet
        diff = urllib.request.urlopen('https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.patch')
        encoding = diff.headers.get_charsets()[0]
        patch = PatchSet(diff, encoding=encoding)

log

Traceback (most recent call last):
  File "c:\users\liziq\appdata\local\programs\python\python37\lib\unittest\case.py", line 59, in testPartExecutor
    yield
  File "c:\users\liziq\appdata\local\programs\python\python37\lib\unittest\case.py", line 628, in run
    testMethod()
  File "D:\A-work\test\test.py", line 92, in test1
    patch = PatchSet(diff, encoding=encoding)
  File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 421, in __init__
    self._parse(data, encoding=encoding, metadata_only=metadata_only)
  File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 513, in _parse
    current_file._append_trailing_empty_line()
  File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 351, in _append_trailing_empty_line
    raise UnidiffParseError('Unexpected trailing newline character')
unidiff.errors.UnidiffParseError: Unexpected trailing newline character

liziwl avatar Aug 23 '20 03:08 liziwl

Note that passing the full patch data won't work (there are several metadata/information there the lib is not parsing, or ignoring). But if you use this URL instead 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.diff', it should work (since there you have the plain/raw diff from the patch only, which is what the library supports).

Makes sense?

matiasb avatar Aug 28 '20 13:08 matiasb

yes, if I just want to get the path of the changed file.

However, I want to get commit hash and the paths from the patch. Maybe I can use regex to get the hash from 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.patch' and get the path from diff URL 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.diff'

liziwl avatar Aug 29 '20 08:08 liziwl