python-unidiff
python-unidiff copied to clipboard
parse github patch UnidiffParseError: Unexpected trailing newline character
test case
def test1(self):
import urllib.request
from unidiff import PatchSet
diff = urllib.request.urlopen('https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.patch')
encoding = diff.headers.get_charsets()[0]
patch = PatchSet(diff, encoding=encoding)
log
Traceback (most recent call last):
File "c:\users\liziq\appdata\local\programs\python\python37\lib\unittest\case.py", line 59, in testPartExecutor
yield
File "c:\users\liziq\appdata\local\programs\python\python37\lib\unittest\case.py", line 628, in run
testMethod()
File "D:\A-work\test\test.py", line 92, in test1
patch = PatchSet(diff, encoding=encoding)
File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 421, in __init__
self._parse(data, encoding=encoding, metadata_only=metadata_only)
File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 513, in _parse
current_file._append_trailing_empty_line()
File "D:\A-work\test\venv\lib\site-packages\unidiff\patch.py", line 351, in _append_trailing_empty_line
raise UnidiffParseError('Unexpected trailing newline character')
unidiff.errors.UnidiffParseError: Unexpected trailing newline character
Note that passing the full patch data won't work (there are several metadata/information there the lib is not parsing, or ignoring). But if you use this URL instead 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.diff', it should work (since there you have the plain/raw diff from the patch only, which is what the library supports).
Makes sense?
yes, if I just want to get the path of the changed file.
However, I want to get commit hash and the paths from the patch. Maybe I can use regex to get the hash from 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.patch' and get the path from diff URL 'https://patch-diff.githubusercontent.com/raw/eXist-db/exist/pull/2644.diff'