toml
toml copied to clipboard
Incorrect parsing of multiline strings within lists
Source:
list = [
"first",
"""multi\
line""",
"last",
]
Expected:
{"list": ["first", "multiline", "last"]}
Actual:
{"list": ["first", "mu"]}
The same behavior.
And more examples even for one line. Comma is a problem in the string.
Those are OK.
toml.loads('''a=[' "a","b" ']''')
{'a': [' "a","b" ']}
toml.loads('''a=[""" "a" "b" """]''')
{'a': [' "a" "b" ']}
toml.loads('''a=[""" 'a' 'b' , """]''')
{'a': [" 'a' 'b' , "]}
toml.loads('''a=[""" 'a','b' """]''')
{'a': [" 'a','b' "]}
This is not OK.
toml.loads('''a=[""" "a" "b" , """]''')
{'a': [' "a" ', '']}
toml.loads('''a=[""" "a","b" """]''')
*** toml.decoder.TomlDecodeError: Found tokens after a closed string. Invalid TOML. (line 1 column 1 char 0)
Traceback (most recent call last):
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 512, in loads
multibackslash)
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 778, in load_line
value, vtype = self.load_value(pair[1], strictly_valid)
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 880, in load_value
return (self.load_array(v), "array")
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 1026, in load_array
nval, ntype = self.load_value(a[i])
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 849, in load_value
raise ValueError("Found tokens after a closed " +
ValueError: Found tokens after a closed string. Invalid TOML.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/path/.venv/lib/python3.7/site-packages/toml/decoder.py", line 514, in loads
raise TomlDecodeError(str(err), original, pos)
Hi, just to add another example:
>>> t1 = "foo = [\n\t'a',\n\t'''\\\n\tb\\\n\t''',\n]"
>>> t2 = "foo = [\n\t'''\\\n\ta\\\n\t''',\n\t'''\\\n\tb\\\n\t''',\n]"
>>> t3 = "foo = [\n\t'''\\\n\ta\\\n\t''',\n\t'b',\n\t'''\\\n\tc\\\n\t'''\n]"
>>> print(t1) # One-line + multi-line
foo = [
'a',
'''\
b\
''',
]
>>> print(t2) # Multi-line only
foo = [
'''\
a\
''',
'''\
b\
''',
]
>>> print(t3) # Multi-line + mixed
foo = [
'''\
a\
''',
'b',
'''\
c\
'''
]
>>> toml.loads(t1) # Wrong
{'foo': ['a', '']}
>>> toml.loads(t2) # OK
{'foo': ['a', 'b']}
>>> toml.loads(t3) # OK
{'foo': ['a', 'b', 'c']}
Seems that string arrays which contain multiline strings but start with one-liners trip the parser.