difftastic
difftastic copied to clipboard
Allow single-quoted strings to be the same as double-quoted strings
Great tool!
(1) A description of the issue. A screenshot is often helpful too.
I'm comparing two python files, where the majority of changes is a swap between single and double quotes, which should be ignored. However difftastic shows quote changes as a difference.
(2) A copy of what you're diffing. I'm diffing files.
Before:
def my_func():
print("Hello World")
After:
def my_func():
print('Hello World')
Expected result: no differences.
(3) The version of difftastic you're using (see difft --version
) and your operating system.
Difftastic 0.58.0 (0c92771 2024-05-10, built with rustc 1.65.0) Running on Windows 11.
This is intentional I'm afraid. Difftastic is a syntactic differ, and semantic information about which string literals are equivalent are out of scope. In some languages, single-quoted strings and double-quoted strings are different (e.g. in bash or PHP) due to different interpolation rules.
Fair. Any chance you can point me to a place where I can make this change in my forked repo? Something like a place where difftastic gets a list of changes to output? I want to manually remove single/double quote differences.
@eduard93 you should be able to change what's considered the content of the Atom when it's a string here:
https://github.com/Wilfred/difftastic/blob/b88b4056203cdd3075cd341595411195671a163b/src/parse/syntax.rs#L411
If it's an AtomKind::String, drop the first and last characters of content
.
Thinking about this some more, I think this is a worthwhile addition, but language configurations should opt-in to it.
@eduard93 you should be able to change what's considered the content of the Atom when it's a string here:
https://github.com/Wilfred/difftastic/blob/b88b4056203cdd3075cd341595411195671a163b/src/parse/syntax.rs#L411
If it's an AtomKind::String, drop the first and last characters of
content
.
First and last characters are not very clean I am aftaid. The Python itself has variable length of string delimiters. Multiline strings (syntactically different to usual "
or '
strings, but not between themselves) are denoted with three of the same "ticks":
my_string = """Hello, World!
I love multiline strings.
"""
other_string = '''Hello, World!
I love multiline strings.
'''
assert my_string == other_string
Good news is that TS understands this; so maybe we can opt-in to take left and right node around the string? Here is generated tree, note "string_start" and "string_end" nodes.