Add a `--diff` flag to show changes without modifying the source file
Description / Summary
A nice-to-have feature would be for mdformat to show the changes it would make to the file by writing the output to stdout rather than to the original file. This is a feature that black has, using the --diff flag on the command line.
Value / benefit
At present, the only way to find out the changes that mdformat will make (or has made) is to create a copy of the file to be formatted, run mdformat on the copy and compare the two files. Or, if the file is under version control, to compare to the previous version (e.g. by git diff). The extra effort is a bit of a pest, but this feature is really just a nice-to-have.
Implementation details
Black's implementation is in src/black/output.py#L55-L73:
def diff(a: str, b: str, a_name: str, b_name: str) -> str:
"""Return a unified diff string between strings `a` and `b`."""
import difflib
a_lines = [line for line in a.splitlines(keepends=True)]
b_lines = [line for line in b.splitlines(keepends=True)]
diff_lines = []
for line in difflib.unified_diff(
a_lines, b_lines, fromfile=a_name, tofile=b_name, n=5
):
# Work around https://bugs.python.org/issue2142
# See:
# https://www.gnu.org/software/diffutils/manual/html_node/Incomplete-Lines.html
if line[-1] == "\n":
diff_lines.append(line)
else:
diff_lines.append(line + "\n")
diff_lines.append("\\ No newline at end of file\n")
return "".join(diff_lines)
There's a short color_diff() function directly after, if anyone wants to be fancy.
Tasks to complete
No response
Thanks for the issue!
I'm not too excited about this feature as
- it's pretty easy to print the diff by doing something like the following already
That is, let the diff tool do diffs and formatting tool do formatting. Simply by being a good Unix citizen, we already enable this use case. As a bonus, the user gets to choose whatever diff tool and configuration they prefer.diff foo.md <(mdformat - < foo.md) - I find it more useful to know if a file is formatted or not, rather than what exact parts are not formatted. The best way to fix is by running
mdformatanyways, I don't need concern myself with the formatting details.
I'll leave this open for now so we can reconsider if there's demand and real-life use cases from other users.
Well, I'm not too excited about the --check feature for exactly the same reasoning. YMMV. Black's implementation of this feature appears to be straightforward and I've added it to the Implementation details section above.
In passing, I note that nbqa now supports checks such as running mdformat over notebook Markdown cells, including the running of an --nbqa-diff switch that generates a preview rather than changing content in-place.
I think a real use case could be that people submit a PR for documents in Markdown files via the Github web editor. If they make an error, CI will fail, but there is no way to tell the user what the formatting problem is. In many cases these people cannot use a development environment, but it is possible to fix their PRs with the diff output.
@hukkin
it's pretty easy to print the diff by doing something like the following already
That assumes you are using 1) a shell 2) one that has the process substitution feature. Neither is the case if you are running it via exec call on a Docker container that uses Ash for its shell, i.e. in many CI flavors out there.
Lots of formatters have this feature exactly for this reason: so that there's an easy way to show the diff in CI logs.