python-unidiff icon indicating copy to clipboard operation
python-unidiff copied to clipboard

Support Dolt table diffs

Open addisonklinke opened this issue 3 years ago • 4 comments
trafficstars

Dolt is a versionable MySQL database that can commit, branch, push, and pull just like a git repository. The diff output is similar enough to git's that PatchSet is able to parse it. However, the number of lines (or in Dolt's case the number of table rows) added/removed does not seem to get tracked correctly.

For a basic Dolt setup, see this a minimal example I made. Taking the diff string of the objects table from my example, I tried unsuccessfully to parse the additions/deletions with unidiff

from io import StringIO
from textwrap import dedent
from unidiff import PatchSet

dolt_diff = dedent("""
    diff --dolt a/objects b/objects
    --- a/objects @ 73hiqmiduef0sqtecba4fav7vuuvdk2l
    +++ b/objects @ 1hq161cev9kkt6eukvap0jmrfeedvt9j
    +-----+----+---------+------------------+
    |     | id | label   | bbox             |
    +-----+----+---------+------------------+
    |  <  | 1  | cat     | [1, 2, 3, 4]     |
    |  >  | 1  | cat     | [3, 4, 5, 6]     |
    |  <  | 2  | dog     | [10, 20, 30, 40] |
    |  >  | 2  | poodle  | [10, 20, 30, 40] |
    |  <  | 3  | dog     | [5, 6, 7, 8]     |
    |  >  | 3  | bulldog | [5, 6, 7, 8]     |
    +-----+----+---------+------------------+
""")

patch_set = PatchSet(StringIO(dolt_diff))
for t, table in enumerate(patch_set):
    table_name = table.path.split('@')[0].strip()
    print(f'Dolt table {t}={table_name}: {table.added} additions / {table.removed} deletions')

This outputs

Dolt table 0=objects: 0 additions / 0 deletions  

Whereas it should've been 3 additions / 3 deletions from the <> syntax in the first ASCII column of the diff. Is there a way to support Dolt's table diffs?

addisonklinke avatar Apr 07 '22 20:04 addisonklinke

@matiasb Curious if you have any update on this?

addisonklinke avatar Aug 15 '22 19:08 addisonklinke

hi! I think this exceeds the original goal of the project, so I'm not really sure how it would work as part of unidiff, although I can see it would be useful to have something like that for your use case. Having said that, it seems it shouldn't be complex to implement using the existing code as base, maybe we can get a branch started and see how that looks? Alternatively, it could be a fork and become something independent?

matiasb avatar Aug 17 '22 21:08 matiasb

Is there a standard way you could see external project's writing a plugin to support their diff format? That seems like it could be a good solution. If so, I can take a look or raise the issue with the Dolt maintainers to get that written

addisonklinke avatar Aug 22 '22 15:08 addisonklinke

Right now there isn't an easy way (it wasn't previously considered either) to have a pluggable way to specify a custom diff format. That would require some work. I think the simpler path to get something working (given in this case it seems a specific scenario and scope) would be to fork and adapt the existing code, as a separate thing. I can try to help/answer questions as time permits.

matiasb avatar Aug 31 '22 01:08 matiasb