lookout icon indicating copy to clipboard operation
lookout copied to clipboard

DataService: changes in a tree that is a sub-module

Open bzz opened this issue 7 years ago • 7 comments

For now, to avoid failures as in #144 in DataService we just skip all requested files that are not part of the repository. One case when this is not an error is when the changes were done in some sub-module.

Rationale is: most probably that would not be a place user will want to see the feedback on his PR anyway. https://github.com/src-d/lookout/issues/144#issuecomment-413534313

Supporting getting files/changes from sub-modules with DataService may make sense in some cases, so this issue is about:

  1. finding such cases
  2. if found, implementing sub-modules support
  3. if not found, may be instead of #150 we should not event detect these as a change and just skip inside DiffTreeScanner

bzz avatar Aug 16 '18 16:08 bzz

Using https://github.com/src-d/style-analyzer/compare/c99dcdff172f1cb5505603a45d054998cb4dd606...src-d:3a9d78 example from https://github.com/src-d/lookout/issues/144#issuecomment-413528797 it's not immediately clear why changes related to submodule in core/server are even detected by DiffTreeScanner.

After they are detected in a Trees, of course File/ChangeBlogScanners fail to get blobs for such paths.

On manual inspection of those hashes, there are no trees related to a path lookout/core/server, that is a sub-module.

From Tree

$ git cat-file -p 5e636bfd1263b4669e2973282294dcc1606b78d9
100644 blob 4fd343b5cd43471af35aaf62466e0c6070630f8b	__init__.py
100644 blob 35561554e4a074d279c5d1eb0cd68c10821fb580	__main__.py
040000 tree 2e1d2fee0927fc5c4a8f5562efb5d4033d3b6f59	core
040000 tree 11f294824d517bc746f0fb00a1a4e10943b8fda4	style

To Tree

$ git cat-file -p 581c5c0e21e619095a87a7e953af3474d2c7ee36
100644 blob 4fd343b5cd43471af35aaf62466e0c6070630f8b	__init__.py
100644 blob 35561554e4a074d279c5d1eb0cd68c10821fb580	__main__.py
040000 tree 3007fdce337099fb6d2a3ed1cc00765aaea89c6d	core
040000 tree 7476473b904019468d313869ddd31c11185934cc	style

bzz avatar Aug 16 '18 17:08 bzz

  1. and 2. is preliminary done - App and ML teams have consensus that we do not want to get submodules changes as changes in lookout.

This is only about 3.

bzz avatar Aug 21 '18 15:08 bzz

After a bit more investigation - indeed, a submodule seems to have a entry that is different - it's a "Commit" (!) under core/server

$ git diff-tree --raw -r --find-renames 5e636bfd1263b4669e2973282294dcc1606b78d9 581c5c0e21e619095a87a7e953af3474d2c7ee36

:100644 100644 4bf07dffc8bd1f219aae52aee6cf6a26c6d19f5e 58da5abdc82e88ed05b8b39b032afce3e56aa658 M	core/api/event_pb2.py
:100644 100644 16ec768e998a96ed24bcb5a850f49eb7107beb93 f51c11f2241ec0077975b2905fc9fc77dbb42873 M	core/api/service_data_pb2.py
:100644 100644 fc1c3e17404d7ac391cf289ea7ed4f55a79eeaac fc82aa1ab0a6282dc52219f8c3841fd90f18dab8 M	core/manager.py
:160000 160000 8a8b89a466aa9e87dff57ebe180fc8569314a75d 5da09179136f01a51f73bd5cbc2b576af6f48aad M	core/server
:100644 100644 6facdca17895101377ff952b5977d915e27ab736 61815491be9f65f3c77c9f44d228e0638ab095b4 M	style/format/format_analyzer.py
$ git cat-file -p 3007fdce337099fb6d2a3ed1cc00765aaea89c6d

100644 blob e69de29bb2d1d6434b8b29ae775ad8c2e48c5391	__init__.py
040000 tree bc2e0680a32215a82a9eeabc2888860121790756	api
100644 blob c499bbb316279de75a5d6bdec4ace90cd6467d22	model_repository.py
 ...
160000 commit 5da09179136f01a51f73bd5cbc2b576af6f48aad	server
100644 blob fe0a961b66337d2727c21134fec6271cd1a81f7b	slogging.py

That simplifies 3 - DiffTreeScanner should check if a change is a Commit and skip it.

bzz avatar Aug 23 '18 06:08 bzz

Was able to reproduce it on https://github.com/bzz/submodule/commit/f46a1e3c7c90d63c07bf47beb13c9ffe5fab3124 🎉

Funny enough, this does not happen on git-fixtures/submodule repo, as it does not include any commit that does both:

  • modifies regular file
  • modifies submodule (just adding submodule does not trigger this!)

bzz avatar Aug 23 '18 09:08 bzz

putting back to backlog due to priority of SDK tasks

bzz avatar Aug 29 '18 11:08 bzz

It seems that DataService no longer crash if there are changes in a tree that is a git sub-module; @bzz suggested that DiffTreeScanner should not even detect these as a change, instead of relying in the workaround done by #150.

should we proceed as suggested, or it is not really needed?

dpordomingo avatar Jan 11 '19 16:01 dpordomingo

It looks like the workaround in #150 will not apply if FilesRequest.want_changes is false. So I'd say yes, it makes sense to make it more robust and skip it before it gets to the blobAdder. Otherwise two different FilesRequest will get a different list of files, depending if the blob contents are needed or not.

carlosms avatar Jan 14 '19 10:01 carlosms