kart icon indicating copy to clipboard operation
kart copied to clipboard

Add support for version-controlled point clouds

Open olsen232 opened this issue 3 years ago • 8 comments

Tracking bug - will add PRs here, and make notes of functionality that is missing. Later when point clouds gets closer to launch, this can be used to file individual point cloud issues.

Right now all functionality is missing - point clouds can't be imported, checked out, diffed, modified, committed, etc.

olsen232 avatar Mar 02 '22 01:03 olsen232

If you use a recent build of Kart, and get PDAL and Git LFS installed, such that they are linked in properly with their existing dependencies such as sqlite, and if you set environment variable X_KART_POINT_CLOUDS=1 then you should be able to do the following:

  • import point clouds, including automatically converting to COPC
  • push and pull
  • show, diff and commit tile changes As of #645, branch switches and resets and the working copy side of things is more robust.

Not done:

  • meta changes - these are not detected and not supported
  • merges + conflicts
  • spatial filtering
  • the build (PDAL and Git LFS)
  • copy-on-write in order to save disk space (?)

olsen232 avatar Jun 08 '22 02:06 olsen232

Meta

  • Detect meta changes

Right now we don’t store LAZ version or COPC version except in the tiles themselves, and in the pointer files to those tiles (unlike the schema, which we store per-dataset in schema.json). This means that a dataset with no tiles (eg if you delete every tile it has) still has a schema, but no longer has a well-defined LAS version or COPC version.

  • Commit meta changes o Commit tile changes and discard meta changes o Commit tile and meta changes: update dataset meta items to match new tile(s)

Make sure we support adding non-copc files to copc dataset, since, we auto convert non-copc to copc during import, and it will confuse users if they then can't commit additional non-copc files to the imported dataset.

Merges/conflicts

Make sure repo is correctly moved into merge-state in the case of conflicts. Add a way to resolve conflicts. (For vector, this is only possible by providing the resolution to the CLI using a stdin or a file. There is no way to provide the resolution using a working copy. For PC, it would probably make most sense to provide it using the working copy. We could - if we wanted - add support for providing vector resolutions at the working copy level too).

Spatial filtering

Support spatial filtering.

Build issues

PDAL Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • ~Some more work on reset()~ #652
  • ~Naming bug - don't name things .copc.copc.laz~
  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • ~Fix the bug where we extract only the horizontal CRS~
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
  • Calling git lfs fetch to fetch from HEAD won't work for a partial reset from another branch
  • git lfs fetch seems to be fetching missing+promised blobs, which breaks spatial filtering.
  • ~Putting LFS tiles into workdir-index is also causing them to be copied to the ODB, which is a huge waste of space.~
  • ~Handle finer-grained soft-resets than entire datasets~

olsen232 avatar Jun 13 '22 02:06 olsen232

Meta changes - done ✅

  • Detect and commit meta changes ✅
  • Commit meta changes ✅
  • Commit new tiles with --convert-to-dataset-format ✅

Merges/conflicts

  • ~Make sure repo is correctly moved into merge-state in the case of conflicts.~ ✅
  • ~Make sure conflicts can be resolved (in a basic way)~ ✅
  • Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them.
  • Let the user merge conflicting tiles by reading them from the workdir
  • Relatedly: Let the user merge conflicting vector features by reading from the working copy
  • Clean up any conflicts that are in the workdir when the merge is complete.

Spatial filtering

Support spatial filtering.

Build issues

PDAL Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
  • Calling git lfs fetch to fetch from HEAD won't work for a partial reset from another branch
  • git lfs fetch seems to be fetching missing+promised blobs, which breaks spatial filtering.

olsen232 avatar Jul 19 '22 02:07 olsen232

Spatial filtering vs Git LFS

(this is mentioned briefly in Odds and Ends above but makes more sense as as part of Spatial Filtering) Git LFS doesn't seem to currently support missing+promised blobs - it seems to deal with them in two different ways, both of which are undesirable for us:

  • it fetches them, which makes things simple from its point of view - now it can check if they are pointer files - but this means that content that is outside the spatial filter can leak into your repo, causing it to grow in size towards the repo's unfiltered size, which is undesirable
  • it doesn't fetch them and crashes instead, complaining that they are missing

Probably we will need to patch it. Possibly such patches can be upstreamed, since partial-clones are an official git concept which other LFS users might want to use without Git LFS crashing.

A Kart repo need not have any LFS tiles to hit problems with Git LFS - it is enough that the repo was created with experimental flag X_KART_POINT_CLOUDS=1 in the environment, which causes the LFS hooks to be installed, and the repo has missing+promised blobs.

olsen232 avatar Aug 01 '22 02:08 olsen232

Meta changes - done ✅

  • Detect and commit meta changes ✅
  • Commit meta changes ✅
  • Commit new tiles with --convert-to-dataset-format ✅

Merges/conflicts - done ✅

  • Make sure repo is correctly moved into merge-state in the case of conflicts. ✅
  • Make sure conflicts can be resolved (in a basic way) ✅
  • Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them. ✅
  • Let the user merge conflicting tiles by reading them from the workdir ✅
  • Relatedly: Let the user merge conflicting vector features by reading from the working copy ✅
  • Clean up any conflicts that are in the workdir when the merge is complete. ✅

Spatial filtering

  • Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering
  • Part 2 - don't fetch LFS tiles that are outside the spatial filter

Build issues

PDAL Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)

olsen232 avatar Aug 09 '22 02:08 olsen232

Meta changes - done ✅

Merges/conflicts - done ✅

Spatial filtering ✅

  • Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering ✅
  • Part 2 - don't fetch LFS tiles that are outside the spatial filter ✅

Build issues

PDAL Git LFS client

Copy-on-write

Not sure what this will involve

Odds and ends

  • Changes to kart status output
  • kart create-workingcopy only supports tabular working copies - need some PC equivalent ✅
  • Rename detection? If you rename a tile, right now it is treated as a delete + insert.
  • Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far
  • kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal.

olsen232 avatar Aug 26 '22 00:08 olsen232

Bugs

  • Test on windows - pre-push script is not working there:

olsen232 avatar Sep 08 '22 03:09 olsen232

there's an import bug:

  • [ ] we need to preserve classifications with flags from PDRF<6 at import time (ie we need to fix or workaround PDAL/PDAL#1701 )

craigds avatar Oct 13 '22 20:10 craigds

Build issues ❌/✅ Entire build system is reworked. Done but still not merged https://github.com/koordinates/kart/tree/rc-vendor-vcpkg-pyremove

Copy-on-write Should work fairly simply (on macos and linux only) using reflink package. Waiting for build issues to settle. (Maybe need a windows copy-on-write solution for launch?)

Odds and ends Changes to kart status output format Rename detection? If you rename a tile, right now it is treated as a delete + insert. Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far. Can launch without this kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal. ✅ Merge import command with point-cloud-import command Remove X_KART_POINT_CLOUDS feature flag / default it on. Better error recovery if operations are interrupted or fail partway through - see https://github.com/koordinates/kart/issues/741

Docs Write point cloud specific docs Skim through existing docs and websites, make sure they are still up-to-date

olsen232 avatar Dec 01 '22 00:12 olsen232

Build issues ✅ - https://github.com/koordinates/kart/pull/752

Copy-on-write ✅ - https://github.com/koordinates/kart/pull/757 Not done on windows (or on any platform that doesn't support reflink) - see https://github.com/koordinates/kart/issues/772

Odds and ends Changes to kart status output format ✅ Rename detection? If you rename a tile, right now it is treated as a delete + insert. ❌ skipped. Merge import command with point-cloud-import command ✅ - https://github.com/koordinates/kart/pull/746 Remove X_KART_POINT_CLOUDS feature flag / default it on. ✅ Better error recovery if operations are interrupted or fail partway through - ✅ https://github.com/koordinates/kart/pull/751

Docs ✅ just finishing these up

Will close out this tracking issue since it's not being used any more

olsen232 avatar Jan 23 '23 02:01 olsen232