kart
kart copied to clipboard
Add support for version-controlled point clouds
Tracking bug - will add PRs here, and make notes of functionality that is missing. Later when point clouds gets closer to launch, this can be used to file individual point cloud issues.
Right now all functionality is missing - point clouds can't be imported, checked out, diffed, modified, committed, etc.
If you use a recent build of Kart, and get PDAL and Git LFS installed, such that they are linked in properly with their existing dependencies such as sqlite, and if you set environment variable X_KART_POINT_CLOUDS=1 then you should be able to do the following:
- import point clouds, including automatically converting to COPC
- push and pull
- show, diff and commit tile changes As of #645, branch switches and resets and the working copy side of things is more robust.
Not done:
- meta changes - these are not detected and not supported
- merges + conflicts
- spatial filtering
- the build (PDAL and Git LFS)
- copy-on-write in order to save disk space (?)
Meta
- Detect meta changes
Right now we don’t store LAZ version or COPC version except in the tiles themselves, and in the pointer files to those tiles (unlike the schema, which we store per-dataset in schema.json). This means that a dataset with no tiles (eg if you delete every tile it has) still has a schema, but no longer has a well-defined LAS version or COPC version.
- Commit meta changes o Commit tile changes and discard meta changes o Commit tile and meta changes: update dataset meta items to match new tile(s)
Make sure we support adding non-copc files to copc dataset, since, we auto convert non-copc to copc during import, and it will confuse users if they then can't commit additional non-copc files to the imported dataset.
Merges/conflicts
Make sure repo is correctly moved into merge-state in the case of conflicts. Add a way to resolve conflicts. (For vector, this is only possible by providing the resolution to the CLI using a stdin or a file. There is no way to provide the resolution using a working copy. For PC, it would probably make most sense to provide it using the working copy. We could - if we wanted - add support for providing vector resolutions at the working copy level too).
Spatial filtering
Support spatial filtering.
Build issues
PDAL Git LFS client
Copy-on-write
Not sure what this will involve
Odds and ends
- ~Some more work on reset()~ #652
- ~Naming bug - don't name things .copc.copc.laz~
- Changes to
kart statusoutput kart create-workingcopyonly supports tabular working copies - need some PC equivalent- Rename detection? If you rename a tile, right now it is treated as a delete + insert.
- ~Fix the bug where we extract only the horizontal CRS~
- Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
- Calling
git lfs fetchto fetch from HEAD won't work for a partial reset from another branch git lfs fetchseems to be fetching missing+promised blobs, which breaks spatial filtering.- ~Putting LFS tiles into
workdir-indexis also causing them to be copied to the ODB, which is a huge waste of space.~ - ~Handle finer-grained soft-resets than entire datasets~
Meta changes - done ✅
- Detect and commit meta changes ✅
- Commit meta changes ✅
- Commit new tiles with --convert-to-dataset-format ✅
Merges/conflicts
- ~Make sure repo is correctly moved into merge-state in the case of conflicts.~ ✅
- ~Make sure conflicts can be resolved (in a basic way)~ ✅
- Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them.
- Let the user merge conflicting tiles by reading them from the workdir
- Relatedly: Let the user merge conflicting vector features by reading from the working copy
- Clean up any conflicts that are in the workdir when the merge is complete.
Spatial filtering
Support spatial filtering.
Build issues
PDAL Git LFS client
Copy-on-write
Not sure what this will involve
Odds and ends
- Changes to
kart statusoutput kart create-workingcopyonly supports tabular working copies - need some PC equivalent- Rename detection? If you rename a tile, right now it is treated as a delete + insert.
- Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
- Calling
git lfs fetchto fetch from HEAD won't work for a partial reset from another branch git lfs fetchseems to be fetching missing+promised blobs, which breaks spatial filtering.
Spatial filtering vs Git LFS
(this is mentioned briefly in Odds and Ends above but makes more sense as as part of Spatial Filtering) Git LFS doesn't seem to currently support missing+promised blobs - it seems to deal with them in two different ways, both of which are undesirable for us:
- it fetches them, which makes things simple from its point of view - now it can check if they are pointer files - but this means that content that is outside the spatial filter can leak into your repo, causing it to grow in size towards the repo's unfiltered size, which is undesirable
- it doesn't fetch them and crashes instead, complaining that they are missing
Probably we will need to patch it. Possibly such patches can be upstreamed, since partial-clones are an official git concept which other LFS users might want to use without Git LFS crashing.
A Kart repo need not have any LFS tiles to hit problems with Git LFS - it is enough that the repo was created with experimental flag X_KART_POINT_CLOUDS=1 in the environment, which causes the LFS hooks to be installed, and the repo has missing+promised blobs.
Meta changes - done ✅
- Detect and commit meta changes ✅
- Commit meta changes ✅
- Commit new tiles with --convert-to-dataset-format ✅
Merges/conflicts - done ✅
- Make sure repo is correctly moved into merge-state in the case of conflicts. ✅
- Make sure conflicts can be resolved (in a basic way) ✅
- Write all versions conflicting tiles to the workdir somewhere, to let the user view them and figure out how to merge them. ✅
- Let the user merge conflicting tiles by reading them from the workdir ✅
- Relatedly: Let the user merge conflicting vector features by reading from the working copy ✅
- Clean up any conflicts that are in the workdir when the merge is complete. ✅
Spatial filtering
- Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering
- Part 2 - don't fetch LFS tiles that are outside the spatial filter
Build issues
PDAL Git LFS client
Copy-on-write
Not sure what this will involve
Odds and ends
- Changes to
kart statusoutput kart create-workingcopyonly supports tabular working copies - need some PC equivalent- Rename detection? If you rename a tile, right now it is treated as a delete + insert.
- Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?)
Meta changes - done ✅
Merges/conflicts - done ✅
Spatial filtering ✅
- Part 1 - don't accidentally fetch missing+promised blobs when scanning for pointer files, as this breaks existing spatial filtering ✅
- Part 2 - don't fetch LFS tiles that are outside the spatial filter ✅
Build issues
PDAL Git LFS client
Copy-on-write
Not sure what this will involve
Odds and ends
- Changes to
kart statusoutput kart create-workingcopyonly supports tabular working copies - need some PC equivalent ✅- Rename detection? If you rename a tile, right now it is treated as a delete + insert.
- Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far
- kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal.
Bugs
- Test on windows - pre-push script is not working there:
there's an import bug:
- [ ] we need to preserve classifications with flags from PDRF<6 at import time (ie we need to fix or workaround PDAL/PDAL#1701 )
Build issues ❌/✅ Entire build system is reworked. Done but still not merged https://github.com/koordinates/kart/tree/rc-vendor-vcpkg-pyremove
Copy-on-write Should work fairly simply (on macos and linux only) using reflink package. Waiting for build issues to settle. (Maybe need a windows copy-on-write solution for launch?)
Odds and ends Changes to kart status output format Rename detection? If you rename a tile, right now it is treated as a delete + insert. Figure out what non-determinism is happening during import (are timestamps being embedded somewhere?) ❌ no success so far. Can launch without this kart lfs+ push is fetching a potentially large number of files using command-line arguments right now - use stdin instead so as not to overflow the terminal. ✅ Merge import command with point-cloud-import command Remove X_KART_POINT_CLOUDS feature flag / default it on. Better error recovery if operations are interrupted or fail partway through - see https://github.com/koordinates/kart/issues/741
Docs Write point cloud specific docs Skim through existing docs and websites, make sure they are still up-to-date
Build issues ✅ - https://github.com/koordinates/kart/pull/752
Copy-on-write ✅ - https://github.com/koordinates/kart/pull/757 Not done on windows (or on any platform that doesn't support reflink) - see https://github.com/koordinates/kart/issues/772
Odds and ends Changes to kart status output format ✅ Rename detection? If you rename a tile, right now it is treated as a delete + insert. ❌ skipped. Merge import command with point-cloud-import command ✅ - https://github.com/koordinates/kart/pull/746 Remove X_KART_POINT_CLOUDS feature flag / default it on. ✅ Better error recovery if operations are interrupted or fail partway through - ✅ https://github.com/koordinates/kart/pull/751
Docs ✅ just finishing these up
Will close out this tracking issue since it's not being used any more