gitbutler icon indicating copy to clipboard operation
gitbutler copied to clipboard

branch details with `gix`

Open Byron opened this issue 1 year ago • 4 comments

This PR is the first step towards providing branch details with gitoxide.

Tasks

  • [x] git2 performance optimizations for 'apples-to-apples' comparison.
  • [x] Fill in missing details in understanding
  • [x] Add a couple of tests (validate merge-base and diff information)
  • [x] Add benchmarks - ideally also with larger repositories
  • [x] Use gix for commit-walk - more than twice as fast. Screenshot 2024-08-12 at 11 56 56, and it can certainly get even faster.
  • [x] Use gix for tree-tree diff Screenshot 2024-08-13 at 10 24 34 - first results are promising, at 11 times faster on a single core, pure diffing performance. It will be slower if the diffed objects have to be retrieved/aren't cached, so in reality on gitlab it's just about 1.6x of git2.
  • [x] make it fast in the real world
  • [ ] improve rev-walk API
  • [ ] improve tree-tree diff API
  • [x] don't forget origin/478475-model-registry-markdown-support-for-version-description has 2 commit (itself) above the merge-base, but lists commit in thegix`
    • This is correct, actually, as there is only one commit and git2 for some reason counts the first parent, even though it's 'underneath' the mergebase.
  • [ ] Don't forget to re-enable all benchmarks!

Follow-Ups

  • replace merge-base - will need some research to see how complex the algorithm is.
  • Add support for CC based zlibng to flate2

Notes for the Reviewer

  • This PR also optimizes git2 for performance, which also levels the playing field as certain validations cost time, but aren't performed (yet) by gix.
  • gix is able to be 1.6x of git2 when it comes to tree-diffing, and it's noticeable. however, for now it's more like 1.5x as stock zlib has to be used until cmake is either available or flate2 can be configured to not use cmake at all.

Notes

  • When having a virtual branch named main it becomes 'entangled' with refs/heads/main. However, when refs/heads/main receives a new commit, main (the vbranch) knows nothing about it and also won't offer to rebase on top of it. It's still tied to origin/main. It's probably OK, even though I would have expected to be able to incorporate the changes somehow.
  • Eventually, we will want to make max-performance work on CI - shouldn't be too hard and would already work if docker wasn't used.
  • I have added sorting (lexicographically, increasing) to make both versions more comparable, and also because git branch sorts like that (but using the full ref name. In the end, the UI should probably sort and group like it wants, then the backend doesn't have to do that anymore. Then I realized the UI is already sorting by last-modified date.
  • For clean profiling, use gitbutler-cli branch details 478475-model-registry-markdown-support-for-version-description "pedropombeiro/454313/3-deprecate-fields" "mc/overwrite-jobs-instead-of-uniqueness-enforcement" "backfill-multiple-desired-sharding-key-small-table-compliance_framework_security_policies" "jay_mccure-master-patch-68663" "work-items-widgets-migration" "ah-dedup-records-migration" "455903-limits" "477889-implement-additional-telemetry-for-added-context-code-completion" "psi-ff-summduo" "450705-policies-mvp-combined-mode" "ia-arkose-token-verification-consitent-return" "hustewart-add-build-snippet-service" "bw-add-subscribed-filter" "adjust-new-user-signup-caps-validation" "bk/473748-backfill-owasp-top-10-null" "cwoolley-gitlab-master-patch-150b" "application-settings-pattern-poc-haml" "470051-document-backup-of-object-storage-data" "j_lar/add_redis_keys_to_monitor" "477306-glql-table-of-issues" "rlehmann1-fix-lint-issues-policies" "ban-ai-duo-settings-admin-group-form" "handle-empty-ff-merge-in-from-train-ref-strategy" "bw-run-markdown-benchmark-on-tag" "477813-follow-up-from-refactor-ai-ga-policies" "gitaly-ci-jobs-52550" "443369-update-self-managed-billing" "433082-iteration-rest-api-does-not-properly-filter-out-iterations-from-groups-with-no-access" "jkhabie-add-reachability-column-sbom" "jreporter-master-patch-d091" "vmairet-add-token-associations" "vij-remove-member-default-value" "autoflow/refactor-naming" "thutterer-danger-demo-settings" "443369-update-billed-shared-group-users" "fix_audit_event_flow" "16-11-stable-ee" "477712-health-check-repeated-click" "470701-branch-rule-editing-minimum-required-approvals-per-target-branch" "ensure_prepared_worker_automation" "janis-add-pages-to-usage-quotas" "quarantine-flaky-tests-spec-lib-gitlab-import_export-project-relation_factory_spec-rb-81" "440857-glql-integration" "472352-transition-dast-site-profiles-builds" "tachyons-pat-sharding-key-full" "368542-container-registry-metadata" "design-management-widget-migration" "ag/477091-seat-controls-fe-validation" "458831-remove-required_instance_ci_template-column" "455903-placeholder-limit" "gmh-add-approver-to-mr-graphql" "quarantine-flaky-tests-spec-lib-banzai-pipeline-plain_markdown_pipeline_spec-rb-106" "463258-create-common-solution-on-how-to-count-and-track-feature-flag-enabled-value" "mk/fix-empty-replication-details-view" "rodrigo/integrate-improved-user-mapping-in-direct-transfer" "16-9-stable-ee" "jmc-16.9-backport-shm-qa" "mattkasa/refactor-query-analyzers" "application-settings-pattern-poc" "464587-support-merge-requests-as-context-for-duo-chat-responses" "449139-fe-update-the-list-and-the-counter-after-inline-role-change-2" "jivanvl-update-permissions-container-expiration-policy" "hustewart-snippets-refactor" "477547_nullify_organization_id_migration" "slashmanov/add-release-filter-to-mr-list-app" "dagron1-vulnerability-resolution-supported-scanners" "464132-work-items-prevent-adding-more-than-10-items-to-hierarchy-widget-in-one-batch" "md-add-feature-flag-headers" "form-issue-jira-setting" "ph/mrDashboardNavCount" "mc_rocha-poc-ingest-license-sbom-security-policy""462398-refactor-pb-fe" "bk/473748-modify-finder" "471726-ignore-invalid-project-CI-with-pep-override-strategy" "autoflow/issue-created-event" "466718-add-api-endpoint-for-bulk-usage_data-sending" "vi-display-closed-status-for-branches" "docs/bprescott/20240618-dbmisc" "eduardosanz-master-patch-19280" "461761-refactor-clean-widgets-to-only-use-id-iid-and-fullpath-to-fetch-work-item-in-individual" "andrey-move-gdk-build" "reintroduce-create-jira-issue-form" "seed-placeholder-users-task" "ck3g-test-xray-job-do-not-merge" "463822-create-service-to-generate-metadata-for-group-instance-level"

stderr with cache-efficiency-debug enabled, showing decent object cache performance, but bad pack-cache performance. The object cache is the most important.

Byron avatar Aug 10 '24 15:08 Byron

@Byron is attempting to deploy a commit to the GitButler Team on Vercel.

A member of the Team first needs to authorize it.

vercel[bot] avatar Aug 10 '24 15:08 vercel[bot]

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
gitbutler-web ✅ Ready (Inspect) Visit Preview 💬 Add feedback Aug 14, 2024 0:05am

vercel[bot] avatar Aug 14 '24 11:08 vercel[bot]

the tree to tree diffing being 11 times faster is really impressive! looking forward to using that in other hot paths of the app :)

krlvi avatar Aug 15 '24 09:08 krlvi

the tree to tree diffing being 11 times faster is really impressive! looking forward to using that in other hot paths of the app :)

Unfortunately it's really only like that in the given microbenchmark, which just diffs a single-line changes of the same two blobs, 10k times. As gitoxide has a cache for this, it doesn't do much work after extracting both objects, whereas git2 doesn't seem to have this optimization (an optimization really coming from implementing rename-tracking which is disabled here).

In practice, on a real GitLab repo it still clocks in at 1.6x faster, which is something I can feel when using the UI.

Byron avatar Aug 16 '24 08:08 Byron