hyrax
hyrax copied to clipboard
Frijya and Frigg Adapters for Migrating Collections, AdminSets and Works to Postgres or Fcrepo 6
Overview
First, apologies. The nature of the work has been that we've been upgrading Hyku and testing Hyku against this branch. As such, it's a larger PR than we might normally prefer.
Type of change (for release notes)
This PR includes two primary considerations:
- Code changes that allow for a "lazy migration" of data (see discussion below).
- Refactorings that allow for downstream applications to configure behavior instead of override entire files.
Further Discussion
The primary work introduces the Frigga and Freyja metadata adapter; these adapters are configured to:
- Read first from one storage location, and failing to find something read from another (e.g.
storage_first.find || storage_second.find
if you will) - Write to the first storage location, ignoring the second storage location.
Through configuration and convention, the Frigga and Freyja adapters write similar looking Solr documents as their ActiveFedora based counterparts.
Frigga's adapter's first storage location is Postgres via Valkyrie, and it's second storage location is fcrepo via ActiveFedora.
Freyja's adapter's first storage location is fcrepo via Valkyrie, and it's second storage location is an fcrepo via ActiveFedora. Note, this is likely Fedora 6 and Fedora 4 respectively.
What this means is that these adapters provide a pathway for minimal downtime during a storage migration; something demonstrated by GBH's AMS project and presented on at Samvera Connect 2023 by @orangewolf.
Guidance for testing, such as acceptance criteria or new user interface behaviors:
There are no known new user behaviors. Instead we're relying on the test suite as well as downstream testing of Hyku migrations to help verify.
@samvera/hyrax-code-reviewers
The goal of double_combo
approach is to minimize downtime by allowing for in-place migrations.
Note: In this scenario, I’m using Fedora4 as the name for the legacy persistence layer, and Postgres as a symbolic name for the metadata adapter’s persistence layer; we could just as easily use Fedora4 but for clarifying purposes I’m separating these concepts.
This is done by leveraging a copy on write (CoW) type strategy. Below are two examples:
Given I wrote Work A to Fedora4 via ActiveFedora
And wrote Work A to Solr via ActiveFedora
And did not write Work A to Postgres
When I use Valkyrie to find Work A
Then Valkyrie will first check Postgress for Work A
And Valkyrie will next check Fedora4 for Work A
From the above scenario:
Given I have found Work A via Valkyrie
When I use Valkyrie to save Work A
Then Valkyrie will write to Postgres
And Valkyrie will update the Solr document originally written by ActiveFedora
At the point of completing the write
scenario, when we go to find “Work A” via Valkyrie, we’ll first check and find the record in Postgres.
Consider that the Fedora4 repository also has works B, C, D, and E. And all of those are part of Collection Alpha. When rendering the contents of the collection, we leverage the Solr document representations of each of the works. This is done by creating presenters. To avoid presentation drift, we would like for A, B, C, D, and E to have consistent presenters regardless of whether they were persisted via ActiveFedora or Valkyrie.
Put another way, the Presenter for Work A’s solr document written by ActiveFedora and the Presenter for Work A’s solr document written by Valkyrie should have the same methods and each of those methods used for rendering should return the same values.
To reduce duplication, it would be ideal for both of those presenters to in fact be the same presenter. It then follows that we would want to minimize branching logic between documents written via ActiveFedora and those written by Valkyrie; the greatest minimization would be for ActiveFedora and Valkyrie to generate logically equivalent solr documents. I say logically equivalent in that there are administrative benefits for tracking in Solr which documents have been valkyrized (see valkyrie_bsi
).
This leads to the aspiration that the has_model_ssim
should be stable between a document written via ActiveFedora and one written via Valkyrie.
The has_model_ssim
is a foundational attribute in that presenters use this for switching logic and we use it for Solr querying logic. See Hyrax::SolrDocumentBehavior#hydra_model
and it’s uses.
Concerning Freyja::CustomQueryContainer#method_missing
.
When we attempt to call a find_access_control_for
custom query checking first service the method raises a Valkyrie::Persistence::ObjectNotFoundError
. Which is then rescued in Hyrax::AccessControl.for
by instantiating a new Hyrax::AccessControl
object.
If in the Freyja::CustomQueryContainer
we rescue Valkyrie::Persistence::ObjectNotFoundError
, then the second service successfully finds the resource in postgres. However, we then encounter a problem in a subsequent call; namely that the first service will now raise ActiveFedora::IllegalOperation
.
I am leaving this here for the end of week, sharing the code and discovery. Some possible things for next week’s exploration is to create an explicitly defined method for #find_access_control_for
.
Below are the two pieces of code followed by the backtrace for ActiveFedora::IllegalOperation
; of note in the backtrace is that we attempt to call ActiveFedora::Core#initialize
, a bit of a surprise.
# frozen_string_literal: true
module Freyja
class CustomQueryContainer < Valkyrie::Persistence::CustomQueryContainer
def method_missing(method_name, *args, **opts)
query_service.services.each do |service|
return service.custom_queries.send(method_name, *args, **opts) if service.custom_queries.respond_to?(method_name)
end
super
end
def respond_to_missing?(method_name, _include_private = false)
query_service.services.each do |service|
return true if service.custom_queries.respond_to?(method_name)
end
false
end
end
end
module Hyrax
class AccessControl < Valkyrie::Resource
##
# A finder/factory method for getting an appropriate ACL for a given
# resource.
#
# @param resource [Valkyrie::Resource]
# @param query_service [#find_inverse_references_by]
#
# @return [AccessControl]
# @raise [ArgumentError] if the resource is not persisted
def self.for(resource:, query_service: Hyrax.query_service)
query_service.custom_queries.find_access_control_for(resource: resource)
rescue Valkyrie::Persistence::ObjectNotFoundError
new(access_to: resource.id)
end
end
end
ActiveFedora::IllegalOperation:
Attempting to recreate existing ldp_source:
`http://fcrepo:8080/rest/hykudemo/6d/90/14/a6/6d9014a6-06e7-4516-86be-87d899e62176'
# gems/active-fedora-14.0.1/lib/active_fedora/core.rb:36:in `initialize'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/postgres/orm_converter.rb:27:in `new'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/postgres/orm_converter.rb:27:in `resource'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/postgres/orm_converter.rb:19:in `convert!'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/postgres/resource_factory.rb:20:in `to_resource'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/postgres/query_service.rb:134:in `block in run_query'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `each'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `each'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `each'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `each'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `find'
# bundler/gems/hyrax-21dd6652f5f4/app/services/hyrax/custom_queries/find_access_control.rb:21:in `find_access_control_for'
# gems/valkyrie-3.1.3/lib/valkyrie/persistence/custom_query_container.rb:55:in `block (2 levels) in register_query_handler'
Test Results
17 files ± 0 17 suites ±0 2h 21m 6s :stopwatch: + 3m 13s 6 704 tests +454 6 407 :white_check_mark: +243 297 :zzz: +211 0 :x: ±0 13 175 runs +539 12 780 :white_check_mark: +297 395 :zzz: +242 0 :x: ±0
Results for commit 731ec367. ± Comparison against base commit 72e570ac.
This pull request removes 305 and adds 759 tests. Note that renamed tests count towards both.
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f553e262588>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007fd4a9e82300>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f553e960790>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007fd4ac234618>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: 25282049-8e9f-4b71-835a-2ed2caa8f9b5
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: 63690170-c710-422b-91c0-c1dab1d3de0c
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to edit AdminSet: c0d0f83d-e271-457d-a7fc-c798b84c2a9f
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to edit Hyrax::AdministrativeSet: b9da9683-ed9e-47c1-8dd7-b13b42a7e711
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to update AdminSet: f39a24c2-b092-40c4-ac1a-a8087db912bd
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to update Hyrax::AdministrativeSet: f0bd7955-d903-4242-adc7-6eac1926e1c5
…
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007f066e4cb690>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplate:0x00007ff6ce252e58>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007f066e527940>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to create #<Hyrax::PermissionTemplateAccess:0x00007ff6d143c090>
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy AdminSet: 084add89-94b4-4852-b30d-2f84863a43a2
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to destroy Hyrax::AdministrativeSet: d5b08e9a-0937-43da-a318-d23518658bdc
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to edit AdminSet: 3d1ac15f-0e88-4e43-9a2c-240b114ce432
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to edit Hyrax::AdministrativeSet: 8557897b-1a94-4683-a51f-2b46a4106980
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to update AdminSet: 22018dfc-fdc1-4187-af8d-451b79a7f37a
spec.abilities.ability_spec ‑ Hyrax::Ability AdminSets and PermissionTemplates a user without edit access is expected not to be able to update Hyrax::AdministrativeSet: 05ade203-916e-4c8d-9227-232a5f74eb2e
…
This pull request skips 10 tests.
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create adds it to the collection
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create updates env when share applies to works and only one collection removes member_of_collections_attributes and adds collection_id to env
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create updates env when share applies to works when more than one collection removes member_of_collections_attributes and does NOT add collection_id
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create updates env when share does NOT apply to works and only one collection removes member_of_collections_attributes and does NOT add collection_id
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create when multiple membership checker returns a non-nil value adds an error and returns false
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create when work is in another user's collection doesn't remove the work from the other user's collection
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create when work is in user's own collection and destroy is passed removes the work from that collection
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor create when working through Rails nested attribute scenarios removes the work from that collection
spec.actors.hyrax.actors.collections_membership_actor_spec ‑ Hyrax::Actors::CollectionsMembershipActor the next actor does not receive the member_of_collections_attributes
spec.features.work_show_spec ‑ work show view in ActiveFedora as the work owner allows adding work to a collection
:recycle: This comment has been updated with latest results.
I'm continuing to review this and make sure we have a dassie config that uses Freyja.
I think we're getting close. I feel like the changes in dassie to enable Freyja need an on/off env toggle.