hyrax icon indicating copy to clipboard operation
hyrax copied to clipboard

valkyrie: use storage adapters for derivatives

Open dunn opened this issue 2 years ago • 7 comments

Currently when derivatives are generated for uploaded files, Hyrax tries to save them directly to the filesystem, and subsequent access also assumes they will exist in a conventional local location: https://github.com/samvera/hyrax/blob/9ebda1a948fcec5e74aafc0d57da8e5dda9ad8ea/app/services/hyrax/derivative_path.rb#L7

When an application is using a different Valkyrie storage adapter, such as valkyrie-shrine for S3, these assumptions fail, and derivative storage doesn't work. This PR teaches Hyrax to use the defined derivatives_storage_adapter instead of directly accessing the filesystem.

  • Valkyrie-based upload tooling is abstracted into ValkyrieUpload in order to be used for original and derivative file upload
  • a new controller pattern is used to create routes for derivatives; these are used instead of the ThumbnailIndexer for FileSets (future work should replace all uses of ThumbnailIndexer in Valkyrie contexts)

@samvera/hyrax-code-reviewers

dunn avatar May 12 '22 17:05 dunn

not sure why we're seeing these errors again; the storage_adapter#upload is definitely happening first: https://app.circleci.com/pipelines/github/samvera/hyrax/7673/workflows/3d3461d7-7092-4999-9d7a-943421b3f1c1/jobs/61138

dunn avatar May 19 '22 20:05 dunn

it might be that #upload is silently failing beforehand?

From: /app/samvera/hyrax-engine/lib/wings/valkyrie/persister.rb:41 Wings::Valkyrie::Persister#save:

    26: def save(resource:)
    27:   af_object = resource_factory.from_resource(resource: resource)
    28:
    29:   check_lock_tokens(af_object: af_object, resource: resource)
    30:
    31:   # the #save! api differs between ActiveFedora::Base and ActiveFedora::File objects,
    32:   # if we get a falsey response, we expect we have a File that has failed to save due
    33:   # to empty content
    34:   af_object.save! ||
    35:     raise(FailedSaveError.new("#{af_object.class}#save! returned non-true. It might be missing required content.", obj: af_object))
    36:
    37:   resource_factory.to_resource(object: af_object)
    38: rescue ActiveFedora::RecordInvalid, RuntimeError => err
    39:   require 'pry'; binding.pry
    40:   raise MissingOrUnsavedFileError.new(err.message, obj: af_object) if
 => 41:     err.message == 'Save the file first'
    42:
    43:   raise FailedSaveError.new(err.message, obj: af_object)
    44: end

[6] pry(#<Wings::Valkyrie::Persister>)> resource.file_identifier
=> #<Valkyrie::ID:0x00007fc931752ef0 @id="fedora://fcrepo:8080/rest/test/4c/14/17/d3/4c1417d3-8d3c-459b-9927-22c9906add91/files/b1af2514-f8d9-43ac-8213-f7745a1275e8">
[7] pry(#<Wings::Valkyrie::Persister>)> resource.file_identifier.id
=> "fedora://fcrepo:8080/rest/test/4c/14/17/d3/4c1417d3-8d3c-459b-9927-22c9906add91/files/b1af2514-f8d9-43ac-8213-f7745a1275e8"
[8] pry(#<Wings::Valkyrie::Persister>)> Hyrax.query_service.find_by(id: resource.file_identifier.id)
Ldp::BadRequest: Path contains empty element! /test/fe/do/ra/:/fedora://fcrepo:8080/rest/test/4c/14/17/d3/4c1417d3-8d3c-459b-9927-22c9906add91/files/b1af2514-f8d9-43ac-8213-f7745a1275e8
from /usr/local/bundle/gems/ldp-1.0.3/lib/ldp/client/methods.rb:118:in `block in check_for_errors'
Caused by RuntimeError: Save the file first
from /usr/local/bundle/gems/active-fedora-13.2.7/lib/active_fedora/with_metadata/metadata_node.rb:49:in `save'

dunn avatar May 19 '22 21:05 dunn

back to this error:

Wings::Valkyrie::Persister::MissingOrUnsavedFileError:
  Failed to save object {obj}.
  Wings tried to save metadata for a file which has not been saved. Fedora creates a metadata node when the file is created, so it's not possible to add metadata for a file until the file contents are persisted.
    Use the Hyrax.storage_adapter to save the file before trying to save metadata.
  Save the file first
./lib/wings/valkyrie/persister.rb:39:in `rescue in save'
./lib/wings/valkyrie/persister.rb:26:in `save'
./app/services/hyrax/valkyrie_upload.rb:51:in `upload'
./app/services/hyrax/valkyrie_persist_derivatives.rb:33:in `call'
./spec/services/hyrax/valkyrie_persist_derivatives_spec.rb:18:in `block (4 levels) in <top (required)>'

https://app.circleci.com/pipelines/github/samvera/hyrax/7742/workflows/3d3d6833-fa41-449a-90bc-383be21406b2/jobs/61662

dunn avatar May 31 '22 20:05 dunn

this looks to be where the File ID gets mangled by AF:

From: /usr/local/bundle/gems/active-fedora-13.2.7/lib/active_fedora/ldp_resource_service.rb:30 ActiveFedora::LdpResourceService#to_uri:

    29: def to_uri(klass, id)
 => 30:   klass.id_to_uri(id)
    31: end

[10] pry(#<ActiveFedora::LdpResourceService>)> id
=> "disk:///app/samvera/hyrax-webapp/derivatives/cb/f3/a3/cbf3a38cacd74715bd86816ad7a89636/3-thumbnail.jpeg"
[11] pry(#<ActiveFedora::LdpResourceService>)> klass.id_to_uri(id)
=> "http://fcrepo:8080/rest/test/di/sk/:/disk:///app/samvera/hyrax-webapp/derivatives/cb/f3/a3/cbf3a38cacd74715bd86816ad7a89636/3-thumbnail.jpeg"

Or more specifically by Noid::Rails.treeify

dunn avatar May 31 '22 21:05 dunn

one error left

Failure/Error:
  Wings::ActiveFedoraConverter::FileMetadataNode(resource.class)
                              .new(file_identifier: Array(resource.file_identifier)
    .map(&:to_s))

ArgumentError:
  You attempted to set the property `has_model' of  to a scalar value. However, this property is declared as being multivalued.
./vendor/bundle/ruby/2.7.0/gems/active-fedora-13.2.7/lib/active_fedora/attributes/property_builder.rb:20:in `has_model='
./vendor/bundle/ruby/2.7.0/gems/active-fedora-13.2.7/lib/active_fedora/core.rb:111:in `assert_content_model'
./vendor/bundle/ruby/2.7.0/gems/active-fedora-13.2.7/lib/active_fedora/core.rb:38:in `initialize'
./lib/wings/active_fedora_converter/instance_builder.rb:34:in `new'
./lib/wings/active_fedora_converter/instance_builder.rb:34:in `build'
./lib/wings/active_fedora_converter.rb:105:in `instance'
./lib/wings/active_fedora_converter.rb:69:in `convert'
./lib/wings/valkyrie/resource_factory.rb:41:in `from_resource'
./lib/wings/valkyrie/persister.rb:27:in `save'

dunn avatar Jul 05 '22 23:07 dunn

now hang on a second

From: /usr/local/bundle/gems/active-fedora-13.2.7/lib/active_fedora/core.rb:113 ActiveFedora::Core#assert_content_model:

    110: def assert_content_model
    111:   require 'pry'; binding.pry
    112:
 => 113:   self.has_model = self.class.to_rdf_representation
    114: end

[1] pry(#<Wings(Wings(Hyrax::FileMetadata))>)> self.class.to_rdf_representation
=> "Wings(Wings(Hyrax::FileMetadata))"

dunn avatar Jul 06 '22 01:07 dunn

Rebased on main and got specs passing and files ingesting.

dlpierce avatar Sep 16 '22 16:09 dlpierce

@dlpierce your additions all look good!

dunn avatar Oct 21 '22 20:10 dunn