mik icon indicating copy to clipboard operation
mik copied to clipboard

Move 'include_migrated_from_uri' config option from metadata parser class to a metadata manipulator

Open mjordan opened this issue 9 years ago • 23 comments

include_migrated_from_uri setting in CONTENTdm to MODS metadata parser uses a hard-coded hostname, and the form of the MODS element created is specific to SFU's requirements. We have a pattern already for adding identifiers via metadata manipulators in the AddUuidToMods. We should create a metadata manuiplator that adds the migrated from URL.

mjordan avatar Feb 29 '16 15:02 mjordan

@MarcusBarnes I can probably find time to take this on, over the long weekend at the latest.

mjordan avatar May 17 '16 20:05 mjordan

@mjordan Great - that would be helpful. Thank you.

MarcusBarnes avatar May 17 '16 21:05 MarcusBarnes

@MarcusBarnes can you test branch issue-176? You will need to run composer dump-autoload as I have added a new class file. If you run it on newspapers, please make sure that both the issue and page level MODS contain the migrated from URL.

Wiki page for this new manipulator is at https://github.com/MarcusBarnes/mik/wiki/Metadata-manipulator:-AddCdmMigratedFromUrl.

mjordan avatar May 18 '16 14:05 mjordan

@mjordan Awesome - I will test.

MarcusBarnes avatar May 18 '16 17:05 MarcusBarnes

@mjordan My test on a newspaper has the migrated from URL appearing in the issue level MODS, but not the page level MODS.

MarcusBarnes avatar May 18 '16 19:05 MarcusBarnes

OK, thanks for testing, I'll take a look.

mjordan avatar May 18 '16 19:05 mjordan

@mjordan Since the page level MODS is not generated using the mapping, we'll have to conditionally check for the specific manipulator and add some code like that for adding the UUID here:

https://github.com/MarcusBarnes/mik/blob/master/src/metadataparsers/mods/CdmToMods.php#L246

MarcusBarnes avatar May 18 '16 19:05 MarcusBarnes

OK, I hadn't notice that. Thanks for pointing it out.

mjordan avatar May 18 '16 23:05 mjordan

Perhaps we can use the InsertXmlFromTemplate to add this, using a template like:

<identifier type='uri' displayLabel='Migrated from'>http://example.com/item/{{ somefield }}</identifier>

We could document this in a cookbook entry.

mjordan avatar Nov 10 '16 04:11 mjordan

@mjordan Thanks. Are you able to test your suggested approach?

MarcusBarnes avatar Nov 10 '16 14:11 MarcusBarnes

Yup, just tested and works as expected. Should I document how to use the InsertXmlFromTemplate manipulator for this, and then remove the hard-wired code in the Cdm* classes?

mjordan avatar Nov 10 '16 14:11 mjordan

@mjordan Yes please. This is a better approach.

MarcusBarnes avatar Nov 10 '16 14:11 MarcusBarnes

Will do. I only have to modify the CdmToMods.php metadata parser. If you don't mind, I'll delete the issue-176 branch and cut a new one to do this work in.

mjordan avatar Nov 10 '16 14:11 mjordan

@mjordan That's fine. Thank you.

MarcusBarnes avatar Nov 10 '16 14:11 MarcusBarnes

We still have the problem of applying manipulators to page-level metadata. I'll see if I can address that as well in this branch.

mjordan avatar Nov 10 '16 15:11 mjordan

@jonathangreen and @MarcusBarnes, looking at this, I'd like to recommend that we create a new metadatamanipulator that uses a Twig template to add the XML fragment to the MODS in the CDM toolchains for the "migrated from" value. The template can take two values, the collection alias and the object or page pointer, both of which are available in the function where the respective MODS is generated. The same template could be used for both the parent and page level fragments. A typical template would look like this:

<identifier type="uri" invalid="yes" displayLabel="Migrated From">http://contentdm.org.net/cdm/ref/collection/{{ collection_alias }}/id/{{ pointer }}</identifier>

This is not as generalized as the InsertXmlFromTemplate manipulator, but I don't think that is suitable for page-level MODS since within createPageLevelModsXML() only $page_pointer and $page_title are available. We already check specifically for the AddUuidToMods manipulator so checking for one more isn't the worst thing we can do. I'd personally like to see more flexibility here for generating custom page-level MODS but without good use cases I'd rather not build that in. If we had some good use cases, however, we should consider doing that.

I'll note that at the time we wrote the CdmToMods metadata parser, we hadn't introduced Twig into MIK. Now that it's there, adding another Twig-based manipulator isn't that hard.

In the .ini file, we'd deprecate include_migrated_from_uri and instead register the new metadata manipulator like any other, in the [MANIPULATORS] section like

metadatamanipulators[] = "AddContentdmSourceUrl|null4|/tmp/twigtemplates/migrated_from.xml"

Thoughts?

mjordan avatar May 01 '18 03:05 mjordan

Seems very reasonable to me!

jonathangreen avatar May 01 '18 12:05 jonathangreen

I'm not sure how many people use the 'include_migrated_from_uri', but just in case, we should tag a point release before merging the result of work on this. That way, if they need to, they can get a copy of the tagged release so that people who are currently using that in a workflow can still use their current workflow until they have the chance to modify their workflows to use the new suggested metadatamanipulator.

MarcusBarnes avatar May 01 '18 13:05 MarcusBarnes

Sounds good. @MarcusBarnes if you want to cut a point release, I can work on this.

mjordan avatar May 01 '18 14:05 mjordan

I've drafted a point release (visible to MIK maintainers) at https://github.com/MarcusBarnes/mik/releases/tag/untagged-50ba6f22dabb99534a02. Once we have a pull-request that addresses this issue, we can publish the point v0.9.1 point release and merge the reviewed pull-request.

MarcusBarnes avatar May 01 '18 18:05 MarcusBarnes

Making progress on this, but the new template XML is overwriting other <identifier> elements. Will continue.

This work is making me remember that the CdmBooks toolchain is probably the least polished of our toolchains. I'm going to do a wee bit of cleanup while I'm in there if that's OK.

mjordan avatar May 02 '18 13:05 mjordan

Thank you for the update and thank you in advance for your work on this.

MarcusBarnes avatar May 02 '18 14:05 MarcusBarnes

Update: the new metadatamanipluator works, but it interferes with other metadatamanipulators. Now all I have to do is figure out why it's doing that and make it stop :face_with_head_bandage:

mjordan avatar May 03 '18 02:05 mjordan