active_fedora
active_fedora copied to clipboard
id should be consistent with to_param for autogenerated pids
This seems odd:
p @asset1.to_param
"87/c9/ad/e0/87c9ade0-c84b-4a0b-9458-a83c2f7ac55b"
p @asset1.id
"/87/c9/ad/e0/87c9ade0-c84b-4a0b-9458-a83c2f7ac55b"
@cbeer @awoods @escowles The ids that fcrepo 4 autogenerates are a real pain to work with in a webapp. Is there any plan to make this subnode structure transparent to the outside world?
There is a branch that does this (creating the hierarchy internally but hiding it):
https://github.com/fcrepo4/fcrepo4/tree/hierarchy
But there are a lot of places in the fcrepo4 code that assume a 1-to-1 mapping between the URL and the underlying Modeshape structure.
Another option would be to build the mapping info AF or LDP, so fcrepo4 creates the subnode structure, but that gets translated back and forth somewhere before the webapp sees it.
Of course, the ideal solution would be improving the performance of Modeshape with a large number of child nodes so we don't have to have the subnode structure at all. @awoods has talked about exploring this, but I don't know if we've filed a Modeshape bug or done any work on that front.
@mohideen is currently working on exploring the ModeShape performance issues in the context of having a large number of child nodes from a single parent node: https://www.pivotaltracker.com/story/show/75459836
As noted, there is a branch that is also exploring the feasibility of hiding an auto-generated hierarchy from the user. I know @cbeer has reservations about such an approach.
The earlier work that @ajs6f did relating to internal/external identifier translation was tickling this issue, but does not support the various places in the code that @escowles mentions where such a translation would be required.
Can you be more specific, @jcoyne, about the struggles you are having?
@awoods The struggle I'm having is mostly cognitive. I want an id to be: '123456' I shouldn't be forced into using an id like /12/34/56/123456 just so the performance isn't terrible.
@jcoyne I completely understand. We've filed a Modeshape bug (https://issues.jboss.org/browse/MODE-2109) and it's actively being worked. It's target at the next release (beta1). Hopefully that will resolve this issue and allow a flat object hierarchy to perform well.
@jcoyne, you may be interested to know that the "auto-generated" hierarchy is a pluggable feature. You can also have F4 create the id without an hierarchy... although if you create a flat structure, performance will suffer until the ModeShape bug is resolved.
See: https://github.com/fcrepo4/fcrepo4/blob/master/fcrepo-webapp/src/main/resources/spring/minter.xml#L8
and a simple alternative: https://github.com/fcrepo4/fcrepo4/blob/master/fcrepo-kernel-impl/src/main/java/org/fcrepo/kernel/impl/identifiers/UUIDPidMinter.java
@awoods Is there a way to configure the minter without modifying the source and rebuilding the war file?
Yes @jcoyne, you can specify a System Property that points to your own "minter.xml" that defines which minter to use: https://github.com/fcrepo4/fcrepo4/blob/master/fcrepo-webapp/src/main/resources/spring/master.xml#L11
Property = "fcrepo.minter.config"
@awoods Actually, I'm not sure that will help. I already have a set of IDs for my object, I need to map the existing ids into new URIs that distributes them uniformly, then I need to transform all of my RDF assertions that I want to move into Fedora 4 to change from old (flat ids) to hierarchical ids. So, while it's a solvable problem, it's a pain to be have to do this in order to get acceptable performance.
@awoods That system property is great. Is there a list of settable system properties documented somewhere? When I googled for fcrepo.minter.config
I only saw the source and a conversation from IRC match.
@jcoyne: https://wiki.duraspace.org/display/FF/Configuring+an+External+PID+Minter
@jcoyne, I think there is some debate on the relationship between internal F4 identifiers (the path inside the repository at which objects are actually stored) and external F4 identifiers (what path or ID shows up in the repository URL). Where we stand right now is that the internal and external identifiers are the same. The issue, of course, is that there are performance issues with flat structure. One approach is to handle the identifier translation at the Hydra-level. The other is to wait for ModeShape to finish and release resolutions to these tickets: https://issues.jboss.org/browse/MODE-2271 and https://issues.jboss.org/browse/MODE-2109
There is also a claim (and I support it) that a strict separation between internal and external identifiers is essential for flexibility and long-term durability. That's why we introduced machinery to that effect (identifier translators and so forth).
@ajs6f, the start of that internal/external identifier translations work holds a lot of promise. What we have since run into is that much of Fedora's interaction with the underlying ModeShape goes through direct ModeShape/JCR calls which do not include any of the id-translation. We likely need a complete abstraction over the JCR objects in order to properly support the separation of internal/external identifiers.
Oh, the hell with it. Let's just get rid of Modeshape.
@awoods, @ajs6f: can we get an update on this issue?
@atz If you mean you would like to have shorter IDs and not have to deal with the PairTree hierarchy, I think the AppleTree module might be useful: https://gitlab.amherst.edu/acdc/acrepo-apple-trees
@awoods I wasn't able to find any docs on how to actually use this, beyond the links from that module's README to some config files. Would it be possible to add a profile to webapp-plus to make it easy to enable this?
@escowles : The apple-trees dependency is available in the current master branch of fcrepo-webapp-plus as of: https://github.com/fcrepo4-exts/fcrepo-webapp-plus/commit/49c8d14747b6db4659e48a4a6046efc21aba3647
Configuration of that component is still required in order to enable its use.
There are caveats related to apple-trees that include the inability to use COPY
and the fact that existing repository resources would need to be migrated to new paths.
Although this capability shows promise along the lines of this GitHub issue, it would require stakeholder interest and testing to move it forward.