Support indexing of EAD 3
Discussed on the 31 July call. Currently, the ArcLight index is pretty connected to EAD 2002, specifically, and in order to leverage future advantages of EAD3, our indexer should support it. #528 and #529 are probably prereqs for this ticket.
@anarchivist , as discussed on 13 August call, one target might be support for ArchivesSpace generated EAD3.
The current ArchivesSpace to EAD3 export mappings are defined in an EAD Import / Export mapping spreadsheet available here: https://archivesspace.org/using-archivesspace/migration-tools-and-data-mapping. The EAD3 exporter as currently implemented in AS may not be identical to what's described in the mapping document, but it should be very close.
Trevor Thornton (NCSU) and I worked on developing the spec and building this ASpace EAD3 exporter a few years back. Trevor is probably most familiar with the exporter code. The goal was to build an MVP EAD3 exporter so we could say ASpace supports EAD3. There are many areas where the EAD3 exported from ASpace doesn't support the full expressiveness of the EAD3 schema (e.g. granularity of encoding for dates, extents, name headings, etc.). In these areas, the ASpace-generated EAD3 is very similar to EAD2002.
Duke is not currently using ASpace-generated EAD3, but would like to if Arclight could support it. There are a few institutions using ASpace EAD3 in the wild (Yale definitely...)
OK - thanks @noahgh221; let's discuss on an upcoming PO call.
Chatting with @mejackreed about this - @noahgh221, could you and/or @seanaery get some fixtures to help move this forward? Even though that this is relatively low priority, we could create a spike once the Traject spike (#558) to address #528?
@anarchivist , @seanaery . Adding a few EAD2002 and EAD3 files for the same collections in the fixture data Google folder now. I'll try to include EADs with digital objects.
Thanks @noahgh221 - I've updated the fixture spreadsheet - once you're back could you fill out any additional details? I took a pass, but I figure you know your description better than I do.
@anarchivist , added a few more notes to the spreadsheet and one more file. Going hunting for an EAD with 1000+ sibling components. I'm sure we have this. Lemme know if there are other fixture needs.
I think focusing on better ASpace integration should be a higher priority than EAD3. I feel like its unlikely that there are many real-world institutions that use EAD3 that are not also using ASpace to generate that EAD3.
In response to gwiedeman - I would probably agree were it not for the fact that at CAO, we work with a large institution who have chosen not to allow us to harvest from their many ASpace repositories and instead they manage their own export of EAD20002s and transfer them to our Arclight data directories. A wrinkle to this is that they already export all their resources in EAD3 for backup into a harvestable location that we could use instead. So, to work with us, they have to export their many thousands of resource records twice. While ours may be an outlier use case, it does bring up that if EAD is the portable data format, and EAD3 is the latest version of that format, then maybe Arclight should still consider a config that allows for indexing of EAD3.