IFIscripts icon indicating copy to clipboard operation
IFIscripts copied to clipboard

creates PREMIS CSV implementation scripts

Open kieranjol opened this issue 8 years ago • 0 comments

Not ready to merge, but sending a pull request for visibility. I added a lot of information to the readme and to the docstrings within the functions, so here's a copypaste of the docstring output as generate by pydoc

Help on module premisobjects:

NAME
    premisobjects

FILE
    ifigit/ifiscripts/premisobjects.py

DESCRIPTION
    Creates a somewhat PREMIS compliant CSV file describing objects in a package.
    A seperate script will need to be written in order to transform these
    CSV files into XML.
    As the flat CSV structure prevents maintaining some of the complex
    relationships between units, some semantic units have been merged, for example:
    relation_structural_includes is really a combination of the
    relationshipType and relationshipSubType units, which each have the values:
    Structural and Includes respectively.
    
    todo:
    Document identifier assignment for files and IE. Probably in events sheet?
    Allow for derivation to be entered
    Link with events sheet
    Link mediainfo xml in /metadata to the objectCharacteristicsExtension field.
    
    
    Assumptions for now: representation UUID already exists as part of the
    SIP/AIP folder structure. Find a way to supply this, probably via argparse.

FUNCTIONS
    file_description(source, manifest, representation_uuid)
        Generate PREMIS descriptions for items and write to CSV.
    
    find_representation_uuid(source)
        This extracts the representation UUID from a directory name.
        This should be moved to ififuncs as it can be used by other scripts.
    
    get_checksum(manifest, filename)
        Extracts checksum from manifest, rather than generating a fresh one.
    
    intellectual_entity_description()
        Generate PREMIS descriptions for Intellectual Entities and write to CSV.
    
    main()
        Launches all the other functions when run from the command line.
    
    make_skeleton_csv()
        Generates a CSV with PREMIS-esque headings. Currently it's just called
        'cle.csv' but it will probably be called:
        UUID_premisobjects.csv
        and sit in the metadata directory.
    
    representation_description(representation_uuid, item_ids)
        Generate PREMIS descriptions for a representation and write to CSV.


Help on module premiscsv:

NAME
    premiscsv

FILE
    ifigit/ifiscripts/premiscsv.py

DESCRIPTION
    Extracts preservation events from an IFI plain text log file and converts
    to a CSV using the PREMIS data dictionary

FUNCTIONS
    find_events(logfile)
        A very hacky attempt to extract the relevant preservation events from our
        log files.
    
    main()
        Launches all the other functions when run from the command line.
    
    make_events_csv()
        Generates a CSV with PREMIS-esque headings. Currently it's just called
        'bla.csv' but it will probably be called:
        UUID_premisevents.csv
        and sit in the metadata directory.


kieranjol avatar Jul 30 '17 13:07 kieranjol