Ross Spencer

Results 135 issues of Ross Spencer

Used to be able to output DROID CSV again from SQLITE DB... do we re-add this functionality? One idea might be to create a dump per namespace? https://github.com/exponential-decay/droid-sqlite-analysis/blob/master/libs/ExportDBClass.py

question

* https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-citation-files

Consider an exception to handle export not identified, or at least handling that output more graciously. If an export isn't identified that's the most important piece of info.

It looks like there are a couple of remaining issues using Siegfried data. Possibly restricted to Windows? 1. Number of folders is incorrect in the report below, there are many...

bug

Some CSVs can still interrupt processing. We handle this now with an error log, but we may want to find a way of proactively identifying the issue when reading the...

bug

Question from Tyler on Twitter - can we detect resource forks better in these reports? How does DROID/SF report on them? Follow up again with Tyler's work on this and...

Given an 8 million line SF YAML, (631,286 row database), [`create_id_breakdown`](https://github.com/exponential-decay/demystify/blob/3452ac7c77e8b67420708a84f331968b6d91f9b4/libs/DemystifyAnalysisClass.py#L326-L335) is taking too long. It is largely unoptimized and not brilliantly written. Any rewrite I believe should bring pretty...

CUL blog with nice visualizations over a lot of data: https://digitalpreservation-blog.lib.cam.ac.uk/identification-and-analysis-of-our-research-repository-file-formats-using-droid-fbb0d7d86222 While demystify won't replicate Grafana - some more tutorials around this and how to use the database could be...

Audit will output all queries used in an analysis and their timings. Example output: https://gist.github.com/ross-spencer/1f6cf16c9f966b980db277f7a932d81b Connected to #75

In-progress