analysispreservation.cern.ch
analysispreservation.cern.ch copied to clipboard
Add new files that are added in republishing in the Record bucket
Reproducing steps:
- Create a draft and upload a file.
- Publish it.
- We can see the uploaded file.
- Change back to draft mode and add new file.
- We can see the newly added file in draft mode.
- Publish the new version and the newly added file is missing.
Expected outcome:
The newly added files should be there in the published version.
Dump
My plan is to try both ways as we discussed earlier
- new bucket creation every time record is published.
- use the same bucket but copy the file instances in the previous bucket with some distinction (like revision number or version number).
--
I tried these couple of ways, does not seems to work.
For 1. since at the first record creation, the record is linked to a bucket already so at the time of republishing after editing, if I create a new bucket, i can't link it to the record because of AttributeError: 'CAPRecord' object has no attribute '_sa_instance_state
For 2, I tried to copy the new files in the previous bucket but I cannot since if bucket has existing files, the new one is not getting added. raise RuntimeError('Can not update existing files.')
--
I did not manage but I had some questions regarding this as I am trying to understand what should be expected behaviour.
-
What is expected behaviour?
-
In
publish_edited, we create a new class(Record) instance and update the data in it. [Since we snapshot a bucket and add files in record during first publishing, it does not happens while publishing the edited version and there is already a bucket associated with the record with previous content] -
Ideally, what I understand when we put a record in draft mode, right now we don't index it either but why not? [Since it has new(updated) metadata and files so it should be in indexed and appear as another record, being said that we can have a relationship between records versions]
-
What I propose is having a RFC for this and whatever we decide, that should be expected behavior.
The expected behavior:
- Create draft -
b1 = f1v1 + f2v1 - Publish -
b2 = f1v1 + f2v1 - Record to Draft mode -
b1 = f1v2 + f2v1 + f3v1 - Publish edited -
b2 = f1v2 + f2v1 + f3v1
The 4 part is not happening right now and the steps to do it are as follows:
- Unlock the bucket
b2before publishing. - Add the files not present in the bucket
b2. - In the end,
b1andb2should have same file objects.
The expected behavior is happening after addition of PR #2806.
Next steps:
-
[x] Check with script if the output(number of affected records) remains same.
-
[ ] Make a script to update the affected records.
-
[x] Add the tests in #2806