Ability for squirrels to change project images via script
One of the most common tasks a squirrel does is replace proofing images, or add, delete or replace illustration (non-proofing) images. Sometimes just a handful, but often in bulk. Currently, this must be done via command line.
After discussing the idea with @cpeel , it seems that the best place to delete non-proofing images would be to extend the existing functionality in tools/proofers/images_index.php to allow squirrels to delete non-proofing images either selected or in bulk in any project state (obviously other than deleted or archived). Or to replace images of the same name. (This should be restricted just to the non-proofing images.)
The other functionality that is needed is the ability to replace existing proofing images in bulk. This could mean replacing all proofing images, or just a partial replacement, but more than is easily replaced using handle_bad_page one at a time. This can probably also be done by adding admin functionality to an existing script, but I'm not sure where.
This can probably also be done by adding admin functionality to an existing script, but I'm not sure where.
Yeah, there's not a great existing place for these to go, just a handful of not-great-but-maybe-not-terrible places:
add_files.phpis what does the initial load and does some handling of bulk images, but this also deals a lot with text and I'm dubious that this is the right place.remote_file_manager.phpis perhaps a logical place as it deals with file management, but right now it's all outside of the project directory (by design) and also only deals with individual files.
It is probably worth investigating images_index.php for the illustrations case and seeing if there's an opportunity for the page images case as well.
This can probably also be done by adding admin functionality to an existing script, but I'm not sure where.
Yeah, there's not a great existing place for these to go, just a handful of not-great-but-maybe-not-terrible places:
add_files.phpis what does the initial load and does some handling of bulk images, but this also deals a lot with text and I'm dubious that this is the right place.
Yeah, that's the problem with using add_files.php
remote_file_manager.phpis perhaps a logical place as it deals with file management, but right now it's all outside of the project directory (by design) and also only deals with individual files.
Right, so far, remote_file_manager.php deals just with the dpscans uploads directory.
It is probably worth investigating
images_index.phpfor the illustrations case and seeing if there's an opportunity for the page images case as well.
Sounds reasonable.
This can probably also be done by adding admin functionality to an existing script, but I'm not sure where.
Yeah, there's not a great existing place for these to go, just a handful of not-great-but-maybe-not-terrible places:
add_files.phpis what does the initial load and does some handling of bulk images, but this also deals a lot with text and I'm dubious that this is the right place.
After mulling this over for a while, I'm not sure that add_files.php is necessarily a bad place for this, iff it's reasonable to check what round a project is in, and only allow squirrels access to it if it's past the point where PMs can load projects.
The down-side would be that add_files.php also can add text, though in this case, we don't want it too, so for anything after P1 Unavailable, it would have to be blocked.
It is probably worth investigating
images_index.phpfor the illustrations case and seeing if there's an opportunity for the page images case as well.
Maybe this is the best place, after all. We don't want the option of deleting page images; just replacing them. The only place that the page images can be deleted is from the page details page, because it has to get text and image. So, if the individual replace could be made available for squirrels as a bulk replace for proofing images, and bulk deletion and replacement/addition for illustration images, I think that's what's needed.
I think the proper place for this is add_files.php. This is the current mechanism to add files as well as replace images in bulk for PMs. It makes sense a site_administrator should be able to use this experience to accomplish the same. The steps would be:
- The site_administrator would have to ensure the image file names are matching the images they want to replace.
- Upload a zip of the images to be bulk replaced
- Go to the project
- Fill out the file name in the add/replace text and image textbox.
- Replace the images
I think this is all there already. The only necessary change would be dhowing site administrators the add replace text/image textbox on the project page when appropriate. Do we know what states we want to enable this behavior for (for site administrators only)?
I think this is all there already. The only necessary change would be showing site administrators the add replace text/image textbox on the project page when appropriate. Do we know what states we want to enable this behavior for (for site administrators only)?
It's all there, yes, but I think we want an additional check that the squirrel isn't accidentally trying to replace the text. Even squirrels shouldn't be replacing text in a project that is in the later rounds. As it currently is, when text/image pairs are loaded, the text goes in the master_text slot (OCR). For adding new or replacing existing images, all stages (at least, I don't see a reason to exclude PP/PPV/SR if a request to replace illustration images is made by the PPer, for instance). And, yes, for site-admins only.
I'm cool with expanding add_files.php to be useful outside of initial load. I would love for the functionality in that page to be pulled out into functions we could use (and test!) separately -- specifically use in the API.
PR #968 takes care of the ability for squirrels to delete illustration images at any time in a project's pre-archiving time.
Still to be done: add the ability for squirrels to load replacement images at any time. The current thinking is to modify add_files.php to allow
- squirrels, only, to bulk-replace images, only (not text)[1] at any time
- continue to allow PFs, squirrels and the projects' PMs to replace images and text if the project is in the new project state or in P1 Unavailable.
- continue to deny everyone else the ability to add or replace images or text at any time.
[1: This is why I was wondering whether it would be easier to treat this the same way we do uploads for SR/backing up PP or PPV files; there's nothing that needs to be done in the database at all. All we need to do is allow squirrels to replace any images that are there already, or to add new illustration images. That is, move the images into the project directory. If there's already an image existing, it will be overwritten, if not, the new image will be copied into the project directory. If the squirrel copies a proofing image in where one doesn't already exist, it will correctly go in as a non-page image).]
This can probably also be done by adding admin functionality to an existing script, but I'm not sure where.
Yeah, there's not a great existing place for these to go, just a handful of not-great-but-maybe-not-terrible places:
add_files.phpis what does the initial load and does some handling of bulk images, but this also deals a lot with text and I'm dubious that this is the right place.
remote_file_manager.phpis perhaps a logical place as it deals with file management, but right now it's all outside of the project directory (by design) and also only deals with individual files.It is probably worth investigating
images_index.phpfor the illustrations case and seeing if there's an opportunity for the page images case as well.
Going back to Casey's initial response to the issue description, I've been trying to read through add_files.php off and on, and getting at least a sense of what it does, and I'm less and less convinced that it's the right place for the bulk replacement of images.
Consider what we have now: a project's PM, PFs and squirrels can bulk replace/add text and images in the new project state, and in P1 Unavailable. If the image has a corresponding text file, the two will be paired, and the image identified as a proofing image. If it doesn't have a corresponding text file, it's identified as a non-page image. Either way, the text goes into the database, and the images go into the project directory with the proper permissions and ownership. This, like initial project load, is very properly handled by add_files.php.
Any round beyond P1 Unavailable, we don't allow, and don't want to allow easy bulk replacement of the text in the database.
What we want to do here is only allow squirrels (and squirrels only) to bulk replace proofing images via script, or bulk add/replace illustration (non-page) images. This involves only images. So, I think all that's needed is:
- The usual checks to make sure it's a valid check
- Check that the zip contains only
.pngorjpg. Reject it if it contains anything else (though I think that the others check for and either ignore or delete Mac-specific invisible files?) - Apply the correct permissions and file ownership
- Move images into the project directory.
There's no way the script can or should be able to check to see if it's the correct project. It doesn't need to care whether the images are correct. The squirrel is responsible for doing that. Every image that goes in will be placed properly. If it's replacing an image, it will overwrite it whether it's a page image or a non-page image. If the image doesn't already exist in the project directory, it'll go in as a non-page image.
I have not grokked add_files.php in its entirety, so I don't know if it's still suitable for that functionality, or if the code that's already in add_files.php can be reorganized to make it work. If adding it requires a lot of new code, I would think that it might make more sense to add the functionality to images_index.php, instead. I don't know if it would touch fewer files, though.
Coming at this from the squirrel perspective (and without having looked at any of the code), it feels like add_files.php is currently closer in functionality to what I'd be looking for than images_index.php and add_files.php is what I would currently use, if the project is in a state that lets me do that. I completely agree though that this should only allow adding/replacing images and it should not be possible to affect the text in any way. (Very, very occasionally, I encounter cases where it would be helpful to replace text for multiple pages at once in later rounds but I'm happy that those are special cases that should be handled manually.)
Having said that, if it's easier to to add some of the add_files.php functionality to images_index.php such that something similar to the "Add/Replace text and images from directory or zip file:" appears on the images index (for those with appropriate permission given the project's current state) and allows adding/replacing images, that would be fine with me too.
I agree with srjfoo's comments above (29 November 2023) about what's needed. The bullets describe the process I follow from the command line (other than I move the images and then apply permissions/ownership).
One other case has just come up that would be good to cover: rarely, when fixing a project, jpg proofing images are replaced with png proofing images (even more rarely, but possible, the other way around). This does involve updating the database, so perhaps it would be better as a separate script, but it seemed that what's needed could go into add_files. Or we can continue doing it manually. It doesn't come up very often.
After looking at this I'm pretty sure that the add_files.php script has all of the validation and functionality that is needed, we just need to make it available to SAs in any state and add in some guardrails.
Separately I want to extract the load functionality out of this script so we can use it in the API, but that shouldn't be necessary to address this issue.