moodle-tool_objectfs
moodle-tool_objectfs copied to clipboard
unlink doesn't seem to delete files in S3 compatible storages
We are using GCP S3 compatible interface and files are not being deleted.
On https://github.com/catalyst/moodle-tool_objectfs/blob/master/classes/local/store/s3/client.php#L173 unlink is used to delete the file for S3 storage:
/**
* Deletes a file in S3 storage.
*
* @path string full path to S3 file.
*/
public function delete_file($fullpath) {
unlink($fullpath);
}
On https://github.com/catalyst/moodle-tool_objectfs/blob/master/classes/local/store/azure/client.php#L130 the azure client library call is used to delete the file for Azure storage:
/**
* Deletes file (blob) in azure blob storage.
*
* @param string $fullpath path to azure blob.
*/
public function delete_file($fullpath) {
$container = $this->container;
$relativepath = $this->get_relative_path_from_fullpath($fullpath);
$this->client->deleteBlob($container, $relativepath);
}
Shouldn't the function delete_file for S3 storage use the S3 library call to delete the file? This is used when testing deletion permissions in https://github.com/catalyst/moodle-tool_objectfs/blob/master/classes/local/store/s3/client.php#L276
if ($testdelete) {
try {
$result = $this->client->deleteObject(array('Bucket' => $this->bucket, 'Key' => 'permissions_check_file'));
$permissions->messages[get_string('settings:deletesuccess', 'tool_objectfs')] = 'warning';
$permissions->success = false;
} catch (\Aws\S3\Exception\S3Exception $e) {
$errorcode = $e->getAwsErrorCode();
// Something else went wrong.
if ($errorcode !== 'AccessDenied') {
$details = $this->get_exception_details($e);
$permissions->messages[get_string('settings:deleteerror', 'tool_objectfs') . $details] = 'notifyproblem';
$permissions->success = false;
}
}
}
So I think delete_file for S3 storage should be:
/**
* Deletes a file in S3 storage.
*
* @path string full path to S3 file.
*/
public function delete_file($fullpath) {
$this->client->deleteObject(array('Bucket' => $this->bucket, 'Key' => $fullpath));
}
We have intentionally not fully implemented deletes in S3 Buckets, It was deemed dangerous due to us sharing S3 buckets across environments And to often people want things back after deleting them, and finally object storage being so cheap we we haven't historically saved a lot of money from deleting from S3. Having said all that, there is provision in README.md for deleting S3 files and your changes would need to respect this flag if you want to send through a merge request. CFG->tool_objectfs_delete_externally = 1;
It not about the money, it is about GDPR, and not retaining unnecessary copies of the files after they have been deleted by the user as the law mandates.
The current code (https://github.com/catalyst/moodle-tool_objectfs/blob/master/classes/local/store/s3/client.php#L173) calls unlink (delete) without checking tool_objectfs_delete_externally. If I were to check tool_objectfs_delete_externally then I will be changing the behaviour of the code.
What I'm proposing is to use the S3 native API for deletion rather than the PHP unlink function, the same as you do with the Azure library (https://github.com/catalyst/moodle-tool_objectfs/blob/master/classes/local/store/azure/client.php#L126). The Azure library doesn't call unlink but uses the native Azure API.
Ok, the delete_file function is a red-herring as what should be happening is a copy to trash and then delete of the original (using the StreamWrapper's rename).
We have:
$CFG->alternative_file_system_class = '\tool_objectfs\s3_file_system';`
$CFG->forced_plugin_settings['tool_objectfs'] = [
'filesystem' => '\tool_objectfs\s3_file_system',
's3_key' => 'redacted',
's3_secret' => 'redacted',
's3_bucket' => 'bucketname',
's3_region' => 'europe-west2',
's3_base_url' => 'https://storage.googleapis.com',
];
$CFG->tool_objectfs_delete_externally = 1;
What works
- uploaded file gets created in
/opt/moodle/data/filedir/XX/YY/XXYY...local file - when push_objects_to_storage task runs, file gets copied to bucket in
/XX/YY/XXYY...object - when delete_local_objects task runs, file get deleted from
/opt/moodle/data/filedir/...local folder (folder also cleaned up if only file in it) - if signed-urls enabled, browsers successfully gets a 303 to bucket
What should happen (I believe)
- use Moodle's file 'popup' to delete file
- browser POSTS to
/repository/draftfiles_ajax.php?action=delete - /repository/draftfiles_ajax.php uses:
$fs = get_file_storage()$stored_file = $fs->get_file(...)$stored_file->delete()
- /lib/filestorage/stored_file.php calls:
$this->filesystem->remove_file(...)
- objectfs /classes/local/store/object_file_system.php calls:
$this->delete_object_from_hash(...)- which looks at where the file is (should be OBJECT_LOCATION_EXTERNAL) so then call
$this->move_external_file_to_trashdir_from_hash(...) - which works out the trash path (presumably
s3://bucket/trash/XX/YY/XXYY...) and calls$this->rename_external_file(...) - which calls
$this->externalclient->rename_file(...)
- object fs /classes/local/store/s3/client.php uses:
- PHP built-in
renamefunction but should have already registered the StreamWrapper from moodle-local_aws plugin
- PHP built-in
- aws plugin StreamWrapper rename() function, SHOULD:
- copy file to trash directory with
$this->getClient()->copy(...) - then delete with
$this->getClient()->deleteObject(...)
- copy file to trash directory with
Working out exactly where in this chain the actual failure is happening is going tricky (especially as this is running in containers in GCP).
The browser's request to draftfiles_ajax.php does get a success response.