qpixel
qpixel copied to clipboard
Automatically clean up tags
meta:278030
Automatically clean up tags with zero posts when they have no excerpt and no wiki. Currently such tags pollute the tag list and require a moderator to delete them.
Suggestions for Implementation
- Add a script with which tags can be cleaned up
- We could run this on a daily basis to run the cleanup repeatedly
- Add an after_handler on post such that if tags were altered on a post, it triggers a check on the removed tags to trigger cleanup.
- 1 has the advantage of being simple to implement, and will be necessary for the initial cleanup even if 2 is chosen.
- 2 has the advantage that it does not depend on server setup/configuration and should ensure immediate cleanup of unused tags as soon as they become unused.
I think the implementation of 2 could be done like this:
app/models/post.rb
# After commit to make sure that the removed tags do not show up any more as associated posts to tags
after_commit :cleanup_tags, on: :update, if: :will_save_change_to_tag_cache?
# TODO: Technically destroy is possible (through console), should also have a cleanup but needs a different method as all the associated tags are then eligible.
...
def cleanup_tags
old_tags = tags_cache_before_last_save
new_tags = saved_change_to_tags_cache
removed_tags = old_tags - new_tags
Tag.where(name: removed_tags).select(&:can_be_cleaned_up?).each(&:destroy)
end
app/models/tag.rb:
# A tag can be cleaned up if it does not contain a detailed description (wiki/excerpt), doesn't have any children and no associated posts.
def can_be_cleaned_up?
excerpt.blank? && wiki.blank? && wiki_markdown.blank? && children.blank? && posts.blank?
end
But this needs to be tested whether it works correctly.
Good idea. We've been letting tags do their own thing for a while and it's piling up.
One problem is that some tags are present in history, for example if an incorrect tag was initially created and added to a post, then replaced by the correct one by a moderator. In that case, there is a history event that the specific tag was removed, and tags can no longer safely be deleted.
If we want to support cleaning up those tags we should consider changing how we store tags in the history (perhaps also by name), such that the diffs remain intact.
Would this problem be avoided if deleted tags were still present in the database but marked as "deleted"? Then they could be available for display in the history, but not available for use.
If a user creates a new tag with that name, we could undelete the old one rather than creating a new entry.
Does this cause any problems?
@trichoplax Thats also a good idea. Perhaps we should call it hide/hidden and auto unhide when added to a post
I'm a fan of soft-deletes as opposed to hard-deletes in general, and this sounds like it would solve some problems. This would allow us to show soft-deleted tags in history with some special formatting to indicate status, as opposed to having things just disappear.
Currently an admin can delete a tag and it disappears from the history of all posts it was ever present on. If we introduce soft deletion for this clean up issue, can we also introduce a soft deletion option for admins at the same time? That would include deletion of a tag that is still in use (has posts still tagged with it).
There may be rare cases where an admin does want to remove a tag from showing in the edit history of all posts, but I would expect that in such cases the admin would probably also want to prevent the tag from being recreated the next time someone makes a new post and tries to tag it. So there doesn't seem to be a reason to keep the current approach, which is the worst of both: it loses the history but also doesn't prevent recreation of the tag.
This reworking of the admin tag deletion may be better as a separate issue, but it's worth bearing it in mind during work on this issue in case it affects which approach is decided on.
Do we need to tidy up tags that just happen to have zero posts? Could these simply be hidden in the tags list page? Perhaps with a check box to include tags with zero posts?
We could also make the tag suggestions during creating/editing a post show only tags with 1 or more existing posts.
Tags which happen to drop to zero posts would then appear to be gone to most users, but would still be available for admins to apply special measures to, such as making them synonyms so they don't get used in future, or marking them as not to be used (when that feature is introduced).
Currently an admin can delete a tag and it disappears from the history of all posts it was ever present on. If we introduce soft deletion for this clean up issue, can we also introduce a soft deletion option for admins at the same time? That would include deletion of a tag that is still in use (has posts still tagged with it).
Soft deletes for all tag deletions would be great. If that's too much to do here then we can spin it off, but I'd like "delete" to mean "soft delete" no matter how it happens, unless an admin takes special steps.
Do we need to tidy up tags that just happen to have zero posts? Could these simply be hidden in the tags list page? Perhaps with a check box to include tags with zero posts?
A tag that has zero posts but has children should still show up -- it exists to provide structure to a hierarchy, its description might be meaningful, and people might want to follow it and its children.
Hiding zero-use child tags by default (with a way to see them anyway) seems reasonable, in the flat list. Again, for the hierarchy view I think we'd want to show them anyway.
This might be getting complicated.
Ah. I can see now why this is more complicated than I expected.
If we take this complicated approach (some zero post tags are hidden in some circumstances), would that preclude the need for automatically cleaning up (soft deleting) tags?
Now that I've reread the entire thread and not just the new comments I got email about...
Proposed amendment: Automatically soft-delete tags with zero posts when they have no excerpt, no wiki, and no children or parent (are not part of a hierarchy). If the same tag gets recreated, undelete it instead of making a new one.
Soft-deleted tags can be visible in history and should probably be visible to tag managers when we create that ability, so we'd need to give them some different styling, not hide them entirely. In particular, let's not add confusion to histories ("I could have sworn I added/saw another tag here..."). This is analogous to deleted posts -- they're there, for those who can see them, and clearly marked.
How about adding a new «retain/persist» attribute for tags, so that you can specify if a certain tag is to be kept alive indefinitely, regardless of its containment in a tag hierarchy?
I’d also like to take this on if possible. Thanks
@peterelbert sounds good. Note there's been a lot of discussion, and I believe https://github.com/codidact/qpixel/issues/1177#issuecomment-1684559665 represents where we landed (or at least no one's objected). In particular, (a) soft deletes and (b) only tags that are leaf nodes (because we have tag hierarchies).