openverse-api
openverse-api copied to clipboard
Remove legacy `tags_list` field
Problem
The media models use an unnecessary legacy tags_list field.
Description
When the machine-generated tags were added, the data model for them changed from ~~simple array of strings~~ a list of ForeignKeys to a separate ImageTags table to an array of objects with properties such as provider and accuracy (only for machine-generated tags). The tags_list property was deprecated, but not removed from the database: https://github.com/cc-archive/cccatalog-api/pull/182
We should remove it.
Alternatives
We could also convert all tags to use simple string array since we are not really using machine-generated tags. But I think we will want to use them in the future, and removing them now to re-add later is an unnecessary complication.
Additional context
Implementation
- [ ] 🙋 I would be interested in implementing this feature.
If we drop this from the Django models, will this need any change in the catalog or the ingestion server?
The catalog will not need to change anything because the tags column in the catalog is not changing. But the ingestion server will probably need to be updated.
Since the ingestion server makes no explicit reference to the columns and picks the overlapping ones, removing it from the API DB would be more than enough for the ingestion server to not copy it. I think this can be marked as ready for work then.