Add S3 based optimization cache support
We use kiwix_storagelib for implementing S3 based optimization cache in the scrapers. However, this gives rise to redundant code. We put a version of the file along with the optimizer version as the metadata always. So, this can be better implemented in scraperlib. For a start, we can have a caching module that can have 3 functions, (or maybe a class containing methods). The primary 3 things we need are -
- download_from_cache()
- upload_to_cache()
- check_credentials()
There can be several ways to have this, but it should at least fulfill the following -
- Compare optimizer_version
- Compare file_version
Optional things can be to check file upload date and discard if it's older than a specified amount of time. If we go for a class based approach, we can also explore possibilities to improve performance.
This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.