Optimize resource checksums
Once resources are downloaded, checksum validation holds up the process for at least 1 or 2 minutes, this is also true when no resources actually need to be downloaded. (Tested on Android)
I've looked into how resources are acquired and cached, my findings are as follows:
Requirements
As far as I can tell and with the most liberal interpretation, the requirements are twofold:
- Make sure only un-editied resources are being used in a benchmark.
- Make sure no download is corrupted or modified between the CDN and the client.
Layers
There are 3 major layers for handling resources (at least where downloads and checksums are involved):
- State
- Resource Manager
- Cache Manager
State
This is the interface to the app itself, basically it receives the order to "load" resources for any number of benchmarks
Resource Manager
Responsible for identifying downloadable resources and checksum verification
Cache Manager
This is where the actual file handling occurs, this layer is what downloads and provides paths to the downloaded files.
Resource Loading Flows
The app requests resources to be loaded through 5 flows:
- As soon as the app starts (all benchmarks & all modes)
- When a user clicks GO (active benchmarks & active modes only, only checksum validation)
- When a user clicks on the download dialog (active benchmarks & all modes)
- When a user clicks on a download button (1 or all benchmarks & all modes)
- If the task config changes (all benchmarks & all modes)
The Loading Process
With the exception of the GO button flow, every flow goes through state.loadResources, this tells the resource manager to "handle" the provided resources, which then tells the cache manager to "cache" or "load" any internet resources.
The terms "cache" and "load" here mean simply finding where the resources are in the filesystem, and downloading them if requested to. "handle" refers to sorting resources, "caching", and checksum validation.
Every time resources are "cached", the cache manager's resource map is overridden, because of that, the state has to request resource "handling" twice, once for selected benchmarks and once again for all benchmarks (regardless of selection), this is to make sure the cache manager's map contains all resources available.
Problematic Areas
There are 4 main problems I see with the process above:
- The cache map being overridden rather than edited and subsequent double handle calls cause files that have already been verified to be verified again. In 3/5 of these flows the verification happens for all benchmarks twice.
- According to the requirements section above, performing checksum validation on a file that was already downloaded in any flow other than the GO button flow is redundant, which causes delays with no tangible benefit.
- Resources are loaded and checksum verified for all modes regardless of what mode is actually selected. While this is convenient for potentially switching modes, it causes more files to be downloaded and validated than needed. This is only a big issue when compounded with the 2 points above.
- For the resource screen, there is no user facing distinction between "loading" and checksum validating resources. Simply put, the user has no idea resources are being validated,the app simply shows "Downloading
" (or only "Downloading" if all resources are already downloaded) for the entirety of the validation process and "handling" for all benchmarks that follows.
NOTE: The GO button flow validating only active modes is the reason why the delay is so long only on SubmissionRun in #1026