ckanext-qa
ckanext-qa copied to clipboard
QA rating without downloading files
I would like to suggest that the extension be fundamentally redesigned so that it is not necessary to download complete files.
It is not ideal to download the complete files. Some files are several gigabyte large. For other resources, a lot of computing time is required in the source system.
Therefore, I suggest different levels to determine the file format:
- Trust the format specification made by the user.
- Do a HTTP HEAD request and trust the webserver's answer.
- Download a few bytes of the resource and use file magic numbers.
- Download the complete file and do the analysis.
You are welcome to contribute, this extension is on quite minimal maintenance, so no major refactions will happen any time soon.