asv
asv copied to clipboard
file names should not use externally generated strings
Currently result files are named using the hash that was benchmarked. For repos which use an actual hash as an identifier this has been working, however, if the repo impl includes path characters in the commit hash publish will fail to find the results.
While incredibly unlikely in this use case, its always good security practice to ensure that external strings are not passed to the shell or OS in which a program is running.
All of the information about the commit is stored in the json data. I was not able to find anywhere where the commit hash was pulled from the filename in processing results.
The implementation for this feature should be pretty simple. If we change the the impl of results.py#get_filename( ...)
to look something like
def get_filename(machine, commit_hash, env_name):
"""
Get the result filename. Hash of commit_hash and
environment name in a folder named for the machine.
"""
fname=hashlib.md5( "{}/{}".format(commit_hash,env_name).encode( 'UTF-8')).hexdigest()
return os.path.join( machine, "{}.json".format(fname) )
it simplifies the method and has the side effect of grouping results on the FS by machine name without impacting behavior or tests.