allennlp
allennlp copied to clipboard
save git status when run commands
Sometimes after changing many versions of the code, I'm confused about how I got this result. It would be nice if allennlp could log the current git status to serialization_dir
when running train
command.
Here is an example of a transformers record(git_log.json
):
{
"repo_id": "<git.repo.base.Repo '/data/wts/transformers/.git'>",
"repo_sha": "b01ddc9577b87f057e163d49563ee3f74f4810cf",
"repo_branch": "master",
"hostname": "XXX-GPUSERVER-144"
}
Something like this would be very good to have.
here is toy code for reference
import git
import json
import os
import socket
def save_git_info(folder_path: str) -> None:
repo = git.Repo(search_parent_directories=True)
repo_infos = {
"repo_id": str(repo),
"repo_sha": str(repo.head.object.hexsha),
"repo_branch": str(repo.active_branch),
"hostname": str(socket.gethostname()),
}
with open(os.path.join(folder_path, "git_log.json"), "w") as f:
json.dump(repo_infos, f, indent=4)
@tshu-w would you be interested in making a PR for this? It would also be good to include the AllenNLP version in this meta-data.
Another detail: we should make sure this meta data is included in the model archive.
@epwalsh I would be glad to make a pull request. I will take a look when I have time.
Now that we're saving meta data in model archives, adding this should be pretty straightforward. We'd just need to add it to the Meta
class.
@epwalsh Awesome, I think it's a more common solution.
@tshu-w we're going to keep this open until the feature is added. Do you still want to make a PR?
@epwalsh I would love to, but I cannot promise.
@epwalsh Sorry, I probably won't think about submitting a PR. hope someone is interested
@epwalsh I am interested in this PR. If @tshu-w does not have any problem, I would like to look into it.
@Shreyz-max No problem of course.