stanford_alpaca icon indicating copy to clipboard operation
stanford_alpaca copied to clipboard

weight_diff AssertionError: Naive integrity check failed. This could imply that some of the checkpoint files are corrupted.

Open abdoelsayed2016 opened this issue 1 year ago • 8 comments

Traceback (most recent call last):
  File "/gpfs/gpfs1/scratch/c7031420/stanford_alpaca/weight_diff.py", line 158, in <module>
    fire.Fire(main)
  File "/.local/lib/python3.9/site-packages/fire/core.py", line 141, in Fire
    component_trace = _Fire(component, args, parsed_flag_args, context, name)
  File "/.local/lib/python3.9/site-packages/fire/core.py", line 475, in _Fire
    component, remaining_args = _CallAndUpdateTrace(
  File "/.local/lib/python3.9/site-packages/fire/core.py", line 691, in _CallAndUpdateTrace
    component = fn(*varargs, **kwargs)
  File "/stanford_alpaca/weight_diff.py", line 154, in main
    globals()[task](**kwargs)
  File "/.conda/envs/llama_2/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/stanford_alpaca/weight_diff.py", line 130, in recover
    assert torch.allclose(
AssertionError: Naive integrity check failed. This could imply that some of the checkpoint files are corrupted.

python weight_diff.py recover --path_raw './PR_7B' --path_diff './output' --path_tuned './recover'

abdoelsayed2016 avatar May 05 '23 20:05 abdoelsayed2016

i have the same question

shiyanlou-015555 avatar May 06 '23 06:05 shiyanlou-015555

Is this inspection necessary? I feel that as long as we ensure that what we download is the weight of Huggingface, there should be no problem.

shiyanlou-015555 avatar May 06 '23 06:05 shiyanlou-015555

I am facing the same issue. Is there any solution to this? I downloaded the weights of llama-2-7b-hf from hugging face and wdiff-7b-alpaca from hf, and now the code exited with the afore mentioned error...

omamaatautolabs avatar Oct 31 '23 10:10 omamaatautolabs

Same issue as above...

woody8657 avatar Nov 02 '23 08:11 woody8657

@woody8657, one possible solution that I came across while skimming through the code file weight_diff.py in github.com/tatsu-lab/stanford_alpaca is to toggle the boolean value of check_integrity_naively, at line 77 to False. In this way the below check starting at line 127

if check_integrity_naively:
        # This is not a rigorous, cryptographically strong integrity check :)
        allsum = sum(state_dict_recovered[key].sum() for key in state_dict_recovered)
        assert torch.allclose(
            allsum, torch.full_like(allsum, fill_value=50637.1836), atol=1e-2, rtol=0
        ), "Naive integrity check failed. This could imply that some of the checkpoint files are corrupted."

does not execute and restoration of weights goes successful

omamaatautolabs avatar Nov 02 '23 10:11 omamaatautolabs

Same problem

Ki-Seki avatar Jan 31 '24 12:01 Ki-Seki

@woody8657, one possible solution that I came across while skimming through the code file weight_diff.py in github.com/tatsu-lab/stanford_alpaca is to toggle the boolean value of check_integrity_naively, at line 77 to False. In this way the below check starting at line 127

if check_integrity_naively:
        # This is not a rigorous, cryptographically strong integrity check :)
        allsum = sum(state_dict_recovered[key].sum() for key in state_dict_recovered)
        assert torch.allclose(
            allsum, torch.full_like(allsum, fill_value=50637.1836), atol=1e-2, rtol=0
        ), "Naive integrity check failed. This could imply that some of the checkpoint files are corrupted."

does not execute and restoration of weights goes successful

I bypass the integrity check without modifying the source code by using the CLI argument --nocheck_integrity_naively. Simply run the command as follows: python weight_diff.py recover --nocheck_integrity_naively --path_raw <path_to_step_1_dir> --path_diff <path_to_step_2_dir> --path_tuned <path_to_store_recovered_weights>

Ki-Seki avatar Jan 31 '24 12:01 Ki-Seki

Thank you so much, @Ki-Seki . Your solution works!!

baotruyenthach avatar Jun 05 '24 18:06 baotruyenthach