torchchat icon indicating copy to clipboard operation
torchchat copied to clipboard

Eval script fails on CPU on model generated by ExecuTorch

Open agunapal opened this issue 6 months ago • 2 comments

🐛 Describe the bug

I am using ET and generating the quantized version of the model as shown in the README.

python torchchat.py export llama3.1 --quantize config/data/mobile.json --output-pte-path llama3.1.pte

Then we when I tried to evaluate the model using the python runtime on Desktop , it fails

python torchchat.py eval llama3.1 --pte-path llama3.1.pte --limit 5
NumExpr defaulting to 16 threads.
PyTorch version 2.5.0.dev20240716+cpu available.
Warning: checkpoint path ignored because an exported DSO or PTE path specified
Using device=cpu
Loading model...
Time to load model: 0.05 seconds
Loading custom ops library: /home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/executorch/examples/models/llama2/custom_ops/libcustom_ops_aot_lib.so
I 00:00:00.004209 executorch:program.cpp:133] InternalConsistency verification requested but not available
-----------------------------------------------------------
Using device 'cpu'
/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/huggingface_hub/file_download.py:1150: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
[Task: wikitext] metric word_perplexity is defined, but aggregation is not. using default aggregation=weighted_perplexity
[Task: wikitext] metric word_perplexity is defined, but higher_is_better is not. using default higher_is_better=False
[Task: wikitext] metric byte_perplexity is defined, but aggregation is not. using default aggregation=weighted_perplexity
[Task: wikitext] metric byte_perplexity is defined, but higher_is_better is not. using default higher_is_better=False
[Task: wikitext] metric bits_per_byte is defined, but aggregation is not. using default aggregation=bits_per_byte
[Task: wikitext] metric bits_per_byte is defined, but higher_is_better is not. using default higher_is_better=False
Downloading builder script: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 10.7k/10.7k [00:00<00:00, 46.8MB/s]
Downloading readme: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7.78k/7.78k [00:00<00:00, 39.3MB/s]
Repo card metadata block was not found. Setting CardData to empty.
Repo card metadata block was not found. Setting CardData to empty.
Downloading data: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4.72M/4.72M [00:00<00:00, 64.4MB/s]
Generating test split: 62 examples [00:00, 1903.53 examples/s]
Generating train split: 629 examples [00:00, 5131.04 examples/s]
Generating validation split: 60 examples [00:00, 7172.82 examples/s]
Building contexts for wikitext on rank 0...
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 5/5 [00:00<00:00, 796.73it/s]
Running loglikelihood_rolling requests
  0%|                                                                                                                                                               | 0/5 [00:00<?, ?it/s]E 00:00:31.679303 executorch:tensor_impl.cpp:86] Attempted to resize a static tensor to a new shape at dimension 1 old_size: 1 new_size: 1263
E 00:00:31.679320 executorch:method.cpp:824] Error setting input 0: 0x10
  0%|                                                                                                                                                               | 0/5 [00:00<?, ?it/s]
Time to run eval: 6.75s.
Traceback (most recent call last):
  File "/home/ubuntu/torchchat/torchchat.py", line 92, in <module>
    eval_main(args)
  File "/home/ubuntu/torchchat/eval.py", line 252, in main
    result = eval(
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/ubuntu/torchchat/eval.py", line 198, in eval
    eval_results = evaluate(
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/lm_eval/utils.py", line 288, in _wrapper
    return fn(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/lm_eval/evaluator.py", line 373, in evaluate
    resps = getattr(lm, reqtype)(cloned_reqs)
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/lm_eval/models/huggingface.py", line 840, in loglikelihood_rolling
    string_nll = self._loglikelihood_tokens(
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/lm_eval/models/huggingface.py", line 1033, in _loglikelihood_tokens
    self._model_call(batched_inps, **call_kwargs), dim=-1
  File "/home/ubuntu/torchchat/eval.py", line 146, in _model_call
    logits = self._model_forward(x, input_pos)
  File "/home/ubuntu/torchchat/eval.py", line 240, in <lambda>
    model_forward = lambda x, input_pos: model(x, input_pos)  # noqa
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1716, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/ubuntu/anaconda3/envs/torchchat/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1727, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/ubuntu/torchchat/build/model_et.py", line 23, in forward
    logits = self.model_.forward(forward_inputs)
RuntimeError: method->set_inputs() for method 'forward' failed with error 0x12

Versions

Collecting environment information...
PyTorch version: 2.5.0.dev20240716+cpu
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A

OS: Ubuntu 22.04.4 LTS (x86_64)
GCC version: (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0
Clang version: Could not collect
CMake version: version 3.30.2
Libc version: glibc-2.35

Python version: 3.10.14 (main, May  6 2024, 19:42:50) [GCC 11.2.0] (64-bit runtime)
Python platform: Linux-6.5.0-1014-aws-x86_64-with-glibc2.35
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture:                       x86_64
CPU op-mode(s):                     32-bit, 64-bit
Address sizes:                      46 bits physical, 48 bits virtual
Byte Order:                         Little Endian
CPU(s):                             16
On-line CPU(s) list:                0-15
Vendor ID:                          GenuineIntel
Model name:                         Intel(R) Xeon(R) Platinum 8375C CPU @ 2.90GHz
CPU family:                         6
Model:                              106
Thread(s) per core:                 2
Core(s) per socket:                 8
Socket(s):                          1
Stepping:                           6
BogoMIPS:                           5799.93
Flags:                              fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single ssbd ibrs ibpb stibp ibrs_enhanced fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves wbnoinvd ida arat avx512vbmi pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg tme avx512_vpopcntdq rdpid md_clear flush_l1d arch_capabilities
Hypervisor vendor:                  KVM
Virtualization type:                full
L1d cache:                          384 KiB (8 instances)
L1i cache:                          256 KiB (8 instances)
L2 cache:                           10 MiB (8 instances)
L3 cache:                           54 MiB (1 instance)
NUMA node(s):                       1
NUMA node0 CPU(s):                  0-15
Vulnerability Gather data sampling: Unknown: Dependent on hypervisor status
Vulnerability Itlb multihit:        Not affected
Vulnerability L1tf:                 Not affected
Vulnerability Mds:                  Not affected
Vulnerability Meltdown:             Not affected
Vulnerability Mmio stale data:      Mitigation; Clear CPU buffers; SMT Host state unknown
Vulnerability Retbleed:             Not affected
Vulnerability Spec rstack overflow: Not affected
Vulnerability Spec store bypass:    Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:           Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2:           Mitigation; Enhanced / Automatic IBRS, IBPB conditional, RSB filling, PBRSB-eIBRS SW sequence
Vulnerability Srbds:                Not affected
Vulnerability Tsx async abort:      Not affected

Versions of relevant libraries:
[pip3] executorch==0.4.0a0+c757499
[pip3] numpy==1.26.4
[pip3] torch==2.5.0.dev20240716+cpu
[pip3] torchao==0.3.1
[pip3] torchaudio==2.4.0.dev20240716+cpu
[pip3] torchsr==1.0.4
[pip3] torchvision==0.20.0.dev20240716+cpu
[conda] executorch                0.4.0a0+c757499          pypi_0    pypi
[conda] numpy                     1.26.4                   pypi_0    pypi
[conda] torch                     2.5.0.dev20240716+cpu          pypi_0    pypi
[conda] torchao                   0.3.1                    pypi_0    pypi
[conda] torchaudio                2.4.0.dev20240716+cpu          pypi_0    pypi
[conda] torchsr                   1.0.4                    pypi_0    pypi
[conda] torchvision               0.20.0.dev20240716+cpu          pypi_0    pypi
(torchchat) ubuntu@ip-172-31-7-68:~/torchchat$ 

agunapal avatar Aug 08 '24 02:08 agunapal