AAnirudh07
AAnirudh07
Thank you!
Hi @akashdhamasia12, Here's the inference code: ```python image = image.to('cpu') logits_mask = int8_model(image) prob_mask = logits_mask.sigmoid() pred_mask = (prob_mask > 0.5).float() ``` The `QuantizedCPU` error occurs even without using DataLoader...
Thanks @kitrak-rev !
Can you try reducing the batch size and see if it works?
Hmm, in that case, these links might be helpful (assuming you haven't tried int8 or int4 implementations yet): 1. [https://github.com/facebookresearch/llama/issues/79#issuecomment-1460464011](https://github.com/facebookresearch/llama/issues/79#issuecomment-1460464011) - This github user was able to run llama-65B using...
> in reward json file, the score is None, so i change it into "0.01". Hey @lonelydancer, as far as I can tell, the reward dataset doesn't play a role...
I also got the assertion error. I ran `python artifacts/main.py artifacts/config/config.yaml --type ACTOR` and got the following error (I truncated some of the assertion errors) ``` ../aten/src/ATen/native/cuda/Indexing.cu:1141: indexSelectLargeIndex: block: [334,0,0],...
^ update (just for more context)- `load_model_test` works! I believe this method uses a HF GPT2 tokenizer. However, if I use the `load_model` method, I get this assertion error. I...
From the official paper: > We tokenize all corpora using the GPT-2 byte level BPE tokenizer (Sennrich et al., 2016; Radfordet al., 2019; Brown et al., 2020). Our final corpus...