Mohamad Zamini

Results 16 issues of Mohamad Zamini

In both the training and evaluation loops, there are unnecessary calls to `loss.detach().item()`. You can calculate the loss without detaching it and only detach it if necessary at a later...

move the model and input tensors to the GPU for faster computations. If you have multiple input samples, you can process them in batches using PyTorch's DataLoader to parallelize computations...

Hi, in Run_gqn.py I get an error on this line: ` handler=checkpoint_handler, to_save={'model': model.state_dict, 'optimizer': optimizer.state_dict, 'annealers': (sigma_scheme.data, mu_scheme.data)}) ` Traceback (most recent call last): File "../run-gqn.py", line 181, in...

Hi. Thanks for sharing the code. Do you know how to train on other datasets? like FB15K? How can we create facts.txt?

In the `paint()` function, you are using the `torch.autocast()` function to cast the batch tensor to the `cuda` device. However, the `batch` tensor is already on the `cuda` device, so...

### Describe the issue Issue: after pretraining Phi-3 with/without VIP, I am getting tokenization mismatch. Command: ``` bash scripts/finetune_vip_llava_phi3_stage2.sh ``` Log: ``` WARNING: tokenization mismatch: 185 vs. 195. (ignored) WARNING:...