Matthew Lyon
Matthew Lyon
> where V is some integer (not necessarily 1) So what purpose does `V` serve? If the typical "batch" dimension (i.e. the size of the mini batch during training) will...
I've made the following edits ```python # backpack.core.derivates.model.py from typing import List, Tuple, Optional from itertools import count import torch from backpack.core.derivatives.basederivatives import BaseDerivatives from backpack.hessianfree.lop import transposed_jacobian_vector_product class ArbitraryModelDerivatives(BaseDerivatives):...
For those viewing this issue, a workaround is to call the `.wrapResolve` method on the returned resolver, e.g. ```typescript UserTC.mongooseResolvers.createOne().wrapResolve((next) => async (rp) => { // extend resolve params with...
> hi @m-lyon , i am trying to looking into the problem. I wonder if it is possible to remove `${CUDA_VISIBLE_DEVICES} equal to "0,1,2,3"` setup in the problem? I did...
> what is ur ray version? can u try with the nightly wheel Using `ray==1.13.0`, I can try with the nightly wheel or ray and see how I go. >...
> ``` > if len(cuda_visible_list) == 1: > device_id = cuda_visible_list[0] > else: > device_id = cuda_visible_list.index(gpu_id) > ``` > > Your change fails due to `RuntimeError: CUDA error: invalid...
I can confirm the above code has fixed this issue. I guess a question is then given the code below, if each trial should only see `CUDA_VISIBLE_DEVICES=0` (because `ray` sets...
As far as I can tell, setting each device to `torch.device('cuda', 0)` within the if statement has fixed the issue, however i can't confirm the GPU activity through `ray status`...
After some time training with the aforementioned code fix, I ran into this error for some of the trials: ``` Failure # 1 (occurred at 2022-08-14_19-33-21) ray::ImplicitFunc.train() (pid=33261, ip=10.10.8.10, repr=train_model_raytune)...
So the above error hinted at what the problem was here. From the [raytune GPU documentation](https://docs.ray.io/en/master/ray-core/tasks/using-ray-with-gpus.html): > a call to ray.get_gpu_ids() will return a list of strings indicating which GPUs...