Anditty
Anditty
Maybe because of this: ```pythoh #worker_str = f"-H {hostfile} " worker_str = "" ``` in server.py 175-176
> Maybe because of this: > > ``` > #worker_str = f"-H {hostfile} " > worker_str = "" > ``` > > in server.py 175-176 Use `worker_str = f"-H {hostfile}`...
I don't think it should return nan when backdoor is null. If there is no backdoor, then we don't need to control anything. That is to say we can just...
@ankurankan Thanks for your answer! But I don't think \sum_{i={1, 2, 3}} P(Y | do (X), Z = i) = P(Y | do (X), Z = 1 or Z =...
I found if I use `--deepspeed ds_config.json` option, then `print(trainer.model.state_dict()['model.layers.30.mlp.gate_proj.weight'])` will print `tensor([], device='cuda:0', dtype=torch.float16)`. And It is mentioned in the README.md that FSDP full_shard mode is used, but FSDP...
Which setup would have given you a score of 84? Swarm or multi-mode?