ValeGian
ValeGian
> Fix code format Done with https://github.com/sgl-project/SpecForge/pull/208/commits/06cdfeb6af6c6cd661479762a004767bd4b521de
> May I ask if you ran the training on the device mentioned above? When I use your script to train the model on 8×H20 GPUs (96 GB each), it...
@ZhengHSI consider that the training for which I reported the curves was optimized to run on the 8xH200 node, the complete set of parameters found on MLflow was ``` Name...
> In addition, I tried training several times, but the loss and accuracy have always remained at 0. I saw in your previous answer that you also encountered this situation....
@ZhengHSI I confirmed that recent merges from main broke the PR, you can find the fixes in commit https://github.com/sgl-project/SpecForge/pull/208/commits/ab36686db3a8aeb2aebc55d8f8a04d6b05e58122. I verified the correct functioning using [visualize_loss_mask](https://github.com/sgl-project/SpecForge/blob/07157bd2957eb4c417af0070b2a8e36679690d1d/tests/test_preprocessing.py#L12). I also updated the...
@ZhengHSI any update about this?
@ZhengHSI seems like latest merge from main broke the tests
@ZhengHSI is there any action on my side to allow closing this PR?