accelerate icon indicating copy to clipboard operation
accelerate copied to clipboard

Fix: Remove duplicate W&B initialization in offline mode (#3818)

Open shantanugupta2004 opened this issue 1 month ago • 0 comments

What does this PR do?

This PR addresses an issue where, when using accelerate with Weights & Biases (W&B) in offline mode, duplicate WB runs were being initialized. This resulted in two "offline-run-..." directories being created for a single training process, which is an unintended and redundant behavior. The problem stemmed from the WandBTracker.store_init_configuration method. In offline mode, this method would explicitly call wandb.init() again to include the run's configuration, even though wandb.init() had already be called by WandBTracker.start(). This redundant initialization led to the creation of a new W&B run, effectively duplicating the logging process. This PR resolves the issue by removing the second, offline-mode-specific wandb.init() call within WandBTracker.store_init_configuration. Instead, it now consistently uses wandb.config.update(values, allow_val_change=True) to update the run's configuration. This approach correctly integrates the configuration in the existing W&B run without triggering a new initialization. The fix ensures that only a single W&B run is initialized and maintained throughout the training process when operating in offline mode, leading to cleaner W&B directories and more accurate run management.

Fixes #3818

Before submitting

  • [x] Did you read the contributor guideline,
  • [x] Was this discussed/approved via a Github issue or the forum? (#3818)
  • [ ] Did you write any new necessary tests?

shantanugupta2004 avatar Dec 14 '25 07:12 shantanugupta2004