OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

Implement validation loss using FID scores and add corresponding documentation

Open IndigoDosSantos opened this issue 1 year ago • 4 comments

Description

This pull request introduces the implementation of validation loss using FID (Fréchet Inception Distance) scores and provides comprehensive documentation for this feature (for the wiki). The implementation includes:

  • Calculation of FID scores at regular intervals (currently after each epoch) using a separate validation image set
  • Storage of FID scores and generated images in the "epochs" folder within the workspace directory
  • Logging of FID scores to TensorBoard for visualization and monitoring

The accompanying documentation covers the following aspects:

  • Explanation of validation loss and its implementation using FID scores
  • Description of how validation loss complements other training metrics
  • Guidance on interpreting validation loss and its benefits for monitoring model performance
  • Details on the relationship between FID scores, model generalization, and overfitting
  • Recommendations for the size of the validation set, with 15% of the total dataset being a good middle ground
  • Implementation considerations for effectively utilizing validation loss, including:
  • Creating a separate validation image set
  • Configuring the "validation_images" concept in concepts.json
  • Storing FID scores and generated images in the "epochs" folder
  • Calculating FID scores after each epoch and logging them to TensorBoard

IndigoDosSantos avatar Jun 07 '24 17:06 IndigoDosSantos

I like the idea of adding validation loss. But there are several issues with your implementation that would need to be resolved before it can be merged.

Just naming a few here, but there are definitely more problems.

  1. There is no configuration and everything is hidden from the user. This will lead to a lot of confusion, and even break possible future models that aren't generative image models.
  2. You hard code multiple paths and file name patterns.
    1. The workspace directory is configurable by the user, but you hard code it as epochs_dir = "workspace/run/epochs".
    2. you assume that the concepts are always stored in os.path.join(os.path.dirname(__file__), "..", "..", "training_concepts"), but that file can be stored anywhere or even inside the TrainConfig object.
    3. The sample file names might change in the future, but you assume that they always use a fixed naming scheme.
    4. You assume that there is a special concept for validation images instead of adding a configuration for it.
    5. ctypes.windll.kernel32.SetFileAttributesW(os.path.join(self.config.workspace_dir, "epochs"), 0x02): only works on windows. And why would you do this anyway?
  3. calculate_fid_scores.py is saved in scripts, but it's not executable. Scripts should all have the same structure, you can compare with any of the existing ones.
  4. What happens if the user doesn't sample regularly? Then the validation score can't be calculated.
  5. sys.path.append(scripts_dir): why? just use the import statement.

Nerogar avatar Jun 08 '24 12:06 Nerogar

@Nerogar all accurate concerns and i agree with all

FurkanGozukara avatar Jun 08 '24 22:06 FurkanGozukara

@Nerogar

Thx for reviewing the PR and providing feedback. I'll carefully consider your feedback and work on addressing the necessary changes as time permits. NinjasInjectiveGIF

IndigoDosSantos avatar Jun 09 '24 12:06 IndigoDosSantos

Addressing point 1:

  • Add configuration options as requested, possibly for(?):
    • [ ] Validation image set
  • [ ] Ensure all relevant information is visible to the user, not hidden
  • Regarding compatibility with non-generative image models:
    • FID is specifically designed for comparing image distributions and is not suitable for non-generative image models
    • Other metrics would need to be used for evaluating non-generative models
    • The intention of this implementation was not to introduce a validation loss metric for models beyond generative image models
    • While similar concepts can be applied to other domains like text or audio, FID is specifically used for comparing image distributions

The current pull request places the validation image configuration within the concepts tab, which is not ideal. To improve the user experience and address organizational issues, I propose the following changes:

  1. Create a dedicated tab for training data, including both concepts and validation images[^1]
  2. Rename the "data" tab to "data processing" to clarify its purpose
  3. Reorganize the UI to group related settings and remove any confusion

Example for new structure: General Model Data Processing ├── Aspect Ratio Bucketing ├── Latent Caching └── Clear Cache Before Training Training Data ├── Concepts └── Validation Images Training Sampling Backup Tools Additional Embeddings

@Nerogar : What do you think about this?

[^1]: Concepts and validation images are both subsets of the training data. Concepts are curated sets of images used to guide the model's adaptation towards specific subjects or styles, while validation images are randomly selected from the training data to evaluate the model's performance on unseen data.

IndigoDosSantos avatar Jun 09 '24 14:06 IndigoDosSantos

Closing this as there hasn't been any activity in the last few months. We already have a different validation method now that should work even better for the usual training objectives.

Nerogar avatar Dec 28 '24 14:12 Nerogar