rinongal

Results 113 comments of rinongal

Hi, Can you please check the log directory for the images named samples_scaled and post the last one here? Prompting with "a photo of " should produce pretty much the...

@genekogan That's indeed the case. A remnant from the LDM code where training usually just wouldn't finish. On cartoon styles: I've seen some people manage to learn them, for example...

@hopibel I would certainly expect identity to fail, yes, but here it is not even managing to copy the style. My first guess would be that the use of "*tom"...

Hey! The most likely candidate is just that our SD version isn't officially released yet because it's not behaving well under new prompts :) It's placing too much weight on...

Hi, First of all, regarding implementations of the evaluation script, you can find the relevant code to run similar evaluations [here](https://github.com/rinongal/textual_inversion/blob/main/evaluation/clip_eval.py) and [here](https://github.com/rinongal/textual_inversion/blob/main/scripts/evaluate_model.py). I just ran the latter file with...

Hi, On the question of LR controlling the tradeoff - the paper refers to the base_learning_rate, yes. However, please note that this isn't the only way to control the tradeoff....

Learning rate: The unfrozen yaml has two different learning rates, both of which are being used. The first is the base_learning_rate, which is used as before to optimize the model...

Sorry for the late response. Busy week :) For guidance_scale_factor you probably want to keep it somewhere near the original values (so 7.0-10.0). Higher will likely give you saturation issues...

If your image set is small and you are not training for long, you could also try enabling `per_image_tokens` in the config (note that this flag appears more than once)....

@oppie85 LDM's latent space is actually larger, and likely more expressive both due to the increased dimension and due to the fact that its encoder was jointly trained with the...