stable-diffusion-webui icon indicating copy to clipboard operation
stable-diffusion-webui copied to clipboard

[Bug]: Textual Inversion keeps getting "AssertionError: No inf checks were recorded for this optimizer."

Open silverhammer751 opened this issue 3 years ago • 17 comments

Is there an existing issue for this?

  • [X] I have searched the existing issues and checked the recent builds/commits

What happened?

Earlier today I successfully trained an embedding. Now when I try to train an embedding I keep getting this error:

Traceback (most recent call last): File "C:\Users----\stable-diffusion-webui\modules\textual_inversion\textual_inversion.py", line 335, in train_embedding continue File "C:\Users----\stable-diffusion-webui\venv\lib\site-packages\torch\cuda\amp\grad_scaler.py", line 336, in step assert len(optimizer_state["found_inf_per_device"]) > 0, "No inf checks were recorded for this optimizer." AssertionError: No inf checks were recorded for this optimizer.

Steps to reproduce the problem

  1. Go to Train tab
  2. Create an embedding
  3. Set up the training with deterministic latent sampling method
  4. Press Train Embedding

What should have happened?

It should start training the embedding.

Commit where the problem happens

bb11bee22ab02aa2fb5b96baa9be8103fff19e6a

What platforms do you use to access UI ?

Windows

What browsers do you use to access the UI ?

Brave

Command Line Arguments

--medvram --opt-split-attention --share --gradio-auth ----:----

Additional information, context and logs

No response

silverhammer751 avatar Nov 27 '22 23:11 silverhammer751

I solved this by doing source venv/bin/activate in the webui directory Followed by pip install -r requirements.txt

ExponentialML avatar Nov 28 '22 16:11 ExponentialML

I solved this by doing source venv/bin/activate in the webui directory Followed by pip install -r requirements.txt

Tried this but still getting the error. Do you know what updated in the venv when you pipped?

ADBoulding avatar Nov 28 '22 16:11 ADBoulding

I get the samme error when using [filewords] in the prompt template file.

Works with [name].

Valiante99 avatar Nov 29 '22 16:11 Valiante99

Seems to most often be due to a torch NaN error. What version of Python are you currently using and what is your GPU?

skdursh avatar Nov 29 '22 18:11 skdursh

Seems to most often be due to a torch NaN error. What version of Python are you currently using and what is your GPU?

I'm running 3.10.7 and a RTX 2080

Valiante99 avatar Nov 29 '22 19:11 Valiante99

Seems to most often be due to a torch NaN error. What version of Python are you currently using and what is your GPU?

I am running 3.10.6 and a 3090

I get the samme error when using [filewords] in the prompt template file.

Works with [name].

That's interesting, I'll test mine.

ADBoulding avatar Nov 30 '22 01:11 ADBoulding

I solved my problem by switching to 3.10.6.

silverhammer751 avatar Dec 02 '22 03:12 silverhammer751

I ran into the problem again, and the issue was that the name of the embedding wasn't found in the prompt (I use filenames). Renamed the files and it worked.

silverhammer751 avatar Dec 02 '22 21:12 silverhammer751

Can you explain the fix? like I need to rename the dataset's names or something? I was trying to resume training for an embedding and got this same error despite the fact the initial training was going well with all the same parameters and paths.

LaikaSa avatar Dec 03 '22 05:12 LaikaSa

I ran into the problem again, and the issue was that the name of the embedding wasn't found in the prompt (I use filenames). Renamed the files and it worked.

I too would like a more detailed breakdown. The embedding is referred in the prompt as [Name], the text file associated with the dataset pictures is [filenames]. Am I renaming the text files to be better identified by webui?

ADBoulding avatar Dec 03 '22 23:12 ADBoulding

I ran into the problem again, and the issue was that the name of the embedding wasn't found in the prompt (I use filenames). Renamed the files and it worked.

I too would like a more detailed breakdown. The embedding is referred in the prompt as [Name], the text file associated with the dataset pictures is [filenames]. Am I renaming the text files to be better identified by webui?

Ok so I do this and it seems to have been fixed: https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5383#issuecomment-1336357833

LaikaSa avatar Dec 04 '22 11:12 LaikaSa

For me the problem was the name of my embedding. It was _example_ and the underscores doesn't work if immediately succeded by a comma. The solution was to use [name] , [filewords] (notice the space before and after comma)

Signorlimone avatar Dec 04 '22 13:12 Signorlimone

When I ran into the issue after switching to 3.10.6, I discovered that my embedding name and the name I used in the filewords weren't lining up (e.g. my filenames were titles "a photo of subject" but the embedding was "subj3ct"). Renaming the files to "a photo of subj3ct" resolved the issue.

silverhammer751 avatar Dec 04 '22 16:12 silverhammer751

For me the problem was the name of my embedding. It was example and the underscores doesn't work if immediately succeded by a comma. The solution was to use [name] , [filewords] (notice the space before and after comma)

I just put the comma right after [name], like [name], [filewords]. Still work just fine

LaikaSa avatar Dec 04 '22 23:12 LaikaSa

I'm sorry could you really dumb this down for me by telling me exactly which files i should be renaming/editing? I downloaded a bin file from a repo on https://huggingface.co/sd-concepts-library/trigger-studio and I thought all i had to do was drag drop and train.

P0rtF1ssure avatar Dec 05 '22 06:12 P0rtF1ssure

When you train the embedding, it'll have a "Prompt template file." In the file you use, you need it to include [name] or, if you use [filewords] include the name of the embedding you are training in the names of the files you are training the embedding with.

silverhammer751 avatar Dec 08 '22 20:12 silverhammer751

You have to rename the prompt template file.

Works with [name].

just like [Valiante99] said. fixed mine!

toolhicks avatar Apr 15 '23 03:04 toolhicks