InvokeAI
InvokeAI copied to clipboard
[bug]: textual_inversion: not found on Mac M1
Is there an existing issue for this?
- [X] I have searched the existing issues
OS
macOS
GPU
mps
VRAM
16
What happened?
On Invoke 2.3, when launching option 3 on command line, this error is printed:
Starting Textual Inversion:
./invoke.sh: line 50: exec: textual_inversion: not found
Screenshots
No response
Additional context
Fresh install of Invoke 2.3, did not update from 2.2.5.
Contact Details
No response
Sorry @ebr but it still doesn't work in RC4, here are the logs :
Starting Textual Inversion: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮ │ /Users/jay/invokeai/.venv/bin/invokeai-ti:8 in <module> │ │ │ │ 5 from ldm.invoke.training.textual_inversion import main │ │ 6 if __name__ == '__main__': │ │ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │ │ ❱ 8 │ sys.exit(main()) │ │ 9 │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:4 │ │ 37 in main │ │ │ │ 434 │ global_set_root(args.root_dir or Globals.root) │ │ 435 │ try: │ │ 436 │ │ if args.front_end: │ │ ❱ 437 │ │ │ do_front_end(args) │ │ 438 │ │ else: │ │ 439 │ │ │ do_textual_inversion_training(**vars(args)) │ │ 440 │ except AssertionError as e: │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:4 │ │ 09 in do_front_end │ │ │ │ 406 def do_front_end(args: Namespace): │ │ 407 │ saved_args = previous_args() │ │ 408 │ myapplication = MyApplication(saved_args=saved_args) │ │ ❱ 409 │ myapplication.run() │ │ 410 │ │ │ 411 │ if args := myapplication.ti_arguments: │ │ 412 │ │ os.makedirs(args["output_dir"], exist_ok=True) │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/apNPSApplication.py:30 in run │ │ │ │ 27 │ def run(self, fork=None): │ │ 28 │ │ """Run application. Calls Mainloop wrapped properly.""" │ │ 29 │ │ if fork is None: │ │ ❱ 30 │ │ │ return npyssafewrapper.wrapper(self.__remove_argument_call_main) │ │ 31 │ │ else: │ │ 32 │ │ │ return npyssafewrapper.wrapper(self.__remove_argument_call_main, fork=fork) │ │ 33 │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/npyssafewrapper.py:41 in wrapper │ │ │ │ 38 │ │ wrapper_no_fork(call_function) │ │ 39 │ else: │ │ 40 │ │ if _NEVER_RUN_INITSCR: │ │ ❱ 41 │ │ │ wrapper_no_fork(call_function) │ │ 42 │ │ else: │ │ 43 │ │ │ wrapper_fork(call_function, reset=reset) │ │ 44 │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/npyssafewrapper.py:97 in │ │ wrapper_no_fork │ │ │ │ 94 │ _SCREEN.keypad(1) │ │ 95 │ │ │ 96 │ try: │ │ ❱ 97 │ │ return_code = call_function(_SCREEN) │ │ 98 │ finally: │ │ 99 │ │ _SCREEN.keypad(0) │ │ 100 │ │ curses.echo() │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/apNPSApplication.py:25 in │ │ __remove_argument_call_main │ │ │ │ 22 │ │ if enable_mouse: │ │ 23 │ │ │ curses.mousemask(curses.ALL_MOUSE_EVENTS) │ │ 24 │ │ del screen │ │ ❱ 25 │ │ return self.main() │ │ 26 │ │ │ 27 │ def run(self, fork=None): │ │ 28 │ │ """Run application. Calls Mainloop wrapped properly.""" │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/apNPSApplicationManaged.py:148 │ │ in main │ │ │ │ 145 │ │ Note that NEXT_ACTIVE_FORM is a string that is the name of the form that was spe │ │ 146 │ │ """ │ │ 147 │ │ │ │ ❱ 148 │ │ self.onStart() │ │ 149 │ │ while self.NEXT_ACTIVE_FORM != "" and self.NEXT_ACTIVE_FORM != None: │ │ 150 │ │ │ self._LAST_NEXT_ACTIVE_FORM = self._Forms[self.NEXT_ACTIVE_FORM] │ │ 151 │ │ │ self.LAST_ACTIVE_FORM_NAME = self.NEXT_ACTIVE_FORM │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:3 │ │ 54 in onStart │ │ │ │ 351 │ │ │ 352 │ def onStart(self): │ │ 353 │ │ npyscreen.setTheme(npyscreen.Themes.DefaultTheme) │ │ ❱ 354 │ │ self.main = self.addForm( │ │ 355 │ │ │ "MAIN", │ │ 356 │ │ │ textualInversionForm, │ │ 357 │ │ │ name="Textual Inversion Settings", │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/apNPSApplicationManaged.py:55 in │ │ addForm │ │ │ │ 52 │ def addForm(self, f_id, FormClass, *args, **keywords): │ │ 53 │ │ """Create a form of the given class. f_id should be a string which will uniquely │ │ 54 │ │ Forms created in this way are handled entirely by the NPSAppManaged class.""" │ │ ❱ 55 │ │ fm = FormClass( parentApp=self, *args, **keywords) │ │ 56 │ │ self.registerForm(f_id, fm) │ │ 57 │ │ return weakref.proxy(fm) │ │ 58 │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:4 │ │ 8 in __init__ │ │ │ │ 45 │ │ │ 46 │ def __init__(self, parentApp, name, saved_args=None): │ │ 47 │ │ self.saved_args = saved_args or {} │ │ ❱ 48 │ │ super().__init__(parentApp, name) │ │ 49 │ │ │ 50 │ def afterEditing(self): │ │ 51 │ │ self.parentApp.setNextForm(None) │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/fmFormMultiPage.py:16 in │ │ __init__ │ │ │ │ 13 │ def __init__(self, display_pages=True, pages_label_color='NORMAL', *args, **keywords │ │ 14 │ │ self.display_pages = display_pages │ │ 15 │ │ self.pages_label_color = pages_label_color │ │ ❱ 16 │ │ super(FormMultiPage, self).__init__(*args, **keywords) │ │ 17 │ │ self.switch_page(0) │ │ 18 │ │ │ 19 │ def draw_form(self, *args, **keywords): │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/npyscreen/fmForm.py:70 in __init__ │ │ │ │ 67 │ │ │ self.create_widgets_from_list(self.__class__.initialWidgets) │ │ 68 │ │ if widget_list: │ │ 69 │ │ │ self.create_widgets_from_list(widget_list) │ │ ❱ 70 │ │ self.create() │ │ 71 │ │ │ │ 72 │ │ if self.FIX_MINIMUM_SIZE_WHEN_CREATED: │ │ 73 │ │ │ self.min_l = self.lines │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:5 │ │ 4 in create │ │ │ │ 51 │ │ self.parentApp.setNextForm(None) │ │ 52 │ │ │ 53 │ def create(self): │ │ ❱ 54 │ │ self.model_names, default = self.get_model_names() │ │ 55 │ │ default_initializer_token = "★" │ │ 56 │ │ default_placeholder_token = "" │ │ 57 │ │ saved_args = self.saved_args │ │ │ │ /Users/jay/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_inversion.py:2 │ │ 98 in get_model_names │ │ │ │ 295 │ │ │ for idx in range(len(model_names)) │ │ 296 │ │ │ if "default" in conf[model_names[idx]] │ │ 297 │ │ ] │ │ ❱ 298 │ │ return (model_names, defaults[0]) │ │ 299 │ │ │ 300 │ def marshall_arguments(self) -> dict: │ │ 301 │ │ args = dict() │ ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯ IndexError: list index out of range
Here is a screen cap for better readability.
Hey, I have the same thing happening on an M1. The weird thing though is that I've been able to access the textual-inversion frontend "randomly" 2 times today after upgrading to RC4, but can't achieve it consistently (now I'm stuck to this error again while just attempting to access the textual-inversion frontend).
Hi @ebr , now I can launch textual inversion, but an error occurs on Mac as it says it's unsupported hardware. Is it normal?
Steps: 0%| | 0/3000 [00:00<?, ?it/s][W NNPACK.cpp:53] Could not initialize NNPACK! Reason: Unsupported hardware. ** An exception occurred during training. The exception was: "slow_conv2d_cpu" not implemented for 'Half'
I solved this by forcing FP32 instead of FP16 which is selected by default (mixed precision: no)
edit: Mixed precision Select the floating point precision for the embedding. "no" will result in a full 32-bit precision, "fp16" will provide 16-bit precision, and "bf16" will provide mixed precision (only available when XFormers is used).
Thank you very much @jere76 for your solution. It now works for me too. However, all the calculations run on CPU, it is normal? Is there a way to run it via MPS for faster acceleration? If only we could make than run trough Neural Engine!
@ebr Maybe you should set FP32 by default if it's run on M1 Macs / Arm64, as it will result as an error if it's launched in FP16.
I've been searching to a clear answer to that too, but I'm far from understanding it clearly. I've been following this, since I thought it could be related to patchmatch not running (so I now have this working), but I still run on CPU. Check this #728 and #262
The index out of range errors are fixed in the current release candidate. What symptoms are you experiencing that tell you the MPS acceleration is not being used?
I believe you have to use FP32 to run on MPS, but I’m not a Mac guy.
@lstein Thanks for your answer. It's because it says in plain text that it is using CPU, here are the detailed logs, with just a quick launch on 500 steps. I put the CPU message in bold below:
Starting Textual Inversion:
DEBUG: args = {'model': 'stable-diffusion-1.5', 'resolution': 512, 'lr_scheduler': 'constant', 'mixed_precision': 'no', 'learnable_property': 'object', 'initializer_token': '★', 'placeholder_token': 'logging_dir
is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use project_dir
instead.
warnings.warn(
02/08/2023 11:23:56 - INFO - ldm.invoke.training.textual_inversion_training - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu
Mixed precision type: no
{'prediction_type', 'variance_type'} was not found in config. Values will be initialized to default values. {'class_embed_type', 'upcast_attention', 'mid_block_type', 'num_class_embeds', 'dual_cross_attention', 'only_cross_attention', 'use_linear_projection', 'resnet_time_scale_shift'} was not found in config. Values will be initialized to default values. 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - ***** Running training ***** 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Num examples = 400 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Num Epochs = 39 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Instantaneous batch size per device = 8 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Total train batch size (w. parallel, distributed & accumulation) = 32 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Gradient Accumulation steps = 4 02/08/2023 11:23:58 - INFO - ldm.invoke.training.textual_inversion_training - Total optimization steps = 500 Checkpoint 'latest' does not exist. Starting a new training run. Steps: 0%| | 0/500 [00:00<?, ?it/s]
@lstein @jere76 Thank you! Setting „Mixed Precision“ to „no“ /with FP32 has worked.
The software also tells me "Device: cpu". I also have full CPU load but hardly any load on the GPU.
Please enter 1, 2, 3, 4, 5, 6 or 7: [2] 3 Starting Textual Inversion: DEBUG: args = {'model': 'stable-diffusion-1.5', 'resolution': 512, 'lr_scheduler': 'constant', 'mixed_precision': 'no', 'learnable_property': 'object', 'initializer_token': '★', 'placeholder_token': '
', 'train_data_dir': '/Users/luis/Pictures/AI-Stable-Diffusion/text-inversion-training-data/samkoulkoso', 'output_dir': '/Users/luis/Pictures/AI-Stable-Diffusion/text-inversion-output/samkoulkoso', 'scale_lr': True, 'center_crop': False, 'enable_xformers_memory_efficient_attention': False, 'train_batch_size': 8, 'gradient_accumulation_steps': 4, 'num_train_epochs': 100, 'max_train_steps': 3000, 'lr_warmup_steps': 0, 'learning_rate': 0.0005, 'resume_from_checkpoint': 'latest', 'only_save_embeds': True} /Users/l/Pictures/AI-Stable-Diffusion/.venv/lib/python3.10/site-packages/accelerate/accelerator.py:231: FutureWarning: logging_dir
is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Useproject_dir
instead. warnings.warn( 02/12/2023 16:02:09 - INFO - ldm.invoke.training.textual_inversion_training - Distributed environment: NO Num processes: 1 Process index: 0 Local process index: 0 Device: cpu Mixed precision type: no{'variance_type', 'prediction_type'} was not found in config. Values will be initialized to default values. {'mid_block_type', 'upcast_attention', 'only_cross_attention', 'class_embed_type', 'dual_cross_attention', 'resnet_time_scale_shift', 'num_class_embeds', 'use_linear_projection'} was not found in config. Values will be initialized to default values. 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - ***** Running training ***** 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Num examples = 500 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Num Epochs = 188 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Instantaneous batch size per device = 8 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Total train batch size (w. parallel, distributed & accumulation) = 32 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Gradient Accumulation steps = 4 02/12/2023 16:02:23 - INFO - ldm.invoke.training.textual_inversion_training - Total optimization steps = 3000 Checkpoint 'latest' does not exist. Starting a new training run.
Fresh install InvokeAI Version 2.3.0. Mac OS 13.2 M1 Max
I'm seeing the same issues as @kayzen-ml & @MrKurtzmann. I'm on a M2 Max running Mac OS 13.2.
I also see the following when running invokeai-ti
via the cli (note the "device: cpu"):
invokeai-ti \
--model=stable-diffusion-1.5 \
--resolution=512 \
--learnable_property=style \
--initializer_token='*' \
--placeholder_token='<test>' \
--train_data_dir=/home/lstein/invokeai/training-data/test \
--output_dir=/home/lstein/invokeai/text-inversion-training/test \
--scale_lr \
--train_batch_size=8 \
--gradient_accumulation_steps=4 \
--max_train_steps=3000 \
--learning_rate=0.0005 \
--resume_from_checkpoint=latest \
--lr_scheduler=constant \
--mixed_precision=no \
--only_save_embeds
/Users/christopherwinch/invokeai/.venv/lib/python3.9/site-packages/accelerate/accelerator.py:231: FutureWarning: `logging_dir` is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use `project_dir` instead.
warnings.warn(
02/17/2023 22:01:37 - INFO - ldm.invoke.training.textual_inversion_training - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: cpu
Mixed precision type: no
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ /Users/christopherwinch/invokeai/.venv/bin/invokeai-ti:8 in <module> │
│ │
│ 5 from ldm.invoke.training.textual_inversion import main │
│ 6 if __name__ == '__main__': │
│ 7 │ sys.argv[0] = re.sub(r'(-script\.pyw|\.exe)?$', '', sys.argv[0]) │
│ ❱ 8 │ sys.exit(main()) │
│ 9 │
│ │
│ /Users/christopherwinch/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_i │
│ nversion.py:441 in main │
│ │
│ 438 │ │ if args.front_end: │
│ 439 │ │ │ do_front_end(args) │
│ 440 │ │ else: │
│ ❱ 441 │ │ │ do_textual_inversion_training(**vars(args)) │
│ 442 │ except widget.NotEnoughSpaceForWidget as e: │
│ 443 │ │ if str(e).startswith("Height of 1 allocated"): │
│ 444 │ │ │ print( │
│ │
│ /Users/christopherwinch/invokeai/.venv/lib/python3.9/site-packages/ldm/invoke/training/textual_i │
│ nversion_training.py:622 in do_textual_inversion_training │
│ │
│ 619 │ │ │ │ if "epoch_*" not in gitignore: │
│ 620 │ │ │ │ │ gitignore.write("epoch_*\n") │
│ 621 │ │ elif output_dir is not None: │
│ ❱ 622 │ │ │ os.makedirs(output_dir, exist_ok=True) │
│ 623 │ │
│ 624 │ models_conf = OmegaConf.load(os.path.join(Globals.root, "configs/models.yaml")) │
│ 625 │ model_conf = models_conf.get(model, None) │
│ │
│ /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib │
│ /python3.9/os.py:215 in makedirs │
│ │
│ 212 │ │ head, tail = path.split(head) │
│ 213 │ if head and tail and not path.exists(head): │
│ 214 │ │ try: │
│ ❱ 215 │ │ │ makedirs(head, exist_ok=exist_ok) │
│ 216 │ │ except FileExistsError: │
│ 217 │ │ │ # Defeats race condition when another thread created the path │
│ 218 │ │ │ pass │
│ │
│ /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib │
│ /python3.9/os.py:215 in makedirs │
│ │
│ 212 │ │ head, tail = path.split(head) │
│ 213 │ if head and tail and not path.exists(head): │
│ 214 │ │ try: │
│ ❱ 215 │ │ │ makedirs(head, exist_ok=exist_ok) │
│ 216 │ │ except FileExistsError: │
│ 217 │ │ │ # Defeats race condition when another thread created the path │
│ 218 │ │ │ pass │
│ │
│ /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib │
│ /python3.9/os.py:215 in makedirs │
│ │
│ 212 │ │ head, tail = path.split(head) │
│ 213 │ if head and tail and not path.exists(head): │
│ 214 │ │ try: │
│ ❱ 215 │ │ │ makedirs(head, exist_ok=exist_ok) │
│ 216 │ │ except FileExistsError: │
│ 217 │ │ │ # Defeats race condition when another thread created the path │
│ 218 │ │ │ pass │
│ │
│ /Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib │
│ /python3.9/os.py:225 in makedirs │
│ │
│ 222 │ │ if tail == cdir: # xxx/newdir/. exists if xxx/newdir exists │
│ 223 │ │ │ return │
│ 224 │ try: │
│ ❱ 225 │ │ mkdir(name, mode) │
│ 226 │ except OSError: │
│ 227 │ │ # Cannot rely on checking for EEXIST, since the operating system │
│ 228 │ │ # could give priority to other errors like EACCES or EROFS │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
Yeah, I'm having the same issue with TI only running on cpu. It's basically not worth using, it gets about 8 steps in an hour. By the way @kayzen-ml , just letting you know, you can use triple backticks to add multiline code on github.
``` This would appear as a multi-line code block ```
This (^) becomes this (⌄)
This would appear
as a multi-line code block
i also have the same problem,Does anyone have a solution
Steps: 0%| | 1/3000 [12:08<484:22:11, 581.44s/it, loss=0.122, lr=0.016]
There has been no activity in this issue for 14 days. If this issue is still being experienced, please reply with an updated confirmation that the issue is still being experienced with the latest release.
Same problem as above. Happy to support the people working on this if it helps!
Textual inversion training
/Users/[...]/InvokeAI/.venv/lib/python3.10/site-packages/accelerate/accelerator.py:249: FutureWarning: `logging_dir` is deprecated and will be removed in version 0.18.0 of 🤗 Accelerate. Use `project_dir` instead.
warnings.warn(
04/04/2023 16:05:14 - INFO - ldm.invoke.training.textual_inversion_training - Distributed environment: NO
Num processes: 1
Process index: 0
Local process index: 0
Device: mps
Mixed precision type: no
{'variance_type', 'clip_sample_range'} was not found in config. Values will be initialized to default values.
{'mid_block_type', 'time_cond_proj_dim', 'conv_out_kernel', 'projection_class_embeddings_input_dim', 'timestep_post_act', 'time_embedding_type', 'resnet_time_scale_shift', 'class_embed_type', 'conv_in_kernel'} was not found in config. Values will be initialized to default values.
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - ***** Running training *****
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Num examples = 1200
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Num Epochs = 25
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Instantaneous batch size per device = 3
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Total train batch size (w. parallel, distributed & accumulation) = 12
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Gradient Accumulation steps = 4
04/04/2023 16:05:23 - INFO - ldm.invoke.training.textual_inversion_training - Total optimization steps = 2500
Checkpoint 'latest' does not exist. Starting a new training run.
Steps: 0%| | 0/2500 [00:00<?, ?it/s]/Users/[...]/InvokeAI/.venv/lib/python3.10/site-packages/torch/autograd/__init__.py:197: UserWarning: The operator 'aten::native_group_norm_backward' is not currently supported on the MPS backend and will fall back to run on the CPU. This may have performance implications. (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/mps/MPSFallback.mm:11.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Steps: 79%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▎ | 1967/2500 [17:41:28<1:36:10, 10.83s/it, loss=nan, lr=0.006]