OneTrainer icon indicating copy to clipboard operation
OneTrainer copied to clipboard

Dataset Tools Rework

Open O-J1 opened this issue 9 months ago • 12 comments

Given the ongoing complaints/requests about captioning, I decided it was time for a comprehensive (and final) overhaul. The PR is currently marked as a draft since I fully expect Johnny to point out numerous issues or questionable decisions that deserve attention. To users reading this; OT will never be as focused as other dedicated tools, this is at the limit.

Additions:

  1. Visual redesign
  2. New captioning models added: Moondream2, WD EVA02 v3, WD Swinv2 v3, and JoyTag. Removed Blip1 due to it being garbage. Consider upgrading Blip2 to Blip3 once released, if feasible.
  3. Integrated Moondream+SAM2.1 for a grounded SAM style approach. Moondream performed notably better for object detection in my tests (though this is somewhat subjective). Additionally, its repository was significantly easier to work with compared to GS.
  4. Added common and practical image operations
  5. Implemented useful caption management operations
  6. The window now properly resizes, including img
  7. Added functionality including undo, redo, clear current caption, save button, and corresponding shortcuts, along with additional shortcut improvement.
  8. Multi-line caption support added (fixes previous issues with losing multi-line input)
  9. Fixed bugs related to samples handling
  10. File list now includes filtering capabilities
  11. Enabled opening of the file browser directly at the current directory for all platforms (previously Windows-only)
  12. Adds JXL support with a Pillow plugin (rust) as the PIL team does not seem to be moving to support it anytime soon and it offers lossless JPEG transcoding and significant space savings.
  13. A cursor indicating whether you have brush or fill on!

image

Initially, this PR was also supposed to include a samples rework, but the effort involved was beyond my expectations just reaching the current stage. I've self-reviewed it as best I can, but after looking at it for so long, I'm certain Ive become blind to some things.

If someone knows a well tested, lightweight ish photoreal replacement for Blip2, then I am open to outright replacing it but you have to provide lots of examples (preferably a peer reviewed paper)

P.S After this update, aside from major model improvements or truly groundbreaking developments (not incremental tweaks), I personally won't be addressing further data tool requests—and based on Nero's recent comments, I doubt he will either.

O-J1 avatar Mar 19 '25 15:03 O-J1

For masking, I currently I find it difficult to reason what the result of a auto-mask run will be as there is no preview or easy way to test what the end result will look like for any particular image without running it and seeing. I'd love the ability to easily "auto mask" just the current image, allowing the user to test/iterate on their prompt/settings without having to go through the "mask all data" flow.

maedtb avatar Mar 20 '25 21:03 maedtb

For masking, I currently I find it difficult to reason what the result of a auto-mask run will be as there is no preview or easy way to test what the end result will look like for any particular image without running it and seeing. I'd love the ability to easily "auto mask" just the current image, allowing the user to test/iterate on their prompt/settings without having to go through the "mask all data" flow.

That sounds like a really good idea. But maybe a bit much for this PR since it's already pretty big. Unless @O-J1 want's to implement it, I think it's better to submit a new feature request.

Nerogar avatar Mar 20 '25 22:03 Nerogar

For masking, I currently I find it difficult to reason what the result of a auto-mask run will be as there is no preview or easy way to test what the end result will look like for any particular image without running it and seeing. I'd love the ability to easily "auto mask" just the current image, allowing the user to test/iterate on their prompt/settings without having to go through the "mask all data" flow.

That sounds like a really good idea. But maybe a bit much for this PR since it's already pretty big. Unless @O-J1 want's to implement it, I think it's better to submit a new feature request.

@Nerogar I'll implement a basic toggle that just does the currently selected image (so just 1) but nothing more than this

O-J1 avatar Mar 20 '25 23:03 O-J1

I tried a full checkout of the new code, to a fresh directory: Ubuntu24, under WSL

ran ./install.sh

then tried ./start_ui.sh

It errored with

[OneTrainer] Using Python Venv environment in "venv"...
[OneTrainer] + python scripts/util/version_check.py 3.10 3.13
[OneTrainer] + python scripts/train_ui.py
Traceback (most recent call last):
  File "/home/phil/git/onetrainer-newtest/scripts/train_ui.py", line 10, in <module>
    from modules.ui.TrainUI import TrainUI
  File "/home/phil/git/onetrainer-newtest/modules/ui/TrainUI.py", line 7, in <module>
    from customtkinter import filedialog
  File "/home/phil/git/onetrainer-newtest/venv/lib/python3.12/site-packages/customtkinter/__init__.py", line 5, in <module>
    from tkinter import Variable, StringVar, IntVar, DoubleVar, BooleanVar
ModuleNotFoundError: No module named 'tkinter'

ppbrown avatar Mar 26 '25 19:03 ppbrown

@ppbrown, tkinter is a[n optional] core python library; it doesn't get installed by PIP or third-party apps. Instead, it should be provided by whoever gives you your Python install. On Ubuntu, that means you need to run something like apt install python3-tk as a super user. Hope that helps!

maedtb avatar Mar 26 '25 19:03 maedtb

Ooops! Thanks for reminding me of that. Woulid be nice if install.sh called that out, yeah?

ppbrown avatar Mar 26 '25 20:03 ppbrown

March 28th update. When trying to install today, SAM was able to build correctly (no updates have been done since on my side or the branch side). Scipy still does not install automatically for me, even though it is in the requirements file. I will try to see if I can troubleshoot this over the weekend.

Calamdor avatar Mar 28 '25 09:03 Calamdor

I played around with this and had some feedback:

  • Bug/Feature Request: The undo button doesn't seem to undo usages of the "clear" button (at least after clicking save, maybe always, I didn't test). I didn't notice I lost my caption and I couldn't recover the data. :(

  • Feedback: The "Clear" button clearing the caption was confusing to me. Clearing all data about an image was confusing, as my read of the top bar was "Mask Options and Save". I would prefer if this button only cleared the mask, otherwise I'd probably never use this button.

  • Feature Request: I am not sure what the use-case for the (existing) Recurse Subdirectories button is, but if this option is useful for someone, can it instead be defaulted on, or the setting saved so I don't have to click it every single time?

  • Feature Request: The Mask Dialog options do not save, which is very annoying since you pin updating the other UI to the Mask Dialog closing. Can you please make these settings persist through openings of the masking dialog? (Even if it's just per app-run)

  • Bug: The file list (ctk.CTkScrollableFrame) breaks when you open directory with a large amount of images. I forget what the limit is, but that UI element does not support even a modest amount of items. The current UI could be replaced with a tk.Listbox (wrapped in a ctk.CTkScrollbar) which does support scaling to a large amount of items.

I am happy to move these into new issues, out of the scope of this PR, but wanted to have the feedback out there while this is still being worked on.

maedtb avatar Jun 12 '25 21:06 maedtb

  • The undo button doesn't seem to undo usages of the "clear" button (at least after clicking save, maybe always, I didn't test).
    I didn't notice I lost my caption and I couldn't recover the data. :(

~~This is currently intended, the buffer is cleared after clicking*. I need multiple opinions on this.~~ Got them. Making it undoable

"Clear" button .. confusing

I can rename it to "Reset", definitely want to keep it, can try and add a mask only clear one. Unsure what to name it. image

Recurse Subdirectories

Was renamed to 'Include Subdirs' awhile back and is very useful for people. You dont need to enable if you dont need it.

The Mask Dialog options do not save ... maybe per app-run?

I will consider it.

Bug: The file list .. lags

Yep, its only been tested with small currently so that I could move quickly (with the amount of opens and closes... for dev). Will look into soon.

O-J1 avatar Jun 16 '25 17:06 O-J1

Everything you have mentioned @maedtb has been implemented or sufficiently addressed, please test and let me know. I only tested up to 1000 files this time. No idea how it will perform on some ridiculous amount.

O-J1 avatar Jun 17 '25 20:06 O-J1

Well that was rubbish.

O-J1 avatar Jun 18 '25 12:06 O-J1

Its absolutely not perfect but it works satisfactorily now. Marking ready for review. Many changes and refactors will probably have to happen but I am committed to this being merged at some point.

O-J1 avatar Jun 30 '25 12:06 O-J1