PanelCleaner icon indicating copy to clipboard operation
PanelCleaner copied to clipboard

Export Boxes

Open apt42541 opened this issue 1 year ago • 5 comments

Is your feature request related to a problem? Please describe. No。

Describe the solution you'd like i don't sure PanelCleaner can train bubble detecter or not but if it do that will good, and i want to export boxes that generate and i draw additional for undection to image result i want to get

image

Describe alternatives you've considered

Additional context

apt42541 avatar Jul 26 '24 14:07 apt42541

This isn't something Panel Cleaner can do or will ever do out of the box, but that said, if you're training your own model, you surely know a bit of Python. In that case, yes, it can do it for you if you follow these simple steps:

  • Install Panel Cleaner so you can run it from source. That means cloning the repo, creating a virtual environment, installing the dependencies from the requirements.txt with pip and then launching the pcleaner/main.py or the pcleaner/gui/launcher.py file to start pcleaner. If that works, move on to step 2.
  • Edit the following code, inserting the 4 lines in pcleaner/image_ops.py. Here as a patch file: cutout_generator.txt: (don't add the + symbols, of course, those are the new lines to add)
diff --git a/pcleaner/image_ops.py b/pcleaner/image_ops.py
index 5ffa74a..97bc6fc 100644
--- a/pcleaner/image_ops.py
+++ b/pcleaner/image_ops.py
@@ -505,6 +505,10 @@ def pick_best_mask(

     # Replace all given images with their corresponding cutouts.
     base = cut_out_box(base, reference_box)
+    cutouts_directory = analytics_page_path.parent / "cutouts"
+    cutout_file_name = analytics_page_path.stem + f"_{reference_box}.png"
+    cutouts_directory.mkdir(exist_ok=True)
+    base.save(cutouts_directory / cutout_file_name)
     precise_mask_cut = cut_out_mask(precise_mask, masking_box, base.size, x_offset, y_offset)
 
     # Check that the precise mask is not blank.

You can either do this manually or download the cutout_generator.txt, place it inside the PanelCleaner directory and then run the command git apply cutout_generator.txt to have git do it for you (if you cloned the repo, rather than just downloading a zip of the source code).

Then just run pcleaner from source and now every time you clean something, it'll generate the bubble cutouts whenever you clean anything now, no need to save the exports even. By default it'll produce results like this, exactly as you wanted: Screenshot_20240727_001342

You can adjust the padding via the regular old box padding in the preprocessor section of your current profile.

Hope it helps, that's the best I can do for you.

Note that you will need to curate the results, otherwise you'll be training an AI on AI data, which means garbage in, garbage out, you'll only be able to produce worse results.

VoxelCubes avatar Jul 26 '24 22:07 VoxelCubes

it work very well in cli. but i need it on gui to create cutouts after add more box with ocr mode , can it achieve ?

apt42541 avatar Jul 27 '24 02:07 apt42541

So you want to do the OCR review and add new boxes using that review gui? The images aren't processed again after doing the review, so you can't just hack it in there. In that case, I'd recommend a different workflow: Perform an OCR run with the output in CSV format and do the review for it. Manually check everything that way. Then, the OCR output in csv format will be saved to a file. This file contains the image file path (relative to where the ocr output was saved) as well as the box coordinates and the text (but I guess you don't need the text). This means, you have coordinates for all the text boxes in a file.

Just write your own tiny script that loads the csv file, which in turn loads each image from the path in the file, crops it to the box coordinates, and saves that as your desired output. It's so easy, Chatgpt could write this, if you give it an example of the csv file, like this:

filename,startx,starty,endx,endy,text
img1.jpg,923,73,1011,336,some text perhaps
img1.jpg,534,275,592,414,or nothing at all

I'd recommend reverting the 4-line patch from the previous comment if you're going to do this, as you'll now be generating cutouts a different way, making the previous method unnecessary.

This won't need you running pcleaner from source, but if you want to, you can apply this patch:

diff --git a/pcleaner/preprocessor.py b/pcleaner/preprocessor.py
index 0f43ac3..a785e38 100644
--- a/pcleaner/preprocessor.py
+++ b/pcleaner/preprocessor.py
@@ -286,7 +286,7 @@ def ocr_check(
     for i, box in enumerate(candidate_small_bubbles):
         cutout = base_image.crop(box.as_tuple)
         # cutout.save(outpath / f"{img_path.stem}_cutout_{i}.png")
-        text = mocr(cutout)
+        text = ""
         remove = is_not_worth_cleaning(text, ocr_blacklist_pattern)
         box_sizes.append(box.area)
         if remove:

to skip running OCR, which will save a lot of time. It just makes it so the text is always blank. Not a problem if you don't need the text anyway, the boxes will be saved in the output either way.

Hope that helps!

VoxelCubes avatar Jul 28 '24 01:07 VoxelCubes

it look like PanelCleaner 2.8.1 create cutout on clean automatically?

apt42541 avatar Jul 28 '24 02:07 apt42541

That's just your modified version.

VoxelCubes avatar Jul 28 '24 03:07 VoxelCubes