DiffAttack icon indicating copy to clipboard operation
DiffAttack copied to clipboard

change batch_size

Open MDD-0928 opened this issue 1 year ago • 3 comments

Dear author: DiffAttack is such a novel and meaningful work for the community! Thanks to your contribution!

      I would like to consult you that whether I can change the batch_size when using the code, like change the batch_size to 32 in order to generate 32 images at one time. 
    
     And, which line of code in diff_attack_latent.py should I modify?

MDD-0928 avatar Sep 21 '24 02:09 MDD-0928

Hi @MDD-0928 ,

The most straightforward way to implement batch parallelism is by using multiprocessing to handle the image batches. You can modify the lines below https://github.com/WindVChen/DiffAttack/blob/d66b1e79f1cd706b8deeedbe83e07925b7216e53/main.py#L162 to split the processed images across different processes. This should be fairly simple and won't require changes to the core code in diff_attack_latent.py.

Also, please make sure you have enough computing resources, as processing a single image currently requires about 16GB of memory.

Hope this helps!

WindVChen avatar Sep 21 '24 17:09 WindVChen

Thanks for your reply!!! I would like to know why you define the "prmopt" by prompt = [imagenet_label.refined_Label[label.item()] + " " + target_prompt] * 2 why the prompt needs to " *2 " and why does the text_encoder first get prompt[0] as input in line 294 and secondly get prmopt as input in line 341

MDD-0928 avatar Sep 22 '24 08:09 MDD-0928

Hi @MDD-0928 ,

Sorry for the delayed reply.

The first prompt[0] is used to obtain the optimized uncond_embeddings for an empty prompt (i.e., ""), so we only need one latent for the calculation, corresponding to a single prompt text [text].

On the other hand, the second prompt, which contains two identical texts [text, text], is used for calculating the structure loss (refer to Section 3.4 of our paper). In this case, we require two latents: one is the original latent, and the other is the optimized latent. That's why we use a prompt with two text embeddings instead of prompt[0].

Hope this clears things up!

WindVChen avatar Sep 23 '24 20:09 WindVChen