CCSR icon indicating copy to clipboard operation
CCSR copied to clipboard

small input image dimensions fail

Open Manoa1911 opened this issue 6 months ago • 3 comments

CUDA: 11.8 windows: 10 card: GA102 cudNN: 8.9.7 (latest) ''' python test_ccsr_tile.py ^ --image_path X:\input-CCSR2 ^ --pretrained_model_path ....\SD21-base\stable-diffusion-2-1-base ^ --controlnet_model_path ....\CCSR-CCSR-v2.0\preset\models\controlnet ^ --vae_model_path ....\CCSR-CCSR-v2.0\preset\models\vae ^ --sample_method ddpm --num_inference_steps 15 --t_min 0.0 --start_point lr --start_steps 0 --guidance_scale 1.0 --sample_times 1 --use_vae_encode_condition --upscale 4 ^ --output_dir X:\CCSR2-i15-g1.0-s0 --conditioning_scale 1.0 --tile_vae --tile_diffusion '''

''' parser.add_argument("--tile_diffusion", action="store_true", help="Optionally! Enable tile-based diffusion") parser.add_argument("--tile_diffusion_size", type=int, default=4096, help="Tile size for diffusion") parser.add_argument("--tile_diffusion_stride", type=int, default=2048, help="Stride size for diffusion tiles") parser.add_argument("--tile_vae", action="store_true", help="Optionally! Enable tiling for VAE") parser.add_argument("--vae_decoder_tile_size", type=int, default=224, help="Tile size for VAE decoder") parser.add_argument("--vae_encoder_tile_size", type=int, default=2496, help="Tile size for VAE encoder") '''

'''[Tiled VAE]: the input size is tiny and unnecessary to tile. Traceback (most recent call last): File "X:\CCSR-CCSR-v2.0\test_ccsr_tile.py", line 297, in main(args) File "X:\CCSR-CCSR-v2.0\test_ccsr_tile.py", line 202, in main inference_time, image = pipeline( File "X:\WPy64-31090\python-3.10.9.amd64\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "X:\CCSR-CCSR-v2.0\pipelines\pipeline_ccsr.py", line 1031, in call latents, x0_T = self._initial_step(do_classifier_free_guidance, latent_model_input, t, t_tao, prompt_embeds, image, vae_encode_condition_hidden_states, tile_diffusion, tile_size, tile_stride) File "X:\CCSR-CCSR-v2.0\pipelines\pipeline_ccsr.py", line 887, in _initial_step noise_pred = self._predict_noise(latents, t, image, prompt_embeds, None, vae_conditions, tile_diffusion, tile_size, tile_stride, 1.0, False) File "X:\CCSR-CCSR-v2.0\pipelines\pipeline_ccsr.py", line 833, in _predict_noise noise_pred = self._tile_predict(latent_model_input, t, image, prompt_embeds, cross_attention_kwargs, vae_conditions, tile_size, tile_stride, conditioning_scale, guess_mode) File "X:\CCSR-CCSR-v2.0\pipelines\pipeline_ccsr.py", line 876, in _tile_predict noise_pred[:, :, hi:hi_end, wi:wi_end] += tile_noise * tile_weight RuntimeError: The size of tensor a (64) must match the size of tensor b (512) at non-singleton dimension 3'''

input image size: 128x256

Manoa1911 avatar Jun 05 '25 12:06 Manoa1911

Thanks for your feedback. Is the input image size 128x256x3? You may be able to resolve the issue by removing the 'tile_vae ' and 'tile_diffusion' when the input size is small. Here’s an example command you can try: ''' python test_ccsr_tile.py ^ --image_path X:\input-CCSR2 ^ --pretrained_model_path ....\SD21-base\stable-diffusion-2-1-base ^ --controlnet_model_path ....\CCSR-CCSR-v2.0\preset\models\controlnet ^ --vae_model_path ....\CCSR-CCSR-v2.0\preset\models\vae ^ --sample_method ddpm --num_inference_steps 15 --t_min 0.0 --start_point lr --start_steps 0 --guidance_scale 1.0 --sample_times 1 --use_vae_encode_condition --upscale 4 ^ --output_dir X:\CCSR2-i15-g1.0-s0 --conditioning_scale 1.0 '''

csslc avatar Jun 10 '25 02:06 csslc

thare is almost 10,000 images in directory :( each is diffrent size any code modifications possible to make it ignore tiling for small files ?

maybe in here ?

    def _predict_noise(self, latent_model_input, t, image, prompt_embeds, cross_attention_kwargs, vae_conditions, tile_diffusion, tile_size, tile_stride, conditioning_scale, guess_mode):
        if not tile_diffusion:
            noise_pred = self._unet_predict(latent_model_input, t, image, prompt_embeds, cross_attention_kwargs, vae_conditions)
        else:
            noise_pred = self._tile_predict(latent_model_input, t, image, prompt_embeds, cross_attention_kwargs, vae_conditions, tile_size, tile_stride, conditioning_scale, guess_mode)
        return noise_pred

thank :)

Manoa1911 avatar Jun 21 '25 18:06 Manoa1911

You can try: ''' python test_ccsr_tile.py ^ --image_path X:\input-CCSR2 ^ --pretrained_model_path ....\SD21-base\stable-diffusion-2-1-base ^ --controlnet_model_path ....\CCSR-CCSR-v2.0\preset\models\controlnet ^ --vae_model_path ....\CCSR-CCSR-v2.0\preset\models\vae ^ --sample_method ddpm --num_inference_steps 15 --t_min 0.0 --start_point lr --start_steps 0 --guidance_scale 1.0 --sample_times 1 --use_vae_encode_condition --upscale 4 ^ --output_dir X:\CCSR2-i15-g1.0-s0 --conditioning_scale 1.0 '''

However, please note that CCSR cannot process images smaller than 512×512 pixels. I recommend resizing any images below this size to at least 512×512 to achieve better SR results.

csslc avatar Jul 17 '25 07:07 csslc