ggml icon indicating copy to clipboard operation
ggml copied to clipboard

Feature: Circular Padding

Open gartia opened this issue 1 year ago • 6 comments

This pull request introduces the circular padding feature into the ggml library.

struct ggml_tensor * paddedoutput= ggml_circular_padding(ctx, tensorInput, paddingAmount);

I believe i wrote the code be effecient and able to parallelized, i'm not entirely familiar with c, so if anyone can review and make sure it's not going to interfere or grind anything to a halt = )

Example input and output with padding 1 image

gartia avatar Dec 06 '23 08:12 gartia

@FSSRepo would you like to review this PR? I think you are planning to add ggml_pad - see if the 2 things play along together

ggerganov avatar Dec 07 '23 09:12 ggerganov

Honestly that makes more sense that i think about it, because i'm sure in the future someone will probably want another type of padding as well like torus padding and what not i'll change it to allign with that name

gartia avatar Dec 07 '23 10:12 gartia

I'm going to follow up on this PR, although I'd prefer to at least understand the use case for this type of padding. According to @gartia , it's used in stable diffusion and is related to conv2d. The issue is that im2col already inherently performs the padding operation with zeros. It would be necessary to see if applying this ggml_pad_circular operation before conv2d has any impact on the way images are generated.

FSSRepo avatar Dec 07 '23 10:12 FSSRepo

It's mainly for generating textures, or repeating images such as panoramic ones. I have seen theres also better implementations such as wrapping the tensor like a torus, but would rather keep it simple for now

gartia avatar Dec 07 '23 10:12 gartia

If this function proves to be useful and easy to integrate with the conv2d operation, it will be added. Since having an operation that no ggml project needs doesn't make sense to me, for that reason, its functionality must be tested in stable diffusion. Also, I don't think it's a good idea to add this operation separately; it should be a mode of the ggml_pad operation, something like GGML_CIRCULAR_PADDING, GGML_ZERO_PADDING, to generalize and avoid code duplication.

For now, this PR could be a draft until I can dedicate time to thoroughly review this feature, as I still need to work on my PR #621.

FSSRepo avatar Dec 07 '23 14:12 FSSRepo

We could also have a single pad function instead of multiple using an enum or something. That way in case someone in the future needs to add more padding functionality it wont create a bunch of different functions doing almost the same operations. I believe pytorch has 4 different padding modes for conv2d

static void ggml_compute_forward_pad_f32(
        const struct ggml_compute_params * params,
        const struct ggml_tensor * src,
        struct ggml_tensor * dst) {

    if (params->type == GGML_TASK_INIT || params->type == GGML_TASK_FINALIZE) {
        return;
    }

    const int padding = dst->op_params[0];
    const ggml_pad_type pad_type = dst->op_params[0];
    const int ith = params->ith;
    const int nth = params->nth;

    const int64_t orig_height = src->ne[0];
    const int64_t orig_width = src->ne[1];
    const int64_t new_height = orig_height + 2 * padding;
    const int64_t new_width = orig_width + 2 * padding;

    float * src_data = (float *) src->data;
    float * dst_data = (float *) dst->data;

    int64_t total_elements = new_height * new_width;
    int64_t start_index = ith * (total_elements / nth);
    int64_t end_index = (ith + 1) == nth ? total_elements : (ith + 1) * (total_elements / nth);

    switch (pad_type) {
        case GGML_PAD_CIRCULAR:
            for (int64_t idx = start_index; idx < end_index; ++idx) {
                int64_t i = idx / new_width;
                int64_t j = idx % new_width;

                int64_t orig_i = (i - padding + orig_height) % orig_height;
                int64_t orig_j = (j - padding + orig_width) % orig_width;

                dst_data[i * new_width + j] = src_data[orig_i * orig_width + orig_j];
            }
            break;

        case GGML_PAD_ZERO:
        default:
            for (int64_t idx = start_index; idx < end_index; ++idx) {
                int64_t i = idx / new_width;
                int64_t j = idx % new_width;

                int64_t orig_i = i - padding;
                int64_t orig_j = j - padding;

                if (orig_i < 0 || orig_i >= orig_height || orig_j < 0 || orig_j >= orig_width) {
                    dst_data[i * new_width + j] = 0;
                } else {
                    dst_data[i * new_width + j] = src_data[orig_i * orig_width + orig_j];
                }
            }
            break;
    }
}

gartia avatar Dec 07 '23 18:12 gartia