YUView icon indicating copy to clipboard operation
YUView copied to clipboard

"semiplanar" YUV images support

Open chaplin89 opened this issue 1 year ago • 14 comments

Is your feature request related to a problem? Please describe. Disclaimer: I barely know what I'm saying. I played a lot with the configuration of YUView and seems like I'm not able to find a configuration that correctly render "semiplanar" YUV images. I have a set of YUV frames that have a 420 chroma subsample. In these frames, Y is a plane on its own but U and V are a single plane.

The frames are 1984x1080 (that's 1080p with a stride), and the memory layout is:

  • Y, which is a 1984x1080 bytes matrix, followed by
  • U/V together in a single 1984x540 bytes matrix. On each row, first 992 bytes are U and the rest is V.

Is there a way to render these frames in the current implementation of YUView or is it something that can be added?

Thanks!

Describe the solution you'd like Render a frame described above.

Describe alternatives you've considered n.a.

chaplin89 avatar Feb 27 '23 15:02 chaplin89

Hi! That sounds like a YUV format that I have not encountered yet. But there is a wild number of specialized YUV formats out there. Do you have any specification on this YUV format? Or if not, can you share where this came from? It would be great if you could provide a file in this format for me to test for implementation. Or a way how I can create one.

ChristianFeldmann avatar Feb 28 '23 08:02 ChristianFeldmann

Hi Christian, I didn't find any spec about this, I just found some reference here but I'm not even sure it's the same thing.

This format is used internally by Chromium. Sample. The format that chromium assigns is this: PIXEL_FORMAT_I420, 12bpp YUV planar 1x1 Y, 2x2 UV samples, a.k.a. YU12.

Not sure this is correct or makes sense though. Following Python script is capable of showing the image correctly, sorry if not polished and I bet there are tons of better way to do what I'm doing with numpy, but it's the first time I'm using it:

from PIL import Image
import numpy

def getyuv():
    y = []
    uv = []

    with open('single_frame.yuv', 'rb') as f:
        for i in range(0,1080):
            row = list(f.read(1984))
            y.append(row)
        for i in range(0,540):
            row = list(f.read(1984))
            uv.append(row)
    return y,uv

def convert_input(y,uv):
    output = numpy.full((1080, 1984,3), (0,0,0), dtype=numpy.uint8)
    for row in range(0,len(uv)):
        for column in range(0,int(len(uv[row])/2)-1):
            output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+992])
            output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+992])
            output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+992])
    return output

y, uv = getyuv()
out = convert_input(y,uv)

# Trimming the last 64 px on each rows (garbage)
out_2 = numpy.full((1080, 1920,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,len(out)):
    out_2[row] = out[row][:-64]

img = Image.fromarray(out_2, mode='YCbCr')
img.show()

chaplin89 avatar Feb 28 '23 08:02 chaplin89

Oddly enough, the memory dump of this image contains 541 rows in the YU matrix, this means can't be open directly in yuview in any case. The frame I shared in the previous comment does not contains this extra bytes. Maybe a feature to discard XX bytes from the beginning or the end of each frame (and maybe also from the beginning of a file) can be surely useful when dealing with these raw information.

chaplin89 avatar Feb 28 '23 08:02 chaplin89

UPDATE: tried with many different pix_fmt on ffmpeg and none of them are able to decode the image as well. Seems like this is surely not a common format, maybe it's just Chromium that is using this internally and is not made to be shared/stored on disk. After all I was just trying to fix an issue in chromium, so it can be.

In any case, I built my own tooling for this. Probably we can close the issue as I don't think implementing this will bring any value to the project.

Here's a better script in the unlikely case someone else should run into the same issue:

from PIL import Image
import numpy
import os

class Convert:
    def __init__(self, column, row, stride, filename) -> None:
        self.column = column
        self.row = row
        self.stride = stride
        self.filename = filename
        self.fpos = 0
        if os.path.exists(self.get_destination()):
            os.unlink(self.get_destination())


    def get_single_frame(self):
        y = []
        uv = []

        with open(self.filename, 'rb') as f:
            if f.seek(self.fpos) == -1:
                return None,None
            y = list(f.read(self.stride*self.row))
            if len(y)==0:
                return None, None
            y = numpy.array(y)
            y = y.reshape(self.row, self.stride)

            half_row = int(self.row/2)
            uv = list(f.read(self.stride*half_row))
            if len(uv)==0:
                return None, None
            uv = numpy.array(uv)
            uv = uv.reshape(half_row, self.stride)
            # Remove last line (contain garbage)
            self.fpos = f.tell() + self.stride

        return y,uv

    def trim(self, frame):
        output = numpy.full((self.row,self.column,3), (0,0,0), dtype=numpy.uint8)
        for row in range(0,self.row):
            trimmed_row = frame[row][:(self.column-self.stride)]
            output[row] = trimmed_row
        return output

    def show(self, frame):
        image = Image.fromarray(frame, mode='YCbCr')
        image.show()

    def merge_yuv(self,y,uv):
        output = numpy.full((self.row, self.stride, 3), (0,0,0), dtype=numpy.uint8)
        half_stride = int(self.stride/2)
        uv_rows = len(uv)
        for row in range(0,uv_rows):
            uv_columns = len(uv[row])
            half_uv_columns = int(uv_columns/2)-1
            for column in range(0,half_uv_columns):
                output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+half_stride])
                output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+half_stride])
        return output

    def get_destination(self):
        return self.filename[:self.filename.rfind('.')] + ".converted.yuv"
    
    def save(self, y,uv):
        with open(self.get_destination(), 'ab') as f:
            f.write(bytes(y.reshape(y.shape[0]*y.shape[1]).tolist()))
            u = uv[:,0:int(uv.shape[1]/2)]
            v = uv[:,int(uv.shape[1]/2):]
            f.write(bytes(u.reshape(u.shape[0]*u.shape[1]).tolist()))
            f.write(bytes(v.reshape(v.shape[0]*v.shape[1]).tolist()))


target1 = 'your_file.yuv'

# change me according to target1 file spec
a = Convert(848,480,960, target1)

i=0
y, uv = a.get_single_frame()
while y is not None:
    a.save(y,uv)
    # Show an image every 20 frame for debug purposes
    if (i+1)%20 == 0:
        frame = a.merge_yuv(y,uv)
        frame = a.trim(frame)
        a.show(frame)
    y,uv = a.get_single_frame()
    i=i+1

This will take your_file.yuv in the format described above and generate `your_file.converted.yuv' that can be encoded into h264 with a command like this:

ffmpeg -f rawvideo -pixel_format yuv420p -video_size 960x480 -framerate 24 -i ./your_file.converted.yuv ./your_file.mp4

The only things that needs to be changed according to video spec are the parameter of the Convert ctor:

  • Cols, Rows -> The size of the visible portion of the frame
  • Stride -> The real number of column in the frame (can be bigger or equal to Cols)

Cheers!

chaplin89 avatar Feb 28 '23 15:02 chaplin89

Ah wait a second! I think we do have support for these semi planar files. At least for some of them. In the link you provided I saw the name NV12 and that rang a bell. We do support that. So you can open the YUV file and go to YUV Format ... custom. In the dialog you have to select the UV(A) interleaved checkbox. Can you try that? It may be what you are looking for. image

Alternatively you can put nv12 into the name of the file and YUView should apply the format based on that.

ChristianFeldmann avatar Mar 01 '23 15:03 ChristianFeldmann

I think at this point I may have tried each combination of custom and non custom decoding option but it's never displaying the image correctly. I think the option you're suggesting must have a sequence of alternating Cb and Cr on each row of the UV matrix. The format I'm talking about has the 1st half of the row for Cb values and the 2nd half of the row for Cb values.

chaplin89 avatar Mar 01 '23 15:03 chaplin89

Ah sorry then that is not exactly the format that you are looking for. Sorry. That would have been to easy anyway. All of the info you provided is already super helpful. But can you please somehow share a file in that format with me? It also only has to be a few frames. That will already do.

ChristianFeldmann avatar Mar 02 '23 08:03 ChristianFeldmann

Yup, I already shared it here: https://github.com/IENT/YUView/issues/518#issuecomment-1447773144

Pasting the link again here: https://github.com/IENT/YUView/files/10848045/single_frame.zip

chaplin89 avatar Mar 02 '23 08:03 chaplin89

Ah sorry my bad I was blind. Got it!

ChristianFeldmann avatar Mar 02 '23 09:03 ChristianFeldmann

No worries, YW!

chaplin89 avatar Mar 02 '23 09:03 chaplin89

Ok so I looked though all the data and files and its still a bit strange:

  • Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.
  • The file looks like its as you described with interlaced Cb Cr lines. I have never seen a format like that. But there are some extra bytes that I can not account for. So if the resolution is 1984x1080 then there is one line of 1984 bytes that is not accounted for.

I have still not found any documentation of a format like this mentioned anywhere. I mean there is all sorts of strange YUV formats out there.

ChristianFeldmann avatar Mar 02 '23 21:03 ChristianFeldmann

Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.

Yes, I'm sure, and yes, chromium code "apparently" support normal planar YUV files. However, if you debug the media part, you'll find out that this is just illusory. A YUV file in chrome is represented as a contiguous memory area in which:

  • The first plane is starting from offset 0
  • The second plane starts from offset stride*width
  • The third plane starts from offset stride*width + half stride

This means that in order to render the image correctly you still have to take into account that the UV plane is interleaved in this way (half row U, half row V).

But there are some extra bytes that I can not account for.

Yup, this is where I'm talking about that https://github.com/IENT/YUView/issues/518#issuecomment-1447782387

I thought I removed this extra line but perhaps I'm wrong. And yeah, I agree it's strange. ffmpeg it's not even supporting it, this is why I was saying probably it's not adding much value to the project.

chaplin89 avatar Mar 02 '23 21:03 chaplin89

Can you refer to the code where this happens in the chromium media part please? I checked out the code and to me it looks like the I420 format has 3 separate planes. E.g. here is the code from video_framce.cc:

    case PIXEL_FORMAT_I420: {
      int uv_width = (coded_size.width() + 1) / 2;
      int uv_height = (coded_size.height() + 1) / 2;
      int uv_stride = uv_width;
      int uv_size = uv_stride * uv_height;
      planes = std::vector<ColorPlaneLayout>{
          ColorPlaneLayout(coded_size.width(), 0, coded_size.GetArea()),
          ColorPlaneLayout(uv_stride, coded_size.GetArea(), uv_size),
          ColorPlaneLayout(uv_stride, coded_size.GetArea() + uv_size, uv_size),
      };
      break;
    }

The offset for the V plane here is uv_size which indicates that the 3 planes are completely separate. The PIXEL_FORMAT_NV12 format seems to have 2 frames where UV are packed. But here, the UV values are packed per value (UVUVUV) and not per line.

I am riding on this so much because if we find out the name of this format, then we can also use it. I don't want to invent a new name for this as there must be one if it is used in chromium.

I think I found one reference to a format like this in the Microsoft docs: https://learn.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#imc2 . They call it IMC2.

ChristianFeldmann avatar Mar 03 '23 10:03 ChristianFeldmann

From the view of data arrangement, we can divide the YUV formats into 3 basic types: Planar(3/4(alpha) plans), Semi-Plan(2 plans), Interleaved(or so called packed, only 1 plan). Then divide the Semi-Plan to 2 subclass: uv_interleaved(UVUV...UVUV), or uv_followed(UU...UUVV...VV). I believe this will work in distinguishing from YUV formats.

FaiScofield avatar Jul 11 '23 15:07 FaiScofield