YUView
YUView copied to clipboard
"semiplanar" YUV images support
Is your feature request related to a problem? Please describe. Disclaimer: I barely know what I'm saying. I played a lot with the configuration of YUView and seems like I'm not able to find a configuration that correctly render "semiplanar" YUV images. I have a set of YUV frames that have a 420 chroma subsample. In these frames, Y is a plane on its own but U and V are a single plane.
The frames are 1984x1080
(that's 1080p with a stride), and the memory layout is:
- Y, which is a
1984x1080
bytes matrix, followed by - U/V together in a single
1984x540
bytes matrix. On each row, first 992 bytes are U and the rest is V.
Is there a way to render these frames in the current implementation of YUView or is it something that can be added?
Thanks!
Describe the solution you'd like Render a frame described above.
Describe alternatives you've considered n.a.
Hi! That sounds like a YUV format that I have not encountered yet. But there is a wild number of specialized YUV formats out there. Do you have any specification on this YUV format? Or if not, can you share where this came from? It would be great if you could provide a file in this format for me to test for implementation. Or a way how I can create one.
Hi Christian, I didn't find any spec about this, I just found some reference here but I'm not even sure it's the same thing.
This format is used internally by Chromium. Sample.
The format that chromium assigns is this:
PIXEL_FORMAT_I420, 12bpp YUV planar 1x1 Y, 2x2 UV samples, a.k.a. YU12.
Not sure this is correct or makes sense though. Following Python script is capable of showing the image correctly, sorry if not polished and I bet there are tons of better way to do what I'm doing with numpy, but it's the first time I'm using it:
from PIL import Image
import numpy
def getyuv():
y = []
uv = []
with open('single_frame.yuv', 'rb') as f:
for i in range(0,1080):
row = list(f.read(1984))
y.append(row)
for i in range(0,540):
row = list(f.read(1984))
uv.append(row)
return y,uv
def convert_input(y,uv):
output = numpy.full((1080, 1984,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,len(uv)):
for column in range(0,int(len(uv[row])/2)-1):
output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+992])
output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+992])
output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+992])
output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+992])
return output
y, uv = getyuv()
out = convert_input(y,uv)
# Trimming the last 64 px on each rows (garbage)
out_2 = numpy.full((1080, 1920,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,len(out)):
out_2[row] = out[row][:-64]
img = Image.fromarray(out_2, mode='YCbCr')
img.show()
Oddly enough, the memory dump of this image contains 541 rows in the YU matrix, this means can't be open directly in yuview in any case. The frame I shared in the previous comment does not contains this extra bytes. Maybe a feature to discard XX bytes from the beginning or the end of each frame (and maybe also from the beginning of a file) can be surely useful when dealing with these raw information.
UPDATE: tried with many different pix_fmt
on ffmpeg and none of them are able to decode the image as well. Seems like this is surely not a common format, maybe it's just Chromium that is using this internally and is not made to be shared/stored on disk. After all I was just trying to fix an issue in chromium, so it can be.
In any case, I built my own tooling for this. Probably we can close the issue as I don't think implementing this will bring any value to the project.
Here's a better script in the unlikely case someone else should run into the same issue:
from PIL import Image
import numpy
import os
class Convert:
def __init__(self, column, row, stride, filename) -> None:
self.column = column
self.row = row
self.stride = stride
self.filename = filename
self.fpos = 0
if os.path.exists(self.get_destination()):
os.unlink(self.get_destination())
def get_single_frame(self):
y = []
uv = []
with open(self.filename, 'rb') as f:
if f.seek(self.fpos) == -1:
return None,None
y = list(f.read(self.stride*self.row))
if len(y)==0:
return None, None
y = numpy.array(y)
y = y.reshape(self.row, self.stride)
half_row = int(self.row/2)
uv = list(f.read(self.stride*half_row))
if len(uv)==0:
return None, None
uv = numpy.array(uv)
uv = uv.reshape(half_row, self.stride)
# Remove last line (contain garbage)
self.fpos = f.tell() + self.stride
return y,uv
def trim(self, frame):
output = numpy.full((self.row,self.column,3), (0,0,0), dtype=numpy.uint8)
for row in range(0,self.row):
trimmed_row = frame[row][:(self.column-self.stride)]
output[row] = trimmed_row
return output
def show(self, frame):
image = Image.fromarray(frame, mode='YCbCr')
image.show()
def merge_yuv(self,y,uv):
output = numpy.full((self.row, self.stride, 3), (0,0,0), dtype=numpy.uint8)
half_stride = int(self.stride/2)
uv_rows = len(uv)
for row in range(0,uv_rows):
uv_columns = len(uv[row])
half_uv_columns = int(uv_columns/2)-1
for column in range(0,half_uv_columns):
output[row*2][column*2] = (y[row*2][column*2], uv[row][column], uv[row][column+half_stride])
output[row*2][column*2+1] = (y[row*2][column*2+1], uv[row][column], uv[row][column+half_stride])
output[row*2+1][column*2] = (y[row*2+1][column*2], uv[row][column], uv[row][column+half_stride])
output[row*2+1][column*2+1] = (y[row*2+1][column*2+1], uv[row][column], uv[row][column+half_stride])
return output
def get_destination(self):
return self.filename[:self.filename.rfind('.')] + ".converted.yuv"
def save(self, y,uv):
with open(self.get_destination(), 'ab') as f:
f.write(bytes(y.reshape(y.shape[0]*y.shape[1]).tolist()))
u = uv[:,0:int(uv.shape[1]/2)]
v = uv[:,int(uv.shape[1]/2):]
f.write(bytes(u.reshape(u.shape[0]*u.shape[1]).tolist()))
f.write(bytes(v.reshape(v.shape[0]*v.shape[1]).tolist()))
target1 = 'your_file.yuv'
# change me according to target1 file spec
a = Convert(848,480,960, target1)
i=0
y, uv = a.get_single_frame()
while y is not None:
a.save(y,uv)
# Show an image every 20 frame for debug purposes
if (i+1)%20 == 0:
frame = a.merge_yuv(y,uv)
frame = a.trim(frame)
a.show(frame)
y,uv = a.get_single_frame()
i=i+1
This will take your_file.yuv
in the format described above and generate `your_file.converted.yuv' that can be encoded into h264 with a command like this:
ffmpeg -f rawvideo -pixel_format yuv420p -video_size 960x480 -framerate 24 -i ./your_file.converted.yuv ./your_file.mp4
The only things that needs to be changed according to video spec are the parameter of the Convert
ctor:
- Cols, Rows -> The size of the visible portion of the frame
- Stride -> The real number of column in the frame (can be bigger or equal to Cols)
Cheers!
Ah wait a second! I think we do have support for these semi planar files. At least for some of them. In the link you provided I saw the name NV12 and that rang a bell. We do support that. So you can open the YUV file and go to YUV Format ... custom. In the dialog you have to select the UV(A) interleaved
checkbox. Can you try that? It may be what you are looking for.
Alternatively you can put nv12
into the name of the file and YUView should apply the format based on that.
I think at this point I may have tried each combination of custom and non custom decoding option but it's never displaying the image correctly. I think the option you're suggesting must have a sequence of alternating Cb and Cr on each row of the UV matrix. The format I'm talking about has the 1st half of the row for Cb values and the 2nd half of the row for Cb values.
Ah sorry then that is not exactly the format that you are looking for. Sorry. That would have been to easy anyway. All of the info you provided is already super helpful. But can you please somehow share a file in that format with me? It also only has to be a few frames. That will already do.
Yup, I already shared it here: https://github.com/IENT/YUView/issues/518#issuecomment-1447773144
Pasting the link again here: https://github.com/IENT/YUView/files/10848045/single_frame.zip
Ah sorry my bad I was blind. Got it!
No worries, YW!
Ok so I looked though all the data and files and its still a bit strange:
- Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.
- The file looks like its as you described with interlaced Cb Cr lines. I have never seen a format like that. But there are some extra bytes that I can not account for. So if the resolution is 1984x1080 then there is one line of 1984 bytes that is not accounted for.
I have still not found any documentation of a format like this mentioned anywhere. I mean there is all sorts of strange YUV formats out there.
Are you sure that this format is used in chrome as PIXEL_FORMAT_I420? Because from the chromium code it looks like this is a "normal" planar format with separate planes for Y Cb Cr.
Yes, I'm sure, and yes, chromium code "apparently" support normal planar YUV files. However, if you debug the media part, you'll find out that this is just illusory. A YUV file in chrome is represented as a contiguous memory area in which:
- The first plane is starting from offset 0
- The second plane starts from offset stride*width
- The third plane starts from offset stride*width + half stride
This means that in order to render the image correctly you still have to take into account that the UV plane is interleaved in this way (half row U, half row V).
But there are some extra bytes that I can not account for.
Yup, this is where I'm talking about that https://github.com/IENT/YUView/issues/518#issuecomment-1447782387
I thought I removed this extra line but perhaps I'm wrong. And yeah, I agree it's strange. ffmpeg it's not even supporting it, this is why I was saying probably it's not adding much value to the project.
Can you refer to the code where this happens in the chromium media part please? I checked out the code and to me it looks like the I420 format has 3 separate planes. E.g. here is the code from video_framce.cc
:
case PIXEL_FORMAT_I420: {
int uv_width = (coded_size.width() + 1) / 2;
int uv_height = (coded_size.height() + 1) / 2;
int uv_stride = uv_width;
int uv_size = uv_stride * uv_height;
planes = std::vector<ColorPlaneLayout>{
ColorPlaneLayout(coded_size.width(), 0, coded_size.GetArea()),
ColorPlaneLayout(uv_stride, coded_size.GetArea(), uv_size),
ColorPlaneLayout(uv_stride, coded_size.GetArea() + uv_size, uv_size),
};
break;
}
The offset for the V plane here is uv_size
which indicates that the 3 planes are completely separate.
The PIXEL_FORMAT_NV12
format seems to have 2 frames where UV are packed. But here, the UV values are packed per value (UVUVUV) and not per line.
I am riding on this so much because if we find out the name of this format, then we can also use it. I don't want to invent a new name for this as there must be one if it is used in chromium.
I think I found one reference to a format like this in the Microsoft docs: https://learn.microsoft.com/en-us/windows/win32/medfound/recommended-8-bit-yuv-formats-for-video-rendering#imc2 . They call it IMC2.
From the view of data arrangement, we can divide the YUV formats into 3 basic types: Planar(3/4(alpha) plans), Semi-Plan(2 plans), Interleaved(or so called packed, only 1 plan). Then divide the Semi-Plan to 2 subclass: uv_interleaved(UVUV...UVUV), or uv_followed(UU...UUVV...VV). I believe this will work in distinguishing from YUV formats.