esp-iot-solution icon indicating copy to clipboard operation
esp-iot-solution copied to clipboard

No YUYV support (AEGHB-323)

Open sxrap1 opened this issue 1 year ago • 8 comments

Camera is YUYV 512 bytes transfer, and yes it will be slow but thats fine. However the new stream component and examples all force to MJPEG, which is not an option. Im happy to code in the YUYV support, but really have no idea where to start, just looking for some pointers to bring this functionality in.

Thanks

sxrap1 avatar Jul 29 '23 05:07 sxrap1

@sxrap1 Yes the usb_stream supports MJPEG only due to the limited throughput of USB Fullspeed, while adding support for YUV is possible but quite slow.

YUV format possible frame rate:

  • For YUV frame size 320*240, one frame is 320*240*2 = 153,600 Bytes.
  • For ESP32S3 Isochronous transfer (MPS = 512), the throughput is 512*1000 = 512,000Bytes/s
  • For ESP32S3 Bulk transfer, the throughput is 64*19*1000 = 1216,000 Bytes/s
  • The final possible frame rate should be 8 > frame rate >= 3 Frame/s

How to support YUV:

In _update_config_from_descriptor add a case handler for the VIDEO_CS_ITF_VS_FORMAT_UNCOMPRESSED to get the detailed info for the YUV format like the MJEPG format. In particular for uncompressed format, you need to check the guidFormat to judge which format the camera support eg. YUY2 NV12, for details please refer USB_Video_Payload_Uncompressed_1.5.pdf

For the rest payload handler, the processing process should be the same for both YUV and MJPEG

leeebo avatar Jul 30 '23 02:07 leeebo

Thank you, I have got the code running now to the point it starts streaming. However no matter what settings I try from the format and frame descriptor almost all packet headers have error flag set, 1 in about 30 comes through with 500 bytes with the EOF flag set. Slowly driving me crazy, but I will keep at it.

sxrap1 avatar Aug 03 '23 21:08 sxrap1

@sxrap1 error flag in headers because the bandwidth is too low to transmit the frame rate you set, which may cause the image to break, so the camera set the bit.

leeebo avatar Aug 04 '23 00:08 leeebo

Yes , starting to think the camera may be too fast for little esp32 s3,

Ive tried pretty much everything, but from what I can see from below 4000000 (25 FPS) is as slow as I can get Cam to send.

Frame 37: 386 bytes on wire (3088 bits), 386 bytes captured (3088 bits) on interface \.\USBPcap2, id 0 Section number: 1 Interface id: 0 (\.\USBPcap2) Interface name: \.\USBPcap2 Interface description: USBPcap2 Encapsulation type: USB packets with USBPcap header (152) Arrival Time: Aug 4, 2023 09:13:37.404325000 E. Australia Standard Time [Time shift for this packet: 0.000000000 seconds] Epoch Time: 1691104417.404325000 seconds [Time delta from previous captured frame: 0.106572000 seconds] [Time delta from previous displayed frame: 0.106572000 seconds] [Time since reference or first frame: 6.188138000 seconds] Frame Number: 37 Frame Length: 386 bytes (3088 bits) Capture Length: 386 bytes (3088 bits) [Frame is marked: False] [Frame is ignored: False] [Protocols in frame: usb:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo:usbvideo] USB URB [Source: 2.19.0] [Destination: host] USBPcap pseudoheader length: 28 IRP ID: 0xffffc48c52b4d010 IRP USBD_STATUS: USBD_STATUS_SUCCESS (0x00000000) URB Function: URB_FUNCTION_CONTROL_TRANSFER (0x0008) IRP information: 0x01, Direction: PDO -> FDO 0000 000. = Reserved: 0x00 .... ...1 = Direction: PDO -> FDO (0x1) URB bus id: 2 Device address: 19 Endpoint: 0x80, Direction: IN 1... .... = Direction: IN (1) .... 0000 = Endpoint number: 0 URB transfer type: URB_CONTROL (0x02) Packet Data Length: 358 [Request in: 36] [Time from request: 0.106572000 seconds] Control transfer stage: Complete (3) CONFIGURATION DESCRIPTOR bLength: 9 bDescriptorType: 0x02 (CONFIGURATION) wTotalLength: 358 bNumInterfaces: 2 bConfigurationValue: 1 iConfiguration: 4 Configuration bmAttributes: 0x80 NOT SELF-POWERED NO REMOTE-WAKEUP 1... .... = Must be 1: Must be 1 for USB 1.1 and higher .0.. .... = Self-Powered: This device is powered from the USB bus ..0. .... = Remote Wakeup: This device does NOT support remote wakeup bMaxPower: 50 (100mA) INTERFACE ASSOCIATION DESCRIPTOR bLength: 8 bDescriptorType: 0x0b (INTERFACE ASSOCIATION) bFirstInterface: 0 bInterfaceCount: 2 bFunctionClass: Video (0x0e) bFunctionSubClass: 0x03 bFunctionProtocol: 0x00 iFunction: 5 INTERFACE DESCRIPTOR (0.0): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 0 bAlternateSetting: 0 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x01 bInterfaceProtocol: 0x00 iInterface: 5 VIDEO CONTROL INTERFACE DESCRIPTOR [Header] bLength: 13 bDescriptorType: 0x24 (video class interface) Subtype: Header (1) bcdUVC: 0x0100 wTotalLength: 78 dwClockFrequency: 15000000 bInCollection: 1 baInterfaceNr: 01 VIDEO CONTROL INTERFACE DESCRIPTOR [Input Terminal] (Entity 1) bLength: 18 bDescriptorType: 0x24 (video class interface) Subtype: Input Terminal (2) bTerminalID: 1 wTerminalType: Camera Input (0x0201) bAssocTerminal: 0 iTerminal: 0 wObjectiveFocalLengthMin: 0 wObjectiveFocalLengthMax: 0 wOcularFocalLength: 0 bControlSize: 3 bmControl: 0x0000000e, Auto Exposure Mode, Auto Exposure Priority, Exposure Time (Absolute) .... .... .... .... .... .0 = Scanning Mode: No .... .... .... .... .... 1. = Auto Exposure Mode: Yes .... .... .... .... ...1 .. = Auto Exposure Priority: Yes .... .... .... .... ..1. .. = Exposure Time (Absolute): Yes .... .... .... .... .0.. .. = Exposure Time (Relative): No .... .... .... .... 0... .. = Focus (Absolute): No .... .... .... ...0 .... .. = Focus (Relative): No .... .... .... ..0. .... .. = Iris (Absolute): No .... .... .... .0.. .... .. = Iris (Relative): No .... .... .... 0... .... .. = Zoom (Absolute): No .... .... ...0 .... .... .. = Zoom (Relative): No .... .... ..0. .... .... .. = PanTilt (Absolute): No .... .... .0.. .... .... .. = PanTilt (Relative): No .... .... 0... .... .... .. = Roll (Absolute): No .... ...0 .... .... .... .. = Roll (Relative): No .... ..0. .... .... .... .. = D15: No .... .0.. .... .... .... .. = D16: No .... 0... .... .... .... .. = Auto Focus: No ...0 .... .... .... .... .. = Privacy: No ..0. .... .... .... .... .. = Focus (Simple): No .0.. .... .... .... .... .. = Window: No 0... .... .... .... .... .. = Region of Interest: No VIDEO CONTROL INTERFACE DESCRIPTOR [Processing Unit] (Entity 2) bLength: 11 bDescriptorType: 0x24 (video class interface) Subtype: Processing Unit (5) bUnitID: 2 bSourceID: 1 wMaxMultiplier: 0 bControlSize: 2 bmControl: 0x0000157f, Brightness, Contrast, Hue, Saturation, Sharpness, Gamma, White Balance Temperature, Backlight Compensation, Power Line Frequency, White Balance Temperature, Auto .... .... .... .... .... ...1 = Brightness: Yes .... .... .... .... .... ..1. = Contrast: Yes .... .... .... .... .... .1.. = Hue: Yes .... .... .... .... .... 1... = Saturation: Yes .... .... .... .... ...1 .... = Sharpness: Yes .... .... .... .... ..1. .... = Gamma: Yes .... .... .... .... .1.. .... = White Balance Temperature: Yes .... .... .... .... 0... .... = White Balance Component: No .... .... .... ...1 .... .... = Backlight Compensation: Yes .... .... .... ..0. .... .... = Gain: No .... .... .... .1.. .... .... = Power Line Frequency: Yes .... .... .... 0... .... .... = Hue, Auto: No .... .... ...1 .... .... .... = White Balance Temperature, Auto: Yes .... .... ..0. .... .... .... = White Balance Component, Auto: No .... .... .0.. .... .... .... = Digital Multiplier: No .... .... 0... .... .... .... = Digital Multiplier Limit: No iProcessing: 0 VIDEO CONTROL INTERFACE DESCRIPTOR [Output Terminal] (Entity 3) bLength: 9 bDescriptorType: 0x24 (video class interface) Subtype: Output Terminal (3) bTerminalID: 3 wTerminalType: Streaming (0x0101) bAssocTerminal: 0 bSourceID: 4 iTerminal: 0 VIDEO CONTROL INTERFACE DESCRIPTOR [Extension Unit] (Entity 4) bLength: 27 bDescriptorType: 0x24 (video class interface) Subtype: Extension Unit (6) bUnitID: 4 guid: 1229a78c-47b4-4094-b0ce-db07386fb938 bNumControls: 2 bNrInPins: 1 baSourceID: 02 bControlSize: 2 bmControl: 0x00000600 iExtension: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x83 IN Endpoint:3 1... .... = Direction: IN Endpoint .... 0011 = Endpoint Number: 0x3 bmAttributes: 0x03 .... ..11 = Transfertype: Interrupt-Transfer (0x3) wMaxPacketSize: 16 ...0 0... .... .... = Transactions per microframe: 1 (0) .... ..00 0001 0000 = Maximum Packet Size: 16 bInterval: 6 VIDEO CONTROL ENDPOINT DESCRIPTOR [Interrupt] bLength: 5 bDescriptorType: 0x25 (video class endpoint) Subtype: Interrupt (3) wMaxTransferSize: 16 INTERFACE DESCRIPTOR (1.0): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 0 bNumEndpoints: 0 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 VIDEO STREAMING INTERFACE DESCRIPTOR [Input Header] bLength: 14 bDescriptorType: 0x24 (video class interface) Subtype: Input Header (1) bNumFormats: 1 wTotalLength: 121 bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmInfo: 0x00 .... ...0 = Dynamic Format Change: No bTerminalLink: 3 bStillCaptureMethod: Suspended streaming (2) HW Triggering: Supported bTriggerUsage: Initiate still image capture (0) bControlSize: 1 bmControl: 0x00000000 .... .0 = wKeyFrameRate: No .... 0. = wPFrameRate: No ...0 .. = wCompQuality: No ..0. .. = wCompWindowSize: No .0.. .. = Generate Key Frame: No 0... .. = Update Frame Segment: No VIDEO STREAMING INTERFACE DESCRIPTOR [Format Uncompressed] (Format 1): YUY2 bLength: 27 bDescriptorType: 0x24 (video class interface) Subtype: Format Uncompressed (4) bFormatIndex: 1 bNumFrameDescriptors: 2 guidFormat: 32595559-0000-0010-8000-00aa00389b71 bBitsPerPixel: 16 bDefaultFrameIndex: 1 bAspectRatioX: 0 bAspectRatioY: 0 bmInterlaceFlags: 0x00, Field pattern: Field 1 only .... ...0 = Interlaced stream: Non-interlaced .... ..0. = Fields per frame: 2 fields .... .0.. = Field 1 first: No ..00 .... = Field pattern: Field 1 only (0) bCopyProtect: No restrictions (0) VIDEO STREAMING INTERFACE DESCRIPTOR [Frame Uncompressed] (Index 1): 256 x 192 bLength: 30 bDescriptorType: 0x24 (video class interface) Subtype: Frame Uncompressed (5) bFrameIndex: 1 bmCapabilities: 0x00 .... ...0 = Still image: Not supported .... ..0. = Fixed frame rate: No wWidth: 256 wHeight: 192 dwMinBitRate: 19660800 dwMaxBitRate: 19660800 dwMaxVideoFrameBufferSize: 98304 dwDefaultFrameInterval: 400000 bFrameIntervalType: Discrete (1 choice) dwFrameInterval: 400000 VIDEO STREAMING INTERFACE DESCRIPTOR [Frame Uncompressed] (Index 2): 256 x 384 bLength: 30 bDescriptorType: 0x24 (video class interface) Subtype: Frame Uncompressed (5) bFrameIndex: 2 bmCapabilities: 0x00 .... ...0 = Still image: Not supported .... ..0. = Fixed frame rate: No wWidth: 256 wHeight: 384 dwMinBitRate: 39321600 dwMaxBitRate: 39321600 dwMaxVideoFrameBufferSize: 196608 dwDefaultFrameInterval: 400000 bFrameIntervalType: Discrete (1 choice) dwFrameInterval: 400000 VIDEO STREAMING INTERFACE DESCRIPTOR [Still Image Frame] bLength: 14 bDescriptorType: 0x24 (video class interface) Subtype: Still Image Frame (3) Descriptor data: 00020001c0000001800100 VIDEO STREAMING INTERFACE DESCRIPTOR [Colorformat] bLength: 6 bDescriptorType: 0x24 (video class interface) Subtype: Colorformat (13) bColorPrimaries: BT.709, sRGB (1) bTransferCharacteristics: BT.709 (1) bMatrixCoefficients: SMPTE 170M (BT.601) (4) INTERFACE DESCRIPTOR (1.1): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 1 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 128 ...0 0... .... .... = Transactions per microframe: 1 (0) .... ..00 1000 0000 = Maximum Packet Size: 128 bInterval: 1 INTERFACE DESCRIPTOR (1.2): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 2 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 512 ...0 0... .... .... = Transactions per microframe: 1 (0) .... ..10 0000 0000 = Maximum Packet Size: 512 bInterval: 1 INTERFACE DESCRIPTOR (1.3): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 3 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 1024 ...0 0... .... .... = Transactions per microframe: 1 (0) .... ..00 0000 0000 = Maximum Packet Size: 0 bInterval: 1 INTERFACE DESCRIPTOR (1.4): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 4 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 2816 ...0 1... .... .... = Transactions per microframe: 2 (1) .... ..11 0000 0000 = Maximum Packet Size: 768 bInterval: 1 INTERFACE DESCRIPTOR (1.5): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 5 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 3072 ...0 1... .... .... = Transactions per microframe: 2 (1) .... ..00 0000 0000 = Maximum Packet Size: 0 bInterval: 1 INTERFACE DESCRIPTOR (1.6): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 6 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 4992 ...1 0... .... .... = Transactions per microframe: 3 (2) .... ..11 1000 0000 = Maximum Packet Size: 896 bInterval: 1 INTERFACE DESCRIPTOR (1.7): class Video bLength: 9 bDescriptorType: 0x04 (INTERFACE) bInterfaceNumber: 1 bAlternateSetting: 7 bNumEndpoints: 1 bInterfaceClass: Video (0x0e) bInterfaceSubClass: 0x02 bInterfaceProtocol: 0x00 iInterface: 0 ENDPOINT DESCRIPTOR bLength: 7 bDescriptorType: 0x05 (ENDPOINT) bEndpointAddress: 0x81 IN Endpoint:1 1... .... = Direction: IN Endpoint .... 0001 = Endpoint Number: 0x1 bmAttributes: 0x05 .... ..01 = Transfertype: Isochronous-Transfer (0x1) .... 01.. = Synchronisationtype: Asynchronous (0x1) ..00 .... = Behaviourtype: Data-Endpoint (0x0) wMaxPacketSize: 5120 ...1 0... .... .... = Transactions per microframe: 3 (2) .... ..00 0000 0000 = Maximum Packet Size: 0 bInterval: 1

sxrap1 avatar Aug 04 '23 03:08 sxrap1

@sxrap1 Would it be possible for you to share how you changed the _update_config_from_descriptor function? I'm having the same issue and would like to check if I'm doing it correctly. Thanks!

codyprupp avatar Aug 30 '23 22:08 codyprupp

@sxrap1 Would it be possible for you to share how you changed the _update_config_from_descriptor function? I'm having the same issue and would like to check if I'm doing it correctly. Thanks!

Have you solved this problem?

jeff-getlucky avatar Mar 08 '24 04:03 jeff-getlucky

Have you solved this problem?

No, I opted for a different video streaming method for my camera instead.

codyprupp avatar Mar 08 '24 04:03 codyprupp

Never really got this going , in the end I switched to a more capable processor. Cameras with bandwidth under about 3 or 4Mps will work, that pretty much rules out YUYV at anything more than 5 or 6 frames per second (assuming the camera can be set down to that). Im now using a banana pi cm4 running ubuntu and YOLOv5, so way beyond what ESP could ever do.

On Fri, Mar 8, 2024 at 2:03 PM jeff-getlucky @.***> wrote:

@sxrap1 https://github.com/sxrap1 Would it be possible for you to share how you changed the _update_config_from_descriptor function? I'm having the same issue and would like to check if I'm doing it correctly. Thanks!

Have you solved this problem?

— Reply to this email directly, view it on GitHub https://github.com/espressif/esp-iot-solution/issues/285#issuecomment-1985001571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AJP3MHQ4TVPAH2T6YISUSYLYXEZ73AVCNFSM6AAAAAA24JAUXGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSOBVGAYDCNJXGE . You are receiving this because you were mentioned.Message ID: @.***>

sxrap1 avatar Mar 08 '24 04:03 sxrap1