depthai-core icon indicating copy to clipboard operation
depthai-core copied to clipboard

invalid resolution does not throw exception -- Device starts and sends invalid data

Open diablodale opened this issue 2 years ago • 5 comments

Configuring a pipeline for a resolution not supported by the OAK device prints an error to the log and sends invalid data to host. It does not throw and does not stop Device startup.

I expect bad configuration to stop Device startup. And never send invalid data to host.

Setup

  • OAK-D
  • Microsoft Windows [Version 10.0.19044.1526]
  • depthai-core v2.15.0
  • VS2019 v16.11.10

Repro

  1. create an app that configures the Pipeline for depth and mono camera at THE_480_P
  2. build and run the app

Result

Device sends invalid 400p ImgFrame to host. And prints to log.

image

[14442C10C1A1C6D200] [27.033] [MonoCamera(4)] [error] OV9282 doesn't support THE_480_P resolution, defaulting to THE_400_P
[14442C10C1A1C6D200] [27.033] [MonoCamera(3)] [error] OV9282 doesn't support THE_480_P resolution, defaulting to THE_400_P

Expected

Exception to be thrown when dai::Device(pipeline) is constructed No invalid data sent to host.

Notes

Related to https://github.com/luxonis/depthai-core/issues/400

diablodale avatar Mar 08 '22 18:03 diablodale

Thanks, yes, we should think how to handle this better.

At the moment, not throwing on this error was intended, for example to have existing apps that configure THE_720_P resolution (for OAK-D) to not crash on OAK-D Lite.

The device-side pipeline will still operate properly, the frames sent to host are valid (with ImgFrame metadata/size properly set). VideoEncoder init was also changed to do lazy init, taking the size from the first frame received.

Only if the host uses (for some purpose) the resolution configured while creating the pipeline, that will no longer be valid. In that case, it's better for apps to override their known config based on the width/height of the frames received. Or possible to query the connected cameras or device type first, then eventually adjust the pipeline and send to device.

alex-luxonis avatar Mar 08 '22 20:03 alex-luxonis

Please take this as additional feedback... I'm not mad and intend no insults.

  • I agree with your first sentence ...to handle this better.
  • I hear your desire on the 2nd. You don't want poorly written apps to crash. I disagree this is the SDK's responsibility. If apps crash...they are poorly written apps. They can fix their bugs. If the app (or python script) has an error, they can fix it. I disagree strongly with the low-level SDK attempting to guess at resolutions and (already) getting it wrong. I believe throwing is the right approach. As a fallback, if you really really must, then have the firmware insert black bars or crop to keep the resolution exactly as requested.
  • On the 3rd, I disagree. The data sent today is not valid. What is my thinking? I follow strict logic/data parameters. If my app makes an api call for 640x480 8-bit mono data, then I (not considering stride) should get exactly 307200 bytes of data. I do not get 307200 bytes of data today. I get instead 256000 bytes of data...there is 51,200 bytes of data missing. It is crucial that a caller of the api gets exactly that. Memory allocations, buffers, pointer offsets, GPU sharing, there are countless reasons why the exact data is needed. Image the Linux kernel or Windows ddk display resolution api did this. Dear OS, give me a resolution...whatever you feel like giving me. That's the behavior now of these depthai-core apis. "OAK ignore the resolution parameter and give me whatever resolution you feel like giving me". If this is the intention, then remove the resolution parameter from those APIs because it is meaningless; OAK ignores it today.
  • I strongly disagree that a C/C++ application should reconfigure itself dynamically on a frame-by-frame basis. And it would be frame-by-frame because in today's depthai-core code it is not possible to know if the sensor might feel like changing the resolution to 123x87 or 1029x752 or whatever other resolution it feel like...at any time...for any frame. This does not belong in a low-level sdk. Maybe something like this is good for python scripts. Maybe the python bindings should remove all the resolution parameters and then python will get whatever the sensor wants to give it. But not c/c++. Nope.

diablodale avatar Mar 08 '22 22:03 diablodale

Thanks for the feedback.

Agree on the need to enforce configurations (and throw exceptions when not achievable). But we also need to see how to better handle cases like this simple example for a variety of devices, while not getting it too bloated: https://github.com/luxonis/depthai-core/blob/988442242061362ee0e40851723e88ac45858da1/examples/MonoCamera/mono_preview.cpp#L21

  1. Simply set a common working config (THE_400_P) - not great (if we want to show a higher res with OAK-D), or could get broken when adding support for more sensors that might not support this config.
  2. Add a new entry AUTO, and the device will decide on providing an appropriate resolution - still not good.
  3. Add an extra property like MonoCamera::setStrictResolution(false) - if the requested resolution is not available for the connected camera, to default to a closer working one (what the firmware is doing currently). And default to strict?

Not directly related to this issue, we also need to add a more fine-grained resolution control, implemented by cropping on sensor, for various usecases (e.g. reducing the height to bump the max FPS up): void MonoCamera::setResolution(int width, int height); It should be coupled with a sensor binning config.

Maybe we could see also about adding black bars to accommodate cases where the maximum sensor resolution is less than requested, but I think it's better to not have this as a default, and rather throw an exception.

CC @themarpe @szabi-luxonis

alex-luxonis avatar Mar 08 '22 23:03 alex-luxonis

I think to properly address this, ~~a combination of both https://github.com/luxonis/depthai-core/issues/285 and~~ (EDIT: Would help, but not strictly necessary to do this as of today actually) a means of setting "AUTO" / leaving out / not enforcing an argument, as Alex pointed out.

The auto option (or API not taking eg: resolution information at all) would allow to basically pick a "default / available" parameter without enforcing it, and the actual data and information about it would be queried from the incoming eg: ImgFrame. (or before building the pipeline by BoardConfig / querying sensors from Device / ...)

In that case we could make setting various parameters enforceable/strict with additional API not needing this information upfront if they are capable of dynamically figuring it out.

Not directly related to this issue, we also need to add a more fine-grained resolution control, implemented by cropping on sensor, for various usecases (e.g. reducing the height to bump the max FPS up): void MonoCamera::setResolution(int width, int height); It should be coupled with a sensor binning config.

:+1: Also maybe a "best effort" API - which would enable "easy" fixed resolutions for specific use cases, independent of the underlying sensor. Some sort of thumbnailing / cropping & scaling to produce the desired resolution.

IMO, I'd follow "If API sets something it enforces, if API leaves some information off, the device can "discovery/best effort" choose". In this case its up to "dynamic" use of this information, based on the received messages. Some cases can be enforced and it makes sense, while others are too dynamic in nature and require dealing with dynamic data as it arrives back to host.

(The above could already be applied to VideoEncoder to do a check with the specified resolution instead of ignoring it, to provide the "enforcability", and some sort of auto for resolution to provide "dynamic" use case)

themarpe avatar Mar 09 '22 00:03 themarpe

I think Alex (above) and I agree on....If a setting/param is invalid on Device construct/start, then it should throw.

Mono camera resolution is just one example of a class of issues about invalid parameter checking. This issue and the related issue linked in OP are two of ?probably? many invalids. I'm going to run tests across every setting combination possible in my application with the test automation I recently wrote that is being used to investigate a PoE bug. I just haven't got to it yet 😉

I suggest we all agree on that "invalid->throw" approach and move on to the resolution as one example within that approach. 🤔

In general, I disagree with prioritizing easy examples when designing low-level sdk. A low-level C/C++ sdk's like depthai-core is not easy. That is the nature of a low-level C++ sdk. A low-level sdk needs to be exact full-control, consistent, deterministic, all features, check its parameters, reliable, detailed, etc.

High-level wrappers like python scripts or a not-yet-existing C++ depthai-easybreezy that wraps around low-level SDKs...those are the high-level things to hide the details and make it easy. People using those high-level things don't care (or might not have the knowledge) about bytes, frame alignment, stride, malloc, nv12 vs. yuv...they just want pretty shiny things with auto everything. And that's ok. The high-level is for them.

Regarding auto idea for sensor resolution: Seems reasonable to me that when code cam= p.create<dai::node::xxxxxCamera>(); and never calls cam->setResolution()...then the sdk+firmware can choose the resolution itself by well-known published logic/rules. That would give you that easy example intention. 👍 But... if cam->setResolution() IS called, then it must output that exact resolution or throw.

FYI, not throwing an exception with today's sdk? This is the opposite of easy. And it doesn't yet work. I catch log errors and throw. That works. But in the non-error case, the device silently fails somewhere. Data is not being received by the DataQueues.

// one line
// device = std::make_unique<dai::Device>(pipeline, deviceinfo);

// changes into...
device = std::make_unique<dai::Device>(pipeline.getOpenVINOVersion(), deviceinfo);
{
    std::mutex m;
    std::condition_variable cv;
    std::string errorText;
    const auto logCallbackId = device->addLogCallback([&](dai::LogMessage logMessage) {
        if (logMessage.level == dai::LogLevel::ERR) {
            {
                std::lock_guard lock(m);
                errorText = std::move(logMessage.payload);
            }
            cv.notify_one();
        }
    });
    device->startPipeline(pipeline);
    std::unique_lock lock(m);
    cv.wait_for(lock, std::chrono::seconds(2), [&]{return !errorText.empty();}); // <-- race condition
    device->removeLogCallback(logCallbackId);
    if (!errorText.empty())
        throw std::runtime_error(errorText);
}
// ...and the DataQueues don't get data even when there was no error on startPipeline()

diablodale avatar Mar 09 '22 03:03 diablodale

Ping. What's happening with the firmware fix? I'll close this bug in 7 calendar days with no response from Luxonis.

diablodale avatar Apr 02 '24 00:04 diablodale

@diablodale this will be addressed in the next major release of depthai, where pipeline building will not be possible without the connection to the device beforehand, so we will be able to throw right away on the API call, if the resolution that is being set is not supported by the camera.

moratom avatar Apr 02 '24 11:04 moratom

Then do I understand that this bug will not be fixed for depthai-core v2.x.x? If so, time to close this bug won't fix.

diablodale avatar Apr 02 '24 14:04 diablodale

Closing stale bug, no response from Luxonis

diablodale avatar Apr 09 '24 14:04 diablodale