librealsense
librealsense copied to clipboard
Assertion when pausing a playing bag file.
| Required Info | |
|---|---|
| Camera Model | 400 series |
| Firmware Version | 5.15.0.2 |
| Operating System & Version | Linux Arch/KDE/Wayland |
| Kernel Version (Linux Only) | 6.6.4-arch1-1 |
| Platform | PC |
| SDK Version | 2.54.1 (Debug build) |
| Language | C /C++ |
| Segment |
Issue Description
Pausing a playing bag file with playback.pause(); causes an assertion when the the playback real-time setting is set to false. i.e. playback.set_real_time(false); with the following message:
LOG_ERROR("Error - timeout waiting for pause, possible deadlock detected");
there is no error when real-time playback is enabled.
Hi @GrubbyHalo Is your librealsense SDK built from source code using the flag -DFORCE_RSUSB_BACKEND=TRUE please?
As you are using a Linux type that is not Ubuntu, a debug build and a kernel version that is not supported at the time of writing this (6.2 is supported now though), all of these factors could cause problems in a build where RSUSB is not true.
Is your librealsense SDK built from source code using the flag -DFORCE_RSUSB_BACKEND=TRUE please?
Hi Marty.
I don't think the -DFORCE_RSUSB_BACKEND=TRUE would make a difference here as frames are coming from a bag file not via usb. Also the debug build just gives me the assert and the release build will at a minimum cause a delay of about 5 seconds before the call to pause returns which is not acceptable in my use case.
I also don't think this is specific to the kernel version either.
A RealSense user at https://github.com/IntelRealSense/librealsense/issues/10312#issuecomment-1068230505 suggests increasing the size of the frame queue to increase the stability of playback and pause.
It is possible to use pause without setting set_real_time to false though, as demonstrated by the SDK's rs-record-playback example program.
https://github.com/IntelRealSense/librealsense/blob/master/examples/record-playback/rs-record-playback.cpp#L161
set_real_time being false is more important when navigating to a particular frame number rather than just pausing playback.
set_real_time being false is more important when navigating to a particular frame number rather than just pausing playback.
This is what I am doing in addition to other things.
The issue you referenced above has to do with polling or waiting on frame from the pipeline while the real time setting is disabled. The issue I am experiencing is with pause being called on a playback pipeline without enabling real time frames.
An alternative approach to pausing that a RealSense user with a set_real_time = false C++ playback script used in https://github.com/IntelRealSense/librealsense/issues/1579#issuecomment-430163908 is to activate a sleep period to suspend playback activity if a 'stop' bool condition is set to false.
Does navigating to a particular frame index become unreliable in your program if set_real_time is false?
In my case I need to seek while paused. I guess that I will inevitably need to work around not calling the pause method while polling or waiting for frames, with an implementation specific to my project. However, the larger issue I wanted to bring to the attention of the maintainers of the librealsense library is that I think there is clearly a bug(s) within the playback code that needs to be resolved as highlighted by my issue and the first comment you referenced.
I have added an Enhancement label to this case in order to keep it open.
Would it be possible to share your complete C++ script in a comment so that it can be tested by the Intel RealSense team to confirm whether your problem can be replicated, please?
I can reproduce the bug in the following code. Although the code seems nonsensical it seems that the bug only occurs when the thread polling for frames blocks waiting for input and then calls the pause method. If pause is called immediately after getting the first frame without waiting for input no error occurs.
#include <librealsense2/rs.hpp> // Include RealSense Cross Platform API
#include <iostream>
#include <exception>
#include <thread>
#include <mutex>
#include <atomic>
#include <signal.h>
#include <iomanip>
#include <sstream>
#include <termios.h>
#include <unistd.h>
#include <queue>
#include <condition_variable>
std::atomic_bool g_exit;
void signal_treatment(int param)
{
switch (param)
{
case SIGPIPE:
case SIGHUP:
case SIGINT:
case SIGTERM:
case SIGUSR1:
case SIGUSR2:
g_exit = true;
break;
}
}
std::string pretty_time(std::chrono::nanoseconds duration)
{
using namespace std::chrono;
auto hhh = duration_cast<hours>(duration);
duration -= hhh;
auto mm = duration_cast<minutes>(duration);
duration -= mm;
auto ss = duration_cast<seconds>(duration);
duration -= ss;
auto ms = duration_cast<milliseconds>(duration);
std::ostringstream stream;
stream << std::setfill('0') << std::setw(hhh.count() >= 10 ? 2 : 1) << hhh.count() << ':' <<
std::setfill('0') << std::setw(2) << mm.count() << ':' <<
std::setfill('0') << std::setw(2) << ss.count();
return stream.str();
}
int main(int argc, char **argv)
try
{
signal(SIGHUP, signal_treatment);
signal(SIGPIPE, signal_treatment);
signal(SIGINT, signal_treatment);
signal(SIGUSR1, signal_treatment);
signal(SIGUSR2, signal_treatment);
signal(SIGTERM, signal_treatment);
signal(SIGALRM, signal_treatment);
rs2::config cfg;
rs2::pipeline pipe;
rs2::frame_queue queue;
cfg.enable_device_from_file(argv[1]);
pipe.start(cfg);
auto device = pipe.get_active_profile().get_device();
auto playback = device.as<rs2::playback>();
playback.set_real_time(false);
std::thread frame_poll_thread([&]()
{
while(!g_exit){
rs2::frameset data;
if(pipe.poll_for_frames(&data)){
queue.enqueue(data);
}
} });
rs2::frameset fs;
uint64_t frame_pos = 0;
char key;
bool first = true;
while(!g_exit){
if (queue.poll_for_frame(&fs)){
auto frame_pos = playback.get_position();
std::string time_elapsed = pretty_time(std::chrono::nanoseconds(frame_pos));
std::cout << "Time Elapsed: " << time_elapsed << std::endl;
}
if(first && fs.get_depth_frame()){ // get at least one frame
std::cout << "Press P to pause/unpause playback" << std::endl;
std::cin >> key;
playback.pause();
first=false;
}
}
while (queue.poll_for_frame(&fs))
; // clear the queue so the video processing thread can exit
frame_poll_thread.join();
return EXIT_SUCCESS;
}
catch (const std::exception &e)
{
std::cerr << e.what() << std::endl;
return EXIT_FAILURE;
}
catch (...)
{
std::cerr << "Unknown exception" << std::endl;
return EXIT_FAILURE;
}
Looking at your script, I would suspect that use of threads is contributing to the problem as issues are more likely to occur with scripts that use threads than ones with more straightforward code. Using threads, poll_for_frames and set_real_time=false with bag reading in a C++ script is discussed at https://github.com/IntelRealSense/librealsense/issues/2711
Hi Marty
Unfortunately for non trivial solutions multi-threading is often a requirement and especially so for my project. I suspect that the librealsense library code in the issue you referenced which was discussed 5 years ago perhaps wasn't suitable for use in a threaded application back then ? However that code will never exhaust system memory if the user never calls poll_for_frames as the library code stands now. Looking at the implementation of frame_queue in librealsense/third-party/rsutils/include/rsutils/concurrency/concurrency.h it would seem as if the frame queue's have a default size of 1 and any attempt to push a new frame on a full queue would simply discard the last queued item before pushing a new one in a thread safe way thereby ensuring that an application that stalls in dequeuing frames will never have a frame queue that keeps on growing and consuming more memory.
// Enqueue an item onto the queue.
// If the queue grows beyond capacity, the front will be removed, losing whatever was there!
bool enqueue(T&& item)
{
std::unique_lock<std::mutex> lock(_mutex);
if( ! _accepting )
{
if( _on_drop_callback )
_on_drop_callback( item );
return false;
}
_queue.push_back(std::move(item));
if( _queue.size() > _cap )
{
if( _on_drop_callback )
_on_drop_callback( _queue.front() );
_queue.pop_front();
}
lock.unlock();
// We pushed something -- let others know there's something to dequeue
_deq_cv.notify_one();
return true;
}
Also note that my code snippet I provided to illustrate the bug was based, in part, on the rs-measure.cpp example code.
// Video-processing thread will fetch frames from the camera,
// apply post-processing and send the result to the main thread for rendering
// It recieves synchronized (but not spatially aligned) pairs
// and outputs synchronized and aligned pairs
std::thread video_processing_thread([&]() {
while (alive)
{
// Fetch frames from the pipeline and send them for processing
rs2::frameset data;
if (pipe.poll_for_frames(&data))
{
.
.
// Send resulting frames for visualization in the main thread
postprocessed_frames.enqueue(data);
}
}
});
rs2::frameset current_frameset;
while(app) // Application still alive?
{
// Fetch the latest available post-processed frameset
postprocessed_frames.poll_for_frame(¤t_frameset);
if (current_frameset)
{
.
.
}
}
Yes, the default frame queue size is 1, though you can set a larger custom value. At the link below, Intel provide an example of doing so in C++ by setting a CAPACITY value for the frame queue, under the librealsense2 heading.
https://dev.intelrealsense.com/docs/api-how-to#do-processing-on-a-background-thread
Yes, the default frame queue size is 1, though you can set a larger custom value
Of course. Should we leave this issue open until there is a resolution to the pause bug ?
I have already given this issue an Enhancement label to signify that it should be left open indefinitely.
Hi @GrubbyHalo Were you able to achieve a solution to your issue, or do you still wish the issue to be referred to my Intel RealSense colleagues, please? Thanks!
@MartyG-RealSense As this clearly a bug in the library I feel that this issue should be referred to your colleagues maintaining the library.
Hi @GrubbyHalo Thanks very much again for your patience. My colleagues have been able to reproduce the issue that you experienced and have created an official internal Intel bug report so that the issue can be investigated further.
Hi @GrubbyHalo , the error occurs because the frame_queue in the given code is attempting to enqueue using a blocking operation while the queue is full. At the same time, the playback sensor dispatcher's queue is also full, and both are waiting to acquire the same lock. None of them is being dequeue causing the flush operation to fail and triggering an assertion error. The solution is to increase the size of the frame_queue in the given code to at least 11.
I will elaborate a bit, When the user ask to pause the playback, the SDK will first allow the current processed frame to finish and than clean the queue and pause. Since the user code use a RS blocking queue with a default size of 1, it is blocked and the SDK is blocked waiting for the user callback to return.
Please update the user code to ether have a bigger queue size than the SDK pipeline size (> 10) or refactor the code not to block the enqueue on a new frame as it stops the dequeue operation and will be blocked forever.
This is a very extreme cast that the SDK will not manage and assume the user callback / frame should be releases at some stage..