iceoryx
iceoryx copied to clipboard
Deprecate posix timer
Brief feature description
Deprecate posix timer from iceoryx code base.
Detailed information
Posix timer has several issues (concurrency, truncation) and is pretty heavy weight and a bit slow. As iceorxy needs to be very responsive the posix timer should be removed and replaced with a lightweight stopwatch.
This will also close #190 as it'll be obsolete.
Tasks
- [x] create
DeadlineTimer
or similar and replace this part of thePosixTimer
where applicable - [x] create an object that creates a thread and peridically executes a task
- [ ] create a small wrapper around a
posix::semaphore
which can be used forconcurrent::PeriodicTask
- [x] replace the keep alive message from a
std::chrono
timestamp with aunits::Duration
from atimespec
initialized withclock_gettime
withCLOCK_REALTIME
-> maybe better to use a different issue for this
@marthtz I think a stopwatch is not exactly what we need. We have to rewrite the timer so that the timer does not work with callbacks anymore but instead with signals. If you look at the manpage https://man7.org/linux/man-pages/man2/timer_create.2.html we have to possible approaches:
- send a signal to the whole process
- send a signal to a specific thread specified when creating the timer
I think we should explore the second option. Maybe it has similar drawbacks than the callback approach but if not it should be our way to go.
@elBoberido what do you think?
@elfenpiff @marthtz please don't call it stopwatch. A stopwatch is measuring time, this functionality is running towards a deadline or countdown. See also the equivalent int boost and Qt.
To the other points. The posix timer was started as a deadline timer and then enhanced with the posix calls. Currently it has both functionalities.
- cut out the deadline functionality and use that class where appropriate
- deprecate the current posix timer and replace the remaining occurrences with
std::thread
andsleep_for
- decide if we really need something like the posix timer. If we do, implement the second option from @elfenpiff which is also described in more detail in #170. But do this in an own class. Don't repeat the mistake and stuff it into the deadline functionality.
@elfenpiff I am currently working on this story. Based on the comments from @dkroenke I am working on to replace the timer.start() function with a simple thread sleep. Currently the posixtimer.start() is being called within the posh_runtime, service_discovery_notifier.
@shankar-bosch a simple sleep_for
is not enough. The current timer invokes a thread on each timeout. You also have a start a thread where you use the sleep_for
.
@elBoberido okay I will do so
@shankar-bosch if you are working on an issue could you please assign it to you and add it into the iceoryx 1.0 board: https://github.com/eclipse/iceoryx/projects/2 so that everyone is aware of it.
@elfenpiff sure. I am not in the list of contributors. Could you pls add me? I can assign this issue in my name and bring this to the board.
@elBoberido I have clarifications. As you know the timers can be one shot or periodic based on the it_interval .
If it is a one shot timer there can be a thread::sleepfor(it_value)
.
if the timer is a periodic then there has to be a thread::sleepfor(it_value)
followed by a thread::sleepfor(it_interval)
until the application exit? I am not sure what to do here.
I also dont see the start()
function within the timer.cpp invoking a thread after a timer is disarmed. Is it done in a different place? could you pls also help me here?
@shankar-bosch if you are working on an issue could you please assign it to you and add it into the iceoryx 1.0 board: https://github.com/eclipse/iceoryx/projects/2 so that everyone is aware of it.
@elfenpiff This can only be done by an Eclipse committer. Eclipse committers are voted on in elections.
@mossmaurice Thank you
@shankar-bosch a simple
sleep_for
is not enough. The current timer invokes a thread on each timeout. You also have a start a thread where you use thesleep_for
.
@elBoberido How about just removing the timer from iceoryx ? I would need to double check but the current use-cases we have within iceoryx can be resolved using a simple sleep_for (e.g. sending heartbeat).
@ithier that's the idea of this issue. There is an additional functionality in the timer (checking for a deadline) and that should to be preserved
@shankar-bosch the thread is started implicitly. Have a look at https://linux.die.net/man/2/timer_create, we use SIGEV_THREAD
If you use just sleep_for
, the semantic of the code changes. In this case the thread which previously called start
is now blocked, while it just continues running with the current timer implementation. You have to create a thread and use sleep_for
in a loop. I would suggest to start with cutting out the functionality of the DeadlineTimer
to cxx/deadline_timer.hpp/cpp
. This should be way easier. You can do it by copying the current timer to cxx
and removing everything from it which is related to the m_osTimer
member
Here is where the timer is used in iceoryx:
- sending heartbeat signal -> can be replaced by creating a single extra thread that calls sleep_for
- waiting for roudi up to a deadline -> can be replaced by saving epoch time and just comparing to the current time each iteration, this does not need another thread
From above, it looks like we only need to do add the heartbeat thread above and write some kind of helper method to wrap the usage of timer_gettime
to satsify the second usecase.
Then we can just mark the old timer as deprecated and we are done with this issue, right ?
it just occurred to me that sleep_for might be really bad for the test time, since it's not interruptible. How about creating a small wrapper around a semaphore with timed wait? e.g.
class Timer {
public:
Timer() noexcept = default;
~Timer() noexcept; // calls stop
cxx::expected<TimerError> wait(cxx::Duration) noexcept; // blocking wait on the semaphore
void stop() noexcept; // sem_post to interrupt waiting
static cxx::expected<units::Duration, TimerError> Timer::now() noexcept; // this is from the old Timer and also used in the codebase
}
Later on the semaphore could be replaced by POSIX timer_create(..)
etc, if necessary.
What do you think?
Current PR #333 also makes use of the timer. Basically, it's checking for a timeout / interval expiry.
@elBoberido wrapper sounds good to me and could be used for this and @ithier first bullet point.
@elBoberido Thanks for the wrapper solution.
We should definitely fix this for 1.0, On ARM 64 I have failing tests because of #190 but I fear that's not the only problem
@budrus to fix #190 we need to refactor the units::Duration away from long double
to chrono
or timespec/timeval
@elBoberido That's at least half of the game. The failing DestructorIsBlocking
could maybe be another thing. This also fails with larger numbers...
@budrus it could still be a rounding issue. AFAIK, the timer doesn't sleep for e.g. 100ms but for currentTime + 100ms
. This might result in rounding errors with the long double
, especially since that's only an alias to double
on ARM and 80 bit wide on x86 ... except for MSVC where it is also an alias to double
@budrus to fix #190 we need to refactor the units::Duration away from
long double
tochrono
ortimespec/timeval
@elBoberido @budrus I suggest here to use neither of them. I would create a struct which looks like timespec
struct duration_t {
uint64_t seconds;
uint32_t nanoseconds;
};
The reason is that a duration cannot be negative, you cannot wait -5 seconds but timespec/timeval
can be negative. Additionally, timespec/timeval
do not have a fixed size they are using the types long
and time_t
which can differ on different architectures.
And when we refactor units::Duration
we should get rid of all the long double
operators which are absolutely unnecessary - and I know I was the one who introduced them in the first place :/ .
I will take the solution for the truncation issue as suggested by @elfenpiff. Thank you. @budrus Is there a seperate ticket for failing DestructorIsBlocking issue?
@shankar-bosch No, as @elBoberido said, it could also have the same root cause as #190. We would have to analyze this but now things are changing anyway
@budrus Okay. I believe the issue #190 fix should be available for 1.0. So increasing the scope of #337 to fix #190 and analysing the failing DestructorIsBlocking should be fine?
since we are approaching the 0.90 release, is there any update on this?
Yes the PR is getting ready. Should be raised shortly.
@shankar-bosch for Deadline or keep-alive?
Deadline is ready. Keep alive is under testing. The PR should cover both.
@shankar-bosch do you think you can get this in until monday? If not, I could do a fast fix for the keep alive.
Just as reminder to keep the PRs small. The PR you are preparing could be split into two. The dealine timer and the keep alive functionality.