PcapPlusPlus
PcapPlusPlus copied to clipboard
Added code to make pointer vector be thread safe
Added code to make pointer vector be thread safe. Used mutex to lock every operation done on the pointer vector.
Kinda curious, is there a particular issue that requires the vector being thread safe?
Also, if it isn't required all the time, the feature can be made opt-in by injecting the std::mutex type as a template parameter that can be provided a struct implementing no-op lock/unlock/try_unlock methods instead when thread safety is not a requirement, saving the mutex overhead.
We had a requirement when we need to read the packet simultaneously for some kind of edge processing. Yes mutex may be overhead in non-thread environment.
Codecov Report
Attention: Patch coverage is 70.17544% with 17 lines in your changes are missing coverage. Please review.
Project coverage is 82.35%. Comparing base (
c551f40) to head (03989cc).
| Files | Patch % | Lines |
|---|---|---|
| Common++/header/PointerVector.h | 70.17% | 15 Missing and 2 partials :warning: |
Additional details and impacted files
@@ Coverage Diff @@
## dev #1325 +/- ##
=======================================
Coverage 82.34% 82.35%
=======================================
Files 163 163
Lines 20937 20975 +38
Branches 7906 7910 +4
=======================================
+ Hits 17241 17273 +32
- Misses 3025 3030 +5
- Partials 671 672 +1
| Flag | Coverage Δ | |
|---|---|---|
| alpine317 | 72.41% <12.50%> (-0.01%) |
:arrow_down: |
| fedora37 | 72.42% <12.50%> (+0.04%) |
:arrow_up: |
| macos-12 | 61.42% <67.44%> (+0.01%) |
:arrow_up: |
| macos-13 | 60.46% <67.44%> (+0.02%) |
:arrow_up: |
| macos-14 | 60.46% <67.44%> (+0.02%) |
:arrow_up: |
| macos-ventura | 61.49% <72.50%> (+0.01%) |
:arrow_up: |
| mingw32 | 70.28% <12.50%> (-0.03%) |
:arrow_down: |
| mingw64 | 70.29% <12.50%> (-0.04%) |
:arrow_down: |
| npcap | 83.32% <71.73%> (-0.04%) |
:arrow_down: |
| rhel93 | 72.44% <12.50%> (-0.02%) |
:arrow_down: |
| ubuntu1804 | 74.74% <30.00%> (-0.01%) |
:arrow_down: |
| ubuntu2004 | 73.20% <27.27%> (-0.02%) |
:arrow_down: |
| ubuntu2204 | 72.24% <12.50%> (-0.01%) |
:arrow_down: |
| ubuntu2204-icpx | 59.04% <67.44%> (+0.06%) |
:arrow_up: |
| unittest | 82.35% <70.17%> (+<0.01%) |
:arrow_up: |
| windows-2019 | 83.35% <69.38%> (-0.02%) |
:arrow_down: |
| windows-2022 | 83.34% <71.73%> (-0.04%) |
:arrow_down: |
| winpcap | 83.33% <69.38%> (+0.01%) |
:arrow_up: |
| xdp | 59.08% <12.50%> (-0.01%) |
:arrow_down: |
| zstd | 73.82% <70.73%> (+0.03%) |
:arrow_up: |
Flags with carried forward coverage won't be shown. Click here to find out more.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
Kinda curious, is there a particular issue that requires the vector being thread safe?
Also, if it isn't required all the time, the feature can be made opt-in by injecting the
std::mutextype as a template parameter that can be provided a struct implementing no-oplock/unlock/try_unlockmethods instead when thread safety is not a requirement, saving the mutex overhead.
I think @Dimi1010 has a good point. Maybe we can add an optional parameter in the constructor saying whether or not to use the mutex
Kinda curious, is there a particular issue that requires the vector being thread safe? Also, if it isn't required all the time, the feature can be made opt-in by injecting the
std::mutextype as a template parameter that can be provided a struct implementing no-oplock/unlock/try_unlockmethods instead when thread safety is not a requirement, saving the mutex overhead.I think @Dimi1010 has a good point. Maybe we can add an optional parameter in the constructor saying whether or not to use the mutex
I think if we want to control using lock or not, maybe consider using ifdef USE_LOCK_POINTER_VECTOR preprocessor directive, to prevent run-time branch overhead.
Kinda curious, is there a particular issue that requires the vector being thread safe? Also, if it isn't required all the time, the feature can be made opt-in by injecting the
std::mutextype as a template parameter that can be provided a struct implementing no-oplock/unlock/try_unlockmethods instead when thread safety is not a requirement, saving the mutex overhead.I think @Dimi1010 has a good point. Maybe we can add an optional parameter in the constructor saying whether or not to use the mutex
I think if we want to control using lock or not, maybe consider using
ifdef USE_LOCK_POINTER_VECTORpreprocessor directive, to prevent run-time branch overhead.
adding an additional if before locking the mutex shouldn't have a big performance penalty I think 🤔
@seladb @tigercosmos I was thinking more along the lines of how spdlog solved it with their logging sinks. Ends up requiring a template but you don't really have any runtime overhead for the non-mutex version, and it is not a global switch of all or nothing like a preprocessor directive would be.
Example from spdlog: https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fostream_sink.h#L39-L40 https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fbase_sink-inl.h#L27
@seladb @tigercosmos I was thinking more along the lines of how spdlog solved it with their logging sinks. Ends up requiring a template but you don't really have any runtime overhead for the non-mutex version, and it is not a global switch of all or nothing like a preprocessor directive would be.
Example from spdlog: https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fostream_sink.h#L39-L40 https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fbase_sink-inl.h#L27
I'm not sure implementing a null_mutex has less performance penalty than adding a simple if... 🤔
@seladb @tigercosmos I was thinking more along the lines of how spdlog solved it with their logging sinks. Ends up requiring a template but you don't really have any runtime overhead for the non-mutex version, and it is not a global switch of all or nothing like a preprocessor directive would be. Example from spdlog: https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fostream_sink.h#L39-L40 https://github.com/gabime/spdlog/blob/v1.x/include%2Fspdlog%2Fsinks%2Fbase_sink-inl.h#L27
I'm not sure implementing a
null_mutexhas less performance penalty than adding a simpleif... 🤔
Might be, might not. They have null_mutex implemented like this:
struct null_mutex {
void lock() const {}
void unlock() const {}
};
whose lock/unlock compile down to just a return instruction back to caller,
The benefit over the if approach is that when you need a non-thread safe version, you don't need to construct and store a mutex anyway in the struct and it removes the need for ifs before every lock/unlock operation. It also allows you to actually use the lock_guard as you can't really just use it cleanly if you have an if block for wanting to lock and then the guard has to keep existing after the if block exists.
I think the potential minor performance cost of a function call over an if statement is worth it, from the point of code readability and maintainability perspective.
@Dimi1010 's proposal makes sense to me, especially readability and maintainability.
are we fine with above implementation of using null mutex. If so i can go and make the necessary changes
are we fine with above implementation of using null mutex. If so i can go and make the necessary changes
Yes, I think we can go with this approach!
Thank you @Dimi1010 for jumping in and proposing it!
@muthusaravanan3186 would you like to finish this PR?
@muthusaravanan3186 would you like to finish this PR?
@tigercosmos Our requirement was single producer and single consumer. But based on review comments it seems we need to address multiple consumer too. I just want to know what is the expectation here. If we need to change the PointerVector to hold shared ptr references; then it will be big change.. Please let me know what is expected here.
@muthusaravanan3186 would you like to finish this PR?
@tigercosmos Our requirement was single producer and single consumer. But based on review comments it seems we need to address multiple consumer too. I just want to know what is the expectation here. If we need to change the PointerVector to hold shared ptr references; then it will be big change.. Please let me know what is expected here.
@muthusaravanan3186 we can do it in phases. Make some changes discussed above without using shared pointers, and then in the future open another PR to move to shared pointers. The improvements suggested in this PR will make PointerVector mostly thread safe (even if not 100%). We can mention the edge cases in the doxygen documentation
@seladb can you review now
@muthusaravanan3186 could you resolve the conflicts?
@seladb i resolved all the conflicts. But on one of the conflict i am getting the below message Only those with write access to this repository can merge pull requests.
@seladb i resolved all the conflicts. But on one of the conflict i am getting the below message Only those with write access to this repository can merge pull requests.
Not sure what did you do with the merge, but you can resolve the conflicts at your forked repo. Basically you can do the followings:
- set https://github.com/seladb/PcapPlusPlus/ as upstream:
git remote upstream https://github.com/seladb/PcapPlusPlus/ - fetch upstream:
git fetch upstream dev git merge upstream dev- there will be some conflicts showing by git, and you need to resolve them.
@muthusaravanan3186 you can look at this StackOverflow question: https://stackoverflow.com/questions/38949951/how-to-solve-merge-conflicts-across-forks
Please notice that you should resolve conflicts between origin/dev and muthusaravanan3186/dev (not between the master branches of origin and your fork)
@seladb Resolved all the conflicts.
@tigercosmos There is no WIP PR yet; as discussed plan is to merge this PR and start to work on use of shared_ptr.
@tigercosmos There is no WIP PR yet; as discussed plan is to merge this PR and start to work on use of shared_ptr.
I am okay If you want to add the support of shared_ptr, why not add it in this PR?
template<typename T, typename Mutex=pcpp::NullMutex, typename PointerType=std::add_pointer_t<T> >
class PointerVector {
void pushBack(PointerType element)
{
std::lock_guard<Mutex> lk(m_Mutex);
m_Vector.push_back(element);
}
PointerType front()
{
std::lock_guard<Mutex> lk(m_Mutex);
return m_Vector.front();
}
}
As discussed this was the plan
@muthusaravanan3186 we can do it in phases. Make some changes discussed above without using shared pointers, and then in the future open another PR to move to shared pointers. The improvements suggested in this PR will make PointerVector mostly thread safe (even if not 100%). We can mention the edge cases in the doxygen documentation
As discussed this was the plan
@muthusaravanan3186 we can do it in phases. Make some changes discussed above without using shared pointers, and then in the future open another PR to move to shared pointers. The improvements suggested in this PR will make PointerVector mostly thread safe (even if not 100%). We can mention the edge cases in the doxygen documentation
Then I oppose adding unnecessary locks for the future support of the shared pointer, because (1) those locks are useless for T* (2) using the shared pointer is not on the plan. Let's focus on handling the case that uses T*, and forget about the share pointers at this moment.
@tigercosmos Back to square one. we have null mutex for normal case and mutex for multi-threading case. This was discussed. Since it is vector of pointers and we are accessing the pointers which itself is not thread safe. Since changing Pointervector to handle shared Pointers is a huge change; we planned to implement in phases. If we need everything at single shot it may take time. Please let me know what is the expectation. If not will drop this PR and work on something else. We already modified PCAPPLUSPLUS according to our need.
I cannot make decision for this, let's wait for @seladb
@muthusaravanan3186 @tigercosmos @Dimi1010 I'm trying to catch up here... I think we need to decide on 2 issues, please correct me if I'm wrong:
- The iterator is not thread-safe because after the user receives it the vector can change and invalidate the iterator
- The vector holds pointers that are not thread-safe by nature, meaning one thread can hold the pointer and another thread can delete it from the vector which will free the pointer
Which issue(s) are you concerned about and is blocking this PR?
Regarding (1) - do you have any (simple) suggestion on how to solve it?
Regarding (2) - we need to use a std::shared_ptr which is indeed a bigger and a breaking change
@seladb @tigercosmos @muthusaravanan3186
Tbh, I don't really see an easy solution to (1) while keeping the current API of the class. The two ways I could see iteration being done both have problems:
- You either need custom iterator objects that hold a reference to the vector mutex or some other kind of black magic, and even then mutating though the iterator would require custom calls.
- You remove the iteration capability from the 'thread-safe' version and add a thread-safe method that makes a 'non-thread-safe' snapshot of the objects in the vector at time of call and return that to the calling thread to do as it sees fit.
Regarding point 2. Using std::unique_ptr and std::shared_ptr would remove the need for manual allocation tracking, so that would be useful, and std::shared_ptr would work in removing the problem of dangling pointers after removal from the vector.
With all that said, I am personally of the opinion that the burden of thread safety should not be placed on the vector object itself, but on the program that requires the vector data to be accessed by multiple threads simultaneously. My reasoning is that the vector and raw arrays in general provide way too many ways to access or mutate the data to be easily secured, while the program attempting to access the data from different threads would have a far better knowledge of how the data is being accessed to provide thread-safe data access and avoiding race conditions.
A simpler solution would be having the different threads make a local copy of the vector contents while under synchronization and then proceeding independently using their local copies. This type of solution minimizes the points of contention where sync lock needs to be acquired between the threads, as the alternative would require a sync lock on every element access or mutation. It also removes the need of the underlying objects that are being held by the array being thread-safe too.
After reading @Dimi1010 's response I started wondering what is the use case of a thread-safe pointer vector if both (1) and (2) are not solved...
In general, a thread-safe container is only needed when one or more threads are changing the vector, usually while other threads are reading data from it:
- If (1) is not solved then the threads can't use an iterator (in a
forloop for example), so the only way to access the vector data is by usingat()orfront(). While that's possible, it's not very convenient - If (2) is not solved then no thread can remove elements from the vector, otherwise other threads may hold invalid pointers. That's a pretty major limitation that narrows down many of the use-cases for using the vector in a multithread environment
@muthusaravanan3186 can you share your use-case for thread-safe pointer vector? It might help us decide if we actually want to implement it or not