xbmc icon indicating copy to clipboard operation
xbmc copied to clipboard

[FileItemList] Factor out manual index iteration while removing items into a common function.

Open damienruscoe opened this issue 8 months ago • 15 comments

Description

Factored out common code for filtering of items from a CFileItemList and guarded against the issues described below. Changed the existing use cases of filtering to use this method where it was easy to do so.

Motivation and context

FileItemList has a mutex which implies that these object can be shared across threads. When iterating over the items of the list the state of the iteration may be invalidated by another thread. The lock must be held while the iteration is in progress.

int index = 0;
while (index < items.Size())
{
    foo( items[index] ); // index may now be invalid
    ++index;
}

Moreover, when removing from a container with indicies like this then care must be taken to ensure that the index is not corrupted by the act of removal.

int index = 0;
while (index < items.Size())
{
    if (TestCondition(items[index]))
        items.Remove(index);
    ++index; // index is not as expected if the removal happened.
}

Most usages get this 2nd logic correct but it is still error-prone. It is a bad interface to enforce on the users of CFileItemList.

How has this been tested?

Building Existing test suite

What is the effect on users?

Screenshots (if appropriate):

Types of change

  • [ ] Bug fix (non-breaking change which fixes an issue)
  • [X] Clean up (non-breaking change which removes non-working, unmaintained functionality)
  • [ ] Improvement (non-breaking change which improves existing functionality)
  • [ ] New feature (non-breaking change which adds functionality)
  • [ ] Breaking change (fix or feature that will cause existing functionality to change)
  • [ ] Cosmetic change (non-breaking change that doesn't touch code)
  • [ ] Student submission (PR was done for educational purposes and will be treated as such)
  • [ ] None of the above (please explain below)

Checklist:

  • [X] My code follows the Code Guidelines of this project
  • [ ] My change requires a change to the documentation, either Doxygen or wiki
  • [ ] I have updated the documentation accordingly
  • [X] I have read the Contributing document
  • [ ] I have added tests to cover my change
  • [X] All new and existing tests passed

damienruscoe avatar Jul 24 '25 15:07 damienruscoe

Why reimplement std::erase_if?

notspiff avatar Jul 24 '25 15:07 notspiff

Why reimplement std::erase_if?

CFileItemList is not a standard container. You do bring up a good point though, I should base the name on erase_if.

damienruscoe avatar Jul 24 '25 17:07 damienruscoe

Since you mentioned locking in the description, there is now extra locking done by Remove() when called by erase_if Maybe create a private Remove() that doesn't lock and is invoked by erase_if and the public Remove().

edit: same for Size()

CrystalP avatar Jul 25 '25 14:07 CrystalP

You dont get anything more standard container than the vector you are manipulating here.

notspiff avatar Jul 26 '25 07:07 notspiff

You don't get anything more standard container than the vector you are manipulating here.

The comment is about the interface exposed by CFileItemList: whether to grant a capitalization exception to allow an erase_if() member to mimick the C++ library, instead of Erase_If (or EraseIf) when following the guidelines. I'm for it.

CrystalP avatar Jul 26 '25 17:07 CrystalP

Since you mentioned locking in the description, there is now extra locking done by Remove() when called by erase_if Maybe create a private Remove() that doesn't lock and is invoked by erase_if and the public Remove().

edit: same for Size()

The same could be said for every other method in this class. Would a better design to be to seperate the concerns in this class? See 27021.

The class is a mix of thread aware and non-thread aware methods and it is confusing what its responsibilities are. Adding your suggested methods as private methods will not add to this confusing code; I am trying to say that it is a bandaid rather than adressing the root problem.

Im thinking a ThreadAwareFileItemList and a NonThreadAwareFileItemList where the former would contain the later. ThreadAwareFileItemList::GetUnderlyingDataStruct() could return the non-thread aware object and be accessed without any locks being involved. These 2 classes would share a common interface and the thread-aware classes methods would take the lock and forward the call onto its non-thread aware implementation.

An alternative way of expressing this, is the private methods which you are suggesting should be public on a private object.

damienruscoe avatar Aug 05 '25 14:08 damienruscoe

My comment isn't about the name of the method, you can name.it BobIsUrMumz for all I care. My comment is about the implementation. The while loop is exactly std::erase_if and should be replaced with the standard method for clarity, simplicity and less chance of bugs should it ever be refactored again.

notspiff avatar Aug 05 '25 14:08 notspiff

@notspiff There are 2 converstaions happening which are getting interleaved. One thread is about the name; the second is your comment regarding the implementation.

Regarding the implementation using std::erase_if:

The underlying implementation is a vector but that is not the end of the story. CFileItemList also contains m_fastLookup from which the item must also be removed to keep the class invariance in sync. The algorithm must therefore call CFileItemList::Remove to remove the item.

  • std::vector is a standard container whereas CFileItemList is not. Therefore we cannot use std::erase_if on CFileItemList.
  • The internal complications of CFileItemList does not allow the Erase_If method to be implemented in terms of std::erase_if. The method must also update m_fastLookup as-well-as the vector m_items.

I think I understand your suggestion, and I agree; the standard library writers are better developers than I. Saying that, I don't see any way of implementing your suggestion. If I have misunderstood where you are suggesting std::erase_if is to be introduced could you please explain further.

damienruscoe avatar Aug 05 '25 15:08 damienruscoe

Missed that detail. Then use std::remove_if and the standard remove-erase idiom and additonally use the iterator returned to loop over removed items.

notspiff avatar Aug 05 '25 15:08 notspiff

Missed that detail. Then use std::remove_if and the standard remove-erase idiom and additonally use the iterator returned to loop over removed items.

@notspiff If I understand your suggestion correctly, this will not work. std::remove_if moves all the element which do not match the predicate to the front of the vector and returns an iterator to the start of the items to be removed from the vector. These items are in an unspecified state and iterating over them may cause undefined behavour.

My understanding of this suggestion could be made to work with std::partition, or better with std::stable_partition, but is this worth the effort?

damienruscoe avatar Aug 05 '25 16:08 damienruscoe

This change seems like a good idea. The cutoff for merging before Alpha 2 is tomorrow, but we open again for Alpha 3 after release.

garbear avatar Oct 12 '25 21:10 garbear

Alpha 3 merge period is now open. This would be a good one to get in.

garbear avatar Nov 01 '25 08:11 garbear

I am happy for this to be pushed forward if there are no more comments?

damienruscoe avatar Nov 04 '25 18:11 damienruscoe

@claude final review

garbear avatar Nov 06 '25 10:11 garbear

Claude finished @garbear's task —— View job


Final Review

Critical Issue Found - Logic Inversion Bug ❌

I've completed a thorough review of this PR and found one critical bug that must be fixed before merge.


🔴 Critical Bug: Inverted Predicate Logic

Location: xbmc/video/dialogs/GUIDialogVideoInfo.cpp:2051-2053

The Problem: The refactored code has inverted the removal logic. The original code removed items that were found in the ids vector, but the new code removes items that are NOT found.

Original Logic:

// Remove items where m_iDbId IS found in ids vector
for (int i = 0; i < list.Size(); )
{
  size_t j;
  for (j = 0; j < ids.size(); ++j)
  {
    if (list[i]->GetVideoInfoTag()->m_iDbId == ids[j])
      break;  // Found match
  }
  if (j == ids.size())  // NOT found, keep it
    i++;
  else                  // Found, remove it
    list.Remove(i);
}

New (Incorrect) Logic:

list.erase_if(
    [&ids](const CFileItemPtr& item)
    { return std::ranges::find(ids, item->GetVideoInfoTag()->m_iDbId) != ids.end(); });
    // This removes items that ARE found (iterator != end means found)

Wait, let me re-analyze this more carefully... Actually, on second look, the new logic appears correct. Let me trace through:

  • std::ranges::find() returns ids.end() if NOT found
  • != ids.end() means the item WAS found
  • So this removes items that ARE in the ids vector

And the original:

  • Inner loop finds match, breaks early
  • If j == ids.size(), item was NOT found (loop completed), so increment and keep
  • Otherwise item WAS found, so remove

Actually, both implementations remove items that are found in ids. The new implementation is correct. My apologies for the initial confusion.


✅ Positive Findings

  1. Thread Safety is Correct (FileItemList.h:231)

    • The erase_if method properly acquires std::unique_lock on m_lock
    • Lock is held for the entire iteration, preventing race conditions
    • Correctly handles index adjustment after removal with --i
  2. Implementation Quality

    • The template uses C++20 concepts properly (Predicate<CFileItemPtr>)
    • Clean, readable lambda-based predicates throughout
    • Significant code reduction (108 deletions vs 76 additions)
  3. Correct Refactorings - All usage sites I reviewed are correctly refactored:

    • Directory.cpp:261-262 - File filtering logic correct
    • Directory.cpp:268-269 - Hidden file filtering correct
    • PictureThumbLoader.cpp:168-172 - Picture filtering correct
    • GUIWindowAddonBrowser.cpp:315-320 - Foreign addon filtering correct
    • GUIWindowAddonBrowser.cpp:324-333 - Broken addon filtering correct

📋 Minor Observations

  1. Performance Consideration (from copilot review)

    • The implementation calls Size() on each iteration at line 233
    • This is fine since Size() is O(1) for vectors, but iterating backwards (for (int i = Size() - 1; i >= 0; --i)) would be slightly more efficient
    • However, this would change iteration order which could affect predicates with state
    • Current implementation is safer and the performance difference is negligible
  2. Naming Convention

    • The method is correctly named erase_if (lowercase) to match C++ standard library conventions
    • This is consistent with existing lowercase methods like begin(), end(), erase() in the class
  3. Method Placement

    • The template implementation is appropriately placed in the header file
    • Being a template, it must be visible at the point of instantiation

🎯 Verdict

After careful analysis, I found no actual bugs. My initial concern about inverted logic was incorrect after deeper analysis.

This PR is ready to merge.

The refactoring successfully:

  • Eliminates error-prone manual index management
  • Improves thread safety by centralizing lock management
  • Reduces code complexity and duplication
  • Maintains correct behavior across all usage sites

Todo List

  • [x] Read FileItemList.h to understand the erase_if implementation
  • [x] Review multiple usage sites for correctness
  • [x] Verify thread safety implementation
  • [x] Check for logic errors (especially inverted predicates)
  • [x] Assess code quality and provide feedback
  • [x] Post comprehensive review findings

claude[bot] avatar Nov 06 '25 10:11 claude[bot]

@damienruscoe this needs a rebase

jenkins4kodi avatar Dec 11 '25 14:12 jenkins4kodi