arrayvec icon indicating copy to clipboard operation
arrayvec copied to clipboard

apply standard rustfmt style

Open robjtede opened this issue 3 years ago • 1 comments

Motivation

  • Following rustfmt generally increases willingness in contributing to the project.
  • Removing cogitive overhead in reviewing potentially important PR #222.

Seems like a few of your personal preferences like single-line methods could be accomplished by requiring nightly rustfmt but I suspect you'd see lots of those get reverted in PRs by folks who automatically apply stable rustfmt in their editor.

robjtede avatar Jul 24 '22 00:07 robjtede

The diff for #222 is about a quarter the size after the format:

~600 lines vs ~2.4k lines

diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml
index 65b6c53..ccbf423 100644
--- a/.github/workflows/ci.yml
+++ b/.github/workflows/ci.yml
@@ -51,6 +51,10 @@ jobs:
     runs-on: ubuntu-latest
     steps:
       - uses: actions/checkout@v2
-      - name: Miri
-        run: ci/miri.sh
-
+      - name: Install Miri
+        run: |
+          rustup toolchain install nightly --component miri
+          rustup override set nightly
+          cargo miri setup
+      - name: Test with Miri
+        run: cargo miri test
diff --git a/ci/miri.sh b/ci/miri.sh
deleted file mode 100755
index 272995c..0000000
--- a/ci/miri.sh
+++ /dev/null
@@ -1,15 +0,0 @@
-#!/bin/sh
-
-set -ex
-
-export CARGO_NET_RETRY=5
-export CARGO_NET_TIMEOUT=10
-
-MIRI_NIGHTLY=nightly-$(curl -s https://rust-lang.github.io/rustup-components-history/x86_64-unknown-linux-gnu/miri)
-echo "Installing latest nightly with Miri: $MIRI_NIGHTLY"
-rustup default "$MIRI_NIGHTLY"
-
-rustup component add miri
-cargo miri setup
-
-cargo miri test
diff --git a/src/array_string.rs b/src/array_string.rs
index f580d77..393a037 100644
--- a/src/array_string.rs
+++ b/src/array_string.rs
@@ -380,15 +380,16 @@ impl<const CAP: usize> ArrayString<CAP> {
             None => panic!("cannot remove a char from the end of a string"),
         };
 
-        let next = idx + ch.len_utf8();
         let len = self.len();
+        let removed_len = ch.len_utf8();
+        let next = idx + removed_len;
+        let tail_len = len - next;
         unsafe {
-            ptr::copy(
-                self.as_ptr().add(next),
-                self.as_mut_ptr().add(idx),
-                len - next,
-            );
-            self.set_len(len - (next - idx));
+            // SAFETY: `idx` is in bounds because we just checked that.
+            // `next` is in bounds because we cannot contain character partially.
+            let p = self.as_mut_ptr();
+            ptr::copy(p.add(next), p.add(idx), tail_len);
+            self.set_len(len - removed_len);
         }
         ch
     }
@@ -526,37 +527,58 @@ impl<const CAP: usize> Clone for ArrayString<CAP> {
     }
 }
 
 impl<const CAP: usize> Ord for ArrayString<CAP> {
diff --git a/src/arrayvec.rs b/src/arrayvec.rs
index 9dde79b..1fdab0b 100644
--- a/src/arrayvec.rs
+++ b/src/arrayvec.rs
@@ -1,3 +1,5 @@
+use core::convert::TryInto;
+use core::ptr::NonNull;
 use std::cmp;
 use std::iter;
 use std::mem;
@@ -52,12 +54,17 @@ impl<T, const CAP: usize> Drop for ArrayVec<T, CAP> {
     }
 }
 
 impl<T, const CAP: usize> ArrayVec<T, CAP> {
@@ -487,13 +494,12 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
 
         impl<T, const CAP: usize> Drop for BackshiftOnDrop<'_, T, CAP> {
             fn drop(&mut self) {
-                if self.deleted_cnt > 0 {
+                if self.deleted_cnt > 0 && self.original_len != self.processed_len {
                     unsafe {
+                        let p = self.v.as_mut_ptr();
                         ptr::copy(
-                            self.v.as_ptr().add(self.processed_len),
-                            self.v
-                                .as_mut_ptr()
-                                .add(self.processed_len - self.deleted_cnt),
+                            p.add(self.processed_len),
+                            p.add(self.processed_len - self.deleted_cnt),
                             self.original_len - self.processed_len,
                         );
                     }
@@ -511,39 +517,50 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
             original_len,
         };
 
-        #[inline(always)]
-        fn process_one<F: FnMut(&mut T) -> bool, T, const CAP: usize, const DELETED: bool>(
+        fn process_loop<F, T, const DELETED: bool, const CAP: usize>(
+            original_len: usize,
             f: &mut F,
             g: &mut BackshiftOnDrop<'_, T, CAP>,
-        ) -> bool {
-            let cur = unsafe { g.v.as_mut_ptr().add(g.processed_len) };
-            if !f(unsafe { &mut *cur }) {
-                g.processed_len += 1;
-                g.deleted_cnt += 1;
-                unsafe { ptr::drop_in_place(cur) };
-                return false;
-            }
-            if DELETED {
-                unsafe {
-                    let hole_slot = g.v.as_mut_ptr().add(g.processed_len - g.deleted_cnt);
-                    ptr::copy_nonoverlapping(cur, hole_slot, 1);
+        ) where
+            F: FnMut(&mut T) -> bool,
+        {
+            while g.processed_len != original_len {
+                // We differ from `std::vec::Vec` here because
+                // we need to borrow whole slice in `as_mut_ptr` call
+                // which violates Stacked Borrows if we already used `cur`.
+                let slice_ptr = g.v.as_mut_ptr();
+                // SAFETY: Unchecked element must be valid.
+                let cur = unsafe { &mut *slice_ptr.add(g.processed_len) };
+                if !f(cur) {
+                    // Advance early to avoid double drop if `drop_in_place` panicked.
+                    g.processed_len += 1;
+                    g.deleted_cnt += 1;
+                    // SAFETY: We never touch this element again after dropped.
+                    unsafe { ptr::drop_in_place(cur) };
+                    // We already advanced the counter.
+                    if DELETED {
+                        continue;
+                    } else {
+                        break;
+                    }
+                }
+                if DELETED {
+                    // SAFETY: `deleted_cnt` > 0, so the hole slot must not overlap with current element.
+                    // We use copy for move, and never touch this element again.
+                    unsafe {
+                        let hole_slot = slice_ptr.add(g.processed_len - g.deleted_cnt);
+                        ptr::copy_nonoverlapping(cur, hole_slot, 1);
+                    }
                 }
+                g.processed_len += 1;
             }
-            g.processed_len += 1;
-            true
         }
 
         // Stage 1: Nothing was deleted.
-        while g.processed_len != original_len {
-            if !process_one::<F, T, CAP, false>(&mut f, &mut g) {
-                break;
-            }
-        }
+        process_loop::<F, T, false, CAP>(original_len, &mut f, &mut g);
 
         // Stage 2: Some elements were deleted.
-        while g.processed_len != original_len {
-            process_one::<F, T, CAP, true>(&mut f, &mut g);
-        }
+        process_loop::<F, T, true, CAP>(original_len, &mut f, &mut g);
 
         drop(g);
     }
@@ -634,10 +651,14 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
         let start = match range.start_bound() {
             Bound::Unbounded => 0,
             Bound::Included(&i) => i,
+            // You cannot have array bigger than usize anyway
+            // so saturating add wouldn't break anything here.
             Bound::Excluded(&i) => i.saturating_add(1),
         };
         let end = match range.end_bound() {
             Bound::Excluded(&j) => j,
+            // You cannot have array bigger than usize anyway
+            // so saturating add wouldn't break anything here.
             Bound::Included(&j) => j.saturating_add(1),
             Bound::Unbounded => len,
         };
@@ -645,21 +666,35 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     }
 
     fn drain_range(&mut self, start: usize, end: usize) -> Drain<T, CAP> {
-        let len = self.len();
-
         // bounds check happens here (before length is changed!)
-        let range_slice: *const _ = &self[start..end];
+        let _ = &self[start..end];
 
+        // Eagerly reduce len to prevent double frees in case of `drop::<T>` panics.
         // Calling `set_len` creates a fresh and thus unique mutable references, making all
         // older aliases we created invalid. So we cannot call that function.
+        let old_len: usize = self.len.try_into().unwrap();
         self.len = start as LenUint;
 
+        let start_ptr: *mut T = self.xs.as_mut_ptr().cast();
+
+        let tail_len = (old_len - end) as LenUint;
+        self.len = start.try_into().unwrap();
         unsafe {
+            // SAFETY: length is valid because we made bounds check earlier.
+            let to_yield = end.saturating_sub(start);
+            let taken_start = start_ptr.add(start);
+            let tail_start = taken_start.add(to_yield);
+
+            let taken_start = NonNull::new(taken_start).unwrap();
+            let tail_start = NonNull::new(tail_start).unwrap();
+
             Drain {
-                tail_start: end,
-                tail_len: len - end,
-                iter: (*range_slice).iter(),
-                vec: self as *mut _,
+                vec_len: &mut self.len,
+                taken_start: taken_start,
+                to_yield: to_yield as LenUint,
+                current_ptr: taken_start,
+                tail_start: tail_start,
+                tail_len,
             }
         }
     }
@@ -941,15 +976,31 @@ where
     }
 }
 
+// Note: We cannot implement this same way as `std::vec::Drain`
+// because keeping pointer to vec would violate Stacked Borrows
+// because this pointer alias with pointers to our inner slice
+// which get invalidated.
+// `std::vec::Vec` doesn't have same problem because its buffer
+// and object itself doesn't overlap.
+
 /// A draining iterator for `ArrayVec`.
 pub struct Drain<'a, T: 'a, const CAP: usize> {
-    /// Index of tail to preserve
-    tail_start: usize,
+    /// Reference to `len` field of vec
+    vec_len: &'a mut LenUint,
+
+    /// Pointer to position which we started to drain.
+    taken_start: ptr::NonNull<T>,
+    /// How much items we need yield.
+    /// Use integer instead of pointer to track this
+    /// because it simplifies working with ZSTs.
+    to_yield: LenUint,
+    /// Points to next item to yield.
+    current_ptr: ptr::NonNull<T>,
+
+    /// Points to item right after drained range.
+    tail_start: ptr::NonNull<T>,
     /// Length of tail
-    tail_len: usize,
-    /// Current remaining range to remove
-    iter: slice::Iter<'a, T>,
-    vec: *mut ArrayVec<T, CAP>,
+    tail_len: LenUint,
 }
 
 unsafe impl<'a, T: Sync, const CAP: usize> Sync for Drain<'a, T, CAP> {}
@@ -959,21 +1010,57 @@ impl<'a, T: 'a, const CAP: usize> Iterator for Drain<'a, T, CAP> {
     type Item = T;
 
     fn next(&mut self) -> Option<Self::Item> {
-        self.iter
-            .next()
-            .map(|elt| unsafe { ptr::read(elt as *const _) })
+        if self.to_yield == 0 {
+            None
+        } else {
+            let current = self.current_ptr.as_ptr();
+            self.to_yield -= 1;
+            self.current_ptr = NonNull::new(
+                // SAFETY: we just checked that we are in range.
+                // We just shortened range.
+                unsafe { current.add(1) },
+            )
+            // Note: rustc optimizes check here even with `-Copt-level=1`
+            // https://godbolt.org/z/br6eevnbx
+            .unwrap();
+
+            Some(
+                // SAFETY: we just checked that we are in range.
+                // Range must be valid by construction.
+                // We visit every item in drained range exactly once
+                // so they are read exactly once.
+                // Alignment is valid because it points to item in slice.
+                unsafe { current.read() },
+            )
+        }
     }
 
     fn size_hint(&self) -> (usize, Option<usize>) {
-        self.iter.size_hint()
+        let remaining: usize = self.to_yield.try_into().unwrap();
+        (remaining, Some(remaining))
     }
 }
 
 impl<'a, T: 'a, const CAP: usize> DoubleEndedIterator for Drain<'a, T, CAP> {
     fn next_back(&mut self) -> Option<Self::Item> {
-        self.iter
-            .next_back()
-            .map(|elt| unsafe { ptr::read(elt as *const _) })
+        if self.to_yield == 0 {
+            None
+        } else {
+            let to_yield: usize = self.to_yield.try_into().unwrap();
+            self.to_yield -= 1;
+
+            // SAFETY: We just checked our range.
+            // We just shortened range.
+            let src = unsafe { self.current_ptr.as_ptr().add(to_yield - 1) };
+            Some(
+                // SAFETY: we just checked that we are in range.
+                // Range must be valid by construction.
+                // We visit every item in drained range exactly once
+                // so they are read exactly once.
+                // Alignment is valid because it points to item in slice.
+                unsafe { src.read() },
+            )
+        }
     }
 }
 
@@ -983,19 +1070,30 @@ impl<'a, T: 'a, const CAP: usize> Drop for Drain<'a, T, CAP> {
     fn drop(&mut self) {
         // len is currently 0 so panicking while dropping will not cause a double drop.
 
-        // exhaust self first
-        while let Some(_) = self.next() {}
+        // Drop inner values first.
+        // Use slice `drop_in_place` because it is faster than iteration over self.
+        unsafe {
+            let remaining_start = self.current_ptr.as_ptr();
+            let remaining_len: usize = self.to_yield.try_into().unwrap();
+            self.to_yield = 0;
+            // SAFETY: It is safe because we dropping only unyielded items
+            // which must be initialized by `Drain` invariant.
+            ptr::drop_in_place(core::slice::from_raw_parts_mut(
+                remaining_start,
+                remaining_len,
+            ));
+        }
 
         if self.tail_len > 0 {
             unsafe {
-                let source_vec = &mut *self.vec;
+                let dst = self.taken_start.as_ptr();
+                let src = self.tail_start.as_ptr() as *const _;
+                let tail_len: usize = self.tail_len.try_into().unwrap();
                 // memmove back untouched tail, update to new length
-                let start = source_vec.len();
-                let tail = self.tail_start;
-                let src = source_vec.as_ptr().add(tail);
-                let dst = source_vec.as_mut_ptr().add(start);
-                ptr::copy(src, dst, self.tail_len);
-                source_vec.set_len(start + self.tail_len);
+                ptr::copy(src, dst, tail_len);
+                // Set len of vec.
+                // Safe because our tail contains exactly `tail_len` living items.
+                *self.vec_len += self.tail_len;
             }
         }
     }
@@ -1050,8 +1148,8 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     {
         let take = self.capacity() - self.len();
         let len = self.len();
-        let mut ptr = raw_ptr_add(self.as_mut_ptr(), len);
-        let end_ptr = raw_ptr_add(ptr, take);
+        let mut ptr = self.as_mut_ptr().add(len);
+
         // Keep the length in a separate variable, write it back on scope
         // exit. To help the compiler with alias analysis and stuff.
         // We update the length to handle panic in the iteration of the
@@ -1063,20 +1161,22 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
                 **self_len = len as LenUint;
             },
         };
-        let mut iter = iterable.into_iter();
-        loop {
-            if let Some(elt) = iter.next() {
-                if ptr == end_ptr && CHECK {
-                    extend_panic();
-                }
-                debug_assert_ne!(ptr, end_ptr);
-                ptr.write(elt);
-                ptr = raw_ptr_add(ptr, 1);
-                guard.data += 1;
-            } else {
-                return; // success
+        let iter = iterable.into_iter();
+        let mut current_offset = 0;
+        // Use `for_each` because it is more optimal for some iterators
+        // than calling `Iterator::next` in a loop.
+        // For example, `Chain`s extra branch is `next()` isn't optimized away.
+        iter.for_each(|elt| {
+            if CHECK && current_offset == take {
+                extend_panic();
             }
-        }
+            debug_assert_ne!(current_offset, take);
+            current_offset += 1;
+
+            ptr.write(elt);
+            ptr = ptr.add(1);
+            guard.data += 1;
+        });
     }
 
     /// Extend the ArrayVec with clones of elements from the slice;
@@ -1098,16 +1198,6 @@ impl<T, const CAP: usize> ArrayVec<T, CAP> {
     }
 }
 
-/// Rawptr add but uses arithmetic distance for ZST
-unsafe fn raw_ptr_add<T>(ptr: *mut T, offset: usize) -> *mut T {
-    if mem::size_of::<T>() == 0 {
-        // Special case for ZST
-        (ptr as usize).wrapping_add(offset) as _
-    } else {
-        ptr.add(offset)
-    }
-}
-
 /// Create an `ArrayVec` from an iterator.
 ///
 /// ***Panics*** if the number of elements in the iterator exceeds the arrayvec's capacity.
diff --git a/tests/tests.rs b/tests/tests.rs
index 9b51cd9..2aeba14 100644
--- a/tests/tests.rs
+++ b/tests/tests.rs
@@ -8,6 +8,7 @@ use arrayvec::CapacityError;
 use std::mem;
 
 use std::collections::HashMap;
+use std::sync::atomic::{AtomicUsize, Ordering as AtomicOrdering};
 
 #[test]
 fn test_simple() {
@@ -368,6 +369,31 @@ fn test_drain_range_inclusive_oob() {
     v.drain(0..=0);
 }
 
+#[test]
+fn test_drain_panic_in_the_middle() {
+    static DROP_COUNTER: AtomicUsize = AtomicUsize::new(0);
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct CountOnDrop;
+
+    impl Drop for CountOnDrop {
+        fn drop(&mut self) {
+            DROP_COUNTER.fetch_add(1, AtomicOrdering::Relaxed);
+        }
+    }
+
+    let mut v = ArrayVec::from([(); 6].map(|_| CountOnDrop));
+
+    std::panic::catch_unwind(move || {
+        let mut d = v.drain(1..4);
+        d.next();
+        panic!("We want to!");
+    })
+    .expect_err("We explicitly panic");
+    // No double drops and no leaks.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 6);
+}
+
 #[test]
 fn test_retain() {
     let mut v = ArrayVec::from([0; 8]);
@@ -385,6 +411,78 @@ fn test_retain() {
     assert_eq!(&v[..], &[]);
 }
 
+#[test]
+fn test_retain_on_panics() {
+    static DROP_COUNTER: AtomicUsize = AtomicUsize::new(0);
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct CountOnDrop;
+
+    impl Drop for CountOnDrop {
+        fn drop(&mut self) {
+            DROP_COUNTER.fetch_add(1, AtomicOrdering::Relaxed);
+        }
+    }
+
+    let max_i = AtomicUsize::new(0);
+    let mut v = ArrayVec::from([(); 20].map(|_| CountOnDrop));
+    // Check what happens if predicate panics.
+    std::panic::catch_unwind({
+        let max_i = &max_i;
+        move || {
+            let mut i = 0;
+            v.retain(|_| {
+                i += 1;
+                max_i.store(i, AtomicOrdering::Relaxed);
+                if i == 10 {
+                    panic!("We want to!");
+                }
+                i & 1 == 0
+            });
+        }
+    })
+    .expect_err("We explicitly panic");
+
+    assert_eq!(max_i.load(AtomicOrdering::Relaxed), 10);
+    // No leaks and no double frees.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 20);
+
+    DROP_COUNTER.store(0, AtomicOrdering::Relaxed);
+
+    struct PanicOnDrop(usize);
+
+    impl Drop for PanicOnDrop {
+        fn drop(&mut self) {
+            if self.0 == 10 {
+                panic!("We want to!");
+            }
+        }
+    }
+
+    let max_i = AtomicUsize::new(0);
+    let mut i = 0;
+    let mut v = ArrayVec::from([(); 20].map(|_| {
+        let j = i;
+        i += 1;
+        (CountOnDrop, PanicOnDrop(j))
+    }));
+    // Check what happens if drop panics.
+    std::panic::catch_unwind({
+        let max_i = &max_i;
+        move || {
+            v.retain(|v| {
+                max_i.store(v.1 .0, AtomicOrdering::Relaxed);
+                v.1 .0 & 1 != 0
+            });
+        }
+    })
+    .expect_err("We explicitly panic");
+
+    assert_eq!(max_i.load(AtomicOrdering::Relaxed), 10);
+    // No double frees and no leaks.
+    assert_eq!(DROP_COUNTER.load(AtomicOrdering::Relaxed), 20);
+}
+
 #[test]
 #[should_panic]
 fn test_drain_oob() {

robjtede avatar Jul 24 '22 00:07 robjtede

@bluss rebased, please consider this PR :)

robjtede avatar Mar 23 '23 14:03 robjtede

Hello, thank you for your work on this crate bluss, much appreciated.

Some feedback from an external contributor coding a potential patch. Not following rustfmt was quite a barrier and resulted in significant overhead for me. Due to running rustfmt out of habit after coding the things, I came to realise the project didn't follow it, and had to selectively commit my changes. For example src/arrayvec.rs had 55 patches, which resulted in some errors from my side when choosing the right ones, and more generally time and energy lost.

As a maintainer of other libs without much time I want to say I understand your situation well :)

kraktus avatar Sep 17 '23 09:09 kraktus

Closing this since it's pretty out of date. I'll leave the branch in place so anyone can cherry pick commits if they'd like.

robjtede avatar Sep 17 '23 09:09 robjtede