Implement Copy for Vec when T implements Copy
Since user should be aware Vec are based on array, implement Copy would be convienent
Emm seems not that easy than I expected
I would very much appreciate this feature. This would enable heapless::String to be Copy, which would be super convenient.
This would be super convenient in many cases indeed.
If only we could impl<T: !Copy, const N: usize> Drop for Vec<T, N> or impl<T: Drop, const N: usize> Drop for Vec<T, N>.
I guess it would be maintenance burden to have a CopyVec that bounds on T: Copy?
Perhaps you guys could make some try to implement Copy for Vec?
Can't implement Copy if Vec should allow drop. https://stackoverflow.com/questions/51704063/why-does-rust-not-allow-the-copy-and-drop-traits-on-one-type You can't conditionally implement Drop either so the choice is to implement Copy conditionally on copy types or respect Drop. Respecting Drop is a reasonable choice, as behind the scenes the item is wrapped in a ManuallyDrop, less risk of unexpected stuff happening to an unsuspecting developer.
That leaves either Generics or macros, Generics are breaking while macros aren't. I threw together a split implementation and implemented Copy on String here https://github.com/MarcusGrass/heapless/tree/copy_vec
Imo both generics and macros are pretty horrible in general but it'd be a really nice feature to have.
Copy-ing a Vec<T> is more expensive than Clone-ing a Vec<T> except in the case where the Vec> is full.
The reason is that a Copy will copy all the uninitialized data (O(capacity)) in the stack buffer from the parent to the child. OTOH, a Clone will only copy the initialized data (O(len)) from the parent to the child. This can be seen in the following machine code.
#[no_mangle]
fn _start() {
unsafe {
(&(clone as usize) as *const usize).read_volatile();
(&(copy as usize) as *const usize).read_volatile();
}
}
#[no_mangle]
fn clone(vec: &Vec<u8, 1024>) {
let mut vec = vec.clone();
black_box(&mut vec);
}
#[no_mangle]
fn copy(vec: &Vec<u8, 1024>) {
let mut vec2 = *vec;
black_box(&mut vec2);
}
fn black_box<T>(val: &mut T) {
unsafe { asm!("// {0}", in(reg) val) }
}
00020100 <clone>:
20100: b5d0 push {r4, r6, r7, lr}
20102: af02 add r7, sp, #8
20104: f5ad 6d81 sub.w sp, sp, #1032 ; reserve 1028 bytes of stack space
20108: 2300 movs r3, #0
2010a: c802 ldmia r0!, {r1}
2010c: 9301 str r3, [sp, #4]
2010e: aa01 add r2, sp, #4
20110: /--/-X b141 cbz r1, 20124 <clone+0x24> ; for loop to copy r1 (self.len) bytes of data
20112: | | 4413 add r3, r2
20114: | | f810 4b01 ldrb.w r4, [r0], #1
20118: | | 3901 subs r1, #1
2011a: | | 711c strb r4, [r3, #4]
2011c: | | 9b01 ldr r3, [sp, #4]
2011e: | | 3301 adds r3, #1
20120: | | 9301 str r3, [sp, #4]
20122: | \-- e7f5 b.n 20110 <clone+0x10>
20124: \----> a801 add r0, sp, #4
20126: f50d 6d81 add.w sp, sp, #1032
2012a: bdd0 pop {r4, r6, r7, pc}
0002012c <copy>:
2012c: b5d0 push {r4, r6, r7, lr}
2012e: af02 add r7, sp, #8
20130: f5ad 6d81 sub.w sp, sp, #1032
20134: ac01 add r4, sp, #4
20136: 4601 mov r1, r0
20138: f240 4204 movw r2, #1028
2013c: 4620 mov r0, r4
2013e: f000 f803 bl 20148 <__aeabi_memcpy4> ; will copy r2 (1028) bytes of data
20142: f50d 6d81 add.w sp, sp, #1032
20146: bdd0 pop {r4, r6, r7, pc}
Aside from the implementation complication, the fact that Copy is less performant than Clone makes me inclined towards not adding such trait implementation.
semi on-topic: given that we cannot use 'trait specialization' on stable there are some reasons for adding a CopyVec<T: Copy> (even if it doesn't implement Copy): as destructors don't matter we can change the source code implementation to get better optimizations. for instance, the clone avoid would be optimized to a memcpy if one used a CopyVec<u8, N>. having both CopyVec and Vec is not a great user experience and would be more maintenance work (cannot use macros / generics as some functions need different source code implement to be properly optimized) so I would prefer to just wait for 'trait specialization' to be stabilized.
Copy-ing aVec<T>is more expensive thanClone-ing aVec<T>except in the case where theVec>is full.The reason is that a
Copywill copy all the uninitialized data (O(capacity)) in the stack buffer from the parent to the child. OTOH, aClonewill only copy the initialized data (O(len)) from the parent to the child. This can be seen in the following machine code.#[no_mangle] fn _start() { unsafe { (&(clone as usize) as *const usize).read_volatile(); (&(copy as usize) as *const usize).read_volatile(); } } #[no_mangle] fn clone(vec: &Vec<u8, 1024>) { let mut vec = vec.clone(); black_box(&mut vec); } #[no_mangle] fn copy(vec: &Vec<u8, 1024>) { let mut vec2 = *vec; black_box(&mut vec2); } fn black_box<T>(val: &mut T) { unsafe { asm!("// {0}", in(reg) val) } }00020100 <clone>: 20100: b5d0 push {r4, r6, r7, lr} 20102: af02 add r7, sp, #8 20104: f5ad 6d81 sub.w sp, sp, #1032 ; reserve 1028 bytes of stack space 20108: 2300 movs r3, #0 2010a: c802 ldmia r0!, {r1} 2010c: 9301 str r3, [sp, #4] 2010e: aa01 add r2, sp, #4 20110: /--/-X b141 cbz r1, 20124 <clone+0x24> ; for loop to copy r1 (self.len) bytes of data 20112: | | 4413 add r3, r2 20114: | | f810 4b01 ldrb.w r4, [r0], #1 20118: | | 3901 subs r1, #1 2011a: | | 711c strb r4, [r3, #4] 2011c: | | 9b01 ldr r3, [sp, #4] 2011e: | | 3301 adds r3, #1 20120: | | 9301 str r3, [sp, #4] 20122: | \-- e7f5 b.n 20110 <clone+0x10> 20124: \----> a801 add r0, sp, #4 20126: f50d 6d81 add.w sp, sp, #1032 2012a: bdd0 pop {r4, r6, r7, pc} 0002012c <copy>: 2012c: b5d0 push {r4, r6, r7, lr} 2012e: af02 add r7, sp, #8 20130: f5ad 6d81 sub.w sp, sp, #1032 20134: ac01 add r4, sp, #4 20136: 4601 mov r1, r0 20138: f240 4204 movw r2, #1028 2013c: 4620 mov r0, r4 2013e: f000 f803 bl 20148 <__aeabi_memcpy4> ; will copy r2 (1028) bytes of data 20142: f50d 6d81 add.w sp, sp, #1032 20146: bdd0 pop {r4, r6, r7, pc}Aside from the implementation complication, the fact that
Copyis less performant thanClonemakes me inclined towards not adding such trait implementation.semi on-topic: given that we cannot use 'trait specialization' on stable there are some reasons for adding a
CopyVec<T: Copy>(even if it doesn't implementCopy): as destructors don't matter we can change the source code implementation to get better optimizations. for instance, thecloneavoid would be optimized to amemcpyif one used aCopyVec<u8, N>. having bothCopyVecandVecis not a great user experience and would be more maintenance work (cannot use macros / generics as some functions need different source code implement to be properly optimized) so I would prefer to just wait for 'trait specialization' to be stabilized.
I understand and agree with you, but in my situation which Vec are defined with small capacity in a relatively large struct, copy cost doesn't matter, instead target binary size matters.
And thanks for your reply
I think we can wait till an optimal solution exists
thanks for the input I can see the perf/size trade-off and see how the perf hit may not matter much for small capacities and almost full Vecs.
I'd now be in favor of a Copy implementation provided that the trade-off and difference in perf vs Clone is documented. I would still like to wait for a way to implement this that does not require duplicating code or macros
Copy for String would be very convenient.