NanoRange icon indicating copy to clipboard operation
NanoRange copied to clipboard

Use memmove optimisations when appropriate

Open tcbrindle opened this issue 4 years ago • 6 comments

When using contiguous iterators with trivially copyable/movable value types, we should be able to optimise copy/move and their backwards versions to use std::memmove. This would have knock-on benefits for several other algorithms which end up copying or moving elements in their implementations.

Unfortunately memmove is not constexpr, so we'd need to use C++20 std::is_constant_evaluated to detect whether we're being called at compile-time (and should therefore just use the normal implementation) or at run-time. The preprocessor define __cpp_lib_is_constant_evaluated is supported in GCC and Clang to find out whether is_constant_evaluated is available, but I'm not sure about MSVC.

tcbrindle avatar Aug 25 '20 14:08 tcbrindle

Hi @pi-tau, in general, the models presented here are based on the common 'modern' implementations of them and not necessarily 1-to-1 as in the original papers. Regarding your specific questions:

  • Activation before first ResNet block: that is true, if we would apply an activation there, the first block would essentially apply a BN+ReLU on an output of just another ReLU. This is not recommended since you zero out a lot more neurons and does not have any apparent benefit (at least to me). In most current 'modern' Pre-ResNet implementations, you don't have this layer either.
  • Activation after last ResNet block: you are right that sometimes an additional activation can be used before the pooling layer. However, I often see that if you do pooling, an activation function is often not needed. I had actually trained multiple versions of the ResNet with the extra final activation, and commonly got worse results. Thus, I decided to stick with the one without activation here.
  • Downsample layers: in general, you only apply the self.downsample part when you need to change the feature dimension. For normal ResBlocks with same input+output features, no 1x1 conv is needed.

phlippe avatar Jun 27 '23 16:06 phlippe

Hi, this is quite interesting that the extra final activation worsens results. Thanks for sharing.

pi-tau avatar Jun 28 '23 09:06 pi-tau