runtime icon indicating copy to clipboard operation
runtime copied to clipboard

[API Proposal]: Interlocked.CompareExchange for byte/short/sbyte/ushort

Open 333fred opened this issue 3 years ago • 23 comments

Background and motivation

We have a few flags fields in the compiler that can be set by multiple threads, used as part of our symbol models. As we avoid locks whenever possible, we generally use a loop on Interlocked.CompareExchange, under the assumption it will succeed eventually or the value will have been computed and set by some other thread. However, because CompareExchange only supports int, long, uint, and ulong integer values, these fields must be at least 1 int wide. This doesn't sound like much, but as the compiler needs to load many, many symbols, those few extra bits can add up. I'd like to be able to use bytes where possible for these values. I only really want byte/ushort overloads, but added the signed variants for completeness.

API Proposal

namespace System.Threading
{
    public static class Interlocked
    {
        public static byte CompareExchange(ref byte location1, byte value, byte comparand);
        public static sbyte CompareExchange(ref sbyte location1, sbyte value, sbyte comparand);
        public static short CompareExchange(ref short location1, short value, short comparand);
        public static ushort CompareExchange(ref ushort location1, ushort value, ushort comparand);
    }
}

API Usage

        public static bool Set(ref byte flags, byte toSet)
        {
            byte oldState, newState;
            do
            {
                oldState = flags;
                newState = oldState | toSet;
                if (newState == oldState)
                {
                    return false;
                }
            }
            while (Interlocked.CompareExchange(byte flags, newState, oldState) != oldState);
            return true;
        }

Alternative Designs

No response

Risks

No response

333fred avatar Feb 02 '22 02:02 333fred

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

Tagging subscribers to this area: @mangod9 See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

We have a few flags fields in the compiler that can be set by multiple threads, used as part of our symbol models. As we avoid locks whenever possible, we generally use a loop on Interlocked.CompareExchange, under the assumption it will succeed eventually or the value will have been computed and set by some other thread. However, because CompareExchange only supports int, long, uint, and ulong integer values, these fields must be at least 1 int wide. This doesn't sound like much, but as the compiler needs to load many, many symbols, those few extra bits can add up. I'd like to be able to use bytes where possible for these values. I only really want byte/ushort overloads, but added the signed variants for completeness.

API Proposal

namespace System.Threading
{
    public class Interlocked
    {
        public byte CompareExchange(ref byte location, byte value, byte comparand);
        public sbyte CompareExchange(ref sbyte location, sbyte value, sbyte comparand);
        public short CompareExchange(ref short location, short value, short comparand);
        public ushort CompareExchange(ref ushort location, ushort value, ushort comparand);
    }
}

API Usage

        public static bool Set(ref byte flags, byte toSet)
        {
            byte oldState, newState;
            do
            {
                oldState = flags;
                newState = oldState | toSet;
                if (newState == oldState)
                {
                    return false;
                }
            }
            while (Interlocked.CompareExchange(byte flags, newState, oldState) != oldState);
            return true;
        }

Alternative Designs

No response

Risks

No response

Author: 333fred
Assignees: -
Labels:

api-suggestion, area-System.Threading, untriaged

Milestone: -

msftbot[bot] avatar Feb 02 '22 02:02 msftbot[bot]

Also CC. @stephentoub

This is something that both x86/x64 (cmpxchg) and Arm64 (cas, casb, and cash) support so I don't see anything that would block this.

  • I'm still of the general opinion that we should go ahead and just expose the full set of missing Interlocked functionality here to help ensure we have a good and usable baseline for users to write their multi-threaded algorithms. x86/x64 and Arm64 support these for Add, And, BitTestAndComplement, BitTestAndReset, BitTestAndSet, CompareExchange, Decrement, Increment, Negate, Not, Or, Subtract, and XOr. There are also some architecture specific cases that are possible good for Hardware Intrinsic options. Other major languages/runtimes (including Java, not just C/C++, Rust, etc) have a much more complete set of baseline APIs for atomic and interlocked operations than we do and I believe we are "behind" comparatively speaking.

tannergooding avatar Feb 02 '22 02:02 tannergooding

I have no problem with us adding CompareExchange overloads for byte/short/etc. if every architecture we target and envision targeting supports them.

Same for overloads of other existing APIs we have, e.g. Increment, Decrement, etc.

I'm still of the general opinion that we should go ahead and just expose the full set of missing Interlocked functionality here to help ensure we have a good and usable baseline for users to write their multi-threaded algorithms. x86/x64 and Arm64 support these for Add, And, BitTestAndComplement, BitTestAndReset, BitTestAndSet, CompareExchange, Decrement, Increment, Negate, Not, Or, Subtract, and XOr. There are also some architecture specific cases that are possible good for Hardware Intrinsic options. Other major languages/runtimes (including Java, not just C/C++, Rust, etc) have a much more complete set of baseline APIs for atomic and interlocked operations than we do and I believe we are "behind" comparatively speaking.

We discussed this in: https://github.com/dotnet/runtime/issues/24694#issuecomment-586389981 I've not seen or heard any further discussion that would change my opinion (namely that we shouldn't be adding APIs for which there isn't a demonstrated need). If you'd like to make a case for it, please open a separate issue.

stephentoub avatar Feb 02 '22 02:02 stephentoub

A slight digression from the topic of the issue, but related to it. I have not found related issues about the below problem. Apart from the overloads described in this issue, it would be a good idea to add overloads for the Enum type. As far as I know, it is based on integers and it's possible.

Some of the current workarounds are described here or here

aleksandr-aleksashin avatar Feb 05 '22 08:02 aleksandr-aleksashin

@333fred Can you add a nuint overload to your proposal?

A nint overload is already covered by the existing IntPtr overload). But it might be useful to add CompareExchangeNInt (@tannergooding what do you think?)

deeprobin avatar Mar 15 '22 11:03 deeprobin

Can you add a nuint overload to your proposal?

That's https://github.com/dotnet/runtime/issues/45103.

But it might be useful to add CompareExchangeNInt

I don't think we should do that.

stephentoub avatar Mar 15 '22 11:03 stephentoub

@stephentoub is there anything blocking this suggestion? Can we tag it for review?

333fred avatar Mar 15 '22 15:03 333fred

That's up to @kouvel.

stephentoub avatar Mar 15 '22 15:03 stephentoub

I don't see any issues with this either, marked as ready-for-review

kouvel avatar Mar 16 '22 03:03 kouvel

  • Looks good, but we should probably also do Exchange
namespace System.Threading
{
    public static class Interlocked
    {
        public static byte CompareExchange(ref byte location1, byte value, byte comparand);
        public static sbyte CompareExchange(ref sbyte location1, sbyte value, sbyte comparand);
        public static short CompareExchange(ref short location1, short value, short comparand);
        public static ushort CompareExchange(ref ushort location1, ushort value, ushort comparand);

        public static byte Exchange(ref byte location1, byte value);
        public static sbyte Exchange(ref sbyte location1, sbyte value);
        public static short Exchange(ref short location1, short value);
        public static ushort Exchange(ref ushort location1, ushort value);
    }
}

terrajobst avatar Apr 14 '22 17:04 terrajobst

Does this API require JIT-changes?

deeprobin avatar Apr 14 '22 17:04 deeprobin

There are multiple possible implementation strategies. It can either be implemented in the runtimes in C/C++ and/or it can be must-expand intrinsic in the JIT.

jkotas avatar Apr 14 '22 18:04 jkotas

Even if an architecture doesn't support these operations, couldn't they be implemented as CompareExchange loops?

public static short CompareExchange(ref short location, short value, short comparand)
{
    ref int locationAsInt = ref Unsafe.As<short, int>(ref location);
    int newValueAsInt = 0;
    ref short newValueLocation = ref Unsafe.As<int, short>(ref newValueAsInt);
    int initialValueAsInt;
    short initialValue;
    do
    {
        initialValueAsInt = Volatile.Read(ref locationAsInt);
        initialValue = Unsafe.As<int, short>(ref initialValueAsInt);
        if (initialValue != comparand)
        {
            return initialValue;
        }
        newValueAsInt = initialValueAsInt;
        Volatile.Write(ref newValueLocation, value);
    } while (Interlocked.CompareExchange(ref locationAsInt, newValueAsInt, initialValueAsInt) != initialValueAsInt);
    return initialValue;
}

Of course I realize this could be quite slow if the short's location isn't on a 4-byte address. The int location could be shifted to the floor of the nearby 4-byte address (using pointers instead of Unsafe.As), unless the short itself crosses a 4-byte boundary.

[Edit] Although, I would find it strange for a new architecture to support working with small values and not also support working with them in an Interlocked fashion.

timcassell avatar May 01 '22 01:05 timcassell

@timcassell newValueAsInt = initialValue; should be perhaps newValueAsInt = initialValueAsInt;? And maybe if (initialValue == comparand) instead of if (initialValue != comparand).

vladd avatar May 01 '22 14:05 vladd

@timcassell newValueAsInt = initialValue; should be perhaps newValueAsInt = initialValueAsInt;?

Ah yes, good catch.

And maybe if (initialValue == comparand) instead of if (initialValue != comparand).

No, that would change the behavior to compare exchange if not equal instead of compare exchange if equal. Also, that check would be removed to implement Exchange.

timcassell avatar May 01 '22 19:05 timcassell

@timcassell

No, that would change the behavior to compare exchange if not equal instead of compare exchange if equal.

Indeed, you are right. Sorry for the false alarm!

vladd avatar May 02 '22 07:05 vladd

In your code, if the byte/short/etc. is just before memory that's not accessible, such that extending by a few more bytes steps over that boundary, this could AV / seg fault.

I realize this could be quite slow

Which is why we don't want to implement these without the right hardware support. Otherwise someone "optimizing" their code by updating it from, say, using an int field to instead using a byte field is actually deoptimizing it.

stephentoub avatar May 02 '22 11:05 stephentoub

In your code, if the byte/short/etc. is just before memory that's not accessible, such that extending by a few more bytes steps over that boundary, this could AV / seg fault.

Yes, I realized that, and shifting the pointer like I mentioned would solve that as well. But that's not the point. That was just a proof-of-concept to show that it could be implemented without hardware support, if necessary.

Which is why we don't want to implement these without the right hardware support. Otherwise someone "optimizing" their code by updating it from, say, using an int field to instead using a byte field is actually deoptimizing it.

It seems that all architectures that .Net currently supports have hardware support, so this is a non-issue. And I doubt future architecture wouldn't support it. But, hypothetically, if .Net does support a future architecture that doesn't have hardware support, would you prefer to throw a NotSupportedException, or have a slow implementation?

timcassell avatar May 02 '22 12:05 timcassell

It seems that all architectures that .Net currently supports have hardware support, so this is a non-issue.

My point was we're not going to add a temporary implementation based on such a loop.

stephentoub avatar May 02 '22 12:05 stephentoub

Could bool and char overloads be also considered? They'd be trivial to implement with this.

MichalPetryka avatar Aug 07 '22 16:08 MichalPetryka

Could byte and char overloads be also considered? They'd be trivial to implement with this.

Perhaps you meant bool? byte is already part of the proposal.

timcassell avatar Aug 07 '22 16:08 timcassell

Could byte and char overloads be also considered? They'd be trivial to implement with this.

Perhaps you meant bool? byte is already part of the proposal.

Right, my bad.

MichalPetryka avatar Aug 07 '22 16:08 MichalPetryka

Could bool and char overloads be also considered? They'd be trivial to implement with this. In my opinion, a good idea.

What do you think? @terrajobst @stephentoub

deeprobin avatar Sep 03 '22 19:09 deeprobin

What do you think?

It's trivial for someone to implement those themselves in terms of the proposed ones. That said, bool could be a reasonable addition, just because it's so common. The problem is it's a slippery slope with all the different methods and all the different possible types; line needs to be drawn somewhere.

Why are you asking @deeprobin?

stephentoub avatar Sep 03 '22 20:09 stephentoub

@stephentoub Oh I thought something like that always has to be strictly re-reviewed. I ask because I plan to implement this API if no one objects.

deeprobin avatar Sep 03 '22 20:09 deeprobin

Oh I thought something like that always has to be strictly re-reviewed.

It does if it were to be included.

I ask because I plan to implement this API if no one objects.

I'm concerned this may be too complex. What is your plan?

stephentoub avatar Sep 03 '22 20:09 stephentoub

I'm concerned this may be too complex. What is your plan?

You are right. I took a look at the whole thing. I would probably have to include instructions like CMPXCHG8B & CMPXCHG16B, which I would still consider feasible, but the ARM implementation would worry me because I haven't dealt with it yet.

Then the issue is better for someone else :D

deeprobin avatar Sep 03 '22 20:09 deeprobin

I could really use an Interlocked.CompareExchange bool variant. Can that get added too?

TonyValenti avatar Jan 05 '23 14:01 TonyValenti

I could really use an Interlocked.CompareExchange bool variant. Can that get added too?

The only issue with that is that a bool with internal representation of 2 would be considered different from one represented as 1.

MichalPetryka avatar Jan 05 '23 21:01 MichalPetryka