cppfront icon indicating copy to clipboard operation
cppfront copied to clipboard

[BUG] int variable casted with as to size_t does not fail on compile time

Open filipsajdak opened this issue 1 year ago • 12 comments
trafficstars

In the current implementation of cppfront (065a993), the following code compiles and runs:

i : int = 123;
std::cout << (i as size_t) << std::endl;

produces

123

and should fail to compile with the following error:

cppfront/include/cpp2util.h:3382:9: error: static assertion failed due to requirement 'program_violates_type_safety_guarantee<unsigned long, int>': 'as' does not allow unsafe possibly-lossy narrowing conversions - if you're sure you want this, use 'unsafe_narrow<T>' to explicitly force the conversion and possibly lose information
        static_assert(
        ^
narrowing.cpp2:10:31: note: in instantiation of function template specialization 'cpp2::impl::as_<unsigned long, int>' requested here
    std::cout << (cpp2::impl::as_<size_t>(cpp2::move(i))) << std::endl;
                              ^
1 error generated.

A potential error will be caught in runtime, e.g., when the i is negative, the following error will be produced:

cppfront/include/cpp2util.h(2486) decltype(auto) cpp2::impl::as(auto &&, std::source_location) [C = unsigned long, x:auto = int]: Type safety violation: dynamic lossy narrowing conversion attempt detected
libc++abi: terminating

It will save us from the trouble... but a user should be informed to use cpp2::unsafe_narrow<size_t>(i) instead.

The bug is related to the badly defined is_narrowing_v - I have the patch already and will deliver it in next PR.

filipsajdak avatar Aug 23 '24 20:08 filipsajdak

The current behavior was added intentionally: https://github.com/hsutter/cppfront/commit/ede2df2d25c3d21de851d5792d9e4ff37bc50b9f

Signed/unsigned conversions to a not-smaller type are handled as a precondition, and trying to cast from a source value that is in the half of the value space that isn't representable in the target type is flagged as a Type safety contract violation

gregmarr avatar Aug 27 '24 15:08 gregmarr

It's different from how we treat other integral types. We take a completely different approach for all unsigned values. I am OK with one or the other approach, but it should be consistent.

Having a static assertion at compile time makes things obvious immediately—this is compiler support that I like to have. Moving it to the runtime is not ideal, but if it is how it should be, we should do the same for other integral types to keep consistency.

filipsajdak avatar Aug 27 '24 16:08 filipsajdak

This is the relevant discussion, and it lays out Herb's thoughts on why it is the way it is now. https://github.com/hsutter/cppfront/pull/106#issuecomment-1372387031 If he wants to change things now, that's certainly possible, but this was definitely a well-considered choice to make them different.

gregmarr avatar Aug 27 '24 20:08 gregmarr

OK, Thanks for recalling that discussion from the past.

Since then, I have run many test cases for is() and as(). What worries me is that we can introduce subtle errors that will appear in the runtime for integral types.

The issue is easy when handling obvious cases like v.size() as int. Things become harder when dealing with generic functions.

Imagine the function (I will leave only signature):

fun: (i);

This function triggers static asserts when you pass std::uint32{123} but will accept std::int32{123} (happily, for the second case, it might as well work).

The function will also accept std::int16{-123} (it will compile fine)... unfortunately it will terminate at runtime with an error that the contract was broken.

The function might look the following:

fun: (i) = {
  std::cout << (i as std::uint16) << std::endl;
}

I am not expecting to choose one or another option. I am expecting consistency. @hsutter proposal looked fine at first (as() function might fail on runtime also when dealing with polymorphic types, std::any, std::optional, std::variant, etc.). But when I prepared a massive test case for testing out is()/as() with all combinations of primitive types (yes, it is a lot of combinations, about ~16k tests), I realized that it is hard to test as() function for some combination as it triggers std::terminate().

This made it different from casting polymorphic types, std::any, std::optional, std::variant, etc. When it fails, you can recover from that as it will throw (assuming you use exceptions). Casting int{32} to std::uint16_t will terminate - you cannot recover from that (casting empty std::optional<int> to int will throw, and you can catch that).

Today, I have thought that maybe we need an additional way of casting:

  • as - statically checked or failed in a way that you can recover from (I am safe here; no terminate will be called),
  • checked_narrow or cast_or_terminate - dynamically checked (terminate when the contract is not met);
  • unsafe_narrow - no checks at all, you know what you are doing (I am not safe, but I believe it will be OK),

filipsajdak avatar Aug 27 '24 21:08 filipsajdak

The function will also accept std::int16{-123} (it will compile fine)... unfortunately it will terminate at runtime with an error that the contract was broken.

I'm not sure that having the user add unsafe_narrow here is better, because then you lose the safety net of the runtime check.

I realized that it is hard to test as() function for some combination as it triggers std::terminate().

Unless you install a throwing violation handler.

Would you find this more acceptable if it threw an exception instead of calling the contract violation handler?

Casting int{32} to std::uint16_t will terminate

No, you are going from a larger size to a smaller size, this requires unsafe_narrow<>.

gregmarr avatar Aug 27 '24 21:08 gregmarr

I'm not sure that having the user add unsafe_narrow here is better, because then you lose the safety net of the runtime check.

No, it is not about using unsafe_narrow, but if you have static_assert for that case, you will know at compile time. In cases where you accept termination for breaking the contract, you should mark it with, e.g., the cast_or_terminate function; if you don't care, you should use unsafe_narrow. In that way, we made everything explicit.

No, you are going from a larger size to a smaller size, this requires unsafe_narrow<>.

Yes, you are right. I was thinking about a case when the type is accepted on compile-time, but the value cause terminates at runtime.

filipsajdak avatar Aug 27 '24 22:08 filipsajdak

No, it is not about using unsafe_narrow, but if you have static_assert for that case, you will know at compile time.

Is this just referring to x_that_is_ptrdiff_t as size_t failing at compile time?

In cases where you accept termination for breaking the contract, you should mark it with, e.g., the cast_or_terminate function;

Violating a contract doesn't always result in termination. It depends on how your contracts are configured. They might even be just "observe" which logs the error somewhere, or "ignore" which doesn't even evaluate them. Putting _or_terminate in there limits the flexibility of the language feature contracts.

gregmarr avatar Aug 28 '24 18:08 gregmarr

Violating a contract doesn't always result in termination. It depends on how your contracts are configured. They might even be just "observe" which logs the error somewhere, or "ignore" which doesn't even evaluate them. Putting _or_terminate in there limits the flexibility of the language feature contracts.

I am not limiting anything; I am just describing how it works now.

This is the check: https://github.com/hsutter/cppfront/blob/78867f807a2c5961f7587605be8402e9d95dfafd/include/cpp2util.h#L1954-L1967

Where type_safety is defined as: https://github.com/hsutter/cppfront/blob/78867f807a2c5961f7587605be8402e9d95dfafd/include/cpp2util.h#L951-L955

Which, as you can see, calls report_and_terminate, which (as the name suggests) calls terminate: https://github.com/hsutter/cppfront/blob/78867f807a2c5961f7587605be8402e9d95dfafd/include/cpp2util.h#L921-L934

filipsajdak avatar Aug 28 '24 19:08 filipsajdak

https://hsutter.github.io/cppfront/cpp2/contracts/

cpp2::contract_group, and customizable violation handling

The contract group object could also provide additional functionality. For example, Cpp2 comes with the cpp2::contract_group type which allows installing a customizable handler for each object. Each object can only have one handler at a time, but the handler can change during the course of the program. contract_group supports:

.set_handler(pfunc) accepts a pointer to a handler function with signature * (* const char).

.get_handler() returns the current handler function pointer, or null if none is installed.

.is_active() returns whether there is a current handler installed.

.enforce(condition, message) evaluates condition, and if it is false then calls .report_violation(message).

gregmarr avatar Aug 28 '24 19:08 gregmarr

OK, thanks for that. I missed that... I will take another look at that issue.

Still, that is not consistent... this is the only runtime checked as() that uses type_safety. The rest uses Throw: https://github.com/hsutter/cppfront/blob/78867f807a2c5961f7587605be8402e9d95dfafd/include/cpp2util.h#L2076 (currently only as for variant uses it explicitly, as for any or optional throws indirectly (soon it will be explicitly: https://github.com/hsutter/cppfront/blob/4960b7795324dd9e7d00874756974f8d440b0b5c/include/cpp2util.h#L2115-L2120 and https://github.com/hsutter/cppfront/blob/4960b7795324dd9e7d00874756974f8d440b0b5c/include/cpp2util.h#L2160-L2168 )

So maybe we should move all Throws to type_safety contracts and set it one way (exceptions or terminate).

filipsajdak avatar Aug 28 '24 20:08 filipsajdak

These throws are using existing exceptions that already exist for this same type of thing in current C++. It makes sense to me to keep them that way. Maybe this should throw something like std::out_of_range, std::domain_error, std::range_error?

gregmarr avatar Aug 30 '24 15:08 gregmarr

@gregmarr These exceptions are not the issue, but how they are thrown in comparison to other places.

I experimented and have prepared an implementation that uses type_safety in as() functions that currently uses Throw: https://github.com/hsutter/cppfront/pull/1267

Take a look. If we want to keep these runtime checks, they should follow same approch, something like that, to allow customization of how they fail.

filipsajdak avatar Aug 30 '24 23:08 filipsajdak

Thanks for the discussion! I'm going to stick with status quo on this for now, but please feel free to reopen if over time you find this keeps causing problems or user surprise.

hsutter avatar Mar 01 '25 18:03 hsutter