c3c icon indicating copy to clipboard operation
c3c copied to clipboard

Zero consistency in C3 types

Open kleenleen opened this issue 3 weeks ago • 8 comments

I was interested on the project philosophy, a cleaned up C with modernized standards, until I saw that :

name bit size signed
bool 1 no
ichar 8 yes
char 8 no
short 16 yes
ushort 16 no
int 32 yes
uint 32 no
long 64 yes
ulong 64 no
int128 128 yes
uint128 128 no
iptr varies yes
uptr varies no
isz varies yes
usz varies no

Why for char it's adding i that defines the signage, while for integers, it's u that defines unsigned ? Why do you go with old broken short/int/long names while even <stdint.h> stands for int16/int32/int64/uint32... much better convention (if you remove _t) ? Why int128, however, uses this convention on its own though ???

Here is how it would be done with consistency done right: By default, the names state obviously what they are, signed is default, when unsigned, a purely consistent u prefix is applied.

name bit size signed
bool 1 no
char 8 yes
uchar 8 no
int16 16 yes
uint16 16 no
int32 32 yes
uint32 32 no
int64 64 yes
uint64 64 no
int128 128 yes
uint128 128 no
ptr varies yes
uptr varies no
size varies yes
usize varies no

kleenleen avatar Dec 04 '25 23:12 kleenleen

I'd like to give some context about the situation. because there was a lot similar discussions.

The situation is similar to C# with byte and sbyte, reason - the type is unsigned by default, because people used to it. About numbers at basic types: Chistoffer founds them not as readable as just word; and int128 exist like this purely because it doesn't have a common nickname (but it has a chance to be renamed).

EroMrinin134 avatar Dec 05 '25 04:12 EroMrinin134

The only one you can't reassign (alias) is char, which maybe is annoying. The rest you can alias with your preferred naming. So maybe I agree in that specific case could be "inconsistent".

ManuLinares avatar Dec 05 '25 15:12 ManuLinares

Thanks or the suggestion, but makeup doesn't give better genes.

I understand it's a bit late for changing fundamentals, but if you ever could, please consider consistency next time.

The foundations cannot be bent if the project is to build a straight tower.

kleenleen avatar Dec 05 '25 17:12 kleenleen

Arguing for other type names is common, see https://github.com/c3lang/c3c/issues/1566 https://github.com/c3lang/c3c/issues/1178 https://github.com/c3lang/c3c/discussions/1681 https://github.com/c3lang/c3c/discussions/434 https://github.com/c3lang/c3c/pull/2072 https://github.com/c3lang/c3c/issues/660

People from Rust wants i32/u32/isize/usize, people from Nim or Swift wants int32/uint32 etc.

There are strong emotions involved, and some try to make the point that this will make or break the language. Or that it's just bad because it's not modern, e.g.

char short int long and oh my god, int128 ?? logically it should be long long where's the logic and consistency?

Do you plan to switch to normal u8 u16 i32 i128 data types ?

And when the answer wasn't "yes":

thx. Dead lang on arrival. Switching to Zig.

Usually there are no arguments, but this issue mentions consistency, and that is indeed a valid concern that has merits – unlike the "oh I like these names" (previously mentioned).

The way the names are constructed is:

  1. If the default name is associated with signed, use u for the unsigned name.
  2. If the default name is associated with unsigned, use i for the signed name.
  3. If the name has neither, then use i and u prefixes for signed and unsigned respectively.

char becomes a bit inconsistent because its signedness is not well defined in C. I associate it with unsigned, and thus ichar. For a while I considered char/byte, but I dropped that. As you know, Java uses byte for its signed 8 bit type – but Java only has signed types.

If you wonder why not ptr uptr, it's because here it's often more useful to have unsigned pointers. That's how they are represented. And so the default if anything would be iptr and ptr sz and isz. But neither is that well established. Sz is common in some C codebases. size would people in general assume is unsigned, and so on. Ideally the int128 would be named. Here D uses cent, but it's a weird name. Even something like octa or quad would be a bad name because it would be used so rarely that people don't have time to build familiarity with the name. For that reason it's stayed int128 as the placeholder name. I would say int128 would be assumed to be signed, so following rule (1), we create uint128.

It would be possible to change the prefix of course. s would mostly work as well as i does. Like schar ssz sptr, but I haven't made that leap. The intXX is clear, but fairly long to type, and the "fixed bit C names" worked well for both C# and Java, which is why I find arguments like "no one is going to learn that long means 64 bits" rather unconvincing.

To be honest, I'd like a better name for ichar. It doesn't look good and it's kind of odd man out, but I can't come up with a better name.

lerno avatar Dec 06 '25 23:12 lerno

Also note that for C the only way was to introduce int32_t and so on. It couldn't very well suddenly bit-fix existing names!

Also, people tend to forget that stdint.h has a whole family of names, not just the intXX_t, but also things like int_leastXX_t, int_fastXX_t. Which is a reminder that using exact bits is not the whole story.

lerno avatar Dec 06 '25 23:12 lerno

Many people can't deal with lack of consistency. I have this issue. For some people, inconsistency is like an unnoticeable gravel on the road. To me, inconsistency feels like chewing sand.

We all have our limitations and strengths.

You can close ticket if the matter is set in stone :)

kleenleen avatar Dec 10 '25 20:12 kleenleen

I do take suggestions on how to improve the more ad hoc names: ichar, int128 and uint128. However, on the large we're sticking with C#/Java C-names-with-fixed-bitsize.

lerno avatar Dec 13 '25 22:12 lerno

I'm okay with ichar, but int128 would be nice to rename to huge or something.

EroMrinin134 avatar Dec 14 '25 15:12 EroMrinin134

There is also a possibility for ichar -> tiny.

EroMrinin134 avatar Dec 22 '25 05:12 EroMrinin134

Making the lack of a u or i prefix depend on the "default" convention of the type is indeed significantly inconsistent and confusing. I don't hate the status quo, but I do very much like the suggested table. DO NOT take ptr and size out of the identifier pool lol. usz and isz and uptr and iptr make each other make more sense in the context of the rest of the names AND adds distinct flavor to a core and regularly used part of C3. I have never liked usize and isize anywhere, myself.

p7r0x7 avatar Dec 22 '25 05:12 p7r0x7

I totally agree that usz isz iptr uptr are good as is.

EroMrinin134 avatar Dec 22 '25 06:12 EroMrinin134

@lerno, how I proposed in https://github.com/c3lang/c3c/discussions/434 just add built-in type aliases, please. If you don't want these aliases to be used in std, let the compiler prohibit it. But allow users to use these aliases in their projects.

data-man avatar Dec 22 '25 07:12 data-man