language
language copied to clipboard
Add octal literals
Related to https://github.com/dart-lang/language/issues/581, I'd also like to see octal literals using the common 0o123 syntax. The lack of these actually caused a security issue for Dart Sass recently (see https://github.com/google/dart_cli_pkg/pull/109) because we incorrectly converted Unix permissions bits from octal to decimal and ended up creating world-writable executables under certain circumstances.
Unix permissions is the only use for octal numerals that I'm aware of. It seems like something that could be fixed by using an abstraction, rather than adding a language feature with no other practical uses. Will probably be more readable than 0o754 to readers too.
Take:
const x = 1;
const w = 2;
const r = 4;
const rw = r | w;
const rx = r | x;
const wx = w | x;
const rwx = r | w | x;
int perm({int u = 0, int g = 0, int o = 0, int user = 0, int group = 0, int others = 0}) =>
((u | user) << 6) | ((g | group) << 3) | (o | others);
so you can just write perm(u: rw, g: r, o: w), and not use (low-readability) number literals at all.
Or use perm(u: 7, g: 5, o: 5) if you prefer the numbers to the names.
Use the longer names if you like that better. Add extra parameters for special bits. Or parse a string.
Or just introduce a helper like oct(7, 5, 5) like:
int oct(int d1, [int? d2, int? d3, int? d4, int? d5]) {
int ensureOctal(int d) => d >= 0 && d <= 7 ? d : (throw ArgumentError.value(d, null, "not octal digit"));
var value = ensureOctal(d1);
if (d2 != null) {
value = (value << 3) | d2;
if (d3 != null) {
value = (value << 3) | d3;
// etc.
}
}
return value;
}
(If I were to design a Posix files system API in the near future, I'd use an inline class on an int for permissions, rather than a plain int.)
Creating an abstraction for this is likely to feel like overkill in places (like the one in question) where you just need to specify a couple permissions flags. Parsing octal literals is a standard way to make this easy enough that authors can use it without jumping through hoops.
This is a widely-supported feature across many programming languages. JavaScript, Java, Go, Python, Ruby, Rust, C, and Swift all support it using either 0o123, 0123, or both—of the widely-used languages, C# is the only one that doesn't. So I'd argue that it's more surprising for a user to see it absent in Dart than it would be to see it present.
Should we take advantage of https://github.com/dart-lang/language/issues/581#issuecomment-2259691898 and add octal literals? It seems @srawlins patch could be easily adapted to also solve this one.
IMO, no thanks. I'd rather not.
Octal numbers have precisely one semi-common use (Unix file permissions). That's not enough of a reason to put them into the Dart language. There really is not enough use-cases for the feature to be worth the effort (and syntax real-estate), no matter how small that effort is.
(A library with 512 constants, const p000 = 0; through const p777 = 0x1FF;, will solve that one use-case.)
Dart doesn't have to support every base of literal, and the reason to have octal is so minusculy bigger than having quaternary that I don't think it should make a difference.
(Or if Dart should support all radixes, then have one syntax parameterized with the base, say 8r775, with r meaning "radix", since b for "base" is taken by binary.).
To my mind, the most compelling difference between octal and quaternary is that octal is widely supported by programming languages using a consistent, well-known syntax and quaternary is not. Realistically, it's not giving up "syntax real-estate" to support it because if you defined 0o755 to mean anything other than 493, you'd be direly undermining user expectations and even potentially posing a security risk by making it easy to write code that looked like it was permissions-correct but in fact was not.
Dart isn't an abstract spec sitting in the realm of platonic forms—it's a programming language in the real world, used by programmers who use many other languages and have expectations accordingly. This doesn't mean it always has to do everything the same, but when there's a clear expectation that's very inexpensive to meet and provides real value, why not meet it?
If no other language had octal literals, would we add them? Probably not. Just too obscure a use-case.
But other languages do have octal literals.
Some use 045 as syntax. That's a horrible design mistake that should not be repeated.
Others use 0o45 which makes sense by itself. Not great for reading, and particularly bad if it also allows a capital O, 0O45 is just not readable. I wouldn't allow that, lower case only if we ever did it.
We don't have to do things just because other languages do. If other languages do something, and lots of users use it and expect it, that's another thing. As yours say, if there is a clear expectation, we probably should meet it.
But there isn't. Octal literals are not table stakes, they're an esoteric feature. Most users of the languages probably don't even know they exist.
Of the languages you mention above: JavaScript, Java, Go, Python, Ruby, Rust, C, and Swift, hour many can you actually remember the syntax for octals in? Or that they had octal numbers at all, without looking it up?
I remember one, JavaScript. I'd guess that Java and C have the same syntax, but it is a guess. I wrote enterprises Java apps for years and never had to use an octal literal.
It's an obscured feature that I actually think adds more complexity to a language than it's worth. Even if it's not a lot of complexity.
Of the languages you mention above: JavaScript, Java, Go, Python, Ruby, Rust, C, and Swift, hour many can you actually remember the syntax for octals in? Or that they had octal numbers at all, without looking it up?
I was curious:
-
JavaScript supported both old-style
0123octal literals and0o123ones. The former are deprecated. -
Go supports both old-style
0123and0o123ones. The latter was added later. -
Python supported old-style but moved to
0o123in Python 2.6. -
Ruby supports both
0123and0o123. (It also supports0d123if you want to be explicit that you're writing in decimal, which is cute.) -
C of course invented the old cursed style.
-
Swift uses
0o123octal literals.
So, aside from C and Java, every listed language supports 0o123-style octal literals. A few support the old style ones too.
The last case I'll make before I leave this up to the language team: although it's true that octal literals have narrow use-cases relative to binary and hexadecimal literals, the use-cases they do have are specifically security-related. The cost of getting a manual conversion to octal wrong is substantially higher than the cost of an average bug. I think this, as well as the wide availability of the 0o style that @munificent demonstrated, tips the balance into making this a worthwhile addition.
Just came upon this: relatively new to Dart, though.
I am interfacing LMDB via libffi and immediately got caught by the default handling of Dart, converting a 0666 integer to the value 666. No warning whatsoever. This is IMHO totally a principle of max. suprise and therefore bad, because every traditional language out there supports this old-school behaviour or if it doesn't, produces an appropriate error.
The alternative using 0o666 would be cute, but is not available. So I need to create my own octal abstraction or make a dependency to an external package, only because this fundamental thing is not available ?
@lrhn : your argumentation about only usage as file permissions is odd, as Dart needs to support this as a portable language. And don't forget there are a lot of unixoid/posixoid systems out there: Linux, MacOS, iOS, iPadOS, Windows Linux subsystem, etc. And every time one uses dartffi to interface to a C/Unix library, people will fall into this trap.
If we can help people not make mistakes, we should.
A "You look like you want to write an octal literal" message could help, either for anything of the form 0123, decimal with leading zero and no 8 or 9 digit, but that might be too harsh, or instead targeting known "permission" parameters, or anything annotated with a @userMightUseOctal marker, or whatever works.
I don't think actually adding octal numerals is the best solution to any problem. Especially if nobody will ever use more than the three digits.
So far I've suggested symbolic constants, or even a library with 512 constants from o000 to o777.
Now we have an event better option: binary numerals. Rather than 0o755 you can write 0b111_101_101.
I think it's easier to read too!
If C had had binary numerals, maybe that would have been how you wrote permissions all along.
There are multiple things:
- automatic conversion of given literals
0755to the decimal value755for ints without any style warning/error (my main concern) - having a concise representation for file permissions
- conventions:
0755,0664, etc. are the canonical form for file permissions on unixoid systems, one would also immediately understand'rwxr-xr-x'- but this is inefficient0b111_101_101could be understood, but it's somehow unexpected and seems more like a workaround
My main concern is the first of these, because there will be unnecessary "frictions" for those coming from a different background.
Fun fact. Here's my dialog with gemini AI. Quote: Me: Example of octal number literal in dart programming language Gemini:
void main() {
int octalNumber = 0o177; // 0o prefix indicates an octal literal
print("Octal number: $octalNumber"); // Output: Octal number: 127
}
Explanation:
0o prefix: This prefix explicitly signifies that the following digits represent an octal number. Digits: Octal numbers use digits from 0 to 7. Output: The code will print the decimal equivalent of the octal number 0o177, which is 127. Key Points:
Octal literals are a concise way to represent numbers in base-8 within your Dart code. The 0o prefix is crucial to distinguish octal numbers from decimal or other number systems. I hope this example helps! End Quote
:smile:
Generative AIs be hallucinating. Must be Tuesday.
Adding 0o123 as an octal literal won't solve the problem of someone writing 0755 and getting a decimal interpretation. Solving that requires disallowing, or at least warning about, leading zeros in decimal integer numerals. If we do that, then we can redirect you to any solution, it doesn't have to be octal literals.
True. I am not aware of any legitimate use case for decimal literals with the leading zero, so the warning seems well-justified.
The problem is that 0o177 literals seem to be widespread (that's the reason for Gemini's hallucinations). It's a philosophical issue: the programming idioms keep changing, and 0o177 looks like a current idiom (replacing legacy 0177). It might be easier to support it than to explain to everyone why you don't want to support it. (These 0o literals won't interfere with existing or future dart features).
A quick search found a few occurrences of leading zeros:
- Time/data which is traditionally two-digit, like
DateTime(2024, 03, 28, 10, 00, 00) - Tabular data, like this base64 table
- Bugs (I think)
- Some special-formatted composite numbers, like a date being
19840101and then having 'minDecimalDatebe00000101`.
The vast majority are DateTime arguments.
Only DateTime seems legit (though hardcoded dates must be uncommon other than it tests). Maybe that's the use case for 0d?
Also (quoting the docs):
"While the 0d prefix is rarely used (because decimal is the default in Ruby), it is supported and can be helpful for clarity when working with mixed numeric bases".
Only DateTime seems legit (though hardcoded dates must be uncommon other than it tests). Maybe that's the use case for
0d? Also (quoting the docs): "While the 0d prefix is rarely used (because decimal is the default in Ruby), it is supported and can be helpful for clarity when working with mixed numeric bases".
We are talking about Dart here, not Ruby, right ?
@lumpidu: please refer to this comment for the context: https://github.com/dart-lang/language/issues/2708#issuecomment-2284909963
Only DateTime seems legit (though hardcoded dates must be uncommon other than it tests). Maybe that's the use case for
0d?
I would bet money that the number of people who expect to be able to write 03 for the third day in a month is larger than the set of people who expect to write 0666 to set file permissions for a POSIX machine.
Linter could print a warning for every constant greater than 9 written with a leading zero. Having encountered 0666, it could print: "aren't you by any chance trying to set POSIX file permissions? Use 0o666 instead", or something.
Such foresight on the part of the compiler will make an extremely favorable impression on the user. The advice to type 0b110_110_110 instead will leave the user slightly puzzled.
There are only 512 three-digit octal literals. Well within the range of just having a constant for each: https://gist.github.com/lrhn/9cf3249827503c1632cbb292b753ebdf
We could have the analyzer comment on any decimal literal with (a single?) leading zero, no 8 or 9 digits and more than one significant digit. Not a warning, just an "unnecessary leading digit" hint.
No warnings for 099 (contains 9), 01 (same value if octal), maybe 0011 (two leading zeros, not necessary for an old-style octal, but 0055 could be intended as a permission).
I'd still get dinged on DateTime(0100, 01, 01), written that way to align with other dates in the same table, but I'll only use literal pre-1970 dates in tests anyway, so the problem is constrained.
Might get a comment on a table like
// Not a prime.
const _np = 0;
// First 128 numbers, if they are prime, zero if not.
const primes = [ // Nobreak.
_np, _np, 002, 003, _np, 005, _np, 007, _np, _np, _np, 011, _np, 013, _np, _np, _np, 017, _np, 019,
_np, _np, _np, 023, _np, _np, _np, _np, _np, 029, _np, 031, _np, _np, _np, _np, _np, 037, _np, _np,
_np, 041, _np, 043, _np, _np, _np, 047, _np, _np, _np, _np, _np, 053, _np, _np, _np, _np, _np, 059,
_np, 061, _np, _np, _np, _np, _np, 067, _np, _np, _np, 071, _np, 073, _np, _np, _np, _np, _np, 079,
_np, _np, _np, 083, _np, _np, _np, _np, _np, 089, _np, _np, _np, _np, _np, _np, _np, 097, _np, _np,
_np, 101, _np, 103, _np, _np, _np, 107, _np, 109, _np, _np, _np, 113, _np, _np, _np, _np, _np, _np,
_np, _np, _np, _np, _np, _np, _np, 127];
The analyzer could try to be a little clever with collections that contain multiple number literals, and only say something if they're all "possibly intended as octals".
But such a (rare) program can use // ignore: comment, no?
Absolutely. (And if I had // ignore for a region, not the entire file, it would be even better.)
My main worry is whether it introduces a buch of warnings in existing code that authors have to go back and fix up. (But if lints can be language versioned, we could just say that it doesn't trigger until you upgrade to some language version.)