www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Time to move std.experimental.checkedint to std.checkedint ?

reply Walter Bright <newshound2 digitalmars.com> writes:
It's been there long enough.
Mar 23
next sibling parent reply mw <mingwu gmail.com> writes:
On Tuesday, 23 March 2021 at 21:22:18 UTC, Walter Bright wrote:
 It's been there long enough.
Can we fix all the problems found in this ticket: https://issues.dlang.org/show_bug.cgi?id=21169 Issue 21169 - make checkedint as a drop-in replacement of native int/long
Mar 23
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/23/2021 2:26 PM, mw wrote:
 Can we fix all the problems found in this ticket:
 
 https://issues.dlang.org/show_bug.cgi?id=21169
 
   Issue 21169 - make checkedint as a drop-in replacement of native int/long
Those are all good enhancement ideas.
Mar 24
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On Tuesday, 23 March 2021 at 21:26:43 UTC, mw wrote:

 https://issues.dlang.org/show_bug.cgi?id=21169

  Issue 21169 - make checkedint as a drop-in replacement of 
 native int/long
I'm not sure if the first thing can be supported. I would require implicit conversions of custom types, which has always been refused in the past. I don't think the last one, number 7, can work either. checkedint supports adding arbitrary hooks that are executed during various conditions. I don't see how those could be made atomic. -- /Jacob Carlborg
Mar 24
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 23 March 2021 at 21:26:43 UTC, mw wrote:
 https://issues.dlang.org/show_bug.cgi?id=21169

  Issue 21169 - make checkedint as a drop-in replacement of 
 native int/long
Years ago I submitted a checkedint module of my own for inclusion in Phobos (https://code.dlang.org/packages/checkedint), which was ultimately rejected by Andrei Alexandrescu because my design goals did not align with his well enough, prompting him to write what became std.experimental.checkedint himself. Maximum convenience and similarity to D's native integer types were high priorities for me, so I spent a lot of time thinking about and experimenting with this problem. My conclusions: /////////////////////////////////// 1) Checked types are different from unchecked types. That's the whole point! I found that trying too hard to make transitioning between checked and unchecked types seamless created holes in the automated protection against overflow that the checked types are supposed to provide. Implicit conversions from checked to unchecked integers are dangerous for the same reason that implicit conversions from system to safe delegates are dangerous. I think the urge to make that transition seamless comes from the fact that trying to actually use checkedint (whether mine or Andrei's) for defensive programming is extremely tedious and annoying, because no one else is doing so. But, this is the wrong solution: the real answer is that checked operations should have been the default in D from the beginning, with unchecked intrinsics available for those rare cases where wrapping overflow and other strange behaviors of machine integers are actually desired, or where maximum performance is needed. Unchecked integer operations are mostly just a micro-optimization that is pointless outside of very hot code, like inner loops. (It is very puzzling that people consider memory safety so important, and yet are totally disinterested in integer overflow, which can violate memory safety.) 2) While there are many things that can be done to make the behavior of two types more similar, it is impossible in D to make any custom type an actual drop-in replacement for a different type. This is because D, by design, has only partial support for implicit conversions, and because template constraints and overload resolution are sensitive to the exact type of the arguments. Thus, whether to treat two different types as equivalent is ultimately a choice that each and every API that may interact with those types makes for itself, either intentionally or by accident. For example: V f(V)(V value) if(std.traits.isIntegral!V) { // Do something here ... } The perfectly reasonable template constraint above rejects checkedint types. Should it? There is no way to answer this question without seeing and understanding the body of the function: while uncommon, it is valid and sometimes desirable to depend upon wrapped integer overflow. So, the API designers must explicitly permit checkedint inputs if they consider that desirable. Automating good solutions to these ambiguities is possible in many cases, but would require deep, breaking, and controversial changes to the D language. /////////////////////////////////// TLDR; What you're really asking for is impossible in D2. It would require massive breaking changes to the language to implement without undermining the guarantees that a checked integer type exists to provide.
Mar 24
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/24/2021 1:28 PM, tsbockman wrote:
 Unchecked integer operations are mostly just a micro-optimization that is 
 pointless outside of very hot code, like inner loops. (It is very puzzling
that 
 people consider memory safety so important, and yet are totally disinterested
in 
 integer overflow, which can violate memory safety.)
Integer overflow happening should not result in memory safety errors in a safe language. It can cause other problems, but not that. The reasons people don't care that much about integer overflow are: 1. they are not the cause of enough problems to be that concerning 2. 2's complement arithmetic fundamentally relies on it 3. it's hard to have signed and unsigned integer types coexist without overflows, and not having unsigned types leads to ugly kludges to get them 4. fast integer arithmetic is fundamental to fast code, not a mere micro-optimization. Who wants an overflow check on every pointer increment? 5. size_t is unsigned, and ptrdiff_t is signed. Yet they have to work together.
Mar 26
next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 27 March 2021 at 03:25:04 UTC, Walter Bright wrote:
 The reasons people don't care that much about integer overflow 
 are:

 1. they are not the cause of enough problems to be that 
 concerning

 2. 2's complement arithmetic fundamentally relies on it
That's an implementation detail. There is no need at either the software or the hardware level to make it the programmer's problem by default. Main memory is addressed as one giant byte array, but we interact with it through better abstractions most of the time (the stack and the heap).
 3. it's hard to have signed and unsigned integer types coexist 
 without overflows, and not having unsigned types leads to ugly 
 kludges to get them
Correctly mixing signed and unsigned integers is hard for programmers to consistently get right, but easy for the computer. That's why the default should be for the computer to do it.
 4. fast integer arithmetic is fundamental to fast code,
I did benchmarking during the development of checkedint. With good inlining and optimization, even a library solution generally slows integer math code down by less than a factor of two. (I expect a language solution could do even better.) This is significant, but nowhere near big enough to move the bottleneck in most code away from I/O, memory, floating-point, or integer math for which wrapping is semantically correct (like hashing or encryption). In those cases where integer math code really is the bottleneck, there are often just a few hot spots where the automatic checks in some inner loop need to be replaced with manual checks outside the loop.
 not a mere micro-optimization.
By "micro-optimization" I mean that it does not affect the asymptotic performance of algorithms, does not matter much outside of hot spots, and is unlikely to change where the hot spots are in the average program.
 Who wants an overflow check on every pointer increment?
As with bounds checks, most of the time the compiler should be able to prove the checks can be skipped, or move them outside the inner loop. The required logic is very similar.
Mar 27
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/27/21 3:42 AM, tsbockman wrote:
 With good inlining and optimization, even a library solution generally 
 slows integer math code down by less than a factor of two. (I expect a 
 language solution could do even better.)
 
 This is significant, but nowhere near big enough to move the bottleneck 
 in most code away from I/O, memory, floating-point, or integer math for 
 which wrapping is semantically correct (like hashing or encryption). In 
 those cases where integer math code really is the bottleneck, there are 
 often just a few hot spots where the automatic checks in some inner loop 
 need to be replaced with manual checks outside the loop.
This claim seems speculative. A factor of two for a fundamental class of operations is very large, not just "significant". We're talking about e.g. 1 cycle for addition, and it was a big deal when it was introduced back in the early 2000s. Checked code is larger, meaning more pressure on the scarce I-cache in large programs - and that's not going to be visible in microbenchmarks. And "I/O is slow anyway" is exactly what drove the development of C++ catastrophically slow iostreams.
Mar 29
next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Monday, 29 March 2021 at 16:41:12 UTC, Andrei Alexandrescu 
wrote:
 Checked code is larger, meaning more pressure on the scarce
 I-cache in large programs - and that's not going to be visible
 in microbenchmarks.
This is true. But, at the moment I don't have an easy way to quantify the size of that effect.
 And "I/O is slow anyway" is exactly what drove the development 
 of C++ catastrophically slow iostreams.
That's really not what I said, though. What I actually said is: 0) The performance of hot code is usually limited by something other than semantically non-wrapping integer arithmetic. 1) When non-wrapping integer arithmetic is the bottleneck, the compiler should usually be able to optimize away most of the cost of checking for overflow. 2) When the compiler cannot optimize away most of the cost, programmers can usually do so manually. 3) Programmers could still disable the checks entirely wherever they consider the performance gain worth the damage done to correctness/reliability. 4) Outside of hot code, the cost isn't significant. You're picking on (0), but the validity of my claim that checked arithmetic by default wouldn't negatively impact performance much mainly depends upon the truth of (4) plus either the truth of (1), or the willingness and ability of programmers to take advantage of (2) and (3).
Mar 29
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/29/21 3:25 PM, tsbockman wrote:
 On Monday, 29 March 2021 at 16:41:12 UTC, Andrei Alexandrescu wrote:
 Checked code is larger, meaning more pressure on the scarce
 I-cache in large programs - and that's not going to be visible
 in microbenchmarks.
This is true. But, at the moment I don't have an easy way to quantify the size of that effect.
You actually do. Apply the scientific method. This is not a new idea, most definitely has been around for years and people have tried a variety of things. So all you need to do is search around scholar.google.com for papers on the topic and plain google.com for other work on the topic. In a couple of minutes I found: * https://dl.acm.org/doi/abs/10.1145/2743019 - relatively recent, quotes a lot of other work. A good starting point. * -ftrapv and -fwrapv flags in gcc: https://gcc.gnu.org/onlinedocs/gcc-4.0.2/gcc/Code-Gen-Options.html. This is not quite what you're looking for (they just crash the program on overflow), but it's good to figure how much demand there is and how people use those flags. * How popular is automated/manual overflow check in systems languages? Rust is a stickler for safety and it has explicit operations that check: https://stackoverflow.com/questions/52646755/checking-for-integ r-overflow-in-rust. I couldn't find any proposal for C or C++. What does this lack of evidence suggest? etc.
Mar 29
parent tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 30 March 2021 at 01:09:12 UTC, Andrei Alexandrescu 
wrote:
 * https://dl.acm.org/doi/abs/10.1145/2743019 - relatively 
 recent, quotes a lot of other work. A good starting point.
I skimmed the paper, and from what I have seen so far it supports my understanding of the facts in every way. I intend to read it more carefully later this week and post a summary here of the most relevant bits, for the benefit of anyone who doesn't want to pay for it. Of course, there is a subject aspect to all of this as well; even with numbers in hand reasonable people may disagree as to what should be done about them.
Mar 30
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 9:41 AM, Andrei Alexandrescu wrote:
 On 3/27/21 3:42 AM, tsbockman wrote:
 With good inlining and optimization, even a library solution generally slows 
 integer math code down by less than a factor of two. (I expect a language 
 solution could do even better.)

 This is significant, but nowhere near big enough to move the bottleneck in 
 most code away from I/O, memory, floating-point, or integer math for which 
 wrapping is semantically correct (like hashing or encryption). In those cases 
 where integer math code really is the bottleneck, there are often just a few 
 hot spots where the automatic checks in some inner loop need to be replaced 
 with manual checks outside the loop.
This claim seems speculative. A factor of two for a fundamental class of operations is very large, not just "significant". We're talking about e.g. 1 cycle for addition, and it was a big deal when it was introduced back in the early 2000s. Checked code is larger, meaning more pressure on the scarce I-cache in large programs - and that's not going to be visible in microbenchmarks. And "I/O is slow anyway" is exactly what drove the development of C++ catastrophically slow iostreams.
With the LEA instruction, which can do adds and some multiplies in one operation, this calculation often comes at zero cost, as it is uses the address calculation logic that runs in parallel. LEA does not set any flags or include any overflow detection logic. Just removing that optimization will result in significant slowdowns. Yes, bugs happen because of overflows. The worst consequence of this is memory corruption bugs in the form of undersized allocations and subsequent buffer overflows (from malloc(numElems * sizeElem)). But D's buffer overflow protection features mitigate this. D's integral promotion rules (bytes and shorts are promoted to ints before doing arithmetic) get rid of the bulk of likely overflows. (It's ironic that the integral promotion rules are much maligned and considered a mistake, I don't share that opinion, and this is one of the reasons why.) In my experience, there are very few places in real code where overflow is a possibility. They usually come in the form of unexpected input, such as overly large files, or specially crafted malicious input. I've inserted checks in DMD's implementation where overflow is a risk. Placing the burden of checks everywhere is a poor tradeoff. It isn't even clear what the behavior on overflows should be. Error? Wraparound? Saturation? std.experimental.checkedint enables the user to make this decision on a case-by-case basis. The language properly defaults to the simplest and fastest choice - wraparound. BTW, Rust does have optional overflow protection, it's turned off for release builds. This is pretty good evidence the performance cost of such checks is not worth it. It also does not do integral promotion, so Rust code is far more vulnerable to overflows.
Mar 29
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 29 March 2021 at 20:00:03 UTC, Walter Bright wrote:
 D's integral promotion rules (bytes and shorts are promoted to 
 ints before doing arithmetic) get rid of the bulk of likely 
 overflows. (It's ironic that the integral promotion rules are 
 much maligned and considered a mistake, I don't share that 
 opinion, and this is one of the reasons why.)
Well...sometimes they do: auto result = int.max + int.max; writeln(typeof(result).stringof); // int writeln(result); // -2 The main issue with D's integer promotion rules is that they're inconsistent. Sometimes truncating the result of an expression requires an explicit cast, and sometimes it doesn't.
Mar 29
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 2:05 PM, Paul Backus wrote:
 On Monday, 29 March 2021 at 20:00:03 UTC, Walter Bright wrote:
 D's integral promotion rules (bytes and shorts are promoted to ints before 
 doing arithmetic) get rid of the bulk of likely overflows. (It's ironic that 
 the integral promotion rules are much maligned and considered a mistake, I 
 don't share that opinion, and this is one of the reasons why.)
Well...sometimes they do:    auto result = int.max + int.max;    writeln(typeof(result).stringof); // int    writeln(result); // -2
I wrote "the bulk of", not "all"
 The main issue with D's integer promotion rules is that they're inconsistent. 
 Sometimes truncating the result of an expression requires an explicit cast,
and 
 sometimes it doesn't.
Without an example, I don't know what you mean.
Mar 29
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Monday, 29 March 2021 at 20:00:03 UTC, Walter Bright wrote:
 It isn't even clear what the behavior on overflows should be. 
 Error? Wraparound? Saturation?
It only seems unclear because you have accepted the idea that computer code "integer" operations may differ from mathematical integer operations in arbitrary ways. Otherwise, the algorithm is simple: if(floor(mathResult) <= codeResult && codeResult <= ceil(mathResult)) return codeResult; else signalErrorSomehow(); Standard mathematical integer addition does not wrap around or saturate. When someone really wants an operation that wraps around or saturates (not just for speed's sake), then that is a different operation and should use a different name and/or type(s), to avoid sowing confusion and ambiguity throughout the codebase for readers and compilers. All of the integer behavior that people complain about violates this in some way: wrapping overflow, incorrect signed-unsigned comparisons, confusing/inconsistent implicit conversion rules, undefined behavior of various more obscure operations for certain inputs, etc. Mathematical integers are a more familiar, simpler, easier to reason about abstraction. When we use this abstraction, we can draw upon our understanding and intuition from our school days, use common mathematical laws and formulas with confidence, etc. Of course the behavior of the computer cannot fully match this infinite abstraction, but it could at least tell us when it is unable to do what was asked of it, instead of just silently doing something else.
Mar 29
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Mar 29, 2021 at 10:47:49PM +0000, tsbockman via Digitalmars-d wrote:
 On Monday, 29 March 2021 at 20:00:03 UTC, Walter Bright wrote:
 It isn't even clear what the behavior on overflows should be. Error?
 Wraparound? Saturation?
It only seems unclear because you have accepted the idea that computer code "integer" operations may differ from mathematical integer operations in arbitrary ways.
The only thing at fault here is the name "integer". `int` in D is defined to be a 32-bit machine word. The very specification of "32-bit" already implies modulo 2^32. Meaning, this is arithmetic modulo 2^32, this is NOT a mathematical infinite-capacity integer. Ditto for the other built-in integral types. When you typed `int` you already signed up for all of the "unintuitive" behaviour that has been the standard behaviour of built-in machine words since the 70's and 80's. They *approximate* mathematical integers, but they are certainly NOT the same thing as mathematical integers, and this is *by definition*. If you want mathematical integers, you should be using std.bigint or something similar instead.
 Otherwise, the algorithm is simple:
 
     if(floor(mathResult) <= codeResult && codeResult <= ceil(mathResult))
         return codeResult;
     else
         signalErrorSomehow();
Implementing such a scheme would introduce so much overhead that it would render the `int` type essentially useless for systems programming. Or for any application where performance is important, for that matter.
 Standard mathematical integer addition does not wrap around or
 saturate.  When someone really wants an operation that wraps around or
 saturates (not just for speed's sake), then that is a different
 operation and should use a different name and/or type(s), to avoid
 sowing confusion and ambiguity throughout the codebase for readers and
 compilers.
The meaning of +, -, *, /, % for built-in machine words has been the one in modulo 2^n arithmetic since the early days when computers were first invented. This isn't going to change anytime soon in a systems language. It doesn't matter what you call them; if you don't like the use of the symbols +, -, *, / for anything other than "standard mathematical integers", make your own language and call them something else. But they are the foundational hardware-supported operations upon which more complex abstractions are built; without them, you wouldn't even be capable of arithmetic in the first place. It's unrealistic to impose pure mathematical definitions on limited-precision hardware numbers. Sooner or later, any programmer must come to grips with what's actually implemented in hardware, not what he imagines some ideal utopian hardware would implement. It's like people complaining that IEEE floats are "buggy" or otherwise behave in strange ways. That's because they're NOT mathematical real numbers. But they *are* a useful approximation of mathematical real numbers -- if used correctly. That requires learning to work with what's implemented in the hardware rather than imposing mathematical ideals on an abstraction that requires laborious (i.e., inefficient) translations to fit the ugly hardware reality. If you don't like the "oddness" of hardware-implemented types, there's always the option of using std.bigint, or software like Mathematica or similar that frees you from needing to worry about the ugly realities of the hardware. Just don't expect the same kind of performance you will get by using the hardware types directly.
 All of the integer behavior that people complain about violates this
 in some way: wrapping overflow, incorrect signed-unsigned comparisons,
 confusing/inconsistent implicit conversion rules, undefined behavior
 of various more obscure operations for certain inputs, etc.
 
 Mathematical integers are a more familiar, simpler, easier to reason
 about abstraction. When we use this abstraction, we can draw upon our
 understanding and intuition from our school days, use common
 mathematical laws and formulas with confidence, etc. Of course the
 behavior of the computer cannot fully match this infinite abstraction,
 but it could at least tell us when it is unable to do what was asked
 of it, instead of just silently doing something else.
It's easy to invent idealized abstractions that are easy to reason about, but which require unnatural contortions to implement efficiently in hardware. A programming language like D that claims to be a systems programming language needs to be able to program the hardware directly, not to impose some ideal abstractions that do not translate nicely to hardware and that therefore require a lot of complexity on the part of the compiler to implement, and on top of that incurs poor runtime performance. To quote Knuth: People who are more than casually interested in computers should have at least some idea of what the underlying hardware is like. Otherwise the programs they write will be pretty weird. -- D. Knuth Again, if you expect mathematical integers, use std.bigint. Or MathCAD or similar. The integral types defined in D are raw hardware types of fixed bit length -- which by definition operate according to modulo 2^n arithmetic. The "peculiarities" of the hardware types are inevitable, and I seriously doubt this is going to change anytime in the foreseeable future. By using `int` instead of `BigInt`, the programmer has already implicitly accepted the "weird" hardware behaviour, and must be prepared to deal with the consequences. Just as when you use `float` or `double` you already signed up for IEEE semantics, like it or not. (I don't, but I also recognize that it's unrealistic to expect the hardware type to match up 100% with the mathematical ideal.) If you don't like that, use one of the real arithmetic libraries out there that let you work with "true" mathematical reals that aren't subject to the quirks of IEEE floating-point numbers. Just don't expect anything that will be competitive performance-wise. Like I said, the only real flaw here is the choice of the name `int` for a hardware type that's clearly NOT an unbounded mathemetical integer. It's too late to rename it now, but basically it should be thought of as `intMod32bit` rather than `integerInTheMathematicalSense`. Once you mentally translate `int` into "32-bit 2's-complement binary word in a hardware register", everything else naturally follows. T -- They pretend to pay us, and we pretend to work. -- Russian saying
Mar 29
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 5:02 PM, H. S. Teoh wrote:
 Like I said, the only real flaw here is the choice of the name `int` for
 a hardware type that's clearly NOT an unbounded mathemetical integer.
You're right. It's not an integer, it's an int :-) Besides, nobody is going to want to type `intMod32bit` every time they declare a variable. Heck, Rust chose `int32`, but that is of value only in the first 30 seconds of learning Rust, and will be cursed forever after.
Mar 29
prev sibling next sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:
 If you want mathematical integers, you should be using 
 std.bigint or something similar instead.


 Otherwise, the algorithm is simple:
 
     if(floor(mathResult) <= codeResult && codeResult <= 
 ceil(mathResult))
         return codeResult;
     else
         signalErrorSomehow();
Implementing such a scheme would introduce so much overhead that it would render the `int` type essentially useless for systems programming. Or for any application where performance is important, for that matter.
You have a wildly exaggerated sense of the runtime performance cost of doing things the way I advocate if you think it is anywhere close to bigint. My proposal (grossly oversimplified) is mostly just to check the built-in CPU overflow flags once in a while. I've actually tested this, and even with a library solution the overhead is low in most realistic scenarios, if the inliner and optimizer are effective. A language solution could do even better, I'm sure.
Mar 29
prev sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:

 Just as when you use `float` or `double` you already signed up 
 for IEEE semantics, like it or not. (I don't, but I also 
 recognize that it's unrealistic to expect the hardware type to 
 match up 100% with the mathematical ideal.) If you don't like 
 that, use one of the real arithmetic libraries out there that 
 let you work with "true" mathematical reals that aren't subject 
 to the quirks of IEEE floating-point numbers. Just don't expect 
 anything that will be competitive performance-wise.
I seems you are arguing against the way D broke compile time floats and doubles. )
Mar 29
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 10:53 PM, Max Samukha wrote:
 On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:
 
 Just as when you use `float` or `double` you already signed up for IEEE 
 semantics, like it or not. (I don't, but I also recognize that it's 
 unrealistic to expect the hardware type to match up 100% with the mathematical 
 ideal.) If you don't like that, use one of the real arithmetic libraries out 
 there that let you work with "true" mathematical reals that aren't subject to 
 the quirks of IEEE floating-point numbers. Just don't expect anything that 
 will be competitive performance-wise.
I seems you are arguing against the way D broke compile time floats and doubles. )
Compile-time isn't a run-time performance issue.
Mar 29
next sibling parent reply Max Haughton <maxhaton gmail.com> writes:
On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:
 On 3/29/2021 10:53 PM, Max Samukha wrote:
 On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:
 
 Just as when you use `float` or `double` you already signed 
 up for IEEE semantics, like it or not. (I don't, but I also 
 recognize that it's unrealistic to expect the hardware type 
 to match up 100% with the mathematical ideal.) If you don't 
 like that, use one of the real arithmetic libraries out there 
 that let you work with "true" mathematical reals that aren't 
 subject to the quirks of IEEE floating-point numbers. Just 
 don't expect anything that will be competitive 
 performance-wise.
I seems you are arguing against the way D broke compile time floats and doubles. )
Compile-time isn't a run-time performance issue.
On the subject of run-time performance, checkedint can also do things like Saturation arithmetic, which can be accelerated using increasingly common native instructions (e.g. AVX on Intel, AMD, and presumably Via also). I have done some tests and found that these are not currently used. ARM also has saturating instructions but I haven't done any tests. Due to AVX being a SIMD instruction set there is a tradeoff to using them for scalar operations, however for loops the proposition seems attractive. The calculus to do this seems non-trivial for the backend however. (AVX instructions are also quite big so there is a the usual I$ hit here too).
Mar 30
parent reply Bruce Carneal <bcarneal gmail.com> writes:
On Tuesday, 30 March 2021 at 08:48:04 UTC, Max Haughton wrote:
 On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:
 On 3/29/2021 10:53 PM, Max Samukha wrote:
 On Tuesday, 30 March 2021 at 00:02:54 UTC, H. S. Teoh wrote:
 
 
[...]
 On the subject of run-time performance, checkedint can also do 
 things like Saturation arithmetic, which can be accelerated 
 using increasingly common native instructions (e.g. AVX on 
 Intel, AMD, and presumably Via also).
[...]
 (AVX instructions are also quite big so there is a the usual I$ 
 hit here too).
Some micro-architectures employ an L0/uOp cache, which can significantly alter the I$ performance calculus within loops. To confidently identify an I$ performance bottleneck I think you'd need to use perf analysis tools. IIRC Max recommended this at Beerconf. Side note: the checkedint code sure looks nice. It's a very readable example of the leverage D affords.
Mar 30
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 6:33 AM, Bruce Carneal wrote:
 Side note: the checkedint code sure looks nice.  It's a very readable example
of 
 the leverage D affords.
Yes, that's also why I want it to have more visibility by being in Phobos.
Mar 30
prev sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:

 Compile-time isn't a run-time performance issue.
Performance is irrelevant to the fact that D frivolously violates basic assumptions about float/double at compile-time.
Mar 31
next sibling parent Max Haughton <maxhaton gmail.com> writes:
On Wednesday, 31 March 2021 at 11:18:05 UTC, Max Samukha wrote:
 On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:

 Compile-time isn't a run-time performance issue.
Performance is irrelevant to the fact that D frivolously violates basic assumptions about float/double at compile-time.
Like?
Mar 31
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 31.03.21 13:18, Max Samukha wrote:
 On Tuesday, 30 March 2021 at 06:43:04 UTC, Walter Bright wrote:
 
 Compile-time isn't a run-time performance issue.
Performance is irrelevant to the fact that D frivolously violates basic assumptions about float/double at compile-time.
Not just at compile time, but it's less noticeable at runtime because compilers usually choose to do the right thing anyway.
Apr 01
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 3:47 PM, tsbockman wrote:
 On Monday, 29 March 2021 at 20:00:03 UTC, Walter Bright wrote:
 It isn't even clear what the behavior on overflows should be. Error? 
 Wraparound? Saturation?
It only seems unclear because you have accepted the idea that computer code "integer" operations may differ from mathematical integer operations in arbitrary ways.
Programmers need to accept that computer math is different in arbitrary ways. Not accepting it means a lifetime of frustration, because it cannot be the same.
 Otherwise, the algorithm is simple:
 
     if(floor(mathResult) <= codeResult && codeResult <= ceil(mathResult))
         return codeResult;
     else
         signalErrorSomehow();
Some of the SIMD arithmetic instructions use saturation arithmetic. It is definitely a thing, and Intel found it profitable to add hardware support for it.
 Standard mathematical integer addition does not wrap around or saturate. When 
 someone really wants an operation that wraps around or saturates (not just for 
 speed's sake), then that is a different operation and should use a different 
 name and/or type(s), to avoid sowing confusion and ambiguity throughout the 
 codebase for readers and compilers.
That's what std.experimental.checkedint does.
 All of the integer behavior that people complain about violates this in some 
 way: wrapping overflow, incorrect signed-unsigned comparisons, 
 confusing/inconsistent implicit conversion rules,
The integral promotion rules have been standard practice for 40 years. It takes two sentences to describe them accurately. Having code that looks like C but behaves differently will be *worse*.
 undefined behavior of various 
 more obscure operations for certain inputs, etc.
Offhand, I can't think of any.
 Mathematical integers are a more familiar, simpler, easier to reason about 
 abstraction. When we use this abstraction, we can draw upon our understanding 
 and intuition from our school days, use common mathematical laws and formulas 
 with confidence, etc. Of course the behavior of the computer cannot fully
match 
 this infinite abstraction, but it could at least tell us when it is unable to
do 
 what was asked of it, instead of just silently doing something else.
These things all come at a cost. The cost is higher than the benefit. Having D generate overflow checks on all adds and multiples will immediately make D uncompetitive with C, C++, Rust, Zig, Nim, etc.
Mar 29
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 30 March 2021 at 00:33:13 UTC, Walter Bright wrote:
 Having D generate overflow checks on all adds and multiples 
 will immediately make D uncompetitive with C, C++, Rust, Zig, 
 Nim, etc.
As someone else shared earlier in this thread, Zig already handles this in pretty much exactly the way I argue for: https://ziglang.org/documentation/master/#Integer-Overflow
Mar 29
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/29/2021 6:29 PM, tsbockman wrote:
 On Tuesday, 30 March 2021 at 00:33:13 UTC, Walter Bright wrote:
 Having D generate overflow checks on all adds and multiples will immediately 
 make D uncompetitive with C, C++, Rust, Zig, Nim, etc.
As someone else shared earlier in this thread, Zig already handles this in pretty much exactly the way I argue for:    https://ziglang.org/documentation/master/#Integer-Overflow
I amend my statement to "immediately make D as uncompetitive as Zig is" Note that Zig has a very different idea of integers than D does. It has arbitrary bit width integers, up to 65535. This seems odd, as what are you going to do with a 6 bit integer? There aren't machine instructions to support it. It'd be better off with a ranged integer, say: i : int 0..64
Mar 29
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:

 Note that Zig has a very different idea of integers than D 
 does. It has arbitrary bit width integers, up to 65535. This 
 seems odd, as what are you going to do with a 6 bit integer? 
 There aren't machine instructions to support it. It'd be better 
 off with a ranged integer, say:

    i : int 0..64
The question is then, does that mean that Zig has over 131070 keywords (65535 for signed and unsigned each)? :D. Or does it reserve anything that starts with i/u followed by numbers? Kind of like how D reveres identifiers starting with two underscores. -- /Jacob Carlborg
Mar 30
parent Rumbu <rumbu rumbu.ro> writes:
On Tuesday, 30 March 2021 at 15:28:04 UTC, Jacob Carlborg wrote:
 On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:

 Note that Zig has a very different idea of integers than D 
 does. It has arbitrary bit width integers, up to 65535. This 
 seems odd, as what are you going to do with a 6 bit integer? 
 There aren't machine instructions to support it. It'd be 
 better off with a ranged integer, say:

    i : int 0..64
The question is then, does that mean that Zig has over 131070 keywords (65535 for signed and unsigned each)? :D. Or does it reserve anything that starts with i/u followed by numbers? Kind of like how D reveres identifiers starting with two underscores. -- /Jacob Carlborg
In Zig, integer type names are not considered keywords, e.g you can use i7 as a variable name or i666 as a function name. But you cannot define new types with this pattern, you get an error message stating that "Type 'i?' is shadowing primitive type 'i?'".
Mar 30
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:
 On 3/29/2021 6:29 PM, tsbockman wrote:
 On Tuesday, 30 March 2021 at 00:33:13 UTC, Walter Bright wrote:
 Having D generate overflow checks on all adds and multiples 
 will immediately make D uncompetitive with C, C++, Rust, Zig, 
 Nim, etc.
As someone else shared earlier in this thread, Zig already handles this in pretty much exactly the way I argue for:    https://ziglang.org/documentation/master/#Integer-Overflow
I amend my statement to "immediately make D as uncompetitive as Zig is"
So you're now dismissing Zig as slow because its feature set surprised you? No real-world data is necessary? No need to understand any of Zig's relevant optimizations or options?
Mar 30
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 10:09 AM, tsbockman wrote:
 So you're now dismissing Zig as slow because its feature set surprised you?
Because it surprised me? No. Because if someone had figured out a way to do overflow checks for no runtime costs, it would be in every language. I know Rust tried pretty hard to do it.
 No real-world data is necessary? No need to understand any of Zig's relevant 
 optimizations or options?
I don't have to test a brick to assume it won't fly. But I could be wrong, definitely. If you can prove me wrong in my presumption, I'm listening. P.S. Yes, I know anything will "fly" if you attach enough horsepower to it. But there's a reason airplanes don't look like bricks.
Mar 30
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Tuesday, 30 March 2021 at 17:53:37 UTC, Walter Bright wrote:
 On 3/30/2021 10:09 AM, tsbockman wrote:
 So you're now dismissing Zig as slow because its feature set 
 surprised you?
Because it surprised me? No. Because if someone had figured out a way to do overflow checks for no runtime costs, it would be in every language. I know Rust tried pretty hard to do it.
Zero runtime cost is not a reasonable standard unless the feature is completely worthless and it cannot be turned off.
 No real-world data is necessary? No need to understand any of 
 Zig's relevant optimizations or options?
I don't have to test a brick to assume it won't fly. But I could be wrong, definitely. If you can prove me wrong in my presumption, I'm listening.
Since I have already been criticized for the use of micro-benchmarks, I assume that only data from complete practical applications will satisfy. Unfortunately, the idiomatic C, C++, D, and Rust source code all omit the information required to perform such tests. Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc Andrei mentioned earlier) won't work, because most high performance code contains some deliberate and correct examples of wrapping overflow, signed-unsigned reinterpretation, etc. Idiomatic Zig code (probably Ada, too) does contain this information. But, the selection of "real world" open source Zig code available for testing is limited right now, since Zig hasn't stabilized the language or the standard library yet. The best test subject I have found, compiled, and run successfully is this: https://github.com/Vexu/arocc It's an incomplete C compiler: "Right now preprocessing and parsing is mostly done but anything beyond that is missing." I believe compilation is a fairly integer-intensive workload, so the results should be meaningful. To test, I took the C source code of gzip and duplicated its contents many times until I got the arocc wall time up to about 1 second. (The final input file is 37.5 MiB.) arocc outputs a long stream of error messages to stderr, whose contents aren't important for our purposes. In order to minimize the time consumed by I/O, I run each test several times in a row and ignore the early runs, to ensure that the input file is cached in RAM by the OS, and pipe the output of arocc (both stdout and stderr) to /dev/null. Results with -O ReleaseSafe (optimizations on, with checked integer arithmetic, bounds checks, null checks, etc.): Binary size: 2.0 MiB Wall clock time: 1.31s System time: 0.71s User time: 0.60s CPU usage: 99% of a single core Results with -O ReleaseFast (optimizations on, with safety checks off): Binary size: 2.3 MiB Wall clock time: 1.15s System time: 0.68s User time: 0.46s CPU usage: 99% of a single core So, in this particular task ReleaseSafe (which checks for a lot of other things, not just integer overflow) takes 14% longer than ReleaseFast. If you only care about user time, that is 48% longer. Last time I checked, these numbers are similar to the performance difference between optimized builds by DMD and LDC/GDC. They are also similar to the performance differences within related benchmarks like: https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/cpp.html Note also that with Zig's approach, paying the modest performance penalty for the various safety checks is *completely optional* in release builds (just like D's bounds checking). Even for applications where that final binary order of magnitude of speed is considered essential in production, Zig's approach still leads to clearer, easier to debug code. So, unless DMD (or C itself!) is "a brick" that "won't fly", your claim that this is something that a high performance systems programming language just cannot do is not grounded in reality.
Mar 30
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 4:01 PM, tsbockman wrote:
 So, in this particular task ReleaseSafe (which checks for a lot of other
things, 
 not just integer overflow) takes 14% longer than ReleaseFast. If you only care 
 about user time, that is 48% longer.
Thank you for running benchmarks. 14% is a big deal.
 Last time I checked, these numbers are similar to the performance difference 
 between optimized builds by DMD and LDC/GDC. They are also similar to the 

Ada/C 
 in language comparison benchmarks like:
 https://benchmarksgame-team.pages.debian.net/benchmarksgame/fastest/cpp.html
 
 Note also that with Zig's approach, paying the modest performance penalty for 
 the various safety checks is *completely optional* in release builds (just
like 
 D's bounds checking). Even for applications where that final binary order of 
 magnitude of speed is considered essential in production, Zig's approach still 
 leads to clearer, easier to debug code.
The problem with turning it off for production code is that the overflows tend to be rare and not encountered during testing. When you need it, it is disabled. Essentially, turning it off for release code is an admission that it is too expensive. Note that D's bounds checking is *not* turned off in release mode. It has a separate switch to turn that off, and I recommend only using it to see how much performance it'll cost for a particular application.
 So, unless DMD (or C itself!) is "a brick" that "won't fly", your claim that 
 this is something that a high performance systems programming language just 
 cannot do is not grounded in reality.
I didn't say cannot. I said it would make it uncompetitive. Overflow checking would be nice to have. But it is not worth the cost for D. I also claim that D code is much less likely to suffer from overflows because of the implicit integer promotion rules. Adding two shorts is never going to overflow, for example, and D won't let you naively assign the resulting int back to a short. One could legitimately claim that D *does* have a form of integer overflow protection in the form of Value Range Propagation (VRP). Best of all, VRP comes for free at zero runtime cost! P.S. I know you know this, due to your good work on VRP :-) but I mention it for the other readers. P.P.S. So why is this claim not made for C? Because: short s, t, u; s = t + u; compiles without complaint in C, but will fail to compile in D. C doesn't have VRP.
Mar 30
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 01:43:50 UTC, Walter Bright wrote:
 Thank you for running benchmarks.

 14% is a big deal.
Note that I deliberately chose an integer-intensive workload, and artificially sped up the I/O to highlight the performance cost. For most real-world applications, the cost is actually *much* lower. The paper Andrei linked earlier has a couple of examples: Checked Apache httpd is less than 0.1% slower than unchecked. Checked OpenSSH file copy is about 7% slower than unchecked. https://dl.acm.org/doi/abs/10.1145/2743019
 The problem with turning it off for production code is that the 
 overflows tend to be rare and not encountered during testing. 
 When you need it, it is disabled.
Only if you choose to disable it. Just because you think it's not worth the cost doesn't mean everyone, or even most people, would turn it off.
 Essentially, turning it off for release code is an admission 
 that it is too expensive.
It's an admission that it's too expensive *for some applications*, not in general. D's garbage collector is too expensive for some applications, but that doesn't mean it should be removed from the language, nor even disabled by default.
 Note that D's bounds checking is *not* turned off in release 
 mode. It has a separate switch to turn that off, and I 
 recommend only using it to see how much performance it'll cost 
 for a particular application.
That's exactly how checked arithmetic, bounds checking, etc. works in Zig. What do you think the difference is, other than your arbitrary assertion that checked arithmetic costs more than it's worth?
 I said it would make it uncompetitive.
The mean performance difference between C and C++ in the (admittedly casual) comparative benchmarks I cited is 36%. Is C uncompetitive with C++? What definition of "uncompetitive" are you using?
 Overflow checking would be nice to have. But it is not worth 
 the cost for D. I also claim that D code is much less likely to 
 suffer from overflows...
Yes, D is better than C in this respect (among many others).
Mar 30
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/30/21 11:07 PM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 01:43:50 UTC, Walter Bright wrote:
 Thank you for running benchmarks.

 14% is a big deal.
Note that I deliberately chose an integer-intensive workload, and artificially sped up the I/O to highlight the performance cost.
Idea: build dmd with -ftrapv (which is supported, I think, by gdc and ldc) and compare performance. That would be truly interesting.
Mar 30
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/30/21 7:01 PM, tsbockman wrote:
 Simply flipping compiler switches (the -ftrapv and -fwrapv flags in gcc 
 Andrei mentioned earlier) won't work, because most high performance code 
 contains some deliberate and correct examples of wrapping overflow, 
 signed-unsigned reinterpretation, etc.
 
 Idiomatic Zig code (probably Ada, too) does contain this information. 
 But, the selection of "real world" open source Zig code available for 
 testing is limited right now, since Zig hasn't stabilized the language 
 or the standard library yet.
That's awfully close to "No true Scotsman".
Mar 30
parent reply tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu 
wrote:
 On 3/30/21 7:01 PM, tsbockman wrote:
 Simply flipping compiler switches (the -ftrapv and -fwrapv 
 flags in gcc Andrei mentioned earlier) won't work, because 
 most high performance code contains some deliberate and 
 correct examples of wrapping overflow, signed-unsigned 
 reinterpretation, etc.
 
 Idiomatic Zig code (probably Ada, too) does contain this 
 information. But, the selection of "real world" open source 
 Zig code available for testing is limited right now, since Zig 
 hasn't stabilized the language or the standard library yet.
That's awfully close to "No true Scotsman".
Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:32 AM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu wrote:
 On 3/30/21 7:01 PM, tsbockman wrote:
 Simply flipping compiler switches (the -ftrapv and -fwrapv flags in 
 gcc Andrei mentioned earlier) won't work, because most high 
 performance code contains some deliberate and correct examples of 
 wrapping overflow, signed-unsigned reinterpretation, etc.

 Idiomatic Zig code (probably Ada, too) does contain this information. 
 But, the selection of "real world" open source Zig code available for 
 testing is limited right now, since Zig hasn't stabilized the 
 language or the standard library yet.
That's awfully close to "No true Scotsman".
Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
I thought it's fairly clear - the claim is non-falsifiable: if code is faster without checks, it is deemed so on account of tricks. Code without checks could benefit of other, better tricks, but their absence is explained by the small size of the available corpus.
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:47 AM, Andrei Alexandrescu wrote:
 On 3/31/21 12:32 AM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei Alexandrescu wrote:
 On 3/30/21 7:01 PM, tsbockman wrote:
 Simply flipping compiler switches (the -ftrapv and -fwrapv flags in 
 gcc Andrei mentioned earlier) won't work, because most high 
 performance code contains some deliberate and correct examples of 
 wrapping overflow, signed-unsigned reinterpretation, etc.

 Idiomatic Zig code (probably Ada, too) does contain this 
 information. But, the selection of "real world" open source Zig code 
 available for testing is limited right now, since Zig hasn't 
 stabilized the language or the standard library yet.
That's awfully close to "No true Scotsman".
Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
I thought it's fairly clear - the claim is non-falsifiable: if code is faster without checks, it is deemed so on account of tricks. Code without checks could benefit of other, better tricks, but their absence is explained by the small size of the available corpus.
s/Code without checks could benefit of other/Code with checks could benefit of other/
Mar 30
parent tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 04:49:01 UTC, Andrei Alexandrescu 
wrote:
 On 3/31/21 12:47 AM, Andrei Alexandrescu wrote:
 On 3/31/21 12:32 AM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 03:32:40 UTC, Andrei 
 Alexandrescu wrote:
 On 3/30/21 7:01 PM, tsbockman wrote:
 Simply flipping compiler switches (the -ftrapv and -fwrapv 
 flags in gcc Andrei mentioned earlier) won't work, because 
 most high performance code contains some deliberate and 
 correct examples of wrapping overflow, signed-unsigned 
 reinterpretation, etc.

 Idiomatic Zig code (probably Ada, too) does contain this 
 information. But, the selection of "real world" open source 
 Zig code available for testing is limited right now, since 
 Zig hasn't stabilized the language or the standard library 
 yet.
That's awfully close to "No true Scotsman".
Just tossing out names of fallacies isn't really very helpful if you don't explain why you think it may apply here.
I thought it's fairly clear
Thank you for explaining anyway.
 - the claim is non-falsifiable: if code is faster without
 checks, it is deemed so on account of tricks.
I've never disputed at any point that unchecked code is, by nature, almost always faster than checked code - albeit often not by much. I haven't attributed unchecked code's speed advantage to "tricks" anywhere.
 Code without checks could benefit of other, better
 tricks, but their absence is explained by the small size of the
 available corpus.
s/Code without checks could benefit of other/Code with checks could benefit of other/
While I think it is true that "better tricks" can narrow the performance gap between checked and unchecked code, that is not at all what I was talking about at all in the paragraphs you labeled "No true Scotsman". Consider a C++ program similar to the following D program: ///////////////////////////////////// module app; import std.stdio : writeln, readln; import std.conv : parse; N randLCG(N)() safe if(is(N == int) || is(N == uint)) { static N state = N(211210973); // "Numerical Recipes" linear congruential generator: return (state = N(1664525) * state + N(1013904223)); // can and should wrap } double testDivisor(N, N divisor)(const(ulong) trials) safe if(is(N == int) || is(N == uint)) { N count = 0; foreach(n; 0 .. trials) count += (randLCG!N() % divisor) == N(0); // can, but should *not* wrap return count / real(trials); } void main() { string input = readln(); const trials = parse!ulong(input); writeln(testDivisor!( int, 3)(trials)); writeln(testDivisor!(uint, 3)(trials)); } ///////////////////////////////////// randLCG!( int, 3) requires -fwrapv and NOT -ftrapv to work as intended. randLCG!(uint, 3) works correctly no matter what. testDivisor!( int, 3) requires -ftrapv and NOT -fwrapv to detect unintended overflows. testDivisor!(uint, 3) is always vulnerable to unintended overflow, with or without -ftrapv. So, neither -ftrapv nor -fwrapv causes an idiomatic C++ program detect unintended overflows without false positives in the general case. The compiler simply doesn't have enough information available to do so, regardless of how much performance we are willing to sacrifice. Instead, the source code of a C++ program must first be modified by a real human being to make it compatible with either -ftrapv or -fwrapv (which are mutually exclusive). The paper you linked earlier mentions this problem: "Finally often integer overflows are known to be intentional or the programmer has investigated it and determined it to be acceptable. To address these use cases while still being useful in reporting undesired integer overflows, a whitelist functionality was introduced to enable users to specify certain files or functions that should not be checked. ... Second, our methodology for distinguishing intentional from unintentional uses of wraparound is manual and subjective. The manual effort required meant that we could only study a subset of the errors..." (See sections 5.3 and 6.1 of https://dl.acm.org/doi/abs/10.1145/2743019) Idiomatic Zig code already contains the information which the researchers on that paper had to manually insert for all the C/C++ code they tested. That is why my tests were limited to Zig, because I don't have the time or motivation to go and determine whether each and every potential overflow in GCC or Firefox or whatever is intentional, just so that I can benchmark them with -ftrapv enabled.
Mar 31
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/30/21 1:09 PM, tsbockman wrote:
 On Tuesday, 30 March 2021 at 03:31:05 UTC, Walter Bright wrote:
 On 3/29/2021 6:29 PM, tsbockman wrote:
 On Tuesday, 30 March 2021 at 00:33:13 UTC, Walter Bright wrote:
 Having D generate overflow checks on all adds and multiples will 
 immediately make D uncompetitive with C, C++, Rust, Zig, Nim, etc.
As someone else shared earlier in this thread, Zig already handles this in pretty much exactly the way I argue for:     https://ziglang.org/documentation/master/#Integer-Overflow
I amend my statement to "immediately make D as uncompetitive as Zig is"
So you're now dismissing Zig as slow because its feature set surprised you? No real-world data is necessary? No need to understand any of Zig's relevant optimizations or options?
Instead of passing the burden of proof back and forth, some evidence would be welcome. I know nothing about Zig so e.g. I couldn't tell how accurate its claims are: https://news.ycombinator.com/item?id=21117669 FWIW I toyed with this but don't know what optimization flags zig takes: https://godbolt.org/z/vKds1c8WY
Mar 30
next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu 
wrote:
 FWIW I toyed with this but don't know what optimization flags 
 zig takes: https://godbolt.org/z/vKds1c8WY
Typing --help in the flags box answers that question :) And the answer is "-O ReleaseFast": https://godbolt.org/z/1WK6W7TM9
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/30/21 11:40 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu wrote:
 FWIW I toyed with this but don't know what optimization flags zig 
 takes: https://godbolt.org/z/vKds1c8WY
Typing --help in the flags box answers that question :) And the answer is "-O ReleaseFast": https://godbolt.org/z/1WK6W7TM9
Cool, thanks. I was looking for "the fastest code that still has the checks", how to get that?
Mar 30
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:01 AM, Andrei Alexandrescu wrote:
 On 3/30/21 11:40 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu wrote:
 FWIW I toyed with this but don't know what optimization flags zig 
 takes: https://godbolt.org/z/vKds1c8WY
Typing --help in the flags box answers that question :) And the answer is "-O ReleaseFast": https://godbolt.org/z/1WK6W7TM9
Cool, thanks. I was looking for "the fastest code that still has the checks", how to get that?
I guess that'd be "-O ReleaseSafe": https://godbolt.org/z/cYcscf1W5
Mar 30
prev sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 04:01:48 UTC, Andrei Alexandrescu 
wrote:
 On 3/30/21 11:40 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei 
 Alexandrescu wrote:
 FWIW I toyed with this but don't know what optimization flags 
 zig takes: https://godbolt.org/z/vKds1c8WY
Typing --help in the flags box answers that question :) And the answer is "-O ReleaseFast": https://godbolt.org/z/1WK6W7TM9
Cool, thanks. I was looking for "the fastest code that still has the checks", how to get that?
Right, sorry. --help says: ReleaseFast Optimizations on, safety off ReleaseSafe Optimizations on, safety on So, maybe that. The ReleaseSafe code looks pretty good, it generates a "jo" instruction: https://godbolt.org/z/cYcscf1W5 Who knows what it actually looks like in CPU microcode, though :)
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:04 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 04:01:48 UTC, Andrei Alexandrescu wrote:
 On 3/30/21 11:40 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu wrote:
 FWIW I toyed with this but don't know what optimization flags zig 
 takes: https://godbolt.org/z/vKds1c8WY
Typing --help in the flags box answers that question :) And the answer is "-O ReleaseFast": https://godbolt.org/z/1WK6W7TM9
Cool, thanks. I was looking for "the fastest code that still has the checks", how to get that?
Right, sorry. --help says:     ReleaseFast             Optimizations on, safety off     ReleaseSafe             Optimizations on, safety on So, maybe that. The ReleaseSafe code looks pretty good, it generates a "jo" instruction: https://godbolt.org/z/cYcscf1W5 Who knows what it actually looks like in CPU microcode, though :)
Not much to write home about. The jumps scale linearly with the number of primitive operations: https://godbolt.org/z/r3sj1T4hc That's not going to be a speed demon.
Mar 30
next sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu 
wrote:
 Not much to write home about. The jumps scale linearly with the 
 number of primitive operations:

 https://godbolt.org/z/r3sj1T4hc

 That's not going to be a speed demon.
Ideally, in release builds the compiler could loosen up the precision of the traps a bit and combine the overflow checks for short sequences of side-effect free operations.
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:37 AM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu wrote:
 Not much to write home about. The jumps scale linearly with the number 
 of primitive operations:

 https://godbolt.org/z/r3sj1T4hc

 That's not going to be a speed demon.
Ideally, in release builds the compiler could loosen up the precision of the traps a bit and combine the overflow checks for short sequences of side-effect free operations.
Yah, was hoping I'd find something like that. Was disappointed. That makes their umbrella claim "Zig is faster than C" quite specious.
Mar 30
parent reply Jacob Carlborg <doob me.com> writes:
On Wednesday, 31 March 2021 at 04:49:52 UTC, Andrei Alexandrescu 
wrote:

 That makes their umbrella claim "Zig is faster than C" quite 
 specious.
The reason, or one of the reasons, why Zig is/can be faster than C is that is uses different default optimization levels. For example, Zig will by default target your native CPU instead of some generic model. This allows to enable vectorization, SSE/AVX and so on. -- /Jacob Carlborg
Mar 31
parent reply Max Haughton <maxhaton gmail.com> writes:
On Wednesday, 31 March 2021 at 09:47:46 UTC, Jacob Carlborg wrote:
 On Wednesday, 31 March 2021 at 04:49:52 UTC, Andrei 
 Alexandrescu wrote:

 That makes their umbrella claim "Zig is faster than C" quite 
 specious.
The reason, or one of the reasons, why Zig is/can be faster than C is that is uses different default optimization levels. For example, Zig will by default target your native CPU instead of some generic model. This allows to enable vectorization, SSE/AVX and so on. -- /Jacob Carlborg
Specific Example? GCC and LLVM are both almost rabid when you turn the vectorizer on
Mar 31
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 7:46 AM, Max Haughton wrote:
 On Wednesday, 31 March 2021 at 09:47:46 UTC, Jacob Carlborg wrote:
 On Wednesday, 31 March 2021 at 04:49:52 UTC, Andrei Alexandrescu wrote:

 That makes their umbrella claim "Zig is faster than C" quite specious.
The reason, or one of the reasons, why Zig is/can be faster than C is that is uses different default optimization levels. For example, Zig will by default target your native CPU instead of some generic model. This allows to enable vectorization, SSE/AVX and so on. -- /Jacob Carlborg
Specific Example? GCC and LLVM are both almost rabid when you turn the vectorizer on
Even if that's the case, "we choose to use by default different flags that make the code more specialized and therefore faster and less portable" can't be a serious basis of a language performance claim.
Mar 31
parent reply Max Haughton <maxhaton gmail.com> writes:
On Wednesday, 31 March 2021 at 12:36:42 UTC, Andrei Alexandrescu 
wrote:
 On 3/31/21 7:46 AM, Max Haughton wrote:
 On Wednesday, 31 March 2021 at 09:47:46 UTC, Jacob Carlborg 
 wrote:
 On Wednesday, 31 March 2021 at 04:49:52 UTC, Andrei 
 Alexandrescu wrote:

 That makes their umbrella claim "Zig is faster than C" quite 
 specious.
The reason, or one of the reasons, why Zig is/can be faster than C is that is uses different default optimization levels. For example, Zig will by default target your native CPU instead of some generic model. This allows to enable vectorization, SSE/AVX and so on. -- /Jacob Carlborg
Specific Example? GCC and LLVM are both almost rabid when you turn the vectorizer on
Even if that's the case, "we choose to use by default different flags that make the code more specialized and therefore faster and less portable" can't be a serious basis of a language performance claim.
Intel C++ can be a little naughty with the fast math options, last time I checked, for example - gotta get those SPEC numbers! I wonder if there is a way to leverage D's type system (or even extend it to allow) to allow a library solution that can hold information which the optimizer can use to elide these checks in most cases. It's probably possible already by just passing some kind of abstract interpretation like data structure as a template parameter, but this is not very ergonomic. Standardizing some kind of `assume` semantics strikes me as a good long term hedge for D, even if doing static analysis and formal verification of D code is an unenviable task.
Mar 31
parent Walter Bright <newshound2 digitalmars.com> writes:
On 3/31/2021 7:36 AM, Max Haughton wrote:
 Intel C++ can be a little naughty with the fast math options, last time I 
 checked, for example - gotta get those SPEC numbers!
Benchmarks are always going to be unfair, but it's only reasonable to try and set the switches as close as practical so they are trying to accomplish the same thing.
 Standardizing some kind of `assume` semantics strikes me as a good long term 
 hedge for D, even if doing static analysis and formal verification of D code
is 
 an unenviable task.
Static analysis has limits. For example, I complained to Vladimir that using hardcoded loop limits enabled optimizations not available to recommended programming practice of not using hardcoded limits.
Mar 31
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2021-03-31 13:46, Max Haughton wrote:

 Specific Example? GCC and LLVM are both almost rabid when you turn the 
 vectorizer on
No, that's why I said "can be". But what I meant is that just running the Zig compiler out of the box might produce better code than Clang because it uses different default optimizations. I mean that is a poor way of claiming Zig is faster than C because it's easy to add a couple of flags to Clang and it will probably be the same speed as Zig. They use the same backend anyway. -- /Jacob Carlborg
Apr 01
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 9:08 PM, Andrei Alexandrescu wrote:
 Not much to write home about. The jumps scale linearly with the number of 
 primitive operations:
 
 https://godbolt.org/z/r3sj1T4hc
 
 That's not going to be a speed demon.
The ldc: mov eax, edi imul eax, eax add eax, edi * add eax, 1 * ret * should be: lea eax,1[eax + edi] Let's try dmd -O: __D3lea6squareFiZi: mov EDX,EAX imul EAX,EAX lea EAX,1[EAX][EDX] ret Woo-hoo!
Mar 30
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:59 AM, Walter Bright wrote:
 On 3/30/2021 9:08 PM, Andrei Alexandrescu wrote:
 Not much to write home about. The jumps scale linearly with the number 
 of primitive operations:

 https://godbolt.org/z/r3sj1T4hc

 That's not going to be a speed demon.
The ldc:         mov     eax, edi         imul    eax, eax         add     eax, edi    *         add     eax, 1      *     ret * should be:     lea    eax,1[eax + edi] Let's try dmd -O: __D3lea6squareFiZi:     mov    EDX,EAX     imul    EAX,EAX     lea    EAX,1[EAX][EDX]     ret Woo-hoo!
Yah, actually gdc uses lea as well: https://godbolt.org/z/Gb6416EKe
Mar 30
prev sibling parent reply Elronnd <elronnd elronnd.net> writes:
On Wednesday, 31 March 2021 at 04:59:08 UTC, Walter Bright wrote:
 * should be:

 	lea    eax,1[eax + edi]
The lea is the exact same length as the sequence of moves, and may be harder to decode. I fail to see how that's a win.
Mar 30
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 10:06 PM, Elronnd wrote:
 On Wednesday, 31 March 2021 at 04:59:08 UTC, Walter Bright wrote:
 * should be:

     lea    eax,1[eax + edi]
The lea is the exact same length as the sequence of moves, and may be harder to decode.  I fail to see how that's a win.
It's a win because it uses the address decoder logic which is separate from the arithmetic logic unit. This enables it to be done in parallel with the ALU. Although not relevant for this particular example, it also doesn't need another register for the intermediate value.
Mar 30
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 05:25:48 UTC, Walter Bright wrote:
 It's a win because it uses the address decoder logic which is 
 separate from the arithmetic logic unit. This enables it to be 
 done in parallel with the ALU.
Is this still true for modern CPUs?
 Although not relevant for this particular example, it also 
 doesn't need another register for the intermediate value.
Haven't CPUs used register renaming for a long time now? It's also pretty rare to see x86_64 code that uses all registers.
Mar 30
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 1:30 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 05:25:48 UTC, Walter Bright wrote:
 It's a win because it uses the address decoder logic which is separate 
 from the arithmetic logic unit. This enables it to be done in parallel 
 with the ALU.
Is this still true for modern CPUs?
Affirmative if you consider the Nehalem modern: https://en.wikipedia.org/wiki/Address_generation_unit
Mar 30
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 05:41:28 UTC, Andrei Alexandrescu 
wrote:
 Affirmative if you consider the Nehalem modern:
Um, that was released 13 years ago.
 https://en.wikipedia.org/wiki/Address_generation_unit
In the picture it still goes through the instruction decoder first, which means LEA and ADD/SHR might as well get decoded to the same microcode. That's the thing about this whole ordeal, we don't know anything. The only thing we *can* do is benchmark. :)
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 1:46 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 05:41:28 UTC, Andrei Alexandrescu wrote:
 Affirmative if you consider the Nehalem modern:
Um, that was released 13 years ago.
It carried over afaik to all subsequent Intel CPUs: https://hexus.net/tech/reviews/cpu/147440-intel-core-i9-11900k/. Sunny Cove actually adds one extra AGU.
 https://en.wikipedia.org/wiki/Address_generation_unit
In the picture it still goes through the instruction decoder first, which means LEA and ADD/SHR might as well get decoded to the same microcode.
That's not the case. It's separate hardware.
 That's the thing about this whole ordeal, we don't know anything. The 
 only thing we *can* do is benchmark. :)
We can Read The Fine Manual.
Mar 30
parent Elronnd <elronnd elronnd.net> writes:
On Wednesday, 31 March 2021 at 05:54:43 UTC, Andrei Alexandrescu 
wrote:
 In the picture it still goes through the instruction decoder 
 first, which means LEA and ADD/SHR might as well get decoded 
 to the same microcode.
That's not the case. It's separate hardware.
Less with the talking, more with the benchmarking! If what you say is true, then a sequence of add interleaved with lea should be faster than the equivalent sequence, but with add replacing the lea. Benchmark code is here https://files.catbox.moe/2zzrwe.tar; on my system, the performance is identical.
Mar 31
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 10:30 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 05:25:48 UTC, Walter Bright wrote:
 It's a win because it uses the address decoder logic which is separate from 
 the arithmetic logic unit. This enables it to be done in parallel with the ALU.
Is this still true for modern CPUs?
See https://www.agner.org/optimize/optimizing_assembly.pdf page 135.
 Although not relevant for this particular example, it also doesn't need 
 another register for the intermediate value.
Haven't CPUs used register renaming for a long time now? It's also pretty rare to see x86_64 code that uses all registers.
If you use a register that needs to be saved on the stack, it's going to cost.
Mar 30
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 06:34:04 UTC, Walter Bright wrote:
 On 3/30/2021 10:30 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 05:25:48 UTC, Walter Bright 
 wrote:
 It's a win because it uses the address decoder logic which is 
 separate from the arithmetic logic unit. This enables it to 
 be done in parallel with the ALU.
Is this still true for modern CPUs?
See https://www.agner.org/optimize/optimizing_assembly.pdf page 135.
Thanks! It also says that LEA may be slower than ADD on some CPUs. I wrote a small benchmark using the assembler code from a few posts ago. It takes the same time on my AMD CPU, but the ADD is indeed slower than the LEA on the old Intel CPU on the server. :) Unfortunately I don't have access to a modern Intel CPU to test.
 Although not relevant for this particular example, it also 
 doesn't need another register for the intermediate value.
Haven't CPUs used register renaming for a long time now? It's also pretty rare to see x86_64 code that uses all registers.
If you use a register that needs to be saved on the stack, it's going to cost.
Sure, but why would you do that? If I'm reading the ABI spec correctly, almost all registers belong to the callee, and don't need to be saved/restored, and there's probably little reason to call a function in the middle of such a computation and therefore save the interim value on the stack.
Mar 30
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 11:54 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 06:34:04 UTC, Walter Bright wrote:
 On 3/30/2021 10:30 PM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 05:25:48 UTC, Walter Bright wrote:
 It's a win because it uses the address decoder logic which is separate from 
 the arithmetic logic unit. This enables it to be done in parallel with the ALU.
Is this still true for modern CPUs?
See https://www.agner.org/optimize/optimizing_assembly.pdf page 135.
Thanks! It also says that LEA may be slower than ADD on some CPUs.
Slower than ADD, but not slower than multiple ADDs. DMD does not replace a mere ADD with LEA. If you also look at how LEA is used in the various examples of optimized code in the pdf, well, he uses it a lot.
 some CPUs
Code gen is generally targeted at generating code that works well on most machines.
 If you use a register that needs to be saved on the stack, it's going to cost.
Sure, but why would you do that?
To map as many locals into registers as possible.
 If I'm reading the ABI spec correctly, almost 
 all registers belong to the callee, and don't need to be saved/restored, and 
 there's probably little reason to call a function in the middle of such a 
 computation and therefore save the interim value on the stack.
All I can say is code gen is never that simple. There are just too many rules that conflict. The combinatorial explosion means some heuristics are relied on that produce better results most of the time. I suppose a good AI research project would be to train an AI to produce better overall patterns. But, in general, 1. LEA is faster for more than one operation 2. using fewer registers is better 3. getting locals into registers is better 4. generating fewer instructions is better 5. generating shorter instructions is better 6. jumpless code is better None of these are *always* true. And Intel/AMD change the rules slightly with every new processor. As for overflow checks, I am not going to post benchmarks because everyone picks at them. Every benchmark posted here by check proponents shows that overflow checks are slower. The Rust team apparently poured a lot of effort into overflow checks, and ultimately failed, as in the checks are turned off in release code. I don't see much hope in replicating their efforts. And, once again, I reiterate that D *does* have some overflow checks that are done at compile time (i.e. are free) in the form of integral promotions and Value Range Propagation, neither of which are part of Zig or Rust.
Mar 31
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 07:13:07 UTC, Walter Bright wrote:
 All I can say is code gen is never that simple. There are just 
 too many rules that conflict. The combinatorial explosion means 
 some heuristics are relied on that produce better results most 
 of the time. I suppose a good AI research project would be to 
 train an AI to produce better overall patterns.

 But, in general,

 1. LEA is faster for more than one operation
 2. using fewer registers is better
 3. getting locals into registers is better
 4. generating fewer instructions is better
 5. generating shorter instructions is better
 6. jumpless code is better
Thanks for the insight! My personal perspective is that: - Silicon will keep getting faster and cheaper with time - A 7% or a 14% or even a +100% slowdown is relatively insignificant considering the overall march of progress - Moore's law, but also other factors such as the average size and complexity of programs, which will also keep increasing as people expect software to do more things, which will drown out such "one-time" slowdowns as integer overflow checks - In the long term, people will invariably prefer programming languages which produce correct results (with less code), over programming languages whose benefit is only that they're faster. So, it seems to me that Rust made the choice to only enable overflow checks in debug mode in order to be competitive with the programming languages of its time. I think Zig's design is the more future-proof - there will continue to be circumstances in which speed is preferable over correctness, such as video games (where an occasional wrong result is tolerable), so having distinct ReleaseFast and ReleaseSafe modes makes sense. BTW, another data point along Rust and Zig is of course Python 3, in which all integers are BigInts (but with small numbers inlined in the value, akin to small string optimizations).
Mar 31
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/31/2021 12:31 AM, Vladimir Panteleev wrote:
 - Silicon will keep getting faster and cheaper with time
 
 - A 7% or a 14% or even a +100% slowdown is relatively insignificant
considering 
 the overall march of progress - Moore's law, but also other factors such as
the 
 average size and complexity of programs, which will also keep increasing as 
 people expect software to do more things, which will drown out such "one-time" 
 slowdowns as integer overflow checks
If you're running a data center, 1% translates to millions of dollars.
 - In the long term, people will invariably prefer programming languages which 
 produce correct results (with less code), over programming languages whose 
 benefit is only that they're faster.
People will prefer what makes them money :-) D's focus is on memory safety, which is far more important than integer overflow.
 So, it seems to me that Rust made the choice to only enable overflow checks in 
 debug mode in order to be competitive with the programming languages of its 
 time. I think Zig's design is the more future-proof - there will continue to
be 
 circumstances in which speed is preferable over correctness, such as video
games 
 (where an occasional wrong result is tolerable), so having distinct
ReleaseFast 
 and ReleaseSafe modes makes sense.
Zig doesn't do much to prevent memory corruption. Memory safety will be the focus of D for the near future.
 BTW, another data point along Rust and Zig is of course Python 3, in which all 
 integers are BigInts (but with small numbers inlined in the value, akin to
small 
 string optimizations).
Python isn't competitive with systems programming languages.
Mar 31
next sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 07:52:31 UTC, Walter Bright wrote:
 On 3/31/2021 12:31 AM, Vladimir Panteleev wrote:
 - Silicon will keep getting faster and cheaper with time
 
 - A 7% or a 14% or even a +100% slowdown is relatively 
 insignificant considering the overall march of progress - 
 Moore's law, but also other factors such as the average size 
 and complexity of programs, which will also keep increasing as 
 people expect software to do more things, which will drown out 
 such "one-time" slowdowns as integer overflow checks
If you're running a data center, 1% translates to millions of dollars.
You would think someone would have told that to all the companies running their services written in Ruby, JavaScript, etc. Unfortunately, that hasn't been the case. What remains the most valuable is 1) time/money not lost due to wrong results / angry customers, and 2) developer time.
 - In the long term, people will invariably prefer programming 
 languages which produce correct results (with less code), over 
 programming languages whose benefit is only that they're 
 faster.
People will prefer what makes them money :-) D's focus is on memory safety, which is far more important than integer overflow.
It most definitely is. But I think sooner or later we will get to a point where memory safety is the norm, and writing code in memory-unsafe languages would be like writing raw assembler today. So, the standard for correctness will be higher.
Mar 31
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 3:58 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 07:52:31 UTC, Walter Bright wrote:
 On 3/31/2021 12:31 AM, Vladimir Panteleev wrote:
 - Silicon will keep getting faster and cheaper with time

 - A 7% or a 14% or even a +100% slowdown is relatively insignificant 
 considering the overall march of progress - Moore's law, but also 
 other factors such as the average size and complexity of programs, 
 which will also keep increasing as people expect software to do more 
 things, which will drown out such "one-time" slowdowns as integer 
 overflow checks
If you're running a data center, 1% translates to millions of dollars.
You would think someone would have told that to all the companies running their services written in Ruby, JavaScript, etc.
Funny how things work out isn't it :o).
 Unfortunately, that hasn't been the case.
It is. I know because I collaborated with the provisioning team at Facebook.
Mar 31
parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 12:38:51 UTC, Andrei Alexandrescu 
wrote:
 You would think someone would have told that to all the 
 companies running their services written in Ruby, JavaScript, 
 etc.
Funny how things work out isn't it :o).
 Unfortunately, that hasn't been the case.
It is. I know because I collaborated with the provisioning team at Facebook.
I don't understand what you mean by this. Do you and Facebook have a plan to forbid the entire world from running Ruby, JavaScript etc. en masse on datacenters?
Mar 31
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 8:40 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 12:38:51 UTC, Andrei Alexandrescu wrote:
 You would think someone would have told that to all the companies 
 running their services written in Ruby, JavaScript, etc.
Funny how things work out isn't it :o).
 Unfortunately, that hasn't been the case.
It is. I know because I collaborated with the provisioning team at Facebook.
I don't understand what you mean by this. Do you and Facebook have a plan to forbid the entire world from running Ruby, JavaScript etc. en masse on datacenters?
Using languages has to take important human factors into effect, e.g. Facebook could not realistically switch from PHP/Hack to C++ in the front end (though the notion does come up time and again). It is factually true that to a large server farm performance percentages translate into millions.
Mar 31
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 3:52 AM, Walter Bright wrote:
 On 3/31/2021 12:31 AM, Vladimir Panteleev wrote:
 - Silicon will keep getting faster and cheaper with time

 - A 7% or a 14% or even a +100% slowdown is relatively insignificant 
 considering the overall march of progress - Moore's law, but also 
 other factors such as the average size and complexity of programs, 
 which will also keep increasing as people expect software to do more 
 things, which will drown out such "one-time" slowdowns as integer 
 overflow checks
If you're running a data center, 1% translates to millions of dollars.
Factually true. Millions of dollars a year that is. It's all about the clientele. There will always be companies that must get every bit of performance. Weka.IO must be fastest. If they were within 15% of the fastest, they'd be out of business.
Mar 31
prev sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu 
wrote:
 Not much to write home about. The jumps scale linearly with the 
 number of primitive operations:

 https://godbolt.org/z/r3sj1T4hc
Right, but as we both know, speed doesn't necessarily scale with the number of instructions for many decades now. Curiosity got the better of me and I played with this for a bit. Here is my program: https://dump.cy.md/d7b7ae5c2d15c8c0127fd96dd74909a1/main.zig Two interesting observations: 1. The compiler (whether it's the Zig frontend or the LLVM backend) is smart about adding the checks. If it can prove that the values will never overflow, then the overflow checks aren't emitted. I had to trick it into thinking that they may overflow, when in practice they never will. 1b. The compiler is actually that aware of the checks, that in one of my attempts to get it to always emit them, it actually generated a version of the function with and without the checks, and called the unchecked version in the case where it knew that it will never overflow! Amazing! 2. After finally getting it to always generate the checks, and benchmarking the results, the difference in run time I'm seeing between ReleaseFast and ReleaseSafe is a measly 2.7%. The disassembly looks all right too: https://godbolt.org/z/3nY7Ee4ff Personally, 2.7% is a price I'm willing to pay any day, if it helps save me from embarrassments like https://github.com/CyberShadow/btdu/issues/1 :)
Mar 30
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 10:16 PM, Vladimir Panteleev wrote:
 1. The compiler (whether it's the Zig frontend or the LLVM backend) is smart 
 about adding the checks. If it can prove that the values will never overflow, 
 then the overflow checks aren't emitted. I had to trick it into thinking that 
 they may overflow, when in practice they never will.
The code uses hardcoded loop limits. Yes, the compiler can infer no overflow by knowing the limits of the value. In my experience, I rarely loop for a hardcoded number of times.
Mar 30
parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 31 March 2021 at 05:32:16 UTC, Walter Bright wrote:
 On 3/30/2021 10:16 PM, Vladimir Panteleev wrote:
 1. The compiler (whether it's the Zig frontend or the LLVM 
 backend) is smart about adding the checks. If it can prove 
 that the values will never overflow, then the overflow checks 
 aren't emitted. I had to trick it into thinking that they may 
 overflow, when in practice they never will.
The code uses hardcoded loop limits. Yes, the compiler can infer no overflow by knowing the limits of the value. In my experience, I rarely loop for a hardcoded number of times.
Well, this is fake artificial code, and looping a fixed number of times is just one aspect of its fakeness. If you do loop an unpredictable number of times in your real program, then you almost certainly do need the overflow check, so emitting it would be the right thing to do there :)
Mar 30
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 1:16 AM, Vladimir Panteleev wrote:
 On Wednesday, 31 March 2021 at 04:08:02 UTC, Andrei Alexandrescu wrote:
 Not much to write home about. The jumps scale linearly with the number 
 of primitive operations:

 https://godbolt.org/z/r3sj1T4hc
Right, but as we both know, speed doesn't necessarily scale with the number of instructions for many decades now.
Of course, and I wasn't suggesting the contrary. If speed would simply increase by decreasing instructions retired, inliners would be much more agreesive etc. etc. But such statements need to be carefully qualified which is why I do my best to not make them in isolation. The qualification here would be... "except most of the case when it does". Instructions retired is generally a telling proxy.
 Curiosity got the better of me and I played with this for a bit.
 
 Here is my program:
 
 https://dump.cy.md/d7b7ae5c2d15c8c0127fd96dd74909a1/main.zig
 
 Two interesting observations:
 
 1. The compiler (whether it's the Zig frontend or the LLVM backend) is 
 smart about adding the checks. If it can prove that the values will 
 never overflow, then the overflow checks aren't emitted. I had to trick 
 it into thinking that they may overflow, when in practice they never will.
 
 1b. The compiler is actually that aware of the checks, that in one of my 
 attempts to get it to always emit them, it actually generated a version 
 of the function with and without the checks, and called the unchecked 
 version in the case where it knew that it will never overflow! Amazing!
 
 2. After finally getting it to always generate the checks, and 
 benchmarking the results, the difference in run time I'm seeing between 
 ReleaseFast and ReleaseSafe is a measly 2.7%. The disassembly looks all 
 right too: https://godbolt.org/z/3nY7Ee4ff
 
 Personally, 2.7% is a price I'm willing to pay any day, if it helps save 
 me from embarrassments like https://github.com/CyberShadow/btdu/issues/1 :)
That's in line with expectations for a small benchmarks. On larger applications the impact of bigger code on the instruction cache would be more detrimental. (Also the branch predictor is a limited resource so more jumps means decreased predictability of others; not sure how that compares in magnitude with the impact on instruction cache, which is a larger and more common problem.)
Mar 30
prev sibling parent reply tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu 
wrote:
 On 3/30/21 1:09 PM, tsbockman wrote:
 So you're now dismissing Zig as slow because its feature set 
 surprised you? No real-world data is necessary? No need to 
 understand any of Zig's relevant optimizations or options?
Instead of passing the burden of proof back and forth, some evidence would be welcome.
I already posted both some Zig benchmark results of my own, and some C/C++ results from the paper you linked earlier. You just missed them, I guess: https://forum.dlang.org/post/ghcnkevthguciupexeyu forum.dlang.org https://forum.dlang.org/post/rnotyrxmczbdvxtalarf forum.dlang.org Oversimplified: the extra time required in these tests ranged from less than 0.1% up to 14%, depending on the application. Also, the Zig checked binaries are actually slightly smaller than the unchecked binaries for some reason.
Mar 30
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:11 AM, tsbockman wrote:
 On Wednesday, 31 March 2021 at 03:30:00 UTC, Andrei Alexandrescu wrote:
 On 3/30/21 1:09 PM, tsbockman wrote:
 So you're now dismissing Zig as slow because its feature set 
 surprised you? No real-world data is necessary? No need to understand 
 any of Zig's relevant optimizations or options?
Instead of passing the burden of proof back and forth, some evidence would be welcome.
I already posted both some Zig benchmark results of my own, and some C/C++ results from the paper you linked earlier. You just missed them, I guess: https://forum.dlang.org/post/ghcnkevthguciupexeyu forum.dlang.org https://forum.dlang.org/post/rnotyrxmczbdvxtalarf forum.dlang.org Oversimplified: the extra time required in these tests ranged from less than 0.1% up to 14%, depending on the application.
Thanks. This is in line with expectations.
 Also, the Zig checked binaries are actually slightly smaller than the 
 unchecked binaries for some reason.
That's surprising so some investigation would be in order. From what I tried on godbolt the generated code is strictly larger if it uses checks. FWIW I just tested -fwrapv and -ftrapv. The former does nothing discernible: https://godbolt.org/z/ErMoeKnxK The latter generates one function call per primitive operation, which is sure to not win any contests: https://godbolt.org/z/ahErY3zKn
Mar 30
next sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 04:26:28 UTC, Andrei Alexandrescu 
wrote:
 FWIW I just tested -fwrapv and -ftrapv. The former does nothing 
 discernible:
-fwrapv isn't supposed to do anything discernible; it just prevents the compiler from taking advantage of otherwise undefined behavior: "Instructs the compiler to assume that signed arithmetic overflow of addition, subtraction, and multiplication, wraps using two's-complement representation." https://www.keil.com/support/man/docs/armclang_ref/armclang_ref_sam1465487496421.htm
Mar 30
prev sibling parent tsbockman <thomas.bockman gmail.com> writes:
On Wednesday, 31 March 2021 at 04:26:28 UTC, Andrei Alexandrescu 
wrote:
 On 3/31/21 12:11 AM, tsbockman wrote:
 Also, the Zig checked binaries are actually slightly smaller 
 than the unchecked binaries for some reason.
That's surprising so some investigation would be in order. From what I tried on godbolt the generated code is strictly larger if it uses checks.
Perhaps the additional runtime validation is causing reduced inlining in some cases? The test program I used has almost 300 KiB of source code, so it may be hard to reproduce the effect with toy programs on godbolt.
Mar 30
prev sibling next sibling parent sighoya <sighoya gmail.com> writes:
On Saturday, 27 March 2021 at 03:25:04 UTC, Walter Bright wrote:

 4. fast integer arithmetic is fundamental to fast code, not a 
 mere micro-optimization. Who wants an overflow check on every 
 pointer increment?
The point is the overflow check is already done by most cpus independent if overflow will be handled by the language or not. Unfortunately, such cpu's don't send an interrupt, so we have to check twice for overflows. The best is of course to handle the language safe arithmetics, however this requires full semantic support in the type system. What about providing two operators for integer arithmetic instead, one safe and one unsafe?
Mar 29
prev sibling parent reply Elronnd <elronnd elronnd.net> writes:
On Saturday, 27 March 2021 at 03:25:04 UTC, Walter Bright wrote:
 4. fast integer arithmetic is fundamental to fast code, not a 
 mere micro-optimization. Who wants an overflow check on every 
 pointer increment?
Dan Luu measures overflow checks as having an overall 1% performance impact for numeric-heavy c code. (https://danluu.com/integer-overflow/). The code size impact is also very small, ~3%. This isn't 'speculation', it's actual measurement. 'lea' is a microoptimization, it doesn't 'significantly' improve performance; yes, mul is slow, but lea can be trivially replaced by the equivalent sequence of shifts and adds with very little penalty. Why is this being seriously discussed as a performance pitfall?
Mar 30
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 3/31/21 12:28 AM, Elronnd wrote:
 On Saturday, 27 March 2021 at 03:25:04 UTC, Walter Bright wrote:
 4. fast integer arithmetic is fundamental to fast code, not a mere 
 micro-optimization. Who wants an overflow check on every pointer 
 increment?
Dan Luu measures overflow checks as having an overall 1% performance impact for numeric-heavy c code. (https://danluu.com/integer-overflow/).  The code size impact is also very small, ~3%. This isn't 'speculation', it's actual measurement.
Bit surprised about how you put it. Are you sure you represent what the article says? I skimmed the article just now and... 1% is literally not found in the text, and 3% is a "guesstimate" per the author. The actual measurements show much larger margins, closer to tsbockman's.
Mar 30
parent Elronnd <elronnd elronnd.net> writes:
On Wednesday, 31 March 2021 at 05:08:29 UTC, Andrei Alexandrescu 
wrote:
 I skimmed the article just now and... 1% is literally not found 
 in the text, and 3% is a "guesstimate" per the author.
Look at the 'fsan ud' row of the only table. 1% is the performance penalty for 'zip', and 0% penalty for 'unzip'. 3% codesize I got from looking at binaries on my own system. I actually forgot that that article talks about codesize at all.
Mar 31
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 3/30/2021 9:28 PM, Elronnd wrote:
 Why is this being seriously discussed as a performance pitfall?
1% is a serious improvement. If it wasn't, why would Rust (for example) who likely tried harder than anyone to make it work, still disable it for release code?
Mar 31
parent reply Elronnd <elronnd elronnd.net> writes:
On Thursday, 1 April 2021 at 00:50:21 UTC, Walter Bright wrote:
 On 3/30/2021 9:28 PM, Elronnd wrote:
 Why is this being seriously discussed as a performance pitfall?
1% is a serious improvement. If it wasn't, why would Rust (for example) who likely tried harder than anyone to make it work, still disable it for release code?
That's an appeal to authority. You haven't actually justified their choice. (Nor, for that matter, have you justified that 1% is a serious performance improvement.)
Apr 01
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 4/1/2021 10:59 PM, Elronnd wrote:
 On Thursday, 1 April 2021 at 00:50:21 UTC, Walter Bright wrote:
 On 3/30/2021 9:28 PM, Elronnd wrote:
 Why is this being seriously discussed as a performance pitfall?
1% is a serious improvement. If it wasn't, why would Rust (for example) who likely tried harder than anyone to make it work, still disable it for release code?
That's an appeal to authority.  You haven't actually justified their choice. (Nor, for that matter, have you justified that 1% is a serious performance improvement.)
That's backwards. You want other people to invest in this technology, you need to justify it. I've been in this business for 40 years. I know for a fact that if you're trying to sell a high performance language that is inherently slower than the current one they're using, you've got a huge problem. Having written high performance apps myself, I'll take 1%. I work on getting a lot smaller gains than that, because they add up. As for Appeal to Authority, there is more nuance to it than one might think: "Exception: Be very careful not to confuse "deferring to an authority on the issue" with the appeal to authority fallacy. Remember, a fallacy is an error in reasoning. Dismissing the council of legitimate experts and authorities turns good skepticism into denialism. The appeal to authority is a fallacy in argumentation, but deferring to an authority is a reliable heuristic that we all use virtually every day on issues of relatively little importance. There is always a chance that any authority can be wrong, that’s why the critical thinker accepts facts provisionally. It is not at all unreasonable (or an error in reasoning) to accept information as provisionally true by credible authorities. Of course, the reasonableness is moderated by the claim being made (i.e., how extraordinary, how important) and the authority (how credible, how relevant to the claim)." https://www.logicallyfallacious.com/logicalfallacies/Appeal-to-Authority --- I'm not preventing anyone from adding integer overflow detection to D. Feel free to make a prototype and we can all evaluate it.
Apr 02
parent reply Guillaume Piolat <first.name spam.org> writes:
On Friday, 2 April 2021 at 20:56:04 UTC, Walter Bright wrote:
 I'm not preventing anyone from adding integer overflow 
 detection to D. Feel free to make a prototype and we can all 
 evaluate it.
Seems to be a bit like bounds checks (less obvious benefits), it could be made default but disabled in -b release-nobounds Even while being opt-out, bounds check are annoying in D because with DUB you typically profile a program built with dub -b release-debug and that _includes_ bounds checks! So I routinely profile programs that aren't like the actual output. So, integer overflow checks would - in practice - further hinder capacity to profile programs.
Apr 03
parent John Colvin <john.loughran.colvin gmail.com> writes:
On Saturday, 3 April 2021 at 09:09:33 UTC, Guillaume Piolat wrote:
 On Friday, 2 April 2021 at 20:56:04 UTC, Walter Bright wrote:
 I'm not preventing anyone from adding integer overflow 
 detection to D. Feel free to make a prototype and we can all 
 evaluate it.
Seems to be a bit like bounds checks (less obvious benefits), it could be made default but disabled in -b release-nobounds Even while being opt-out, bounds check are annoying in D because with DUB you typically profile a program built with dub -b release-debug and that _includes_ bounds checks! So I routinely profile programs that aren't like the actual output. So, integer overflow checks would - in practice - further hinder capacity to profile programs.
It’s not like bounds checks because there’s loads of code out there that correctly uses overflow. It’s a significant breaking change to turn that switch on, not just a “would you like to trade some speed for safety” like bounds-checks are. That’s not to say it shouldn’t be done. I’m just pointing out that it’s very different.
Apr 03
prev sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 24 March 2021 at 20:28:39 UTC, tsbockman wrote:
 [snip]
 TLDR; What you're really asking for is impossible in D2. It 
 would require massive breaking changes to the language to 
 implement without undermining the guarantees that a checked 
 integer type exists to provide.
Are you familiar with how Zig handles overflow [1]? They error on overflow by default, but they have additional functions and operators to handle when you want to do wraparound. Nevertheless, I agree that the ship has sailed for D2 on this. [1] https://ziglang.org/documentation/master/#Integer-Overflow
Mar 27
parent tsbockman <thomas.bockman gmail.com> writes:
On Saturday, 27 March 2021 at 21:02:39 UTC, jmh530 wrote:
 Are you familiar with how Zig handles overflow [1]? They error 
 on overflow by default, but they have additional functions and 
 operators to handle when you want to do wraparound.
Thanks for the link; I hadn't seen Zig's take before. It agrees with my conclusions from developing checkedint: assume the user wants normal integer math by default, signal an error somehow when it fails, and wrap overflow only when this is explicitly requested. It's not just about reliability vs. performance, it is about making the intended semantics of the code clear: 0) Is overflow wrapped on purpose? 1) Did the programmer somehow prove that overflow cannot occur for all valid inputs? 2) Was the programmer desperate enough for speed to knowingly write incorrect code? 3) Was the programmer simply ignorant or forgetful of this problem? 4) Did the programmer willfully ignore overflow because it is "not the cause of enough problems to be that concerning"? Most code written in C/D/etc. leaves the answer to this question a mystery for the reader to puzzle out. In contrast, code written using a system like Zig's is far less likely to confuse or mislead the reader.
 Nevertheless, I agree that the ship has sailed for D2 on this.
Yes.
 [1] https://ziglang.org/documentation/master/#Integer-Overflow
Mar 27
prev sibling parent reply Berni44 <someone somemail.com> writes:
On Tuesday, 23 March 2021 at 21:22:18 UTC, Walter Bright wrote:
 It's been there long enough.
Isn't that true meanwhile for everything in std.experimental? I ask, because I've got the feeling, that std.experimental doesn't work as expected. For me it looks more or less like an attic, where stuff is put and then forgotten. Maybe the way we used for sumtype is the better approach...
Mar 24
parent reply Q. Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 24 March 2021 at 11:20:52 UTC, Berni44 wrote:
 On Tuesday, 23 March 2021 at 21:22:18 UTC, Walter Bright wrote:
 It's been there long enough.
Isn't that true meanwhile for everything in std.experimental? I ask, because I've got the feeling, that std.experimental doesn't work as expected. For me it looks more or less like an attic, where stuff is put and then forgotten. Maybe the way we used for sumtype is the better approach...
I have no idea why std.experimental is a thing to begin with. It sounds like a bad idea and it turned out to be one. Moving stuff around in a standard library isn't without some disadvantages: The public import stays as an historic artifact or deprecation is needed, both things that should be avoided. There are cases where it's fine like splitting a module into a package. A standard library is something expected to be particularly well done and stable. Having experimental stuff in it, is an oxymoron. DUB packages that are "featured" is a way better approach. If deemed worth it (like sumtype), they can be incorporated into Phobos. We may even introduce a "Phobos candidate" tag. Additionally, that establishes DUB as a core part of the D ecosystem. Can std.experimental packages be removed without deprecation? The worst offender is std.experimental.typecons; while I don't really understand the purpose of (un-)wrap, I know enough of Final to be sure it's the kind of thing that must be a language feature or it cannot possibly live up to users' expectations. Final cannot work properly as a library solution. (I can elaborate if needed.) I tried fixing it until I realized it's impossible because it's design goal is unsound. I honestly cannot imagine anyone who uses it. It is cumbersome and has zero advantages.
Mar 24
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 3/24/21 10:57 AM, Q. Schroll wrote:
 On Wednesday, 24 March 2021 at 11:20:52 UTC, Berni44 wrote:
 On Tuesday, 23 March 2021 at 21:22:18 UTC, Walter Bright wrote:
 It's been there long enough.
Isn't that true meanwhile for everything in std.experimental? I ask, because I've got the feeling, that std.experimental doesn't work as expected. For me it looks more or less like an attic, where stuff is put and then forgotten. Maybe the way we used for sumtype is the better approach...
I have no idea why std.experimental is a thing to begin with. It sounds like a bad idea and it turned out to be one. Moving stuff around in a standard library isn't without some disadvantages: The public import stays as an historic artifact or deprecation is needed, both things that should be avoided. There are cases where it's fine like splitting a module into a package.
It's there because we wanted a place for new parts of phobos to develop without becoming set in stone. The reason it's called "std.experimental" is to disclose explicitly that it is meant to be experimental, subject to breaking changes. Otherwise you get things like javax. In practice, it turned out not as helpful as originally planned, which is why we haven't put anything new in it for a long long time. Take for instance std.experimental.allocator. At one point, a fundamental design change happened (which is perfectly allowed). But of course, code had depended on it, and now was broken. So stdx.allocator was born (see https://code.dlang.org/packages/stdx-allocator) to allow depending on specific versions of std.experimental.allocator without having to freeze yourself at a specific Phobos version. It's important to note that std.experimental predates code.dlang.org, which I think is the better way to develop libraries that might become included into phobos (see for instance std.sumtype).
 
 Can std.experimental packages be removed without deprecation?
In a word, yes. It's experimental, anything is possible. I would recommend we deprecate-remove everything in it into dub packages, or promote them to full-fledged Phobos packages. -Steve
Mar 29
parent Guillaume Piolat <first.name spam.org> writes:
On Monday, 29 March 2021 at 14:47:15 UTC, Steven Schveighoffer 
wrote:
 It's important to note that std.experimental predates 
 code.dlang.org, which I think is the better way to develop 
 libraries that might become included into phobos (see for 
 instance std.sumtype).
I was intringued and digged a bit of forum history: - the idea for std.experimental was out there in 2011 (!) - debates about its merit vs popularity on code.dlang.org happened in 2014 - first module accepted was std.experimental.logger, in 2015, after an unusually long review time (and after being a DUB package for a while) - followed by std.experimental.allocator in 2015 - std.experimental.checkedint is added in Apr 2017, at the same time std.experimental.ndslice is removed from Phobos. Development continues on DUB. - the stdx-allocator DUB package was created in Nov 2017. Today it has a 4.2 score on DUB.
Mar 29