digitalmars.D.learn - Assert and undefined behavior

John Burton (30/30) Oct 11 2017 The spec says this :-

rikki cattermole (9/43) Oct 11 2017 You misinterpreted it.
user1234 (6/10) Oct 11 2017 Yes, that's the way of doing. assert() are just used to test the

Eduard Staniloiu (8/19) Oct 11 2017 A small addition to the answers already provided.

Jonathan M Davis (45/75) Oct 11 2017 If your assertions are failing, you're screwed anyway. If an assertion
=?UTF-8?Q?Ali_=c3=87ehreli?= (34/41) Oct 11 2017 The important part is that the "program" has undefined behavior
Timon Gehr (8/27) Oct 12 2017 Yes, that's what it is saying. (The other answers, that say or try to

John Burton (32/41) Oct 12 2017 This is an example of what I mean :-

kdevel (42/45) Oct 12 2017 In the context of ISO-C++ it is meaningless to reason about the

Jonathan M Davis (17/62) Oct 12 2017 assert(false) is a bit special in that it's never removed (it becomes a ...

kdevel (18/32) Oct 12 2017 Confirmed. I should have written something like this instead:

Jonathan M Davis (42/78) Oct 12 2017 If assertions are compiled in (which they are if you're not compiling wi...

kdevel (48/70) Oct 13 2017 Thanks for the clarification! This is a difference to C where

Jonathan M Davis (35/76) Oct 13 2017 Essentially, though talking about conforming usually has to do with spec...

Jesse Phillips (13/22) Oct 13 2017 Yeah the C/C++ community/haters love to talk about all the code

Jonathan M Davis (4/8) Oct 14 2017 +1
Timon Gehr (13/43) Oct 14 2017 The compiler can easily prove that the value of data.length does not

kdevel (15/18) Oct 14 2017 This confuses different levels of reasoning. In C/C++ "undefined

Timon Gehr (11/34) Oct 14 2017 It's a correct statement about the semantics of programs produced from

Jesse Phillips (32/41) Oct 17 2017 You are right, in this example proving that there is no change

John Burton <john.burton jbmail.com> writes:

The spec says this :-

"As a contract, an assert represents a guarantee that the code 
must uphold. Any failure of this expression represents a logic 
error in the code that must be fixed in the source code. A 
program for which the assert contract is false is, by definition, 
invalid, and therefore has undefined behaviour."

Now I worry about the words "undefined behavior" because in C++ 
compiler writers seem to have decided that these words mean that 
it's ok for the compiler to generate code to do whatever it feels 
like even in unconnected code and even before the undefined 
behavior is invoked because some subsequent code has undefined 
behavior.

 From my C++ experience this paragraph tells me that if I use 
"assert" to check my assumptions, and the assertion is false, 
then this could lead to my program failing in unpredictable ways 
unconnected with the actual assertion.

I therefore feel like I ought to not use assert and should 
instead validate my assumptions with an if statement and a throw 
or exit or something.

I feel like a failing assertion should not cause "undefined 
behavior" in the sense it is commonly used in C++ programming 
these days but should have exactly defined behavior that it will 
do nothing if the assert passes and throw the specified exception 
if it fails. Can I safely assume this despite the wording?

I know this might seem like a small or pedantic point, but C++ 
compilers can and do use invoking undefined behavior as an excuse 
to do all kinds of unexpected things in generated code these days 
and I want to write safe code :) I feel that if D is specified in 
the same way then assert is not safe for me to use in a real 
program.

Oct 11 2017

rikki cattermole <rikki cattermole.co.nz> writes:

On 11/10/2017 10:27 AM, John Burton wrote:
 The spec says this :-
 
 "As a contract, an assert represents a guarantee that the code must 
 uphold. Any failure of this expression represents a logic error in the 
 code that must be fixed in the source code. A program for which the 
 assert contract is false is, by definition, invalid, and therefore has 
 undefined behaviour."
 
 Now I worry about the words "undefined behavior" because in C++ compiler 
 writers seem to have decided that these words mean that it's ok for the 
 compiler to generate code to do whatever it feels like even in 
 unconnected code and even before the undefined behavior is invoked 
 because some subsequent code has undefined behavior.
 
  From my C++ experience this paragraph tells me that if I use "assert" 
 to check my assumptions, and the assertion is false, then this could 
 lead to my program failing in unpredictable ways unconnected with the 
 actual assertion.
 
 I therefore feel like I ought to not use assert and should instead 
 validate my assumptions with an if statement and a throw or exit or 
 something.
 
 I feel like a failing assertion should not cause "undefined behavior" in 
 the sense it is commonly used in C++ programming these days but should 
 have exactly defined behavior that it will do nothing if the assert 
 passes and throw the specified exception if it fails. Can I safely 
 assume this despite the wording?
 
 I know this might seem like a small or pedantic point, but C++ compilers 
 can and do use invoking undefined behavior as an excuse to do all kinds 
 of unexpected things in generated code these days and I want to write 
 safe code :) I feel that if D is specified in the same way then assert 
 is not safe for me to use in a real program.

You misinterpreted it.

The program /could/ be in an invalid state because an internal state 
check (assert) says that it isn't.

What the compiler generates for the assert, depends upon the platform 
and if its building with optimizations. But all of them will end in the 
process crashing unless you go out of your way to handle it.

By default it throws and Error in debug mode, which you shouldn't be 
catching since it is an Error and not an Exception anyway.

Oct 11 2017

user1234 <user1234 12.sp> writes:

On Wednesday, 11 October 2017 at 09:27:49 UTC, John Burton wrote:
 [...]
 I therefore feel like I ought to not use assert and should 
 instead validate my assumptions with an if statement and a 
 throw or exit or something.

Yes, that's the way of doing. assert() are just used to test the 
program. the -release option in DMD disable all the assert() 
(excepted assert(0) which is a bit special), so that in a release 
version, only Throwable objects can be used after a failure 
detected.

Oct 11 2017

Eduard Staniloiu <edi33416 gmail.com> writes:

On Wednesday, 11 October 2017 at 09:39:04 UTC, user1234 wrote:
 On Wednesday, 11 October 2017 at 09:27:49 UTC, John Burton 
 wrote:
 [...]
 I therefore feel like I ought to not use assert and should 
 instead validate my assumptions with an if statement and a 
 throw or exit or something.

 Yes, that's the way of doing. assert() are just used to test 
 the program. the -release option in DMD disable all the 
 assert() (excepted assert(0) which is a bit special), so that 
 in a release version, only Throwable objects can be used after 
 a failure detected.

A small addition to the answers already provided.

As user1234 has already said, asserts are removed in the -release 
build, so, if you have to validate some assumption (ex. the file 
opened) you should use enforce[0].

Cheers,
Eduard

[0] - https://dlang.org/library/std/exception/enforce.html

Oct 11 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Wednesday, October 11, 2017 09:27:49 John Burton via Digitalmars-d-learn 
wrote:
 The spec says this :-

 "As a contract, an assert represents a guarantee that the code
 must uphold. Any failure of this expression represents a logic
 error in the code that must be fixed in the source code. A
 program for which the assert contract is false is, by definition,
 invalid, and therefore has undefined behaviour."

 Now I worry about the words "undefined behavior" because in C++
 compiler writers seem to have decided that these words mean that
 it's ok for the compiler to generate code to do whatever it feels
 like even in unconnected code and even before the undefined
 behavior is invoked because some subsequent code has undefined
 behavior.

  From my C++ experience this paragraph tells me that if I use
 "assert" to check my assumptions, and the assertion is false,
 then this could lead to my program failing in unpredictable ways
 unconnected with the actual assertion.

 I therefore feel like I ought to not use assert and should
 instead validate my assumptions with an if statement and a throw
 or exit or something.

 I feel like a failing assertion should not cause "undefined
 behavior" in the sense it is commonly used in C++ programming
 these days but should have exactly defined behavior that it will
 do nothing if the assert passes and throw the specified exception
 if it fails. Can I safely assume this despite the wording?

 I know this might seem like a small or pedantic point, but C++
 compilers can and do use invoking undefined behavior as an excuse
 to do all kinds of unexpected things in generated code these days
 and I want to write safe code :) I feel that if D is specified in
 the same way then assert is not safe for me to use in a real
 program.

If your assertions are failing, you're screwed anyway. If an assertion
fails, then by definition, your program is in an invalid state and who knows
what is going to happen. The whole point of assertions is to catch problems
during development so that you ensure that your code is correct. Not using
them is just making things worse for yourself.

The compiler _may_ use an assertion to inform its code generation by
assuming that the assertion is true (which certainly wouldn't cause you any
problems when not compiling with -release, since a failed assertions would
throw an AssertError and kill your program), and yes, in theory, if you
compile with -release, and an assertion would have failed, and the compiler
did something that assumed that the assertion passed, then maybe things
would be worse, but you're already screwed anyway, because your program is
in an invalid state, because the assertion wasn't true. In reality, I expect
that the compiler does very little at this point to optimize based on
assertions, but any time that it actually does will generally benefit you.

If you're concerned about compiling with -release and having an assertion
that would have failed not result in your program dying like it normally
would, then you can use assert(0), which will be translated to a HLT
instruction with -release. e.g.

if(!cond)
 assert(0);

instead of

assert(cond);

Regardless, you obviously shouldn't be using assertions for anything that
depends on user input. They're for detecting bugs in your program during
development. If you're truly using them for that, then you'll be better off
using them, and not using them is just begging for your program to have more
bugs that you didn't catch.

And the reality of the matter is that if you're looking to avoid undefined
behavior in general, you're screwed. Yes, D is much more likely to define
what should be happening than C/C++ is and thus has less undefined behavior
in general, but not ever allowing for undefined behavior would really harm
performance overall, because it really can help the compiler to optimize
code when it's not forced to do something less efficient just to make it so
that everything is fully defined, and pretty much any language you use is
going to have at least some undefined behavior.

Assertions are a great way to help ensure that you catch bugs in your
program, and if they can help the compiler optimize your code upon occasion,
then all the better (though honestly, I expect that they rarely have any
effect on the optimizer at this point given how little of the work on dmd is
geared towards better optimizing code; some work is done in that area to be
sure, but it's rarely the focus because of how much else needs to be done).

- Jonathan M Davis

Oct 11 2017

=?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:

On 10/11/2017 02:27 AM, John Burton wrote:
 The spec says this :-

 "As a contract, an assert represents a guarantee that the code must
 uphold. Any failure of this expression represents a logic error in the
 code that must be fixed in the source code. A program for which the
 assert contract is false is, by definition, invalid, and therefore has
 undefined behaviour."

The important part is that the "program" has undefined behavior 
according to *your* definition because you're the one who asserted that 
something should never have happened:

struct Square {
     int side;
     int area;

     invariant() {
         assert(area == side * side); // Could be inside foo()
     }

     void foo() {
     }
}

void main() {
     auto s = Square(1, 10);
     s.foo();
}

So, you think you wrote your program to never break that assertion. So, 
regardless of the reason for the failure (design error, memory 
corruption, hardware error, etc.), the program is outside of its 
well-defined state (by you).

 I know this might seem like a small or pedantic point

Not only this specific point, but almost everything about assertion 
failures are very important and very interesting. For example, according 
to the text you quoted; the code injected by the compiler, the one that 
dumps a backtrace for an Error, should not be executed either. You have 
no guarantee that that code will really dump a backtrace, whether the 
output will be correct, etc. :/ (There has been many many long 
discussions about these topics on the D newsgroups.)

What gives me comfort is the fact that life is not perfect anyway.[1] 
Things somehow seem to work fine. :)

Ali

[1] Another example is mouse clicks (and screen taps). We have no 
guarantee that we are clicking what we wanted to. Sometimes a new window 
pops up and you click some random button but it works in general.

Oct 11 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 11.10.2017 11:27, John Burton wrote:
 The spec says this :-
 
 "As a contract, an assert represents a guarantee that the code must 
 uphold. Any failure of this expression represents a logic error in the 
 code that must be fixed in the source code. A program for which the 
 assert contract is false is, by definition, invalid, and therefore has 
 undefined behaviour."
 
 Now I worry about the words "undefined behavior" because in C++ compiler 
 writers seem to have decided that these words mean that it's ok for the 
 compiler to generate code to do whatever it feels like even in 
 unconnected code and even before the undefined behavior is invoked 
 because some subsequent code has undefined behavior.
 
 From my C++ experience this paragraph tells me that if I use "assert" 
 to check my assumptions, and the assertion is false, then this could 
 lead to my program failing in unpredictable ways unconnected with the 
 actual assertion.
 

Yes, that's what it is saying. (The other answers, that say or try to 
imply that this is not true or true but not a bad thing, are wrong.)

To make this more obvious, see:

http://forum.dlang.org/post/lrbpvj$mih$1 digitalmars.com

Refer to point 2. The fix is to not use both assert and -release.

However, in practice, I think none of the current compiler 
implementations actually uses assert expressions for optimizations.

Oct 12 2017

John Burton <john.burton jbmail.com> writes:

On Thursday, 12 October 2017 at 14:22:43 UTC, Timon Gehr wrote:
 On 11.10.2017 11:27, John Burton wrote:

 Yes, that's what it is saying. (The other answers, that say or 
 try to imply that this is not true or true but not a bad thing, 
 are wrong.)

 ...
 

 However, in practice, I think none of the current compiler 
 implementations actually uses assert expressions for 
 optimizations.

This is an example of what I mean :-

import std.stdio;

extern void control_nuclear_reactor(int[] data);

void test(int[] data)
{
     if (data.length == 0) {
         writeln("Not enough data!");
     } else {
         control_nuclear_reactor(data);
     }

     assert(data.length > 0);
}

So according to the spec, if data is size zero then the assert 
fails and therefore the code has **undefined behavour**. What 
this means in practice is that the compiler decides that it 
doesn't matter what code is generated for that case as it 
undefined what it is meant to do anyway, so the compiler can 
"optimize" out the if condition as it only affects the case where 
the language doesn't define what it's supposed to do anyway, and 
compiles the code as if it was :-

void test(int[] data)
{
     control_nuclear_reactor();
}

Which obviously could have very bad results if the test mattered.

Yes my program is invalid because I violated it's assumptions but 
I find it very hard to argue that including the assert should 
"break" the code before it.

C++ compilers can and do perform such optimizations so I was 
wondering if assert in D could cause such behavior according to 
the spec.

Oct 12 2017

kdevel <kdevel vogtner.de> writes:

On Thursday, 12 October 2017 at 15:37:23 UTC, John Burton wrote:
 C++ compilers can and do perform such optimizations so I was 
 wondering if assert in D could cause such behavior according to 
 the spec.

In the context of ISO-C++ it is meaningless to reason about the 
"actual behavior" of a non-conforming program ("start WW III" 
etc.). You may find details here: 
<http://en.cppreference.com/w/cpp/language/ub>

As standard oriented C++ (or C or FORTRAN) programmers we avoid 
undefined behavior not because we would want to prevent WW III, 
but because we want to write and reason about conforming code 
only.

IIRC C++'s assert is defined in the ISO-C standard. There we can 
read:

"The assert macro puts diagnostic tests into programs; it expands 
to a void expression. When it is executed, if expression (which 
shall have a scalar type) is false (that is, compares equal to 
0), the assert macro writes information about the particular call 
that failed [...] on the standard error stream in an 
implementation-defined format). It then calls the abort function."

So in C/C++

---
int main ()
{
    assert (0);
    return 0;
}
---

is a perfectly valid (conforming) program.

D ist not standardized (yet) hence there is no such thing as a 
"standard conforming D implementation" or a "standard conforming 
D program". The D documentation is simply the manual of a set of 
programs (compiler, tools) which may or may not be correctly be 
described therin. According to 
<https://dlang.org/spec/contracts.html> the program

---
void main ()
{
    assert (false);
}
---

qualifies as "invalid, and therefore has undefined behaviour." A 
statement, which makes no sense to me. Either it is a "debugging 
aid", that implies defined behavior, or it is undefined behavior, 
then assert (false) cannot aid debugging.

Oct 12 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, October 12, 2017 20:15:41 kdevel via Digitalmars-d-learn wrote:
 On Thursday, 12 October 2017 at 15:37:23 UTC, John Burton wrote:
 C++ compilers can and do perform such optimizations so I was
 wondering if assert in D could cause such behavior according to
 the spec.

 In the context of ISO-C++ it is meaningless to reason about the
 "actual behavior" of a non-conforming program ("start WW III"
 etc.). You may find details here:
 <http://en.cppreference.com/w/cpp/language/ub>

 As standard oriented C++ (or C or FORTRAN) programmers we avoid
 undefined behavior not because we would want to prevent WW III,
 but because we want to write and reason about conforming code
 only.

 IIRC C++'s assert is defined in the ISO-C standard. There we can
 read:

 "The assert macro puts diagnostic tests into programs; it expands
 to a void expression. When it is executed, if expression (which
 shall have a scalar type) is false (that is, compares equal to
 0), the assert macro writes information about the particular call
 that failed [...] on the standard error stream in an
 implementation-defined format). It then calls the abort function."

 So in C/C++

 ---
 int main ()
 {
     assert (0);
     return 0;
 }
 ---

 is a perfectly valid (conforming) program.

 D ist not standardized (yet) hence there is no such thing as a
 "standard conforming D implementation" or a "standard conforming
 D program". The D documentation is simply the manual of a set of
 programs (compiler, tools) which may or may not be correctly be
 described therin. According to
 <https://dlang.org/spec/contracts.html> the program

 ---
 void main ()
 {
     assert (false);
 }
 ---

 qualifies as "invalid, and therefore has undefined behaviour." A
 statement, which makes no sense to me. Either it is a "debugging
 aid", that implies defined behavior, or it is undefined behavior,
 then assert (false) cannot aid debugging.

assert(false) is a bit special in that it's never removed (it becomes a HLT
instruction with -release), and the compiler recognizes that you're saying
that that code is supposed to be unreachable (e.g. it then doesn't require a
return statement after assert(0) if if the function is supposed to return).
Obviously, asserting false in main is pointless, and you're dead wrong about
the code being unreachable, but the fact that you're wrong is caught when
the line is reached, and the program is killed. And that behavior is
completely defined, because it's known at compile time that the condition
being tested by the assertion is false.

assert(false) does aid in debugging in that you're indicating that a piece
of code is supposed to be unreachable, and if you reach it, then you have a
bug, and you catch it, but really, it's a special case indicating that a
piece of code is supposed to be unreachable rather than an assertion to test
that a particular condition is true at a particular point in the program,
which is what assertions normally do.

- Jonathan M Davis

Oct 12 2017

kdevel <kdevel vogtner.de> writes:

On Thursday, 12 October 2017 at 20:27:03 UTC, Jonathan M Davis 
wrote:
 On Thursday, October 12, 2017 20:15:41 kdevel via
 ---
 void main ()
 {
     assert (false);
 }
 ---

 qualifies as "invalid, and therefore has undefined behaviour." 
 A statement, which makes no sense to me. Either it is a 
 "debugging aid", that implies defined behavior, or it is 
 undefined behavior, then assert (false) cannot aid debugging.

 assert(false) is a bit special in that it's never removed (it 
 becomes a HLT instruction with -release),

Confirmed. I should have written something like this instead:

---
import std.stdio;
import std.string;
import std.conv;
void main ()
{
    int i;
    i = readln.chomp.to!int;
    assert (i != 3);
    writeln ("i = <", i, ">");
}
---

Is it defined that this program throws an AssertError in debug 
mode if 3 is fed to stdin? If not, assert (...) could not aid 
debugging.

Oct 12 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Thursday, October 12, 2017 21:22:29 kdevel via Digitalmars-d-learn wrote:
 On Thursday, 12 October 2017 at 20:27:03 UTC, Jonathan M Davis

 wrote:
 On Thursday, October 12, 2017 20:15:41 kdevel via

 ---
 void main ()
 {

     assert (false);

 }
 ---

 qualifies as "invalid, and therefore has undefined behaviour."
 A statement, which makes no sense to me. Either it is a
 "debugging aid", that implies defined behavior, or it is
 undefined behavior, then assert (false) cannot aid debugging.

 assert(false) is a bit special in that it's never removed (it
 becomes a HLT instruction with -release),

 Confirmed. I should have written something like this instead:

 ---
 import std.stdio;
 import std.string;
 import std.conv;
 void main ()
 {
     int i;
     i = readln.chomp.to!int;
     assert (i != 3);
     writeln ("i = <", i, ">");
 }
 ---

 Is it defined that this program throws an AssertError in debug
 mode if 3 is fed to stdin? If not, assert (...) could not aid
 debugging.

If assertions are compiled in (which they are if you're not compiling with
-release), and i is ever 3, then an AssertError will be thrown. This is
guaranteed. As such, the compiler is free to assume that i is never 3 when
code execution arrives at the line after the assertion, and if it can do an
optimization based on that fact, it is free to do so. You've told it that i
should never be 3 at that point and that it's a bug if it is, and as such,
it is free to assume that i is never 3 after the assertion even if the
assertion is compiled out with -release - that is the only place that
undefined behavior may enter into it. If the compiler does an optimization
based on the fact that i isn't 3, and it is, and -release is used, then you
could get some weird behavior when the code reaches the lines after the
assertion - but by definition, you already have a bug if i is 3, and your
program in general is assuming that i isn't 3 at that point, so you're going
to get bad behavior either way. The fact that your assertion failed means
that you have a logic error in your program, and it is therefore in an
invalid state and will likely not behave correctly.

However, your example is an excellent example of when _not_ to use
assertions. Assertions should never be used on user input or anything
outside of the program's control. When you use an assertion, you are saying
that it is a bug in the progam if that assertion fails, and bad user input
isn't a bug, though the fact that you're not validating user input arguably
is (certainly it is if the assertion is there, since at that point, you're
saying that it's a bug if i is ever 3). Assertions allow you to catch bugs
in your logic during development and then don't slow your program down when
compiling with -release for production. They are not for validating anything
other than that the logic of your program is correct.

And if for any reason, you're paranoid enough that you want those logic
checks to still be there in production, then either don't use -release (even
in production), or do something like

enforce!Error(cond, "msg");

instead of

assert(cond, "msg);

and then you'll get an Error thrown when the condition fails - even with
-release.


On a side note, I would point out that talking about "debug mode" with D
gets annoyingly ambiguous, because that kind of implies the -debug flag,
which has nothing to do with assertions and which actually can be used in
conjunction with -release (all -debug does is enable debug{} blocks), which
is why I try to avoid the term debug mode - though I assume that you meant
when -release isn't used, since that's often what folks mean.

- Jonathan M Davis

Oct 12 2017

kdevel <kdevel vogtner.de> writes:

On Friday, 13 October 2017 at 02:22:24 UTC, Jonathan M Davis 
wrote:
 You've told it that i should never be 3 at that point and that 
 it's a bug if it is, and as such, it is free to assume that i 
 is never 3 after the assertion even if the assertion is 
 compiled out with -release - that is the only place that 
 undefined behavior may enter into it.

Thanks for the clarification! This is a difference to C where 
assert has only a diagnostic purpose. Disabling assertions in C 
(by setting NDEBUG) does AFAICS neither introduce undefined 
behavior nor is the compiler entitled to optimize code away based 
on the assertion. This C program

--- test.c
#include <stdio.h>
#define NDEBUG 1
#include <assert.h>
int main ()
{
    int i = 3;
    assert (i != 3);
    if (i == 3)
       printf ("%d\n", i);
    return 0;
}
---

is IMHO conforming and it is defined to print 3 in a conforming 
environment. The 'corresponding' D program

--- assert4.d
import std.stdio;
int main ()
{
    int i = 3;
    assert (i != 3);
    if (i == 3)
       writef ("%d\n", i);
    return 0;
}
---

is 'conforming' (but buggy) under non-release-D and 
'non-conforming' (because of the undefined behavior) otherwise. 
Is this judgement correct?

 If the compiler does an optimization based on the fact that i 
 isn't 3, and it is, and -release is used, then you could get 
 some weird behavior when the code reaches the lines after the 
 assertion - but by definition, you already have a bug if i is 
 3, and your program in general is assuming that i isn't 3 at 
 that point, so you're going to get bad behavior either way.

I would like to make a clear distiction between "bug" or "bad 
behavior" on the one hand and "undefined behavior" on the other. 
"Bug" and "bad behavior" address the outcome of a computation 
while "undefined behavior" is an (abstract, formal) property of a 
piece of code with respect to a certain language specification.

 The fact that your assertion failed means that you have a logic 
 error in your program, and it is therefore in an invalid state 
 and will likely not behave correctly.

Under non-release-D the program is perfectly valid and behaves 
exactly as expected. In relase-D it makes no sense to discuss if 
the state of program is valid or if the program behaves 
correctly, since it is non-conforming because of the undefined 
behavior.

(...)

 On a side note, I would point out that talking about "debug 
 mode" with D gets annoyingly ambiguous, because that kind of 
 implies the -debug flag, which has nothing to do with 
 assertions and which actually can be used in conjunction with 
 -release (all -debug does is enable debug{} blocks), which is 
 why I try to avoid the term debug mode - though I assume that 
 you meant when -release isn't used, since that's often what 
 folks mean.

Agreed.

Oct 13 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, October 13, 2017 11:26:54 kdevel via Digitalmars-d-learn wrote:
 On Friday, 13 October 2017 at 02:22:24 UTC, Jonathan M Davis

 wrote:
 You've told it that i should never be 3 at that point and that
 it's a bug if it is, and as such, it is free to assume that i
 is never 3 after the assertion even if the assertion is
 compiled out with -release - that is the only place that
 undefined behavior may enter into it.

 Thanks for the clarification! This is a difference to C where
 assert has only a diagnostic purpose. Disabling assertions in C
 (by setting NDEBUG) does AFAICS neither introduce undefined
 behavior nor is the compiler entitled to optimize code away based
 on the assertion. This C program

 --- test.c
 #include <stdio.h>
 #define NDEBUG 1
 #include <assert.h>
 int main ()
 {
     int i = 3;
     assert (i != 3);
     if (i == 3)
        printf ("%d\n", i);
     return 0;
 }
 ---

 is IMHO conforming and it is defined to print 3 in a conforming
 environment. The 'corresponding' D program

 --- assert4.d
 import std.stdio;
 int main ()
 {
     int i = 3;
     assert (i != 3);
     if (i == 3)
        writef ("%d\n", i);
     return 0;
 }
 ---

 is 'conforming' (but buggy) under non-release-D and
 'non-conforming' (because of the undefined behavior) otherwise.
 Is this judgement correct?

Essentially, though talking about conforming usually has to do with spec.

In both C/C++ and D, if you use an assertion, you're saying that if the
assertion fails, then the logic in your code is faulty, and there is a bug
in your program. With C/C++, it may not be codified that the compiler
understands that, but the meaning is the same. If the assertion is compiled
out but would have failed, then your program is in an invalid state and will
do who-knows-what. By definition, you're screwed. Your program is doing
something that you have said should never happen. How screwed you actually
are can vary considerably, and if all it does is print out the value and
never use it again, then you're not very screwed, but that also means that
it was a rather odd assertion (though you're obviously doing that here as an
example and not something that someone would normally do).

Because D's compiler does understand what assert means, it is allowed to
optimize based on that fact. So, it _can_ generate code based on the
assumption that the assertion succeeded, which can increase how screwed you
are if the assertion is compiled out but would have failed, but your program
is in an invalid state either way, because you've asserted that something is
true when it isn't and thus indicated that if it isn't true, there is a bug
in your program, and its logic is wrong. And as soon as the logic in your
program is wrong, then it's not going to behave correctly. It's just a
question of how badly behaved it will be.

So, we can talk about the behavior being undefined if the assertion would
have failed on the basis that the compiler could generate optimized code
that assumes that the assertion succeeded and thus do weirder things than it
would have done if the code hadn't been optimized that way, but as far as
the language is concerned, it's undefined behavior due to the simple fact
that you've asserted that something is true which isn't. You yourself have
stated that something must be true for your program to be valid, and it
isn't true.

As long as the assertions are compiled in, then the fact that the logic in
your program was invalid is caught, whereas if they're not compiled in, then
it's not caught. But if the asserted condition is false, then your program
is wrong either way.

- Jonathan M Davis

Oct 13 2017

Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:

On Thursday, 12 October 2017 at 15:37:23 UTC, John Burton wrote:
 This is an example of what I mean :-

 undefined what it is meant to do anyway, so the compiler can 
 "optimize" out the if condition as it only affects the case 
 where the language doesn't define what it's supposed to do 
 anyway, and compiles the code as if it was :-

 void test(int[] data)
 {
     control_nuclear_reactor();
 }

Yeah the C/C++ community/haters love to talk about all the code 
the compiler can inject because of undefined behavior. But that 
is not what it means.

The compiler does not know the value of data.length so it could 
not make such a transformation of the code. Now had the assert 
been written before the if, you're telling the compiler some 
properties of data.length before you check it and it could make 
such optimizations.

The point is assert tells the compiler something it can use to 
reason about its job, not that it can insert additional runtime 
checks to see if you code is invalid an then add new jumps to 
execute whatever the hell it wants.

Oct 13 2017

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Saturday, October 14, 2017 05:20:47 Jesse Phillips via Digitalmars-d-
learn wrote:
 The point is assert tells the compiler something it can use to
 reason about its job, not that it can insert additional runtime
 checks to see if you code is invalid an then add new jumps to
 execute whatever the hell it wants.

+1

- Jonathan M Davis

Oct 14 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 14.10.2017 07:20, Jesse Phillips wrote:
 On Thursday, 12 October 2017 at 15:37:23 UTC, John Burton wrote:
 This is an example of what I mean :-

 undefined what it is meant to do anyway, so the compiler can 
 "optimize" out the if condition as it only affects the case where the 
 language doesn't define what it's supposed to do anyway, and compiles 
 the code as if it was :-

 void test(int[] data)
 {
     control_nuclear_reactor();
 }

 
 Yeah the C/C++ community/haters love to talk about all the code the 
 compiler can inject because of undefined behavior. But that is not what 
 it means.
 ...

It can mean that, but that is not even what happened in the given example.

 The compiler does not know the value of data.length so it could not make 
 such a transformation of the code.

The compiler can easily prove that the value of data.length does not 
change between the two points in the program. According to the 
specification, the behavior of the program is undefined in case the 
assertion fails, not just the behavior of the program after the 
assertion would have failed if it had not been removed.

 Now had the assert been written 
 before the if, you're telling the compiler some properties of 
 data.length before you check it and it could make such optimizations.
 
 The point is assert tells the compiler something it can use to reason 
 about its job, not that it can insert additional runtime checks to see 
 if you code is invalid an then add new jumps to execute whatever the 
 hell it wants.
 

In the above example, a branch was removed, not added.

However, optimizers can add branches. (For example, it can check whether 
there is aliasing and use optimized code if it is not the case.)

Also, UB can and does sometimes mean that the program can execute 
arbitrary code. It's called "arbitrary code execution": 
https://en.wikipedia.org/wiki/Arbitrary_code_execution

Oct 14 2017

kdevel <kdevel vogtner.de> writes:

On Saturday, 14 October 2017 at 09:32:32 UTC, Timon Gehr wrote:
 Also, UB can and does sometimes mean that the program can 
 execute arbitrary code. It's called "arbitrary code execution": 
 https://en.wikipedia.org/wiki/Arbitrary_code_execution

This confuses different levels of reasoning. In C/C++ "undefined 
behavior" is a property of the SOURCE code with respect to the 
specification. It states: The spec does not not apply, it does 
not define the semantic.

This issue is totally different from the question what a given 
program containing undefined behavior actually does after is 
compiles and the after the linker produces an executable. This is 
reasoning about generated MACHINE code.

A result of this confusion has been that some clever people tried 
to "detect" certain kinds of undefined behavior "after" they 
"happended". E.g. 
<https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475> This is the 
danger of undefined behavior: The MACHINE code may also work as 
the programmer expected. At least for some time.

Oct 14 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 14.10.2017 23:36, kdevel wrote:
 On Saturday, 14 October 2017 at 09:32:32 UTC, Timon Gehr wrote:
 Also, UB can and does sometimes mean that the program can execute 
 arbitrary code. It's called "arbitrary code execution": 
 https://en.wikipedia.org/wiki/Arbitrary_code_execution

 
 This confuses different levels of reasoning.

It's a correct statement about the semantics of programs produced from 
sources with UB by standard-compliant compilers.

 In C/C++ "undefined 
 behavior" is a property of the SOURCE code with respect to the 
 specification. It states: The spec does not not apply, it does not 
 define the semantic.
 ...

I.e., the semantics of a program produced by a conforming compiler can 
be arbitrary.

 This issue is totally different from the question what a given program 
 containing undefined behavior actually does after is compiles and the 
 after the linker produces an executable. This is reasoning about 
 generated MACHINE code.
 ...

Sure. This is very much intentional. The current subthread is about what 
kind of programs the compiler might produce (in practice) if the 
provided source code contains UB. The claim I was refuting was that the 
produced program cannot have branching and other behaviors not specified 
in the source.

 A result of this confusion has been that some clever people tried to 
 "detect" certain kinds of undefined behavior "after" they "happended". 
 E.g. <https://gcc.gnu.org/bugzilla/show_bug.cgi?id=30475> This is the 
 danger of undefined behavior: The MACHINE code may also work as the 
 programmer expected. At least for some time.
 
 

I'm not confused about this at all.

Oct 14 2017

Jesse Phillips <Jesse.K.Phillips+D gmail.com> writes:

On Saturday, 14 October 2017 at 09:32:32 UTC, Timon Gehr wrote:
 The compiler can easily prove that the value of data.length 
 does not change between the two points in the program. 
 According to the specification, the behavior of the program is 
 undefined in case the assertion fails, not just the behavior of 
 the program after the assertion would have failed if it had not 
 been removed.

You are right, in this example proving that there is no change 
between the condition and the assert is easy and possible. In 
fact there was an example of this in C I think with a function 
pointer which was uninitialized. Where the optimizer identified 
that there was only one valid function which could have been 
assigned and made lowered the indirect call to a direct one.

My statement was more around if the compiler/optimizer can't 
determine the value

     void test(int[] data, bool goboom)
     {
         if (data.length == 0) {
             writeln("Not enough data!");
         } else {
             control_nuclear_reactor(data);
         }

         assert(goboom);
     }

The optimizer can generate code to match:


     void test(int[] data, bool goboom)
     {
         if(!goboom) {
             launch_nuclear_missile();
             return;
         }

         if (data.length == 0) {
             writeln("Not enough data!");
         } else {
             control_nuclear_reactor(data);
         }
     }


 Also, UB can and does sometimes mean that the program can 
 execute arbitrary code. It's called "arbitrary code execution": 
 https://en.wikipedia.org/wiki/Arbitrary_code_execution

That article is about attacks not optimizers.

Oct 17 2017

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Assert and undefined behavior