www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Tail pad optimization, cache friendlyness and C++ interrop

reply "deadalnix" <deadalnix gmail.com> writes:
I was messing around with clang codegen and noticed that sometime 
it optimize structs using the tail pad, and sometime it doesn't. 
It ended up in the following stack overflow question :

http://stackoverflow.com/questions/24110347/when-extending-a-padded-struct-why-cant-extra-fields-be-placed-in-the-tail-pad

It seems that C++ will use the padding left in a struct to pack 
more stuff in there. I was wonering what position we want to have 
with D.

Packing elements efficiently is a major importance for 
performance. For now, D uses C style packing (ie: nothing in the 
tail pad) as far as I know. This is missed opportunities to pack 
more (I have many places in SDC where it could be beneficial from 
using tail pad).

If we are gonna interoperable with C++, then we need to 
understand the tail pad optimization anyway. So here is what I 
propose:

extern(D) do tail pad optimize.
extern(C) struct do not tail pad optimize.
extern(C++) do tail pad with C++ rules:
do not tail pad if (otherwise tail pad):
1. has no non-static data members that aren't standard-layout
2. has no virtual functions and no virtual base classes
3. has the same access control for all non-static data members
4. has no base classes that aren't standard-layout
5. either has no base class with non-static data members or has 
no non-static data members in the most derived class and only one 
base with them
6. has no base classes of the same type as the first non-static 
data member

thought ?
Jun 10 2014
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
deadalnix:

 thought ?

I think for D there is a lower handing fruit: the D specs allow to reorder class instance fields, but I think the D front-end is not yet doing that. Bye, bearophile
Jun 10 2014
prev sibling next sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Tuesday, 10 June 2014 at 08:57:26 UTC, bearophile wrote:
 deadalnix:

 thought ?

I think for D there is a lower handing fruit: the D specs allow to reorder class instance fields, but I think the D front-end is not yet doing that.

But only for classes, not for structs.
Jun 10 2014
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
10-Jun-2014 11:30, deadalnix пишет:
[snip]
 extern(D) do tail pad optimize.
 extern(C) struct do not tail pad optimize.
 extern(C++) do tail pad with C++ rules:
 do not tail pad if (otherwise tail pad):
 1. has no non-static data members that aren't standard-layout
 2. has no virtual functions and no virtual base classes
 3. has the same access control for all non-static data members
 4. has no base classes that aren't standard-layout
 5. either has no base class with non-static data members or has no
 non-static data members in the most derived class and only one base with
 them
 6. has no base classes of the same type as the first non-static data member

 thought ?

Would be useful, but as proposed it might break some of current C bindings silently (unless it's extern(C): at the top of file). -- Dmitry Olshansky
Jun 10 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/10/2014 12:30 AM, deadalnix wrote:
 thought ?

This does not apply to D structs because D structs do not inherit. C doesn't have classes, so no issues there. extern(C++) class should match the C++ ABI for this. extern(D) class we are free to innovate with the layout. I suggest turning this into a bugzilla enhancement request.
Jun 10 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/10/2014 5:27 PM, deadalnix wrote:
 I'm talking about structs, not classes.

Ok, but since D structs do not inherit, how does tail pad optimization apply?
Jun 10 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/10/2014 7:44 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 02:10:18 UTC, Walter Bright wrote:
 On 6/10/2014 5:27 PM, deadalnix wrote:
 I'm talking about structs, not classes.

Ok, but since D structs do not inherit, how does tail pad optimization apply?

struct S1 { int a; byte b; } struct S2 { S1 s1; char c; } Sé could be 8 byte long (some variation of this are in C++ for instance if I make a public and b private).

Do any C++ compilers do this for this case, or just for the inheritance one?
Jun 10 2014
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/10/2014 7:44 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 02:10:18 UTC, Walter Bright wrote:
 On 6/10/2014 5:27 PM, deadalnix wrote:
 I'm talking about structs, not classes.

Ok, but since D structs do not inherit, how does tail pad optimization apply?

struct S1 { int a; byte b; } struct S2 { S1 s1; char c; }

Hmm, this could have serious problems with the following: S1 s1; S2 s2; s2.c = 3; memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c
Jun 10 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/10/2014 11:11 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 04:11:53 UTC, Walter Bright wrote:
 Hmm, this could have serious problems with the following:

 S1 s1;
 S2 s2;
 s2.c = 3;
 memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c

Yes, that is why they do it only in specific condition (ie when the thing use C++ specific feature and not C one).

I don't understand - didn't you say this was expressible as C structs? Aren't those supposed to be compatible with C?
 This kind of thing is valid in C, but isn't
 supposed to be in C++,

I'm not so sure about that.
 and certainly not in  safe D.

I'm not so sure about that, either. There are many ways of bit copying structs, and some of them are perfectly memory safe. The fundamental problem with this optimization is now a struct has two sizes - the unpadded and the padded size. I foresee nothing but problems, corner cases, and bugs inherent in trying to select which size to use for which purpose. I do not understand how C++ avoids these issues.
Jun 10 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/11/2014 12:22 AM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 06:30:26 UTC, Walter Bright wrote:
 On 6/10/2014 11:11 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 04:11:53 UTC, Walter Bright wrote:
 Hmm, this could have serious problems with the following:

 S1 s1;
 S2 s2;
 s2.c = 3;
 memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c

Yes, that is why they do it only in specific condition (ie when the thing use C++ specific feature and not C one).

I don't understand - didn't you say this was expressible as C structs? Aren't those supposed to be compatible with C?

struct S1 { int a; private: char b; }; struct S2 : S1 { char c; }; S2 is 8 bytes. If you remove private in S1, then S2 becomes 12 bytes. S1 is not a "C" struct anymore and do not need to follow the standard layout.

Ok, I understand that, but I'd find it surprising behavior that a protection attribute affected layout.
 and certainly not in  safe D.

I'm not so sure about that, either. There are many ways of bit copying structs, and some of them are perfectly memory safe.

It is not provable by the compiler, therefore it is not safe.

What's not provable? Why would bit copying a struct not be memory safe?
 The fundamental problem with this optimization is now a struct has two sizes -
 the unpadded and the padded size. I foresee nothing but problems, corner
 cases, and bugs inherent in trying to select which size to use for which
 purpose. I do not understand how C++ avoids these issues.

By not spilling out for struct that do not have standard layout. In above example, the compiler won't issue code that manipulate the padding after S1 as it may contains something.

Ok, I can see that, but forcing memberwise copy has its own efficiency problems, especially when doing things like enregistering it.
Jun 11 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/11/2014 11:35 AM, Walter Bright wrote:
 I'm not so sure about that, either. There are many ways of bit copying
 structs, and some of them are perfectly memory safe.

It is not provable by the compiler, therefore it is not safe.

Not memory safe implies (is supposed to imply) not safe but not safe does not imply not memory safe.
Jun 11 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/11/2014 4:34 AM, Timon Gehr wrote:
 Not memory safe implies (is supposed to imply) not  safe but not  safe does not
 imply not memory safe.

safe in D == memory safe.
Jun 11 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 12:44 AM, David Nadlinger wrote:
 On Wednesday, 11 June 2014 at 16:50:30 UTC, Walter Bright wrote:
 On 6/11/2014 4:34 AM, Timon Gehr wrote:
 Not memory safe implies (is supposed to imply) not  safe but not  safe does not
 imply not memory safe.

safe in D == memory safe.

What Timon is saying is that not all memory safe code is verifiably safe.

In D, they are defined to be the same thing, so the statement makes no sense.
  safe => memory safe, but not memory safe =>  safe.

 David

Jun 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/15/2014 10:33 AM, Walter Bright wrote:
 What Timon is saying is that not all memory safe code is verifiably
  safe.

In D, they are defined to be the same thing,

Since when? http://dlang.org/function "Function Safety Safe functions are functions that are _statically checked_ to exhibit _no possibility of undefined behavior_. Undefined behavior is often used as a vector for malicious attacks. Safe Functions Safe functions are marked with the safe attribute. The following operations are not allowed in safe functions:" I.e. the documentation has two (conflicting) definitions and none of them is the one you claim there is.
 so the statement makes no sense.

Then please indicate how to fix the documentation. If you are going to claim the Humpty Dumpty privilege, I'll back off. Thanks. On 06/11/2014 11:35 AM, Walter Bright wrote:
 What's not provable? Why would bit copying a struct not be memory safe?

Since you claim memory safe is the same as verifiably safe, you are asking:
 What's not provable? Why would bit copying a struct not be verifiably  safe?

struct S{ int x; } void main() safe{ S s,t; memcpy(&s,&t,S.sizeof); // error } So, what is it that you are trying to bring across?
Jun 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 2:48 AM, Timon Gehr wrote:
 On 06/15/2014 10:33 AM, Walter Bright wrote:
 What Timon is saying is that not all memory safe code is verifiably
  safe.

In D, they are defined to be the same thing,

Since when? http://dlang.org/function "Function Safety Safe functions are functions that are _statically checked_ to exhibit _no possibility of undefined behavior_. Undefined behavior is often used as a vector for malicious attacks. Safe Functions Safe functions are marked with the safe attribute. The following operations are not allowed in safe functions:" I.e. the documentation has two (conflicting) definitions and none of them is the one you claim there is.
 so the statement makes no sense.

Then please indicate how to fix the documentation. If you are going to claim the Humpty Dumpty privilege, I'll back off. Thanks.

I don't know why the documentation says that. D's safe is about memory safety, not undefined behavior. Note that the list of eschewed behaviors are possibly memory corrupting. Signed integer overflow, for example, is not listed.
Jun 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/15/2014 08:44 PM, Walter Bright wrote:
 On 6/15/2014 2:48 AM, Timon Gehr wrote:
 On 06/15/2014 10:33 AM, Walter Bright wrote:
 What Timon is saying is that not all memory safe code is verifiably
  safe.

In D, they are defined to be the same thing,

Since when? http://dlang.org/function ...
 so the statement makes no sense.

Then please indicate how to fix the documentation. If you are going to claim the Humpty Dumpty privilege, I'll back off. Thanks.

I don't know why the documentation says that. D's safe is about memory safety, not undefined behavior. ...

Note that this is frustratingly unhelpful for deciphering your point about "memory safe" <=> "verifiably safe" by definition. Are you defining "memory safe" or "verifiably safe"?
 Note that the list of eschewed behaviors are possibly memory corrupting.

It is an incomplete list. I'd rather see an incomplete list of _allowed_ behaviours instead of an incomplete list of eschewed behaviours. In any case, I don't see how any list of (syntactic) verifiable properties of the code is supposed to be equivalent to the (non-trivial semantic) memory safety property. Are you assuming trusted as an oracle for memory safety and saying trusted code is 'verifiably safe' code? (That's not the intended reading.)
 Signed integer overflow, for example, is not listed.

Are you trying to say that signed integer overflow is undefined behaviour in D? (This would again contradict the documentation.)
Jun 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 3:45 PM, Timon Gehr wrote:
 I don't know why the documentation says that. D's  safe is about memory
 safety, not undefined behavior.
 ...

Note that this is frustratingly unhelpful for deciphering your point about "memory safe" <=> "verifiably safe" by definition. Are you defining "memory safe" or "verifiably safe"?

I don't understand your question. I don't know what is unhelpful about saying that safe refers to memory safety.
 Note that the list of eschewed behaviors are possibly memory corrupting.

It is an incomplete list.

I ask that you enumerate the missing items, put the list in bugzilla, and tag them with the 'safe' keyword.
 In any case, I
 don't see how any list of (syntactic) verifiable properties of the code is
 supposed to be equivalent to the (non-trivial semantic) memory safety property.

The list is not restricted to syntactic issues.
 Are you assuming  trusted as an oracle for memory safety and saying  trusted
 code is 'verifiably  safe' code? (That's not the intended reading.)

Not at all. Where does the spec suggest that?
 Signed integer overflow, for example, is not listed.

(This would again contradict the documentation.)

I know the spec says it follows 2's complement arithmetic. I'm not 100% confident we can rely on that for all 2's complement CPUs, and furthermore we have a bit of a problem in relying on optimizers built for C/C++ which rely on it being undefined behavior. In any case, it is still not an issue for safe.
Jun 15 2014
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/16/2014 01:06 AM, Walter Bright wrote:
 On 6/15/2014 3:45 PM, Timon Gehr wrote:
 I don't know why the documentation says that. D's  safe is about memory
 safety, not undefined behavior.
 ...

Note that this is frustratingly unhelpful for deciphering your point about "memory safe" <=> "verifiably safe" by definition. Are you defining "memory safe" or "verifiably safe"?

I don't understand your question. I don't know what is unhelpful about saying that safe refers to memory safety. ...

You stated the two to be equivalent earlier, which is impossible.
 Note that the list of eschewed behaviors are possibly memory corrupting.

It is an incomplete list.

I ask that you enumerate the missing items, put the list in bugzilla, and tag them with the 'safe' keyword.
 In any case, I
 don't see how any list of (syntactic) verifiable properties of the
 code is
 supposed to be equivalent to the (non-trivial semantic) memory safety
 property.

The list is not restricted to syntactic issues. ...

(Yes it is, but that is not important because here the problem here is clearly that these terms have wildly different meanings in different communities.) The important distinction is between verifiable and non-verifiable. safe cannot be equivalent to memory safe because safe is verifiable and memory safe is not.
 Are you assuming  trusted as an oracle for memory safety and saying
  trusted
 code is 'verifiably  safe' code? (That's not the intended reading.)

Not at all. Where does the spec suggest that? ...

I'm just trying to find the definition/theorem we do not agree on.
 Signed integer overflow, for example, is not listed.

behaviour in D? (This would again contradict the documentation.)

I know the spec says it follows 2's complement arithmetic. I'm not 100% confident we can rely on that for all 2's complement CPUs, and furthermore we have a bit of a problem in relying on optimizers built for C/C++ which rely on it being undefined behavior. In any case, it is still not an issue for safe.

Maybe not in practice (though I'll not bet on it), but a conforming implementation can do _anything at all_ if undefined behaviour occurs, including behaving as if memory had been corrupted.
Jun 15 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 4:26 PM, Timon Gehr wrote:
 On 06/16/2014 01:06 AM, Walter Bright wrote:
 I don't understand your question. I don't know what is unhelpful about
 saying that  safe refers to memory safety.
 ...

You stated the two to be equivalent earlier, which is impossible.

It is for Java, why should D be different?
 The list is not restricted to syntactic issues.


No, it is not. For example, assigning an int to a pointer is a semantic issue, not a semantic one.
 but that is not important because here the problem here is clearly
 that these terms have wildly different meanings in different communities.)

I don't think the distinction is hard to grasp.
 I'm just trying to find the definition/theorem we do not agree on.

I strongly suggest that you can help by identifying specific issues in bugzilla and marking them with the 'safe' keyword, as I suggested earlier. I do not believe that memory safety is a difficult concept to agree on.
Jun 15 2014
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 9:49 PM, Walter Bright wrote:
 No, it is not. For example, assigning an int to a pointer is a semantic issue,
 not a semantic one.

Gack, I meant "not a SYNTACTIC one".
Jun 15 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/16/2014 06:49 AM, Walter Bright wrote:
 On 6/15/2014 4:26 PM, Timon Gehr wrote:
 On 06/16/2014 01:06 AM, Walter Bright wrote:
 I don't understand your question. I don't know what is unhelpful about
 saying that  safe refers to memory safety.
 ...

You stated the two to be equivalent earlier, which is impossible.

It is for Java,

No. sun.misc.Unsafe can be used to write memory safe code.
 why should D be different?


 The list is not restricted to syntactic issues.


No, it is not. For example, assigning an int to a pointer is a semantic issue, not a [syntactic] one. ...

By extension, memory safety is a semantic issue. Since it is non-trivial, it is in general not verifiable. (This is Rice's Theorem.) safe-ty OTOH is a verifiable property of the program.
 ...

 I'm just trying to find the definition/theorem we do not agree on.

I strongly suggest that you can help by identifying specific issues in bugzilla

My point was that no implementation of safe whatsoever can make it _equivalent_ to memory safety (i.e. safe will never single out precisely those programs that do not corrupt memory). It will always only approximate memory safety. This is not a bug, it's just a fact.
 and marking them with the 'safe' keyword, as I suggested
 earlier.

In my book, the main problem is that safe is not specified in a way that is easy to be verified to guarantee memory safety. I think plugging holes one after another as they are discovered in real code until _hopefully_ none remain is not a very productive use of time. We should strive to keep safe sound during its whole implementation.
 I do not believe that memory safety is a difficult concept to
 agree on.

Why? Many superficially simple informal concepts are difficult to agree on. In this case this is witnessed by the fact that you do not seem to want to ban undefined behaviour from safe code which I can't agree with.
Jun 16 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
You have a lot of good thoughts on this issue.

You have specific problems in mind but do not identify them other than in vague 
terms. I guarantee that I am really, really bad at mindreading.

Instead, I would strongly prefer that you:

1. write bugzilla issues for the problems that you identify

2. connect those bugzilla issues regarding  safe with the 'safe' keyword

3. construct specific proposals for how to fix these issues. The proposals
don't 
actually have to be pull requests, just descriptions on the bugzilla pages for 
how to fix.

4. write pull requests for the documentation


Even doing just one of 1..4 would be most appreciated. Merely asserting that 
 safe is broken/buggy/etc. will not result in progress.
Jun 17 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/17/2014 1:30 PM, deadalnix wrote:
 I don't think he will. He's been explaining you for the past
 several posts that this process is fundamentally broken, and
 won't achieve the goals of  safe.

 Please have a step back and look at the broader issue here.

What plan of action do you propose?
Jun 17 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/17/2014 1:59 PM, deadalnix wrote:
 On Tuesday, 17 June 2014 at 20:50:01 UTC, Walter Bright wrote:
 What plan of action do you propose?


The default is system.
 unless they are proven  safe instead of
 trying to plug the holes in the condom one by one and hope you
 don't end up with 20 years of child support.

Instead of asserting there are unspecified holes, please identify them with bugzilla issues and connect them with the 'safe' keyword. Doing otherwise is useless and nothing will come of it. Arguing that the spec is wrong without identifying specifically what is wrong and how it should be corrected is useless and nothing will come of it. There are many people (particularly Kenji) who spend many hours combing bugzilla looking for things to fix. Issues not in bugzilla do not get fixed. Nonspecific complaints on the n.g. are ineffective in garnering improvement.
Jun 17 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/17/2014 2:50 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 Out of curiosity, what is "memory safety" defined to be?

http://en.wikipedia.org/wiki/Memory_safety
 Does it include running out of stack?

Depends. If the stack has protection against stack overflow, then it is memory safe. If not, it is not memory safe. A guard page is an example of the former, but D's fibers suffer from the latter.
 Then just define the undefined behaviour as yielding the integer result of a
 unspecified black box function of the input. Integer overflow should yield the
 same value for the same operands on all ALUs. I don't understand how this
 relates to memory safety?

Exactly - offer an improved wording rather than just throwing up hands.
Jun 17 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/17/2014 11:50 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 ...
 Btw, Rice's theorem is based on the halting problem for TMs… so it
 suffers from the same issues as everything else in theoretical CS when
 it comes to practical situations.

There is no such 'issue', or any meaningful way to define a 'practical' situation, especially such that the definition would apply to the current context. Theoretical computer scientists are saying what they are saying and they know what they are saying.
 Whether generated IR contains unsafe
 instructions is trivially decidable. Since you can define an IR in a way
 that discriminate between unsafe/safe instructions you can also decide
 that the safe subset is verifiable memory safe.

As you know this will not single out _exactly_ the subset of programs which is memory safe which is the claim I was arguing against, and invoking Rice's theorem to this end is perfectly fine and does not suffer from any 'issues'. The kind of thing you discuss in this paragraph, which you appear to consider 'practical' is also studied in theoretical CS, so what's your point?
Jun 17 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/17/2014 11:50 PM, deadalnix wrote:
 and the fact that  safe is defined backward (ie by listing what is not allowed
and
 adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!
Jun 18 2014
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/18/2014 12:05 AM, deadalnix wrote:
 On Wednesday, 18 June 2014 at 07:02:43 UTC, Walter Bright wrote:
 On 6/17/2014 11:50 PM, deadalnix wrote:
 and the fact that  safe is defined backward (ie by listing what is not
 allowed and
 adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!

I don't even know what to answer to that. We are clearly talking past each other here, and I have no idea how to convey the message in a better way.

1. A long list of problems with safe has been asserted, but I have not been able to elicit any enumeration of them, so the extent of this problem is completely unknown. 2. The definition of safe in the spec is asserted to be utterly wrong, but no corrected definition has been proposed. 3. A new approach to designing safe has been proposed in vague terms, but nothing specific and no offers of help to flesh it out. From my perspective, it is like bug reports I'd often get for the compiler that consisted solely of: "Your compiler doesn't work." It's just not helpful. There's nothing I can do with that. Also, D is a collaborative effort. If there's an issue that engages your interest, step up and help out. I simply cannot do everything. This n.g. is full of "you should do this, you should do that" largely directed at me. You guys want things to happen, make them happen!
Jun 18 2014
next sibling parent reply Rikki Cattermole <alphaglosined gmail.com> writes:
On 18/06/2014 8:52 p.m., Joakim wrote:
 On Wednesday, 18 June 2014 at 08:13:59 UTC, Walter Bright wrote:
 From my perspective, it is like bug reports I'd often get for the
 compiler that consisted solely of:

     "Your compiler doesn't work."

 It's just not helpful. There's nothing I can do with that.

Lol, I'd love to see your response to them.
 Also, D is a collaborative effort. If there's an issue that engages
 your interest, step up and help out. I simply cannot do everything.
 This n.g. is full of "you should do this, you should do that" largely
 directed at me. You guys want things to happen, make them happen!

I largely agree with this assessment, but would also like to point out that while almost all programmers use compilers, very few know how to improve one. You can help by fleshing out the dmd source guide so that it's better, then pointing people at it when they demand stuff from you: ;) http://wiki.dlang.org/DMD_Source_Guide I know, more work for you and the few dmd contributors, but you guys can help others get involved with better documentation for dmd.

From my experience, one of the reasons I haven't had much to do with the development of dmd is simply to compile dmd, druntime is not as straight forward as it could be. All I can say is with ddmd make it compilable with dub same goes with druntime. With a nice tutorial on how to set it all up on Windows/Linux ext. Preferably with a nice automated means of, this is how we make a release build for platform x. It'll go along way. It really would. And anyway it'll really show off the power of dub ;) But in saying all this, I may want to learn to write my own compiler ext. I just don't have the knowledge and even getting into x86 instruction encoding is head hurting alone as it stands.
Jun 18 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/18/2014 2:16 AM, Rikki Cattermole wrote:
  From my experience, one of the reasons I haven't had much to do with the
 development of dmd is simply to compile dmd, druntime is not as straight
forward
 as it could be.

You don't need to actually hack on dmd or druntime to make valuable contributions. Getting problem reports and reproducible test cases into bugzilla is much needed work, for example.
Jun 18 2014
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/19/2014 4:12 AM, Artur Skawina via Digitalmars-d wrote:
 On 06/18/14 18:42, Dicebot via Digitalmars-d wrote:
 On Wednesday, 18 June 2014 at 11:09:14 UTC, Artur Skawina via Digitalmars-d
wrote:
 The number of potential contributors is
 already low enough, and the fuzzy licensing situation drives away the
 most valuable ones (since those will often be the ones which treat the
 legal aspects seriously and the risks involved when dealing with this
 kind of mixed free and proprietary code within one repo are too high).

Wait what? Do you know a single person who decided to not work on DMD FE because of kind of formally (but not practically) non-free backend?

Well, do you think I would have said what I did if this issue didn't affect /me/? [1]

The front end is now Boost licensed.
Jun 19 2014
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 6/19/2014 12:59 PM, Joakim wrote:
 Admittedly his concerns are unclear, but his problem is with the backend, not
 the frontend, which he already said he likes better now that it's
 boost-licensed.  He claims that the proprietary backend scares potential
 contributors away and that it should be "split up" from the frontend, by which
 he might mean putting it in a separate git repo?

 I don't know enough about these copyright tainting concerns to say if it's a
 good idea, just pointing out that he was talking about the backend, not the
 frontend.

Ok, but I'll also point out that LDC and GDC are fully free & open source.
Jun 19 2014
parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/20/2014 5:13 AM, Artur Skawina via Digitalmars-d wrote:
 On 06/20/14 13:51, David Nadlinger via Digitalmars-d wrote:
 Which wouldn't really help Artur (whether his concerns are justified or
 not), as we usually tell people to contribute their frontend patches
 directly to the upstream DMD repository.

Yes. Also, like I've already said, working on top of a downstream tree would eventually either result in a fork, or fail, with the latter being the much more likely result. I'll just add that I now think I've overstated the gains that splitting out the free frontend would bring. That's because I've since realized how hard it would still be to deal with development on top of git head, when one can not immediately test the result /on real code/ (ie using a non-dmd backend). Without a truly shared frontend, there's no good solution, I guess. :(

Just install dmd on your machine and contribute to the front end that way. Just because the back end is not what you want, it isn't going to corrupt your free open source contributions in the slightest, and those changes will make their way into LDC and GDC. Or you can contribute to the repositories for GDC and LDC directly, and then issue PR's for them into the main front end github.
Jun 20 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/22/2014 9:08 AM, Dicebot wrote:
 All DMD source files mention "Copyright (c) *** by Digital Mars". As far as I
 understand that implies that any pull request that does not explicitly mentions
 copyright does assignment automatically. Which totally makes sense because
 source base with distributed copyright ownership is extremely inflexible to
 maintain, to the point where it is better to simply reject such pull request
 than to deal with it.

Copyright assignment is required in other larger projects, like Linux and gcc.
Jun 22 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/22/2014 2:02 PM, Joakim wrote:
 Since Artur is being so evasive, I believe he's talking about the same reasons
 why Walter purposely won't even look at llvm code, which is basically
 BSD-licensed.  Any time you can say you haven't even looked at any outside
code,
 let alone contributed to it, you save yourself legal hassles.  I think he's
 saying that many potential contributors will see the legal uncertainty from
 dmd's licenses and just pass on contributing to dmd.

 I don't know how real a concern this is, as I've thankfully never had to deal
 with these copyright-tainting issues.

My main issue is simply protecting DMD from someone, years later, asserting some sort of claim over it. Also, I need to be able to deal with issues as they come up, and some contributors may be unreachable (and this has happened multiple times). With copyright assignment, these issues go away. I don't know why any contributors need to be concerned about the Boost license, or the copyright assignment, affecting them. As far as credit goes, the github repository provides ample credit to who did what, and in fact I encourage people to use their real names on it so that they get the credit they're due.
Jun 22 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/18/2014 10:18 AM, H. S. Teoh via Digitalmars-d wrote:
 Everyone talks about it, but nobody does anything about it, so here goes
 nothing:

 	https://issues.dlang.org/show_bug.cgi?id=12941

 Obviously, I have no idea what should go in that list, so everyone who
 cares about this issue, please chime in. Thanks!

 P.S. I've also tagged a whole bunch of issues with 'safe'. It's just
 based on a quick skim over all issues that turned up when I searched for
 'safe', so I may have mistagged some, and missed others, but this is
 just to get the ball rolling. Please fix wrong/missing tags if you find
 them. :)

Thanks!
Jun 18 2014
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 06/19/2014 06:06 AM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 ...
 In the real world you work with typical programs that run on finite
 resources guided by heuristics. There is no proof that you cannot have
  safe.

I assume you mean safe <===> memory safety.
 So leave that line of arguing. It is fundamentally flawed.

No, your line of reasoning is flawed. The amount of resources is not a constant. You must prove that memory safety holds for _each possible_ amount of resources at which point you haven't won anything by talking about resource usage, or else you need to set an explicit resource bound _at the language level_ and enforce it.
Jun 19 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/19/2014 10:39 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 19 June 2014 at 20:26:27 UTC, Timon Gehr wrote:
 ...
 No, your line of reasoning is flawed. The amount of resources is not a
 constant. You must prove that memory safety holds for

I have not set out to prove anything,

This is a discussion about proving memory safety. (Though this is not at all apparent from the original topic. This sort of went out of hand. :o))
 I dislike how people abuse CS in order to "win" an argument.

Sorry for having "won" the argument if that is what you are implying. That was not my intention and this is not actually a contest. My intention was to defend the use of properties of Turing machines in the context of a Turing complete programming language. I wanted to convey the message that it has no "issues" and it is not "fundamentally flawed". I also don't think I am guilty of "abuse".
 If you guys want to leverage CS theory do it
 properly or leave it out. Just because you have read Garey and Johnson

I haven't, if that helps.
 does not mean that you should leverage it without proper treatment.

I failed to decipher this part of the sentence. Is it just some more boring name calling or is there an insight behind it? NB: I have started to write down a complete-ish list of safe language features and I will post them to the issue opened by H.S. Teoh soon.
Jun 19 2014
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/19/2014 09:06 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 On Thursday, 19 June 2014 at 18:25:00 UTC, Tobias Pankrath wrote:
 It's not. Since the resources to verify a property are limited, too.

 If I need to enumerate all possible program states, there will always
 exists a program that can run on my box but will not be verifiable on
 it (since I lack the resources).

Nobody have said you should enumerate all possible states. There's a significant difference between a proof and an algorithm.

I don't think it is that significant. They are basically the same except that the proof is typically more strongly typed and you cannot actually execute a proof by excluded middle in the general case.
Jun 19 2014
prev sibling next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jun 19, 2014 at 06:24:59PM +0000, Tobias Pankrath via Digitalmars-d
wrote:
 On Thursday, 19 June 2014 at 04:06:51 UTC, Ola Fosheim Grstad wrote:
On Wednesday, 18 June 2014 at 21:57:40 UTC, deadalnix wrote:
Granted infinite resources. Good, now that we ruled that thing out,
can we talk about the subject, or do we need to continue talking
about imaginary things ?

No, finite resources, and that is only a worst case upper bound. It does not preclude the possibility of doing better. It also does not say how much work it is in the typical case e.g. the kind of programs you wrote when you try to make code safe. What it does prove is that the problem is computable, that is a major difference. Please note that the halting problem does not address difficulty but analytical computability. In the real world you work with typical programs that run on finite resources guided by heuristics. There is no proof that you cannot have safe. So leave that line of arguing. It is fundamentally flawed.

It's not. Since the resources to verify a property are limited, too. If I need to enumerate all possible program states, there will always exists a program that can run on my box but will not be verifiable on it (since I lack the resources). It's also not enough for safe to be verifiable, it should not slow down the compilation time too much as well.

Exactly. Just because something is *finite*, doesn't necessarily mean it's practical. Anything that requires enumeration of all possible states is impractical, because it has an exponential worst case bound, which is unacceptable for compilation time (or for anything, really, except in the most trivial cases). To get an idea of how intractible it can get, suppose your entire program state is fully described by, say, 1 MB worth of variables (that's pretty conservative, considering the size of programs these days and the size of the data they deal with). The upper bound to enumerating all program states is 2^(1 MB) = 2^2^20 = ... (a number with 324,936 digits). Note that that's not 324,936 *states*, but that's the number of *digits* in the number of states! It's a finite number, sure, but good luck enumerating all program states within the lifetime of the universe (don't even talk about within your lifetime!). Even if you run it on a hypothetical distributed supercomputer with 100 billion nodes, that barely even begins to make a dent in the number of states you need to evaluate (the number of states per node is a number with "only" 324,925 digits, haha). And mind you, 1 MB of state is pathetically small. Real programs these days require GB's of memory, and that will cause the number of states to explode far, far beyond any imaginable computing capabilities of the human race, past, present, or future -- unless we manage to invent a device that exploits time contraction near a black hole's horizon to compute the halting problem in finite time (i.e., it's pure fiction). T -- Question authority. Don't ask why, just do it.
Jun 19 2014
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 06/19/2014 11:28 PM, "Ola Fosheim Grøstad" 
<ola.fosheim.grostad+dlang gmail.com>" wrote:
 But we aren't talking machine language, we are talking D.

This part is spot on.
Jun 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 19 June 2014 at 18:50:27 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 Exactly.  Just because something is *finite*, doesn't 
 necessarily mean
 it's practical.  Anything that requires enumeration of all 
 possible
 states is impractical, because it has an exponential worst case 
 bound,
 which is unacceptable for compilation time (or for anything, 
 really,
 except in the most trivial cases).

You are being silly. You either discuss computability or you discuss complexity. Please don't mix the two topics. You either discuss a useful verification of safe features or you don't. It's not at all impossible to verify memory safety for real time applications. Real time applications have an upper bound on how long they should run. This even holds for TMs. The problem "does the TM terminate within N steps" is decidable.
Jun 19 2014
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 19 June 2014 at 19:12:53 UTC, Ola Fosheim Grøstad
wrote:
 You are being silly. You either discuss computability or you 
 discuss complexity. Please don't mix the two topics.

Do you have something to contribute to THIS discussion, or ill you continue to bring irrelevant topic in ?
Jun 19 2014
prev sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Wed, Jun 18, 2014 at 04:39:27PM +0000, Dicebot via Digitalmars-d wrote:
 On Wednesday, 18 June 2014 at 07:05:13 UTC, deadalnix wrote:
On Wednesday, 18 June 2014 at 07:02:43 UTC, Walter Bright wrote:
On 6/17/2014 11:50 PM, deadalnix wrote:
and the fact that  safe is defined backward (ie by listing what is not
allowed and
adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!

I don't even know what to answer to that. We are clearly talking past each other here, and I have no idea how to convey the message in a better way.

Create a bugzilla issue to make everything unsafe and list there constructs you think can be defined as safe :)

Everyone talks about it, but nobody does anything about it, so here goes nothing: https://issues.dlang.org/show_bug.cgi?id=12941 Obviously, I have no idea what should go in that list, so everyone who cares about this issue, please chime in. Thanks! P.S. I've also tagged a whole bunch of issues with 'safe'. It's just based on a quick skim over all issues that turned up when I searched for 'safe', so I may have mistagged some, and missed others, but this is just to get the ball rolling. Please fix wrong/missing tags if you find them. :) T -- I'm still trying to find a pun for "punishment"...
Jun 18 2014
prev sibling next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 2:30 AM, Maxim Fomin wrote:
 On Wednesday, 11 June 2014 at 16:50:30 UTC, Walter Bright wrote:
 On 6/11/2014 4:34 AM, Timon Gehr wrote:
 Not memory safe implies (is supposed to imply) not  safe but not  safe does not
 imply not memory safe.

safe in D == memory safe.

Why? I found dozens of cases when safe is broken, let alone other issues in bugzilla.

Those are bugs. We aim to fix them.
 By the way, memory safety is also compromised by compiler bugs which make him
 generate buggy code. Note, when I counted memory safety problems, codegen
issues
 were not counted.

I don't think we should mix intention with bugs. Bugs need to get fixed.
 I have reached this conclusion some years ago and nothing has change in D since
 then which make me reconsidering my opinion (including that  safe holes were
not
 fixed).

I do welcome help in fixing bugs.
Jun 15 2014
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 6/15/2014 2:30 AM, Maxim Fomin wrote:
 I found dozens of cases when  safe is broken, let alone other issues in
 bugzilla.

I have added the keyword 'safe' to bugzilla. I'd appreciate it if you would go through the bugzilla issues you've identified with safe, and mark them with the 'safe' keyword. Thanks!
Jun 15 2014
prev sibling parent Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:
On 6/11/2014 8:56 AM, Remo wrote:
 This is pretty strange behavior.
 At least on Windows I can not confirm this.
 Visual Studio 2013, Intel Compiler and Clang for windows have the same
 consistent behavior here.

 private  do NOT affect struct size.

 But there is a parameter in Visual Studio that will affect it called
 "Struct Member Alignment"
 http://msdn.microsoft.com/en-us/library/xh3e3fd0.aspx

 1 Byte (/Zp1)  sizeof(S1)=5,  sizeof(S2)=6
 2 Byte (/Zp2)  sizeof(S1)=6,  sizeof(S2)=8
 4 Byte (/Zp4)  sizeof(S1)=8,  sizeof(S2)=12
 8 Byte (/Zp8)  sizeof(S1)=8,  sizeof(S2)=12   this is the default.

 Of course the same can be done with #pragma pack(?) .

 There is also __declspec(align(?)) and in C++11  alignas(?) but its
 behavior is different from #pragma pack(?) .

For inheritance I've seen gcc and clang allow the inheriting to go into the tail padding of the previous class in the hierarchy, technically legal but always gives me scary thoughts before going to sleep at night. MSVC pads up the classes to the alignment size and inheritance starts from there, a bit wasteful but far safer. struct foo { int x; char y; }; struct bar : public foo { char z; } sizeof(foo == 8); sizeof(bar == 8); // clang and gcc on linux sizeof(bar == 12); // msvc The windows case is generally safer as memset(&class, 0, sizeof(class)) won't clobber inherited members, but its also not supposed to be legal to do things like memsetting anything that fails is_pod etc in c++, however since nearly all code I've worked with has constructors is_pod is useless and people just memset struct-like objects without a care int he world, which can be fun when porting :)
Jun 14 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 10 June 2014 at 20:50:46 UTC, Walter Bright wrote:
 On 6/10/2014 12:30 AM, deadalnix wrote:
 thought ?

This does not apply to D structs because D structs do not inherit. C doesn't have classes, so no issues there. extern(C++) class should match the C++ ABI for this. extern(D) class we are free to innovate with the layout. I suggest turning this into a bugzilla enhancement request.

I'm talking about structs, not classes.
Jun 10 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 11 June 2014 at 02:10:18 UTC, Walter Bright wrote:
 On 6/10/2014 5:27 PM, deadalnix wrote:
 I'm talking about structs, not classes.

Ok, but since D structs do not inherit, how does tail pad optimization apply?

struct S1 { int a; byte b; } struct S2 { S1 s1; char c; } Sé could be 8 byte long (some variation of this are in C++ for instance if I make a public and b private).
Jun 10 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 11 June 2014 at 03:19:55 UTC, Walter Bright wrote:
 On 6/10/2014 7:44 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 02:10:18 UTC, Walter Bright 
 wrote:
 On 6/10/2014 5:27 PM, deadalnix wrote:
 I'm talking about structs, not classes.

Ok, but since D structs do not inherit, how does tail pad optimization apply?

struct S1 { int a; byte b; } struct S2 { S1 s1; char c; } Sé could be 8 byte long (some variation of this are in C++ for instance if I make a public and b private).

Do any C++ compilers do this for this case, or just for the inheritance one?

They do it in the condition mentioned for extern(C++) in my first post (at least clang and gcc, C++ ABI isn't well defined, so other compiler may do thing differently).
Jun 10 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 11 June 2014 at 04:11:53 UTC, Walter Bright wrote:
 Hmm, this could have serious problems with the following:

 S1 s1;
 S2 s2;
 s2.c = 3;
 memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c

Yes, that is why they do it only in specific condition (ie when the thing use C++ specific feature and not C one). This kind of thing is valid in C, but isn't supposed to be in C++, and certainly not in safe D.
Jun 10 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 11 June 2014 at 06:30:26 UTC, Walter Bright wrote:
 On 6/10/2014 11:11 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 04:11:53 UTC, Walter Bright 
 wrote:
 Hmm, this could have serious problems with the following:

 S1 s1;
 S2 s2;
 s2.c = 3;
 memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c

Yes, that is why they do it only in specific condition (ie when the thing use C++ specific feature and not C one).

I don't understand - didn't you say this was expressible as C structs? Aren't those supposed to be compatible with C?

struct S1 { int a; private: char b; }; struct S2 : S1 { char c; }; S2 is 8 bytes. If you remove private in S1, then S2 becomes 12 bytes. S1 is not a "C" struct anymore and do not need to follow the standard layout.
 and certainly not in  safe D.

I'm not so sure about that, either. There are many ways of bit copying structs, and some of them are perfectly memory safe.

It is not provable by the compiler, therefore it is not safe.
 The fundamental problem with this optimization is now a struct 
 has two sizes - the unpadded and the padded size. I foresee 
 nothing but problems, corner cases, and bugs inherent in trying 
 to select which size to use for which purpose. I do not 
 understand how C++ avoids these issues.

By not spilling out for struct that do not have standard layout. In above example, the compiler won't issue code that manipulate the padding after S1 as it may contains something.
Jun 11 2014
prev sibling next sibling parent "Remo" <remo4d gmail.com> writes:
On Wednesday, 11 June 2014 at 07:22:51 UTC, deadalnix wrote:
 On Wednesday, 11 June 2014 at 06:30:26 UTC, Walter Bright wrote:
 On 6/10/2014 11:11 PM, deadalnix wrote:
 On Wednesday, 11 June 2014 at 04:11:53 UTC, Walter Bright 
 wrote:
 Hmm, this could have serious problems with the following:

 S1 s1;
 S2 s2;
 s2.c = 3;
 memcpy(&s2.s1, &s1, sizeof(S1)); // Oops! stomped on s2.c

Yes, that is why they do it only in specific condition (ie when the thing use C++ specific feature and not C one).

I don't understand - didn't you say this was expressible as C structs? Aren't those supposed to be compatible with C?

struct S1 { int a; private: char b; }; struct S2 : S1 { char c; }; S2 is 8 bytes. If you remove private in S1, then S2 becomes 12 bytes. S1 is not a "C" struct anymore and do not need to follow the standard layout.

This is pretty strange behavior. At least on Windows I can not confirm this. Visual Studio 2013, Intel Compiler and Clang for windows have the same consistent behavior here. private do NOT affect struct size. But there is a parameter in Visual Studio that will affect it called "Struct Member Alignment" http://msdn.microsoft.com/en-us/library/xh3e3fd0.aspx 1 Byte (/Zp1) sizeof(S1)=5, sizeof(S2)=6 2 Byte (/Zp2) sizeof(S1)=6, sizeof(S2)=8 4 Byte (/Zp4) sizeof(S1)=8, sizeof(S2)=12 8 Byte (/Zp8) sizeof(S1)=8, sizeof(S2)=12 this is the default. Of course the same can be done with #pragma pack(?) . There is also __declspec(align(?)) and in C++11 alignas(?) but its behavior is different from #pragma pack(?) .
Jun 11 2014
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Wednesday, 11 June 2014 at 16:50:30 UTC, Walter Bright wrote:
 On 6/11/2014 4:34 AM, Timon Gehr wrote:
 Not memory safe implies (is supposed to imply) not  safe but 
 not  safe does not
 imply not memory safe.

safe in D == memory safe.

What Timon is saying is that not all memory safe code is verifiably safe. safe => memory safe, but not memory safe => safe. David
Jun 15 2014
prev sibling next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
David Nadlinger:

  safe => memory safe, but not memory safe =>  safe.

Apparently that's not true, according to the post on the LLVM blog linked here: http://forum.dlang.org/thread/okratgxrikskmylmwrrx forum.dlang.org Bye, bearophile
Jun 15 2014
prev sibling next sibling parent "Maxim Fomin" <maxim maxim-fomin.ru> writes:
On Wednesday, 11 June 2014 at 16:50:30 UTC, Walter Bright wrote:
 On 6/11/2014 4:34 AM, Timon Gehr wrote:
 Not memory safe implies (is supposed to imply) not  safe but 
 not  safe does not
 imply not memory safe.

safe in D == memory safe.

Why? I found dozens of cases when safe is broken, let alone other issues in bugzilla. After thinking about the safety in D my conclusion is that it is impossible to evaluate memory and safety correctness of a code in which static type says nothing about where an object is allocated. Contrary to popular wisdom, one can have fixed array on heaps, classes on stack, etc. If any code which is potentially not safe is rejected, this would lead to lots of false positives making using the language very inconvenient. Even in that case, safety cannot be guaranteed by the language due to other issues, like separate compilation, etc. By the way, memory safety is also compromised by compiler bugs which make him generate buggy code. Note, when I counted memory safety problems, codegen issues were not counted. safe should not be considered as feature which ensures that the code is memory safe, but as a feature rejecting code with high probability of memory bugs. The difference is very important. I have reached this conclusion some years ago and nothing has change in D since then which make me reconsidering my opinion (including that safe holes were not fixed). Note, that I do not critique the language, it is fine. It is very good system level language with powerful modelling features. It is not good at verifying memory correctness, because in such languages (unlike managed ones) burden of memory correctness lies mostly on programmer. If he thinks that he can outsource memory correctness to safe, there is high probability that he would be suddenly surprised.
Jun 15 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 17 June 2014 at 20:25:52 UTC, Walter Bright wrote:
 You have a lot of good thoughts on this issue.

 You have specific problems in mind but do not identify them 
 other than in vague terms. I guarantee that I am really, really 
 bad at mindreading.

 Instead, I would strongly prefer that you:

 1. write bugzilla issues for the problems that you identify

 2. connect those bugzilla issues regarding  safe with the 
 'safe' keyword

 3. construct specific proposals for how to fix these issues. 
 The proposals don't actually have to be pull requests, just 
 descriptions on the bugzilla pages for how to fix.

 4. write pull requests for the documentation


 Even doing just one of 1..4 would be most appreciated. Merely 
 asserting that  safe is broken/buggy/etc. will not result in 
 progress.

I don't think he will. He's been explaining you for the past several posts that this process is fundamentally broken, and won't achieve the goals of safe. Please have a step back and look at the broader issue here.
Jun 17 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 17 June 2014 at 20:50:01 UTC, Walter Bright wrote:
 On 6/17/2014 1:30 PM, deadalnix wrote:
 I don't think he will. He's been explaining you for the past
 several posts that this process is fundamentally broken, and
 won't achieve the goals of  safe.

 Please have a step back and look at the broader issue here.

What plan of action do you propose?

Make everything system unless they are proven safe instead of trying to plug the holes in the condom one by one and hope you don't end up with 20 years of child support.
Jun 17 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Monday, 16 June 2014 at 13:12:33 UTC, Timon Gehr wrote:
 My point was that no implementation of  safe whatsoever can 
 make it _equivalent_ to memory safety (i.e.  safe will never 
 single out precisely those programs that do not corrupt 
 memory). It will always only approximate memory safety. This is 
 not a bug, it's just a fact.

Out of curiosity, what is "memory safety" defined to be? Does it include running out of stack? It should not be a problem if safe rejects some memory safe programs. It still verifies that they are memory safe if they pass (as opposed to "if and only if"). You can verify subsets even if you cannot define two crisp set that discriminate between all possible programs. Btw, Rice's theorem is based on the halting problem for TMs… so it suffers from the same issues as everything else in theoretical CS when it comes to practical situations. Whether generated IR contains unsafe instructions is trivially decidable. Since you can define an IR in a way that discriminate between unsafe/safe instructions you can also decide that the safe subset is verifiable memory safe.
 to agree on. In this case this is witnessed by the fact that 
 you do not seem to want to ban undefined behaviour from  safe 
 code which I can't agree with.

Then just define the undefined behaviour as yielding the integer result of a unspecified black box function of the input. Integer overflow should yield the same value for the same operands on all ALUs. I don't understand how this relates to memory safety?
Jun 17 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jun 19, 2014 at 07:12:52PM +0000, via Digitalmars-d wrote:
 On Thursday, 19 June 2014 at 18:50:27 UTC, H. S. Teoh via Digitalmars-d
 wrote:
Exactly.  Just because something is *finite*, doesn't necessarily
mean it's practical.  Anything that requires enumeration of all
possible states is impractical, because it has an exponential worst
case bound, which is unacceptable for compilation time (or for
anything, really, except in the most trivial cases).

You are being silly. You either discuss computability or you discuss complexity. Please don't mix the two topics. You either discuss a useful verification of safe features or you don't. It's not at all impossible to verify memory safety for real time applications. Real time applications have an upper bound on how long they should run. This even holds for TMs. The problem "does the TM terminate within N steps" is decidable.

Decidable doesn't mean feasible. You have to run a simulator of the program for N steps in order to determine whether it terminates. If N is large, this means compilation will be impractically slow. The halting problem is not just about needing infinite time, it's also about the non-existence of algorithms that can analyze every possible program. Even if a given TM never terminates, as long as it fits into an analysable pattern, we can prove its non-termination in finite time. The analysing algorithm is able to "compress" the proof into finite length. But the non-solvability of the halting problem means that there is no algorithm that can compress every possible program. This means there is no such algorithm that can determine termination within N steps for arbitrary N, except the one that enumerates all possible states up to N steps. If I have code like this: while (!finished) { if (arbitarilyComplexFalseCondition) unsafeOperation(); else safeOperation(); finished = arbitrarilyComplexStoppingCondition; } you cannot, in general, prove safety without enumerating all possible states (since you can't predict which branch the if statement will take). Since enumerating all possible states is impractical, this means your N-step termination problem is practically unsolvable, even if it's decidable in theory. T -- The early bird gets the worm. Moral: ewww...
Jun 19 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 19 June 2014 at 19:37:32 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 But the non-solvability of the halting problem means that there 
 is no
 algorithm that can compress every possible program.

Even if true, programmers don't write every possible program when they try to write safe code.
Jun 19 2014
prev sibling next sibling parent reply "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Tuesday, 17 June 2014 at 22:44:00 UTC, Timon Gehr wrote:
 As you know this will not single out _exactly_ the subset of 
 programs which is memory safe which is the claim I was arguing 
 against,

It will single out exactly the programs that are most certainly memory safe, and also single out those programs that have generated unsafe features (instructions). Whether those unsafe features sit in a basic block that will never called is a different matter since safe does not address that. So yes, safe is not equivalent to memory safety, it is equivalent to the absence of FEATURES that can lead to situations that aren't memory safe. But that is splitting hairs. Absence of features such as unsafe instructions is a trivial property of the compiled program. A non trivial property is whether the program terminates, returns 0, rejects the input "hello", etc.
 and invoking Rice's theorem to this end is perfectly fine and 
 does not suffer from any 'issues'.

Not really, you can prove termination for all programs with 3 instructions and 3 bytes of RAM. You can do it for all programs with 4 instructions and 4 bytes of RAM. You can do it for all programs with N instructions and N bytes of RAM. You cannot do it for all TMs since they have unbounded storage. Thus you can prove non-trivial properties such as whether a functions returns with 1, whether it accepts or rejects the input '0', etc. It might take a lot of time, but you can do it in a fixed amount of time. You cannot do it for TMs. But this issue is simpler than that. Think about it this way: There are problems that have a solution, but it is provable impossible to solve analytically. You still can solve them numerically down to a specified precision. I posit you can do the same with memory safety, as in: you can reject all unsafe programs and reduce the set of false positives to a marginal set of typical programs (which are written with memory safety in mind).
Jun 17 2014
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Thu, Jun 19, 2014 at 08:31:53PM +0000, via Digitalmars-d wrote:
 On Thursday, 19 June 2014 at 19:37:32 UTC, H. S. Teoh via Digitalmars-d
 wrote:
But the non-solvability of the halting problem means that there is no
algorithm that can compress every possible program.

Even if true, programmers don't write every possible program when they try to write safe code.

So how do you, as a compiler writer, predict exactly which programs your users will write? -- because without knowing that, you cannot know which analysis algorithms to implement that will solve the N-step halting problem for all the programs that users will write (even though that's only a very small finite set out of the space of all possible programs!). So the only remaining approach is to consider all possible programs. Which means you have to implement exhaustive state space search. Which is impractical for the reasons I stated. T -- This is a tpyo.
Jun 19 2014
prev sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 19 June 2014 at 20:47:16 UTC, H. S. Teoh via 
Digitalmars-d wrote:
 programs!). So the only remaining approach is to consider all 
 possible
 programs. Which means you have to implement exhaustive state 
 space
 search. Which is impractical for the reasons I stated.

No. Is a program compiled to asm.js memory safe? It has to be, because Javascript is a memory safe environment. You might still mess up badly, but you can obviously transform just about any program that is presumed to be memory unsafe into one that is memory safe (if you consider a function to be a mapping of input to output). Are you telling me that you cannot write an equivalent mapping from input to output using memory safe primitives while staying within P? You can safely insist on only having access to valid memory safe primitives within safe and still be able to solve the exact same problems, in pretty much the same way (on an abstract algorithmic level). It's not like safe guards against exceptions IIRC. You can inject bound checks everywhere and it is provable memory safe. safe should guarantee that your program terminates in an orderly fashion without corrupting memory. safe does not guarantee correctness. That would be a folly. A hypothetical and pointless "not safe" does not guarantee memory corruption either. If you compile to asm.js you obviously always stay memory safe. Or do you define "out of array bounds" exceptions as memory corruption? Now, if we were talking untransformable machine language, you might have a point, not sure. But we aren't talking machine language, we are talking D.
Jun 19 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 18 June 2014 at 06:35:01 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 17 June 2014 at 22:44:00 UTC, Timon Gehr wrote:
 As you know this will not single out _exactly_ the subset of 
 programs which is memory safe which is the claim I was arguing 
 against,

It will single out exactly the programs that are most certainly memory safe, and also single out those programs that have generated unsafe features (instructions). Whether those unsafe features sit in a basic block that will never called is a different matter since safe does not address that. So yes, safe is not equivalent to memory safety, it is equivalent to the absence of FEATURES that can lead to situations that aren't memory safe. But that is splitting hairs.

That isn't splitting hair. You are loosing track of the big picture here. To come back to the original problem : memset(&foo, 0, typeof(foo).sizeof); Walter mention that this is not corrupting memory and therefore safe. In the situation of tail pad optimization, it is obviously not valid anymore and will cause memory corruption. The discussion then derailed to what is safe, as the code is certainly safe, but it is difficult to prove it is automatically) and the fact that safe is defined backward (ie by listing what is not allowed and adding to the list when new holes are discovered rather than listing what is allowed and adding to the list when some construct are proven to be safe).
Jun 17 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 18 June 2014 at 07:02:43 UTC, Walter Bright wrote:
 On 6/17/2014 11:50 PM, deadalnix wrote:
 and the fact that  safe is defined backward (ie by listing 
 what is not allowed and
 adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!

I don't even know what to answer to that. We are clearly talking past each other here, and I have no idea how to convey the message in a better way.
Jun 18 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 18 June 2014 at 06:50:14 UTC, deadalnix wrote:
 That isn't splitting hair. You are loosing track of the big 
 picture here. To come back to the original problem :

 memset(&foo, 0, typeof(foo).sizeof);

 Walter mention that this is not corrupting memory and therefore 
  safe.

memset is by definition an unsafe feature, but a hypothetical cleartype(foo,…) is a safe feature. Imagine having a hypothetical IR that makes the unsafe/safe distinction, then you can have both memset and cleartype in that IR and transform them into memset at the lower level backend without causing problems for memory safety verification.
 In the situation of tail pad optimization, it is obviously not 
 valid anymore and will cause memory corruption.

But if you have cleartype() when you do tail pad optimization and turn it into memset a codegen then you are ok?
 The discussion then derailed to what is  safe, as the code is 
 certainly  safe, but it is difficult to prove it is 
 automatically) and the fact that  safe is defined backward (ie 
 by listing what is not allowed and adding to the list when new 
 holes are discovered rather than listing what is allowed and 
 adding to the list when some construct are proven to be  safe).

I don't disagree, I think the simplest thing to do is to have a hypothetical implied IR between the AST and the backend and accept safe features on that level. I think people who care about safe also will accept that some safe programs are rejected if it is easy to modify the code into something the compiler accepts. The compiler could even provide suggestions: ("consider using cleartype() over memset()" etc).
Jun 18 2014
prev sibling next sibling parent "Joakim" <dlang joakim.airpost.net> writes:
On Wednesday, 18 June 2014 at 08:13:59 UTC, Walter Bright wrote:
 From my perspective, it is like bug reports I'd often get for 
 the compiler that consisted solely of:

     "Your compiler doesn't work."

 It's just not helpful. There's nothing I can do with that.

Lol, I'd love to see your response to them.
 Also, D is a collaborative effort. If there's an issue that 
 engages your interest, step up and help out. I simply cannot do 
 everything. This n.g. is full of "you should do this, you 
 should do that" largely directed at me. You guys want things to 
 happen, make them happen!

I largely agree with this assessment, but would also like to point out that while almost all programmers use compilers, very few know how to improve one. You can help by fleshing out the dmd source guide so that it's better, then pointing people at it when they demand stuff from you: ;) http://wiki.dlang.org/DMD_Source_Guide I know, more work for you and the few dmd contributors, but you guys can help others get involved with better documentation for dmd.
Jun 18 2014
prev sibling next sibling parent "Joakim" <dlang joakim.airpost.net> writes:
On Wednesday, 18 June 2014 at 09:16:32 UTC, Rikki Cattermole 
wrote:
 From my experience, one of the reasons I haven't had much to do 
 with the development of dmd is simply to compile dmd, druntime 
 is not as straight forward as it could be.
 All I can say is with ddmd make it compilable with dub same 
 goes with druntime. With a nice tutorial on how to set it all 
 up on Windows/Linux ext.
 Preferably with a nice automated means of, this is how we make 
 a release build for platform x.

I used this guide from the wiki some time back to start building dmd/druntime/phobos and run their tests, pretty easy: http://wiki.dlang.org/Building_DMD Non-Windows builds are very easy. Windows builds require a bit too much manual configuration, I'll agree with that.
 It'll go along way. It really would. And anyway it'll really 
 show off the power of dub ;)

Daniel should stick his D frontend up there as a library, even if it isn't architected to be used as a library right now. People will find ways to use it and his translated code will get some testing.
Jun 18 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/18/14 10:14, Walter Bright via Digitalmars-d wrote:
 
 Also, D is a collaborative effort. If there's an issue that engages your
interest, step up and help out. I simply cannot do everything. This n.g. is
full of "you should do this, you should do that" largely directed at me. You
guys want things to happen, make them happen!

And most of those requests completely ignore language context, are indiscriminately trying to copy or transplant other concepts, or give a suboptimal answer to the wrong question. It's frustrating to even just read them; I have several times considered unsubscribing, as watching things go in a wrong and very unproductive direction isn't fun. The one thing that /you/ could do, which would make the most difference, is splitting up the compiler into a free frontend [1] and the non-free "rest". The current situation results in people avoiding looking at or even going near DMD at all. The number of potential contributors is already low enough, and the fuzzy licensing situation drives away the most valuable ones (since those will often be the ones which treat the legal aspects seriously and the risks involved when dealing with this kind of mixed free and proprietary code within one repo are too high). (JIC someone suggests working on the frontend via LDC/GDC trees - that would inevitably result in a language fork eventually.) artur [1] the many problems with the artistic license are now gone, after the switch to boost; at least the license itself is no longer an issue.
Jun 18 2014
prev sibling next sibling parent "Matthias Bentrup" <matthias.bentrup googlemail.com> writes:
On Wednesday, 18 June 2014 at 08:13:59 UTC, Walter Bright wrote:
 On 6/18/2014 12:05 AM, deadalnix wrote:
 On Wednesday, 18 June 2014 at 07:02:43 UTC, Walter Bright 
 wrote:
 On 6/17/2014 11:50 PM, deadalnix wrote:
 and the fact that  safe is defined backward (ie by listing 
 what is not
 allowed and
 adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!

I don't even know what to answer to that. We are clearly talking past each other here, and I have no idea how to convey the message in a better way.

1. A long list of problems with safe has been asserted, but I have not been able to elicit any enumeration of them, so the extent of this problem is completely unknown. 2. The definition of safe in the spec is asserted to be utterly wrong, but no corrected definition has been proposed. 3. A new approach to designing safe has been proposed in vague terms, but nothing specific and no offers of help to flesh it out. From my perspective, it is like bug reports I'd often get for the compiler that consisted solely of: "Your compiler doesn't work." It's just not helpful. There's nothing I can do with that. Also, D is a collaborative effort. If there's an issue that engages your interest, step up and help out. I simply cannot do everything. This n.g. is full of "you should do this, you should do that" largely directed at me. You guys want things to happen, make them happen!

You two speak different languages here, I'll try to translate: Basically deadalnix argues that memory safety cannot be checked by a machine (there's a proof for that) and as safe-ness can be checked it obviously is not equivalent to memory safety. Further *any* definition of safe-ness that can be expressed by a finite set of mathematical rules will not be equivalent to memory safety. Then the big misunderstanding is the "equivalence". Equivalence means that 1) safe implies memory-safe 2) ! safe implies !memory-safe Clearly Walter is only talking about 1), and there is no reason why you couldn't have a safe-ness check that implies memory-safety. It just won't be *equivalent*.
Jun 18 2014
prev sibling next sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Wednesday, 18 June 2014 at 06:35:01 UTC, Ola Fosheim Grøstad
wrote:
 Not really, you can prove termination for all programs with 3 
 instructions and 3 bytes of RAM. You can do it for all programs 
 with 4 instructions and 4 bytes of RAM. You can do it for all 
 programs with N instructions and N bytes of RAM. You cannot do 
 it for all TMs since they have unbounded storage.

 Thus you can prove non-trivial properties such as whether a 
 functions returns with 1, whether it accepts or rejects the 
 input '0', etc. It might take a lot of time, but you can do it 
 in a fixed amount of time. You cannot do it for TMs.

Maybe I missed something in this discussion, but unless you are not including jumps/loops in those N instructions, just because the number of instructions and memory is bounded does not mean you can prove (for arbitrary programs) that the program terminates. I hope I don't shoot myself in the foot by trying to come up with a concrete example, but let me try. Consider the program to calculate whether some point is part of the Mandelbrot set. The program only requires a small and fixed amount of memory but for some points it may iterate forever (practical fractal programs set an iteration limit, but let's assume this one doesn't, for the sake of the argument).
Jun 18 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 18 June 2014 at 07:05:13 UTC, deadalnix wrote:
 On Wednesday, 18 June 2014 at 07:02:43 UTC, Walter Bright wrote:
 On 6/17/2014 11:50 PM, deadalnix wrote:
 and the fact that  safe is defined backward (ie by listing 
 what is not allowed and
 adding to the list when new holes are discovered

https://issues.dlang.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&keywords=safe%2C%20&keywords_type=allwords&list_id=41168&query_format=advanced Currently, there are zero bugzilla issues tagged with 'safe'. Please file bugzilla issues for which ones are discovered and tag them!

I don't even know what to answer to that. We are clearly talking past each other here, and I have no idea how to convey the message in a better way.

Create a bugzilla issue to make everything unsafe and list there constructs you think can be defined as safe :)
Jun 18 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Wednesday, 18 June 2014 at 11:09:14 UTC, Artur Skawina via 
Digitalmars-d wrote:
 The number of potential contributors is
 already low enough, and the fuzzy licensing situation drives 
 away the
 most valuable ones (since those will often be the ones which 
 treat the
 legal aspects seriously and the risks involved when dealing 
 with this
 kind of mixed free and proprietary code within one repo are too 
 high).

Wait what? Do you know a single person who decided to not work on DMD FE because of kind of formally (but not practically) non-free backend?
Jun 18 2014
prev sibling next sibling parent "Marc =?UTF-8?B?U2Now7x0eiI=?= <schuetzm gmx.net> writes:
On Wednesday, 18 June 2014 at 15:25:11 UTC, Luís Marques wrote:
 On Wednesday, 18 June 2014 at 06:35:01 UTC, Ola Fosheim Grøstad
 wrote:
 Not really, you can prove termination for all programs with 3 
 instructions and 3 bytes of RAM. You can do it for all 
 programs with 4 instructions and 4 bytes of RAM. You can do it 
 for all programs with N instructions and N bytes of RAM. You 
 cannot do it for all TMs since they have unbounded storage.

 Thus you can prove non-trivial properties such as whether a 
 functions returns with 1, whether it accepts or rejects the 
 input '0', etc. It might take a lot of time, but you can do it 
 in a fixed amount of time. You cannot do it for TMs.

Maybe I missed something in this discussion, but unless you are not including jumps/loops in those N instructions, just because the number of instructions and memory is bounded does not mean you can prove (for arbitrary programs) that the program terminates.

Jumps and loops don't matter. The point is that such a program only has a finite amount of possible states: The 4 bytes of RAM, plus the current instruction pointer (to be 2 bits for 4 instructions). In this case, there are only 2^(4*8) * 2^2 = 2^34 = 17179869184 different states. If during execution you encounter a state that you've already seen before, you know that the program will go into an infinite loop. This is easy to implement using a bitmap. Of course, said bitmap will soon become too large to handle in practice if you want to analyze realistic "interesting" programs.
Jun 18 2014
prev sibling next sibling parent =?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:
On Wednesday, 18 June 2014 at 19:17:50 UTC, Marc Schütz wrote:
 Jumps and loops don't matter. The point is that such a program 
 only has a finite amount of possible states: The 4 bytes of 
 RAM, plus the current instruction pointer (to be 2 bits for 4 
 instructions). In this case, there are only

     2^(4*8) * 2^2 = 2^34 = 17179869184

 different states. If during execution you encounter a state 
 that you've already seen before, you know that the program will 
 go into an infinite loop. This is easy to implement using a 
 bitmap.

 Of course, said bitmap will soon become too large to handle in 
 practice if you want to analyze realistic "interesting" 
 programs.

Ah, right, my bad :-)
Jun 18 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 18 June 2014 at 15:25:11 UTC, Luís Marques wrote:
 Maybe I missed something in this discussion, but unless you are
 not including jumps/loops in those N instructions, just because
 the number of instructions and memory is bounded does not mean
 you can prove (for arbitrary programs) that the program
 terminates.

Yes, you can. You just run it until it enters a previous state. At that point you have an infinite loop and it won't terminate. Since you only have a finite number of possible states and you always move from one state to another the proof of this becomes trivial. The worst case time complexity is the same as the number of possible states.
Jun 18 2014
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 18 June 2014 at 20:58:46 UTC, Ola Fosheim Grøstad
wrote:
 On Wednesday, 18 June 2014 at 15:25:11 UTC, Luís Marques wrote:
 Maybe I missed something in this discussion, but unless you are
 not including jumps/loops in those N instructions, just because
 the number of instructions and memory is bounded does not mean
 you can prove (for arbitrary programs) that the program
 terminates.

Yes, you can. You just run it until it enters a previous state. At that point you have an infinite loop and it won't terminate. Since you only have a finite number of possible states and you always move from one state to another the proof of this becomes trivial. The worst case time complexity is the same as the number of possible states.

Granted infinite resources. Good, now that we ruled that thing out, can we talk about the subject, or do we need to continue talking about imaginary things ?
Jun 18 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Wednesday, 18 June 2014 at 21:57:40 UTC, deadalnix wrote:
 Granted infinite resources. Good, now that we ruled that thing
 out, can we talk about the subject, or do we need to continue
 talking about imaginary things ?

No, finite resources, and that is only a worst case upper bound. It does not preclude the possibility of doing better. It also does not say how much work it is in the typical case e.g. the kind of programs you wrote when you try to make code safe. What it does prove is that the problem is computable, that is a major difference. Please note that the halting problem does not address difficulty but analytical computability. In the real world you work with typical programs that run on finite resources guided by heuristics. There is no proof that you cannot have safe. So leave that line of arguing. It is fundamentally flawed.
Jun 18 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/18/14 18:42, Dicebot via Digitalmars-d wrote:
 On Wednesday, 18 June 2014 at 11:09:14 UTC, Artur Skawina via Digitalmars-d
wrote:
 The number of potential contributors is
 already low enough, and the fuzzy licensing situation drives away the
 most valuable ones (since those will often be the ones which treat the
 legal aspects seriously and the risks involved when dealing with this
 kind of mixed free and proprietary code within one repo are too high).

Wait what? Do you know a single person who decided to not work on DMD FE because of kind of formally (but not practically) non-free backend?

Well, do you think I would have said what I did if this issue didn't affect /me/? [1] The problem is a very practical one. Walter's complaint about the lack of constructive contributions is justified; most of the language related discussions happening here are nothing more than wish lists. There are many different causes, I'm pointing out one. What's unique about this one is that there's only one person in the world that can do something about it -- the same task done by someone other than Walter Bright would not be equally valuable. The shared frontend+backend setup might not be just convenience, but a deliberate attempt to discourage certain developments -- even then, I think it does much more harm than good. It practically inhibits "casual" contributions, while not really restricting usages that bring few benefits to the original project. And, yes, some people really always check licenses, even before fully determining what a software project actually is/does. Because if the license is problematic then everything else is irrelevant -- the project simply is unusable, and any time spent looking at it would be wasted. That is fortunately not a problem for dmdfe, as boost/gpl should be ok for (almost) everyone. But the cost of having to deal with another license, for a bundled part, that you're never going to use and are not even interested in, is there. The cost of scratching-an-itch also becomes higher. Depending on person/context, these costs can be prohibitive. artur [1] If I knew I would be replying like this, I would have worded that quoted statement a bit differently; I'm not quite that arrogant. :)
Jun 19 2014
prev sibling next sibling parent "Tobias Pankrath" <tobias pankrath.net> writes:
On Thursday, 19 June 2014 at 04:06:51 UTC, Ola Fosheim Grøstad 
wrote:
 On Wednesday, 18 June 2014 at 21:57:40 UTC, deadalnix wrote:
 Granted infinite resources. Good, now that we ruled that thing
 out, can we talk about the subject, or do we need to continue
 talking about imaginary things ?

No, finite resources, and that is only a worst case upper bound. It does not preclude the possibility of doing better. It also does not say how much work it is in the typical case e.g. the kind of programs you wrote when you try to make code safe. What it does prove is that the problem is computable, that is a major difference. Please note that the halting problem does not address difficulty but analytical computability. In the real world you work with typical programs that run on finite resources guided by heuristics. There is no proof that you cannot have safe. So leave that line of arguing. It is fundamentally flawed.

It's not. Since the resources to verify a property are limited, too. If I need to enumerate all possible program states, there will always exists a program that can run on my box but will not be verifiable on it (since I lack the resources). It's also not enough for safe to be verifiable, it should not slow down the compilation time too much as well.
Jun 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 19 June 2014 at 18:25:00 UTC, Tobias Pankrath wrote:
 It's not. Since the resources to verify a property are limited, 
 too.

 If I need to enumerate all possible program states, there will 
 always exists a program that can run on my box but will not be 
 verifiable on it (since I lack the resources).

Nobody have said you should enumerate all possible states. There's a significant difference between a proof and an algorithm.
Jun 19 2014
prev sibling next sibling parent "Joakim" <dlang joakim.airpost.net> writes:
On Thursday, 19 June 2014 at 18:10:57 UTC, Walter Bright wrote:
 On 6/19/2014 4:12 AM, Artur Skawina via Digitalmars-d wrote:
 On 06/18/14 18:42, Dicebot via Digitalmars-d wrote:
 On Wednesday, 18 June 2014 at 11:09:14 UTC, Artur Skawina via 
 Digitalmars-d wrote:
 The number of potential contributors is
 already low enough, and the fuzzy licensing situation drives 
 away the
 most valuable ones (since those will often be the ones which 
 treat the
 legal aspects seriously and the risks involved when dealing 
 with this
 kind of mixed free and proprietary code within one repo are 
 too high).

Wait what? Do you know a single person who decided to not work on DMD FE because of kind of formally (but not practically) non-free backend?

Well, do you think I would have said what I did if this issue didn't affect /me/? [1]

The front end is now Boost licensed.

Admittedly his concerns are unclear, but his problem is with the backend, not the frontend, which he already said he likes better now that it's boost-licensed. He claims that the proprietary backend scares potential contributors away and that it should be "split up" from the frontend, by which he might mean putting it in a separate git repo? I don't know enough about these copyright tainting concerns to say if it's a good idea, just pointing out that he was talking about the backend, not the frontend.
Jun 19 2014
prev sibling next sibling parent "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= writes:
On Thursday, 19 June 2014 at 20:26:27 UTC, Timon Gehr wrote:
 I assume you mean  safe <===> memory safety.

I mean that the best route in general is: safe => memory safe features on the local level (trivial) => "strength reduction" into memory safe use of memory unsafe features (trivial).
 No, your line of reasoning is flawed. The amount of resources 
 is not a constant. You must prove that memory safety holds for

I have not set out to prove anything, I dislike how people abuse CS in order to "win" an argument. If you guys want to leverage CS theory do it properly or leave it out. Just because you have read Garey and Johnson does not mean that you should leverage it without proper treatment.
Jun 19 2014
prev sibling next sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Friday, 20 June 2014 at 00:15:36 UTC, Walter Bright wrote:
 On 6/19/2014 12:59 PM, Joakim wrote:
 I don't know enough about these copyright tainting concerns to 
 say if it's a
 good idea, just pointing out that he was talking about the 
 backend, not the
 frontend.

Ok, but I'll also point out that LDC and GDC are fully free & open source.

Which wouldn't really help Artur (whether his concerns are justified or not), as we usually tell people to contribute their frontend patches directly to the upstream DMD repository. David
Jun 20 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/20/14 13:51, David Nadlinger via Digitalmars-d wrote:
 On Friday, 20 June 2014 at 00:15:36 UTC, Walter Bright wrote:
 On 6/19/2014 12:59 PM, Joakim wrote:
 I don't know enough about these copyright tainting concerns to say if it's a
 good idea, just pointing out that he was talking about the backend, not the
 frontend.

Ok, but I'll also point out that LDC and GDC are fully free & open source.

Which wouldn't really help Artur (whether his concerns are justified or not), as we usually tell people to contribute their frontend patches directly to the upstream DMD repository.

Yes. Also, like I've already said, working on top of a downstream tree would eventually either result in a fork, or fail, with the latter being the much more likely result. I'll just add that I now think I've overstated the gains that splitting out the free frontend would bring. That's because I've since realized how hard it would still be to deal with development on top of git head, when one can not immediately test the result /on real code/ (ie using a non-dmd backend). Without a truly shared frontend, there's no good solution, I guess. :( artur
Jun 20 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Thursday, 19 June 2014 at 11:12:46 UTC, Artur Skawina via 
Digitalmars-d wrote:
 Wait what? Do you know a single person who decided to not work 
 on DMD FE because of kind of formally (but not practically) 
 non-free backend?

Well, do you think I would have said what I did if this issue didn't affect /me/? [1] ... And, yes, some people really always check licenses, even before fully determining what a software project actually is/does. Because if the license is problematic then everything else is irrelevant -- the project simply is unusable, and any time spent looking at it would be wasted. That is fortunately not a problem for dmdfe, as boost/gpl should be ok for (almost) everyone. But the cost of having to deal with another license, for a bundled part, that you're never going to use and are not even interested in, is there. The cost of scratching-an-itch also becomes higher. Depending on person/context, these costs can be prohibitive. artur

I still don't understand. What impact backend license has on you? In other words, what is potential danger you need to be concerned about that makes potential contributions too risky? One problem I am aware of is redistribution issue which is common blocker with getting into linux distributions. But personal contributions? Can you explain it in a bit more details?
Jun 20 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/20/14 23:28, Dicebot via Digitalmars-d wrote:
 On Thursday, 19 June 2014 at 11:12:46 UTC, Artur Skawina via Digitalmars-d
wrote:
 That is fortunately not a problem for dmdfe, as boost/gpl should be
 ok for (almost) everyone. But the cost of having to deal with another
 license, for a bundled part, that you're never going to use and are not
 even interested in, is there. The cost of scratching-an-itch also
 becomes higher. Depending on person/context, these costs can be
 prohibitive.


 I still don't understand. What impact backend license has on you? 

Let's not make this about me - while this issue has been (one of) the reason(s) why I have never even looked at DMD in all the years, I have never mentioned it. At least not until somebody suggested that the problem is not a real one. :) It's just /one/ of the issues leading to the very low amount of contributions; eliminating this one wouldn't drastically change the situation, it would be just a small step in the right direction.
 In other words, what is potential danger you need to be concerned about that
makes potential contributions too risky? One problem I am aware of is
redistribution issue which is common blocker with getting into linux
distributions. But personal contributions? Can you explain it in a bit more
details?

It's not about being able to contribute to DMD, it is about being able to work on /other/ projects. If contributing to DMD carries the risk of affecting the latter then it's simply best to avoid it; it's not a risk worth taking, just for a few small improvements. Significant work often starts with simple and trivial fixes; if scratching-an-itch is too costly then major contributions suffer too. Note that whether the risk is significant, or even real, doesn't really matter much -- it's the cost of making the decision that matters. Just-submit-a-small-patch-to-a-boost- -licensed-project turns into investigate-the-licensing-and-evaluate-all- -the-potential-legal-implications. It's enough to discourage submissions *even in the cases where there is no problem*. artur
Jun 21 2014
prev sibling next sibling parent "SomeDude" <lolilol mytrashmail.com> writes:
On Saturday, 21 June 2014 at 10:49:57 UTC, Artur Skawina via
Digitalmars-d wrote:
 It's not about being able to contribute to DMD, it is about 
 being able
 to work on /other/ projects. If contributing to DMD carries the 
 risk of
 affecting the latter then it's simply best to avoid it; it's 
 not a risk
 worth taking, just for a few small improvements. Significant 
 work often
 starts with simple and trivial fixes; if scratching-an-itch is 
 too costly
 then major contributions suffer too. Note that whether the risk 
 is
 significant, or even real, doesn't really matter much -- it's 
 the cost of
 making the decision that matters. 
 Just-submit-a-small-patch-to-a-boost-
 -licensed-project turns into 
 investigate-the-licensing-and-evaluate-all-
 -the-potential-legal-implications. It's enough to discourage 
 submissions
 *even in the cases where there is no problem*.

 artur

I really don't see what the issue is. If the projects are unrelated, there is no reason there could be a "contamination". And even with that, nothing prevents you from working on the front end with LDC or GDC.
Jun 22 2014
prev sibling next sibling parent Iain Buclaw via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 22 June 2014 08:21, SomeDude via Digitalmars-d
<digitalmars-d puremagic.com> wrote:
 On Saturday, 21 June 2014 at 10:49:57 UTC, Artur Skawina via

 Digitalmars-d wrote:
 It's not about being able to contribute to DMD, it is about being able
 to work on /other/ projects. If contributing to DMD carries the risk of
 affecting the latter then it's simply best to avoid it; it's not a risk
 worth taking, just for a few small improvements. Significant work often
 starts with simple and trivial fixes; if scratching-an-itch is too costly
 then major contributions suffer too. Note that whether the risk is
 significant, or even real, doesn't really matter much -- it's the cost of
 making the decision that matters. Just-submit-a-small-patch-to-a-boost-
 -licensed-project turns into investigate-the-licensing-and-evaluate-all-
 -the-potential-legal-implications. It's enough to discourage submissions
 *even in the cases where there is no problem*.

 artur

I really don't see what the issue is. If the projects are unrelated, there is no reason there could be a "contamination". And even with that, nothing prevents you from working on the front end with LDC or GDC.

And if your reasoning for not working on the front-end for GDC or LDC is that they are behind current development. Then why don't you make your first contribution as updating the chosen compiler to be aligned with DMD development! This is on the projects page for GDC at least, and soemwhere hidden on my long TODO list, but there are too many other important things to look after for the time being. :) Regards Iain.
Jun 22 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/22/14 09:48, Iain Buclaw via Digitalmars-d wrote:
 And if your reasoning for not working on the front-end for GDC or LDC
 is that they are behind current development.  Then why don't you make
 your first contribution as updating the chosen compiler to be aligned
 with DMD development!

1) It's not really about me. As long as the issue affected me, I have never said anything - I'm only mentioning it now, when it's no longer relevant (to me), because it's a problem for D. It was implicitly suggested that the issue isn't real or relevant; it was better to explain it using my example than to argue hypothetically. 2) I'm not even sure what you're suggesting. Dealing with DMD code is exactly the problem here. I guess I won't be able to explain the issue any better...
 This is on the projects page for GDC at least, and soemwhere hidden on
 my long TODO list, but there are too many other important things to
 look after for the time being.  :)

Any chance of some kind of progress on the inlining front in the near future? artur
Jun 22 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Saturday, 21 June 2014 at 10:49:57 UTC, Artur Skawina via 
Digitalmars-d wrote:
 In other words, what is potential danger you need to be 
 concerned about that makes potential contributions too risky? 
 One problem I am aware of is redistribution issue which is 
 common blocker with getting into linux distributions. But 
 personal contributions? Can you explain it in a bit more 
 details?

It's not about being able to contribute to DMD, it is about being able to work on /other/ projects.

This is exactly what I don't understand. How does DMD license affect work with any other projects? You are passing copyright to DigitalMars for all contributions anyway, Boost or proprietary. What is potential danger to be aware of? I think I am simply not that educated in legal matters on this topic.
Jun 22 2014
prev sibling next sibling parent Artur Skawina via Digitalmars-d <digitalmars-d puremagic.com> writes:
On 06/22/14 15:13, Dicebot via Digitalmars-d wrote:
 You are passing copyright to DigitalMars for all contributions anyway, Boost
or proprietary. 

Not true in general. Or are you saying that Walter requires copyright assignment before merging code? If that would be the case then this would be an even bigger problem. (Hmm, the GCC integration makes the situation much more complicated)
 What is potential danger to be aware of? I think I am simply not that educated
in legal matters on this topic.

Let's just say I'm paranoid. I don't want to discourage anybody from contributing, so I don't want to list "potential dangers", That would only lead to discussions about how real or serious they are; it's a very subjective matter. artur
Jun 22 2014
prev sibling next sibling parent "Dicebot" <public dicebot.lv> writes:
On Sunday, 22 June 2014 at 15:01:57 UTC, Artur Skawina via 
Digitalmars-d wrote:
 On 06/22/14 15:13, Dicebot via Digitalmars-d wrote:
 You are passing copyright to DigitalMars for all contributions 
 anyway, Boost or proprietary.

Not true in general. Or are you saying that Walter requires copyright assignment before merging code? If that would be the case then this would be an even bigger problem. (Hmm, the GCC integration makes the situation much more complicated)

All DMD source files mention "Copyright (c) *** by Digital Mars". As far as I understand that implies that any pull request that does not explicitly mentions copyright does assignment automatically. Which totally makes sense because source base with distributed copyright ownership is extremely inflexible to maintain, to the point where it is better to simply reject such pull request than to deal with it.
Jun 22 2014
prev sibling next sibling parent "Joakim" <dlang joakim.airpost.net> writes:
On Sunday, 22 June 2014 at 16:08:58 UTC, Dicebot wrote:
 All DMD source files mention "Copyright (c) *** by Digital 
 Mars". As far as I understand that implies that any pull 
 request that does not explicitly mentions copyright does 
 assignment automatically. Which totally makes sense because 
 source base with distributed copyright ownership is extremely 
 inflexible to maintain, to the point where it is better to 
 simply reject such pull request than to deal with it.

Simply sticking that notice at the top of a source file does not imply copyright assignment. At best, submitting a pull request might imply that you acquiesce to the open-source license of the project, but some would say even that isn't certain. This is why companies like google explicitly make you fill out a Contributor License Agreement before they will accept patches from you, though theirs doesn't require copyright assignment: https://developers.google.com/open-source/cla/individual Since Artur is being so evasive, I believe he's talking about the same reasons why Walter purposely won't even look at llvm code, which is basically BSD-licensed. Any time you can say you haven't even looked at any outside code, let alone contributed to it, you save yourself legal hassles. I think he's saying that many potential contributors will see the legal uncertainty from dmd's licenses and just pass on contributing to dmd. I don't know how real a concern this is, as I've thankfully never had to deal with these copyright-tainting issues.
Jun 22 2014
prev sibling parent "David Nadlinger" <code klickverbot.at> writes:
On Sunday, 22 June 2014 at 18:47:06 UTC, Walter Bright wrote:
 Copyright assignment is required in other larger projects, like 
 Linux and gcc.

Not Linux, no. David
Jun 22 2014