digitalmars.D - all OS functions should be "nothrow trusted nogc"

Shachar Shemesh (7/7) Jul 25 2017 The title really does says it all. I keep copying OS function

ag0aep6g (6/10) Jul 25 2017 Not all OS functions can be `@trusted`.

Shachar Shemesh (8/19) Jul 25 2017 And, indeed, the code calling "read" shouldn't be able to do that as

ag0aep6g (8/11) Jul 25 2017 That's not how `@trusted` works. The point of `@trusted` is to allow

Andrei Alexandrescu (21/34) Jul 25 2017 About http://man7.org/linux/man-pages/man2/read.2.html, there's just a

Steven Schveighoffer (12/25) Jul 25 2017 Great idea! Should it be a package on its own, or should we put the

Andrei Alexandrescu (2/33) Jul 25 2017 Same here. I'd preserve the function name though. -- Andrei

Walter Bright (15/16) Jul 25 2017 The idea of fixing the operating system interface(s) has come up now and...

Kagamin (7/12) Jul 26 2017 Given that C and OS api have no notion of memory safety, they

Walter Bright (2/3) Jul 26 2017 Marking ones that are safe with @safe is fine. OS APIs pretty much never...

Kagamin (5/9) Jul 28 2017 New technologies and new features get introduced over time: 64

Grander (3/13) Jul 28 2017 most of them don't lead to "real" API changes, they often only

Vladimir Panteleev (8/12) Jul 31 2017 Sometimes operating systems add new flags to their API which

Shachar Shemesh (7/21) Jul 31 2017 One of the things that really bother me with the D community is the

Timon Gehr (4/11) Jul 31 2017 Personally, I'm more bothered by this kind of lazy argument that sounds

Shachar Shemesh (14/28) Jul 31 2017 That's fine, but since, according to the logic presented here, no OS

Timon Gehr (6/15) Jul 31 2017 This is actually not true. Vladimir was just pointing out a complication...

Vladimir Panteleev (4/10) Jul 31 2017 Indeed. @safe is not a sandbox, there is no need to actually go

Kagamin (4/6) Aug 01 2017 In the worst case when a function becomes unsafe, only @safe

w0rp (5/5) Aug 01 2017 Direct OS function calls should probably all be treated as

H. S. Teoh via Digitalmars-d (5/10) Aug 01 2017 +1.

Marco Leise (14/22) Aug 01 2017 Am Tue, 1 Aug 2017 10:50:59 -0700

H. S. Teoh via Digitalmars-d (10/34) Aug 01 2017 [...]
Moritz Maxeiner (9/31) Aug 01 2017 I know this is in jest, but since `strlen`'s interface is

Steven Schveighoffer (4/38) Aug 01 2017 I think it goes without saying that some functions just shouldn't be

Moritz Maxeiner (23/63) Aug 01 2017 Of course, though I think this (sub) context was more about

Steven Schveighoffer (6/72) Aug 01 2017 Most definitely. It would be nice to have a fully @safe interface that

H. S. Teoh via Digitalmars-d (9/18) Aug 01 2017 [...]

Moritz Maxeiner (4/21) Aug 01 2017 I was lazy, okay (I nearly forgot putting the auto decoding

Andrei Alexandrescu (2/12) Aug 02 2017 return str.representation.countUntil('\0');

Moritz Maxeiner (4/15) Aug 02 2017 Thanks, wasn't aware of this; it makes auto decoding slightly

Andrei Alexandrescu (4/28) Jul 27 2017 The standard library would not be in the position to provide such, but

Shachar Shemesh (3/4) Jul 26 2017 Can you expand on this point?

Steven Schveighoffer (9/15) Jul 26 2017 Because anything casts to void[] implicitly.

Moritz Maxeiner (19/40) Jul 25 2017 No, it is not, because it does not fulfill the definition of

Stefan Koch (3/11) Jul 25 2017 these functions are supposed to have trused wrappers if used in

Shachar Shemesh (11/23) Jul 25 2017 I'd love to hear the difference between:

Steven Schveighoffer (6/30) Jul 25 2017 I think signalfd can be marked @trusted, as @safe code supports pointing...

Kagamin (2/2) Jul 25 2017 While we're at it, check this:

Steven Schveighoffer (3/5) Jul 25 2017 Looks fine to me. That's not an array of FILE, it's a single pointer.

Moritz Maxeiner (8/13) Jul 25 2017 fgetc cannot be @trusted the same way fclose cannot be @trusted.

Steven Schveighoffer (14/28) Jul 25 2017 The behavior is defined. It will crash with a segfault. This is par for

Moritz Maxeiner (9/30) Jul 25 2017 In C land that behaviour is a platform (hardware/OS/libc)

Steven Schveighoffer (12/18) Jul 25 2017 In cases where C does not crash when dereferencing null, then D would

Timon Gehr (10/23) Jul 25 2017 What Moritz is saying is that the following implementation of fclose is

Timon Gehr (2/13) Jul 25 2017 (Forgot the returns.)
Andrei Alexandrescu (4/29) Jul 25 2017 I'd think that would be the case, but failed to find a fgetc

Walter Bright (7/9) Jul 25 2017 The documentation for DMC++ fgetc() is:

Timon Gehr (4/19) Jul 26 2017 The C mindset is that this check is a waste of precious processing

Walter Bright (3/8) Jul 26 2017 I wrote that code 30+ years ago, and no longer remember why I put the nu...

Timon Gehr (4/19) Jul 26 2017 It's implicit. In C, whenever you pass something that is outside the

Steven Schveighoffer (14/39) Jul 25 2017 I think we can correctly assume no fclose implementations exist that do

Walter Bright (24/27) Jul 25 2017 I spent 10 years programming on DOS with zero memory protection, and peo...

Patrick Schluter (7/40) Jul 26 2017 And alone for that list of decision do I love you. I can not hear

Timon Gehr (2/24) Jul 26 2017 I'm not going to assume that.

Steven Schveighoffer (4/29) Jul 26 2017 Tell you what, when you find a D platform that this doesn't happen, we

Timon Gehr (3/15) Jul 26 2017 The burden of proof is on you, not me. You are advocating the C approach...

Steven Schveighoffer (14/30) Jul 26 2017 They leave NULL dereferencing undefined because in some quirky old

Patrick Schluter (14/57) Jul 26 2017 What a luck that Solaris/SPARC is not supported as on that

Steven Schveighoffer (12/50) Jul 26 2017 I'm guessing though that it's an implementation detail (like Walter's

Andrei Alexandrescu (5/9) Jul 26 2017 No need to worry about that at all. If worse comes to worst - i.e. we do...

Timon Gehr (12/24) Jul 26 2017 My argument was not that we need to fear implementations that take

Steven Schveighoffer (5/18) Jul 26 2017 I can't see how compilers can take advantage of this one. However, we

Jacob Carlborg (40/43) Jul 26 2017 Unfortunately it's not that easy with optimizing compilers for C and C++...

Steven Schveighoffer (4/11) Jul 27 2017 So the result is that it will segfault. I don't see a problem with this....

Patrick Schluter (14/24) Jul 27 2017 Except that that code was used in the Linux kernel where page 0
Jacob Carlborg (5/7) Jul 27 2017 The problem is that behavior might change depending on which compiler is...

Steven Schveighoffer (11/21) Jul 26 2017 Hm.. so you mean:

Timon Gehr (2/28) Jul 27 2017 That works but it changes the signature. (extern(D) vs. extern(C)).

Steven Schveighoffer (6/34) Jul 27 2017 Hm... you could use pragma(mangle) to get the signature the same. I was
Andrei Alexandrescu (6/35) Jul 27 2017 There are a number of techniques allowing you to daisy chain C functions...

Moritz Maxeiner (30/73) Jul 27 2017 --- null.d ---

ag0aep6g (5/37) Jul 27 2017 The gist of this is that Linux can be configured so that null can be a

Moritz Maxeiner (21/58) Jul 27 2017 In summation, yes. To be technical about it:

Steven Schveighoffer (7/60) Jul 27 2017 Again, all these hacks are just messing with the assumptions D is

Moritz Maxeiner (11/76) Jul 27 2017 Which aren't in the official D spec (or at the very least I can't

Steven Schveighoffer (14/90) Jul 27 2017 You are right. I have asked Walter to add such an update. I should pull

Moritz Maxeiner (7/25) Jul 27 2017 Which essentially means that any library written in @safe D

Moritz Maxeiner (8/18) Jul 26 2017 OK, my (wrong) assumption was that a D compiler would on those

Kagamin (2/8) Jul 26 2017 There's a less questionable problem with it.

Kagamin (2/3) Jul 29 2017 Hint: FILE struct is transparent, look inside it, lots of

Andrei Alexandrescu (2/4) Jul 25 2017 That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei

Steven Schveighoffer (5/10) Jul 25 2017 fclose is not @safe.

Andrei Alexandrescu (2/11) Jul 25 2017 Ah, nice. Thanks! -- Andrei

Kagamin (4/6) Jul 25 2017 What about functions that take zero terminated strings? Are they

Steven Schveighoffer (6/12) Jul 25 2017 No, a null terminated string is as arbitrary as passing in a length.

Moritz Maxeiner (16/21) Jul 25 2017 Since you explicitly state *all* OS functions:

Shachar Shemesh (14/32) Jul 25 2017 Technically, any system call that is a pthreads cancellation point may

Moritz Maxeiner (26/63) Jul 25 2017 Good to know, then since D is supposed to be able to catch C++

Shachar Shemesh (23/72) Jul 25 2017 And right there and then you've introduced a serious problem. The

Moritz Maxeiner (22/95) Jul 26 2017 The issue lies with the definition of `nothrow` considering only

Shachar Shemesh <shachar weka.io> writes:

The title really does says it all. I keep copying OS function 
declarations into my code, just so I can add those attributes to them. 
Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
couple from my most recent history) from  safe code.

I can try and set up a PR when I have the time. If anyone else wants to 
take an easy one before then, you're welcome to :-)

Shachar

Jul 25 2017

ag0aep6g <anonymous example.com> writes:

On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.

Not all OS functions can be ` trusted`.

I don't about `signalfd` and `sigemptyset`, but `read` [1] can't be 
` trusted`, for example. It takes pointer and length separately, and the 
pointer is a `void*`. That's not safe at all.


[1] http://man7.org/linux/man-pages/man2/read.2.html

Jul 25 2017

Shachar Shemesh <shachar weka.io> writes:

On 25/07/17 17:11, ag0aep6g wrote:
 On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.

 
 Not all OS functions can be ` trusted`.
 
 I don't about `signalfd` and `sigemptyset`, but `read` [1] can't be 
 ` trusted`, for example. It takes pointer and length separately, and the 
 pointer is a `void*`. That's not safe at all.

And, indeed, the code calling "read" shouldn't be able to do that as 
 safe. Read itself, however, is trusted (because, let's face it, if you 
cannot trust the kernel, you're screwed anyways).

Having said that, I have no objection to excluding the "pointer+length" 
system calls from the above rule. They are, by far, the minority of 
system calls.

Shachar

Jul 25 2017

ag0aep6g <anonymous example.com> writes:

On 07/25/2017 04:32 PM, Shachar Shemesh wrote:
 And, indeed, the code calling "read" shouldn't be able to do that as 
  safe. Read itself, however, is trusted (because, let's face it, if you 
 cannot trust the kernel, you're screwed anyways).

That's not how ` trusted` works. The point of ` trusted` is to allow 
unsafe features in the implementation. The interface must be just as 
safe as with ` safe`.

`read` doesn't have a safe interface. `read` is safe as long as long as 
you pass good arguments. When you pass bad arguments, `read` will break 
your stuff. A ` trusted` function must always be safe, no matter the 
arguments.

Jul 25 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 10:43 AM, ag0aep6g wrote:
 On 07/25/2017 04:32 PM, Shachar Shemesh wrote:
 And, indeed, the code calling "read" shouldn't be able to do that as 
  safe. Read itself, however, is trusted (because, let's face it, if 
 you cannot trust the kernel, you're screwed anyways).

 
 That's not how ` trusted` works. The point of ` trusted` is to allow 
 unsafe features in the implementation. The interface must be just as 
 safe as with ` safe`.
 
 `read` doesn't have a safe interface. `read` is safe as long as long as 
 you pass good arguments. When you pass bad arguments, `read` will break 
 your stuff. A ` trusted` function must always be safe, no matter the 
 arguments.

About http://man7.org/linux/man-pages/man2/read.2.html, there's just a 
bit of wrapping necessary:

nothrow  trusted  nogc
ssize_t read(int fd, ubyte[] buf)
{
     return read(fd, buf.ptr, buf.length);
}

(btw void[] doesn't work)

The point being that a safe D program needs to guarantee memory will not 
be corrupted, and there's no way for the type system to ensure that in 
the Posix read() call the buffer and the length are coordinated.

Certain systems (such as the static checker used at Microsoft - it's 
fairly well known, there are a couple of papers on it, forgot the name) 
require annotations in the function signature to indicate the 
coordination, e.g.:

ssize_t read(int fd, void *buf,  islengthof(buf) size_t count);

Then the type checker can verify upon each call that indeed count is the 
right size of buf.

A suite of safe wrappers on OS primitives might be useful.


Andrei

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 11:26 AM, Andrei Alexandrescu wrote:
 
 About http://man7.org/linux/man-pages/man2/read.2.html, there's just a 
 bit of wrapping necessary:
 
 nothrow  trusted  nogc
 ssize_t read(int fd, ubyte[] buf)
 {
      return read(fd, buf.ptr, buf.length);
 }
 
 (btw void[] doesn't work)
 

[snip]
 A suite of safe wrappers on OS primitives might be useful.

Great idea! Should it be a package on its own, or should we put the 
wrappers inside the original files?

That is, do we make

core.sys.safe.posix.unistd: read

or do we make

core.sys.posix.unistd: safe_read

?

My preference is for the former, since it's very nice to have a pristine 
copy of the header file.

-Steve

Jul 25 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 11:50 AM, Steven Schveighoffer wrote:
 On 7/25/17 11:26 AM, Andrei Alexandrescu wrote:
 About http://man7.org/linux/man-pages/man2/read.2.html, there's just a 
 bit of wrapping necessary:

 nothrow  trusted  nogc
 ssize_t read(int fd, ubyte[] buf)
 {
      return read(fd, buf.ptr, buf.length);
 }

 (btw void[] doesn't work)

 [snip]
 A suite of safe wrappers on OS primitives might be useful.

 
 Great idea! Should it be a package on its own, or should we put the 
 wrappers inside the original files?
 
 That is, do we make
 
 core.sys.safe.posix.unistd: read
 
 or do we make
 
 core.sys.posix.unistd: safe_read
 
 ?
 
 My preference is for the former, since it's very nice to have a pristine 
 copy of the header file.

Same here. I'd preserve the function name though. -- Andrei

Jul 25 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 7/25/2017 8:26 AM, Andrei Alexandrescu wrote:
 A suite of safe wrappers on OS primitives might be useful.

The idea of fixing the operating system interface(s) has come up now and then. 
I've tried to discourage that on the following grounds:


* We are not in the operating system business.

* Operating system APIs grow like weeds. We'd set ourselves an impossible task.

* It's a huge job simply to provide accurate declarations for the APIs.

* We'd have to write our own documentation for the operating system APIs. It's 
hard enough writing such for Phobos.

* A lot are fundamentally unfixable, like free() and strlen().

* The API import files should be focused solely on direct access to the APIs, 
not adding a translation layer. The user of them will expect this.

* We already have safe wrappers for the commonly used APIs. For read(), there
is 
std.stdio.


It is worthwhile, however, to augment the APIs with the appropriate attributes 
like  nogc, scope, nothrow,  safe (for the ones that are), etc.

Jul 25 2017

Kagamin <spam here.lot> writes:

On Wednesday, 26 July 2017 at 02:54:34 UTC, Walter Bright wrote:
 * Operating system APIs grow like weeds. We'd set ourselves an 
 impossible task.

 It is worthwhile, however, to augment the APIs with the 
 appropriate attributes like  nogc, scope, nothrow,  safe (for 
 the ones that are), etc.

Given that C and OS api have no notion of memory safety, they 
don't support it and don't maintain it, so if it once was safe, 
it can be refactored later and become unsafe relying on proper 
usage of the api. Then if it was marked safe, the qualifier must 
be removed, which will be a breaking change for D code, but not 
for C code. Should we still try to mark them safe at all?

Jul 26 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?

Marking ones that are safe with  safe is fine. OS APIs pretty much never change.

Jul 26 2017

Kagamin <spam here.lot> writes:

On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?

 Marking ones that are safe with  safe is fine. OS APIs pretty 
 much never change.

New technologies and new features get introduced over time: 64 
bit, ipv6, bitmap_v5, generally bigger data everywhere, and api 
changes accordingly and incorporates new features, and takes 
increasingly bigger arguments over time.

Jul 28 2017

Grander <grander grander.grander> writes:

On Friday, 28 July 2017 at 12:40:06 UTC, Kagamin wrote:
 On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?

 Marking ones that are safe with  safe is fine. OS APIs pretty 
 much never change.

 New technologies and new features get introduced over time: 64 
 bit, ipv6, bitmap_v5, generally bigger data everywhere, and api 
 changes accordingly and incorporates new features, and takes 
 increasingly bigger arguments over time.

most of them don't lead to "real" API changes, they often only 
add new functions/overloads/whatever

Jul 28 2017

Vladimir Panteleev <thecybershadow.lists gmail.com> writes:

On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?

 Marking ones that are safe with  safe is fine. OS APIs pretty 
 much never change.

Sometimes operating systems add new flags to their API which 
change how some values are interpreted. Some API functions may, 
for example, normally take a pointer to a such-and-such struct, 
but if a certain flag is specified, the parameter is instead 
interpreted as a pointer to a different data type. That would be 
one case where an API call becomes un- safe due to the addition 
of a flag.

Jul 31 2017

Shachar Shemesh <shachar weka.io> writes:

On 31/07/17 16:33, Vladimir Panteleev wrote:
 On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?

 Marking ones that are safe with  safe is fine. OS APIs pretty much 
 never change.

 
 Sometimes operating systems add new flags to their API which change how 
 some values are interpreted. Some API functions may, for example, 
 normally take a pointer to a such-and-such struct, but if a certain flag 
 is specified, the parameter is instead interpreted as a pointer to a 
 different data type. That would be one case where an API call becomes 
 un- safe due to the addition of a flag.
 

One of the things that really bother me with the D community is the 
"100% or nothing" approach.

System programming is, by definition, an exercise in juggling 
conflicting aims. The more absolute the language, the less useful it is 
for performing real life tasks.

Shachar

Jul 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 31.07.2017 15:56, Shachar Shemesh wrote:
 
 One of the things that really bother me with the D community is the 
 "100% or nothing" approach.
 ...

Personally, I'm more bothered by this kind of lazy argument that sounds 
good but has no substance.

 System programming is, by definition, an exercise in juggling 
 conflicting aims. The more absolute the language, the less useful it is 
 for performing real life tasks.

Why do you think  trusted exists?

Jul 31 2017

Shachar Shemesh <shachar weka.io> writes:

On 31/07/17 17:08, Timon Gehr wrote:
 On 31.07.2017 15:56, Shachar Shemesh wrote:
 One of the things that really bother me with the D community is the 
 "100% or nothing" approach.
 ...

 
 Personally, I'm more bothered by this kind of lazy argument that sounds 
 good but has no substance.
 
 System programming is, by definition, an exercise in juggling 
 conflicting aims. The more absolute the language, the less useful it 
 is for performing real life tasks.

 
 Why do you think  trusted exists?

That's fine, but since, according to the logic presented here, no OS 
function can ever be  safe, then all code calling such a function can't 
be  safe either. At this point, half your code, give or take, is 
 trusted. That's the point you give up, and just write everything as 
 system.

And what we have here is that you started out trying to be 100% pure 
(and, in this case, there is no problem with current code, only 
*hypothetical* future changes), and end up not getting any protection 
from  safe at all.

There is a proverb in Hebrew that says:
תפסת מרובה, לא תפסת.
Try to grab too much, and you end up holding nothing.

Shachar

Jul 31 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 31.07.2017 16:15, Shachar Shemesh wrote:
 Why do you think  trusted exists?

 
 That's fine, but since, according to the logic presented here, no OS 
 function can ever be  safe,

This is actually not true. Vladimir was just pointing out a complication 
of which to be aware. Are you arguing against applying due diligence 
when specifying library interfaces?

 
 There is a proverb in Hebrew that says:
 תפסת מרובה, לא תפסת.
 Try to grab too much, and you end up holding nothing. 

I.e. if you mark too many functions as  trusted, you will have no memory 
safety.

Jul 31 2017

Vladimir Panteleev <thecybershadow.lists gmail.com> writes:

On Monday, 31 July 2017 at 14:51:22 UTC, Timon Gehr wrote:
 On 31.07.2017 16:15, Shachar Shemesh wrote:
 That's fine, but since, according to the logic presented here, 
 no OS function can ever be  safe,

 This is actually not true. Vladimir was just pointing out a 
 complication of which to be aware. Are you arguing against 
 applying due diligence when specifying library interfaces?

Indeed.  safe is not a sandbox, there is no need to actually go 
to extreme measures to safeguard against potential changes beyond 
our control; just something to keep in mind.

Jul 31 2017

Kagamin <spam here.lot> writes:

On Monday, 31 July 2017 at 13:56:48 UTC, Shachar Shemesh wrote:
 One of the things that really bother me with the D community is 
 the "100% or nothing" approach.

In the worst case when a function becomes unsafe, only  safe 
attribute will be removed from it, which will be a breaking 
change, but hopefully it will happen rarely enough.

Aug 01 2017

w0rp <devw0rp gmail.com> writes:

Direct OS function calls should probably all be treated as 
unsafe, except for rare cases where the behaviour is very well 
defined in standards and in actual implementations to be safe. 
The way to get safe functions for OS functionality is to write 
wrapper functions in D which prohibit unsafe calls.

Aug 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions in D
 which prohibit unsafe calls.

+1.


T

-- 
People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG

Aug 01 2017

Marco Leise <Marco.Leise gmx.de> writes:

Am Tue, 1 Aug 2017 10:50:59 -0700
schrieb "H. S. Teoh via Digitalmars-d"
<digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions in D
 which prohibit unsafe calls.  

 
 +1.

I think I got it now!

	size_t strlen_safe(in char[] str)  trusted
	{
		foreach (c; str)
			if (!c)
				return strlen(str.ptr);
		return str.length;
	}

  :o)

-- 
Marco

Aug 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, Aug 01, 2017 at 10:39:35PM +0200, Marco Leise via Digitalmars-d wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:
 
 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions
 in D which prohibit unsafe calls.  

 
 +1.

 
 I think I got it now!
 
 	size_t strlen_safe(in char[] str)  trusted
 	{
 		foreach (c; str)
 			if (!c)
 				return strlen(str.ptr);
 		return str.length;
 	}
 
   :o)

[...]

LOL, that's laughably inefficient.  Instead of calling strlen, you might
as well have just looped with an index and returned the index. :-P

	foreach (i, c; str)
		if (!c) return i;

Oh wait, so we didn't need the wrapper after all. :-P


T

-- 
It's amazing how careful choice of punctuation can leave you hanging:

Aug 01 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via 
 Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as 
 unsafe, except for rare cases where the behaviour is very 
 well defined in standards and in actual implementations to 
 be safe. The way to get safe functions for OS functionality 
 is to write wrapper functions in D which prohibit unsafe 
 calls.

 
 +1.

 I think I got it now!

 	size_t strlen_safe(in char[] str)  trusted
 	{
 		foreach (c; str)
 			if (!c)
 				return strlen(str.ptr);
 		return str.length;
 	}

   :o)

I know this is in jest, but since `strlen`'s interface is 
inherently unsafe, yes, the only way to make calling it  safe 
happens to also solve what `strlen` is supposed to solve.
To me the consequence of this would be to not use `strlen` (or 
any other C function where checking the arguments for  safety 
solves a superset of what the C function solves) from D.
I don't think this applies to most OS functions, though, just to 
(OS independent) libc functions.

Aug 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as > 

 unsafe, except for rare cases where the behaviour is very > well 
 defined in standards and in actual implementations to > be safe. The 
 way to get safe functions for OS functionality > is to write wrapper 
 functions in D which prohibit unsafe > calls.

 +1.

 I think I got it now!

     size_t strlen_safe(in char[] str)  trusted
     {
         foreach (c; str)
             if (!c)
                 return strlen(str.ptr);
         return str.length;
     }

   :o)

 
 I know this is in jest, but since `strlen`'s interface is inherently 
 unsafe, yes, the only way to make calling it  safe happens to also solve 
 what `strlen` is supposed to solve.
 To me the consequence of this would be to not use `strlen` (or any other 
 C function where checking the arguments for  safety solves a superset of 
 what the C function solves) from D.
 I don't think this applies to most OS functions, though, just to (OS 
 independent) libc functions.

I think it goes without saying that some functions just shouldn't be 
marked  safe or  trusted. strlen is one of those.

-Steve

Aug 01 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 1 August 2017 at 21:59:46 UTC, Steven Schveighoffer 
wrote:
 On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via 
 Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as


 unsafe, except for rare cases where the behaviour is very > 
 well defined in standards and in actual implementations to > 
 be safe. The way to get safe functions for OS functionality
 is to write wrapper functions in D which prohibit unsafe >

 calls.

 +1.

 I think I got it now!

     size_t strlen_safe(in char[] str)  trusted
     {
         foreach (c; str)
             if (!c)
                 return strlen(str.ptr);
         return str.length;
     }

   :o)

 
 I know this is in jest, but since `strlen`'s interface is 
 inherently unsafe, yes, the only way to make calling it  safe 
 happens to also solve what `strlen` is supposed to solve.
 To me the consequence of this would be to not use `strlen` (or 
 any other C function where checking the arguments for  safety 
 solves a superset of what the C function solves) from D.
 I don't think this applies to most OS functions, though, just 
 to (OS independent) libc functions.

 I think it goes without saying that some functions just 
 shouldn't be marked  safe or  trusted. strlen is one of those.

Of course, though I think this (sub) context was more about 
writing  safe D wrappers for  system C functions than about which 
C functions to mark as  trusted/ safe. `strnlen` shouldn't be 
marked  safe/ trusted, either, but writing a  safe D wrapper for 
it doesn't involve doing in D what `strnlen` is supposed to do:

---
size_t strnlen_safe(in char[] str)
{
     return strnlen(&str[0], str.length);
}
---

Not that there's much of a reason to do so, anyway, when the D 
idiomatic way is just a Phobos away:

---
import std.algorithm;
// I probably wouldn't even define this but use the body as is
auto strnlen_safe(in char[] str)
{
     return countUntil(cast(ubyte[]) str, '\0');
}
---

Aug 01 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 8/1/17 6:17 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 21:59:46 UTC, Steven Schveighoffer wrote:
 On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d 
 wrote:
 Direct OS function calls should probably all be treated as


 unsafe, except for rare cases where the behaviour is very > well 
 defined in standards and in actual implementations to > be safe. 
 The way to get safe functions for OS functionality
 is to write wrapper functions in D which prohibit unsafe >

 calls.

 +1.

 I think I got it now!

     size_t strlen_safe(in char[] str)  trusted
     {
         foreach (c; str)
             if (!c)
                 return strlen(str.ptr);
         return str.length;
     }

   :o)

 I know this is in jest, but since `strlen`'s interface is inherently 
 unsafe, yes, the only way to make calling it  safe happens to also 
 solve what `strlen` is supposed to solve.
 To me the consequence of this would be to not use `strlen` (or any 
 other C function where checking the arguments for  safety solves a 
 superset of what the C function solves) from D.
 I don't think this applies to most OS functions, though, just to (OS 
 independent) libc functions.

 I think it goes without saying that some functions just shouldn't be 
 marked  safe or  trusted. strlen is one of those.

 
 Of course, though I think this (sub) context was more about writing 
  safe D wrappers for  system C functions than about which C functions to 
 mark as  trusted/ safe. `strnlen` shouldn't be marked  safe/ trusted, 
 either, but writing a  safe D wrapper for it doesn't involve doing in D 
 what `strnlen` is supposed to do:
 
 ---
 size_t strnlen_safe(in char[] str)
 {
      return strnlen(&str[0], str.length);
 }
 ---

Most definitely. It would be nice to have a fully  safe interface that 
is as low-level as you can possibly get. Then any library implemented on 
top of it could be marked  safe as well.

 Not that there's much of a reason to do so, anyway, when the D idiomatic 
 way is just a Phobos away:
 
 ---
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }

Oh that cast.... it irks me so.

-Steve

Aug 01 2017

"H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:

On Tue, Aug 01, 2017 at 06:46:17PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 8/1/17 6:17 PM, Moritz Maxeiner wrote:

[...]
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }

 
 Oh that cast.... it irks me so.

[...]

Welcome to the wonderful world of autodecoding. :-D

OTOH, we could just use byCodeUnit and we wouldn't need the cast, I
think. 


T

-- 
Don't get stuck in a closet---wear yourself out.

Aug 01 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 1 August 2017 at 22:52:26 UTC, H. S. Teoh wrote:
 On Tue, Aug 01, 2017 at 06:46:17PM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 On 8/1/17 6:17 PM, Moritz Maxeiner wrote:

 [...]
 import std.algorithm;
 // I probably wouldn't even define this but use the body as 
 is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }

 
 Oh that cast.... it irks me so.

 [...]

 Welcome to the wonderful world of autodecoding. :-D

 OTOH, we could just use byCodeUnit and we wouldn't need the 
 cast, I think.

I was lazy, okay (I nearly forgot putting the auto decoding 
prevention in there, because I always forget that D has auto 
decoding; it irks me as well) :p

Aug 01 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }

 
 Oh that cast.... it irks me so.
 
 -Steve

return str.representation.countUntil('\0');

Andrei

Aug 02 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 2 August 2017 at 16:32:44 UTC, Andrei Alexandrescu 
wrote:
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }

 
 Oh that cast.... it irks me so.
 
 -Steve

 return str.representation.countUntil('\0');

Thanks, wasn't aware of this; it makes auto decoding slightly 
more bearable.

Aug 02 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 10:54 PM, Walter Bright wrote:
 On 7/25/2017 8:26 AM, Andrei Alexandrescu wrote:
 A suite of safe wrappers on OS primitives might be useful.

 
 The idea of fixing the operating system interface(s) has come up now and 
 then. I've tried to discourage that on the following grounds:
 
 
 * We are not in the operating system business.
 
 * Operating system APIs grow like weeds. We'd set ourselves an 
 impossible task.
 
 * It's a huge job simply to provide accurate declarations for the APIs.
 
 * We'd have to write our own documentation for the operating system 
 APIs. It's hard enough writing such for Phobos.
 
 * A lot are fundamentally unfixable, like free() and strlen().
 
 * The API import files should be focused solely on direct access to the 
 APIs, not adding a translation layer. The user of them will expect this.
 
 * We already have safe wrappers for the commonly used APIs. For read(), 
 there is std.stdio.

The standard library would not be in the position to provide such, but 
the project seems a good choice for a crowdsource and crowdmaintained 
library. -- Andrei

Jul 27 2017

Shachar Shemesh <shachar weka.io> writes:

On 25/07/17 18:26, Andrei Alexandrescu wrote:
 (btw void[] doesn't work)

Can you expand on this point?

Shachar

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 3:05 AM, Shachar Shemesh wrote:
 On 25/07/17 18:26, Andrei Alexandrescu wrote:
 (btw void[] doesn't work)

 
 Can you expand on this point?
 
 Shachar

Because anything casts to void[] implicitly.

e.g.:

void main()  safe
{
    int *[] arr = new int*[5];
    read(0, arr); // reading raw pointer data, shouldn't be allowed
}

-Steve

Jul 26 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 25 July 2017 at 14:32:18 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:11, ag0aep6g wrote:
 On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes 
 to them. Otherwise I simply cannot call "signalfd" and 
 "sigemptyset" (to name a couple from my most recent history) 
 from  safe code.

 
 Not all OS functions can be ` trusted`.
 
 I don't about `signalfd` and `sigemptyset`, but `read` [1] 
 can't be ` trusted`, for example. It takes pointer and length 
 separately, and the pointer is a `void*`. That's not safe at 
 all.

 And, indeed, the code calling "read" shouldn't be able to do 
 that as  safe. Read itself, however, is trusted

No, it is not, because it does not fulfill the definition of 
 trusted (callable from *any*  safe context without allowing 
memory corruption).

 (because, let's face it, if you cannot trust the kernel, you're 
 screwed anyways).

This has nothing to do with trusting the kernel:
---
char[1] buf;
int dontCorruptMePlease;
read(fd, &buf[0], 10);
---
The read implementation can't verify the buffer size, it must 
assume it to be correct. If it's too large  for the actual buffer 
-> memory corruption.
No function taking pointer+size of pointed to (that accesses 
them) can be  trusted.

 Having said that, I have no objection to excluding the 
 "pointer+length" system calls from the above rule. They are, by 
 far, the minority of system calls.

And also happen to be the most used ones.
But I digress, the point is *every single functionust be verified 
for every single Attribute* (other than nothrow).
PRs are welcome :)

Jul 25 2017

Stefan Koch <uplink.coder googlemail.com> writes:

On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes 
 to them. Otherwise I simply cannot call "signalfd" and 
 "sigemptyset" (to name a couple from my most recent history) 
 from  safe code.

 I can try and set up a PR when I have the time. If anyone else 
 wants to take an easy one before then, you're welcome to :-)

 Shachar

these functions are supposed to have trused wrappers if used in 
safe code.

Jul 25 2017

Shachar Shemesh <shachar weka.io> writes:

On 25/07/17 17:12, Stefan Koch wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.

 I can try and set up a PR when I have the time. If anyone else wants 
 to take an easy one before then, you're welcome to :-)

 Shachar

 
 these functions are supposed to have trused wrappers if used in safe code.

I'd love to hear the difference between:
extern(C) int signalfd (int __fd, const(sigset_t)* __mask, int __flags) 
nothrow  nogc;

and
int signalfdWrapper(int __fd, const(sigset_t)* __mask, int __flags) 
nothrow  trusted  nogc {
	return signalfd(__fd, __mask, __flags);
}

Or are you suggesting the wrapper do something else?

Shachar

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 10:27 AM, Shachar Shemesh wrote:
 On 25/07/17 17:12, Stefan Koch wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to 
 them. Otherwise I simply cannot call "signalfd" and "sigemptyset" (to 
 name a couple from my most recent history) from  safe code.

 I can try and set up a PR when I have the time. If anyone else wants 
 to take an easy one before then, you're welcome to :-)

 Shachar

 these functions are supposed to have trused wrappers if used in safe 
 code.

 
 I'd love to hear the difference between:
 extern(C) int signalfd (int __fd, const(sigset_t)* __mask, int __flags) 
 nothrow  nogc;
 
 and
 int signalfdWrapper(int __fd, const(sigset_t)* __mask, int __flags) 
 nothrow  trusted  nogc {
      return signalfd(__fd, __mask, __flags);
 }

I think signalfd can be marked  trusted, as  safe code supports pointing 
at a single element.

Other system calls that accept a pointer/length combo cannot be marked 
 trusted.

-Steve

Jul 25 2017

Kagamin <spam here.lot> writes:

While we're at it, check this: 
https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

Looks fine to me. That's not an array of FILE, it's a single pointer.

-Steve

Jul 25 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

 Looks fine to me. That's not an array of FILE, it's a single 
 pointer.

fgetc cannot be  trusted the same way fclose cannot be  trusted.
If you pass either of them `null` - which constitutes a legal 
 safe context - the behaviour is undefined, which contradicts 
 trusted definition:
<Trusted functions are guaranteed by the programmer to not 
exhibit any undefined behavior if called by a safe function.>

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 2:36 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047 

 Looks fine to me. That's not an array of FILE, it's a single pointer.

 
 fgetc cannot be  trusted the same way fclose cannot be  trusted.
 If you pass either of them `null` - which constitutes a legal  safe 
 context - the behaviour is undefined, which contradicts  trusted 
 definition:
 <Trusted functions are guaranteed by the programmer to not exhibit any 
 undefined behavior if called by a safe function.>

The behavior is defined. It will crash with a segfault. This is par for 
the course in  safe land -- dereferencing null pointers is OK.

What is not defined is to fclose a file, and then use that FILE * in any 
way afterwards without reassigning.

Note that  safe functions don't make any guarantees once you pass in an 
invalid (except for null) or dangling pointer. However, if you are using 
only  safe code, you shouldn't be able to make one of these either. 
Hence fclose is not  safe or  trusted.

The one case where this fails is for a null pointer to a very very large 
struct that has a way to reference data outside the protected page. I 
have proposed in the past a way to protect against this, but it didn't 
gain any traction.

-Steve

Jul 25 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 2:36 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer 
 wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

 Looks fine to me. That's not an array of FILE, it's a single 
 pointer.

 
 fgetc cannot be  trusted the same way fclose cannot be 
  trusted.
 If you pass either of them `null` - which constitutes a legal 
  safe context - the behaviour is undefined, which contradicts 
  trusted definition:
 <Trusted functions are guaranteed by the programmer to not 
 exhibit any undefined behavior if called by a safe function.>

 The behavior is defined. It will crash with a segfault.

In C land that behaviour is a platform (hardware/OS/libc) 
specific implementation detail (it's what you generally expect to 
happen, but AFAIK it isn't defined in official ISO/IEC C).

 This is par for the course in  safe land -- dereferencing null 
 pointers  is OK.

In D land we require null dereferences to crash.

That means - from a strict, pedantic standpoint - that while it's 
OK to attribute D functions with null dereferences as  trusted, 
the same can't be said for C functions with null dereferences.

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 
 In C land that behaviour is a platform (hardware/OS/libc) specific 
 implementation detail (it's what you generally expect to happen, but 
 AFAIK it isn't defined in official ISO/IEC C).

In cases where C does not crash when dereferencing null, then D would 
not crash when dereferencing null. D depends on the hardware doing this 
(Walter has said so many times), so if C doesn't do it, then D won't. So 
those systems would have to be treated specially, and you'd have to work 
out your own home-grown mechanism for memory safety.

Optionally, one can redefine  safe *on those platforms* to say all 
dereferences will be checked against null, and then it could work on 
such platforms (and of course, you'd have to remove the  trusted marks 
from low-level C calls).

Either way, we can mark these as  trusted for all current D platforms.

-Steve

Jul 25 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) specific 
 implementation detail (it's what you generally expect to happen, but 
 AFAIK it isn't defined in official ISO/IEC C).

 
 In cases where C does not crash when dereferencing null, then D would 
 not crash when dereferencing null. D depends on the hardware doing this 
 (Walter has said so many times), so if C doesn't do it, then D won't. So 
 those systems would have to be treated specially, and you'd have to work 
 out your own home-grown mechanism for memory safety.

What Moritz is saying is that the following implementation of fclose is 
correct according to the C standard:

int fclose(FILE *stream){
     if(stream == NULL){
         go_wild_and_corrupt_all_the_memory();
     }else{
         actually_close_the_file(stream);
     }
}

Jul 25 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 02:45, Timon Gehr would have liked to have written:
 ...
 What Moritz is saying is that the following implementation of fclose is 
 correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }

(Forgot the returns.)

Jul 25 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 08:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) specific 
 implementation detail (it's what you generally expect to happen, but 
 AFAIK it isn't defined in official ISO/IEC C).

 In cases where C does not crash when dereferencing null, then D would 
 not crash when dereferencing null. D depends on the hardware doing 
 this (Walter has said so many times), so if C doesn't do it, then D 
 won't. So those systems would have to be treated specially, and you'd 
 have to work out your own home-grown mechanism for memory safety.

 
 What Moritz is saying is that the following implementation of fclose is 
 correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          go_wild_and_corrupt_all_the_memory();
      }else{
          actually_close_the_file(stream);
      }
 }

I'd think that would be the case, but failed to find a fgetc 
implementation that mentions it's undefined for a null FILE*. Is there a 
link? Thx. -- Andrei

Jul 25 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 7/25/2017 5:56 PM, Andrei Alexandrescu wrote:
 I'd think that would be the case, but failed to find a fgetc implementation
that 
 mentions it's undefined for a null FILE*. Is there a link? Thx. -- Andrei

The documentation for DMC++ fgetc() is:

   https://digitalmars.com/rtl/stdio.html#fgetc

and says:

   "Returns the character just read on success, or EOF if end-of-file or a read 
error is encountered."

The implementation checks for fp being NULL and returns EOF if it is.

Jul 25 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 05:02, Walter Bright wrote:
 On 7/25/2017 5:56 PM, Andrei Alexandrescu wrote:
 I'd think that would be the case, but failed to find a fgetc 
 implementation that mentions it's undefined for a null FILE*. Is there 
 a link? Thx. -- Andrei

 
 The documentation for DMC++ fgetc() is:
 
    https://digitalmars.com/rtl/stdio.html#fgetc
 
 and says:
 
    "Returns the character just read on success, or EOF if end-of-file or 
 a read error is encountered."
 
 The implementation checks for fp being NULL and returns EOF if it is.

The C mindset is that this check is a waste of precious processing 
resources and morally wrong, as only a fool would pass NULL anyway, and 
fools deserve to get UB.

Jul 26 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 7/26/2017 3:14 AM, Timon Gehr wrote:
 On 26.07.2017 05:02, Walter Bright wrote:
 The implementation checks for fp being NULL and returns EOF if it is.

 
 The C mindset is that this check is a waste of precious processing resources
and 
 morally wrong, as only a fool would pass NULL anyway, and fools deserve to get
UB.

I wrote that code 30+ years ago, and no longer remember why I put the null
check 
in. It might have been because other C compiler libraries did it.

Jul 26 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 02:56, Andrei Alexandrescu wrote:
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }

 
 I'd think that would be the case, but failed to find a fgetc 
 implementation that mentions it's undefined for a null FILE*. Is there a 
 link? Thx. -- Andrei

It's implicit. In C, whenever you pass something that is outside the 
interface specification, you get UB. Also, in C, there is no way to get 
a segmentation fault except for UB, and fgetc(NULL) segfaults with glibc.

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) specific 
 implementation detail (it's what you generally expect to happen, but 
 AFAIK it isn't defined in official ISO/IEC C).

 In cases where C does not crash when dereferencing null, then D would 
 not crash when dereferencing null. D depends on the hardware doing 
 this (Walter has said so many times), so if C doesn't do it, then D 
 won't. So those systems would have to be treated specially, and you'd 
 have to work out your own home-grown mechanism for memory safety.

 
 What Moritz is saying is that the following implementation of fclose is 
 correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          go_wild_and_corrupt_all_the_memory();
      }else{
          actually_close_the_file(stream);
      }
 }

I think we can correctly assume no fclose implementations exist that do 
anything but access data pointed at by stream. Which means a segfault on 
every platform we support.

On platforms that may not segfault, you'd be on your own.

In other words, I think we can assume for any C functions that are 
passed pointers that dereference those pointers, passing null is safely 
going to segfault.

Likewise, because D depends on hardware flagging of dereferencing null 
as a segfault, any platforms that *don't* have that for C also won't 
have it for D. And then  safe doesn't even work in D code either.

As we have good support for different prototypes for different 
platforms, we could potentially unmark those as  trusted in those cases.

-Steve

Jul 25 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 7/25/2017 6:09 PM, Steven Schveighoffer wrote:
 Likewise, because D depends on hardware flagging of dereferencing null as a 
 segfault, any platforms that *don't* have that for C also won't have it for D. 
 And then  safe doesn't even work in D code either.

I spent 10 years programming on DOS with zero memory protection, and people
have 
forgotten how awful that was. You couldn't simply instrument the code with null 
pointer checks, either, because then the program would be too big to fit.

The solution finally appeared with the 286 DOS Extenders, which ran in
protected 
mode. I switched to doing all my development under them, and would port to DOS 
only after passing all the test suite.

D is definitely predicated on having hardware memory protection.

The C/C++ Standards are still hanging on to EBCDIC, 10 bit bytes, non-IEEE 
floating point, etc. It's time to let that crap go :-)

One C++ programmer told me that C++ could handle any character set. I asked him 
how RADIX50 was supported. Segfault! (I learned to program on RADIX50 systems.)

D made some fundamental decisions:

* Unicode
* 2's complement
* 8 bit bytes
* IEEE arithmetic
* memory protection
* fixed sizes for integral types
* single pointer type
* >= 32 bit processors

that relegated a lot of junk to the dustbin of history. (It's awful pretending 
to support that stuff. C and C++ pretend do, but just about zero programs will 
actually work on such systems, because there aren't any to try the code out on.)

Jul 25 2017

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Wednesday, 26 July 2017 at 03:16:44 UTC, Walter Bright wrote:
 On 7/25/2017 6:09 PM, Steven Schveighoffer wrote:
 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 I spent 10 years programming on DOS with zero memory 
 protection, and people have forgotten how awful that was. You 
 couldn't simply instrument the code with null pointer checks, 
 either, because then the program would be too big to fit.

 The solution finally appeared with the 286 DOS Extenders, which 
 ran in protected mode. I switched to doing all my development 
 under them, and would port to DOS only after passing all the 
 test suite.

 D is definitely predicated on having hardware memory protection.

 The C/C++ Standards are still hanging on to EBCDIC, 10 bit 
 bytes, non-IEEE floating point, etc. It's time to let that crap 
 go :-)

 One C++ programmer told me that C++ could handle any character 
 set. I asked him how RADIX50 was supported. Segfault! (I 
 learned to program on RADIX50 systems.)

 D made some fundamental decisions:

 * Unicode
 * 2's complement
 * 8 bit bytes
 * IEEE arithmetic
 * memory protection
 * fixed sizes for integral types
 * single pointer type
 * >= 32 bit processors

 that relegated a lot of junk to the dustbin of history. (It's 
 awful pretending to support that stuff. C and C++ pretend do, 
 but just about zero programs will actually work on such 
 systems, because there aren't any to try the code out on.)

And alone for that list of decision do I love you. I can not hear 
anymore all the crap about "undefined behaviour", "nasal demons" 
and optimizer that think that they are entitled to sabotage 
programs because he is an over zealous language lawyer in the C 
world practicing POOP (premature optimisation oriented 
programming).

Jul 26 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 03:09, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 ...
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }

 
 I think we can correctly assume no fclose implementations exist that do 
 anything but access data pointed at by stream. Which means a segfault on 
 every platform we support.
 
 On platforms that may not segfault, you'd be on your own.
 
 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is safely 
 going to segfault.

I'm not going to assume that.

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 ...
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }

 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 
 I'm not going to assume that.

Tell you what, when you find a D platform that this doesn't happen, we 
can fix it with a version statement ;)

-Steve

Jul 26 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 26.07.2017 13:22, Steven Schveighoffer wrote:
 On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 ...
 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 I'm not going to assume that.

 
 Tell you what, when you find a D platform that this doesn't happen, > we can
fix it with a version statement ;)
 
 -Steve

The burden of proof is on you, not me. You are advocating the C approach 
to memory safety.

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 7:55 AM, Timon Gehr wrote:
 On 26.07.2017 13:22, Steven Schveighoffer wrote:
 On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 ...
 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 I'm not going to assume that.

 Tell you what, when you find a D platform that this doesn't happen, > 
 we can fix it with a version statement ;)

 
 The burden of proof is on you, not me. You are advocating the C approach 
 to memory safety.

They leave NULL dereferencing undefined because in some quirky old 
useless no-longer-existing hardware, it doesn't segfault.

Note that this is more implementation defined than undefined (in fact, I 
couldn't find it listed in the UB section at all in the C11 spec).

Look at Walter's response. I think D can simply only work with C 
implementations on platforms where null dereferencing segfaults and 
ignore the rest.

Walter, can we update the  safe spec to say that reading/writing data 
from the null page in C is required to generate a program crash for 
 safe to be valid? This can be an exception to the UB rule.

I just don't see the point of adding extra checks for null when the 
hardware already does it.

-Steve

Jul 26 2017

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven 
 Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) 
 specific implementation detail (it's what you generally 
 expect to happen, but AFAIK it isn't defined in official 
 ISO/IEC C).

 In cases where C does not crash when dereferencing null, then 
 D would not crash when dereferencing null. D depends on the 
 hardware doing this (Walter has said so many times), so if C 
 doesn't do it, then D won't. So those systems would have to 
 be treated specially, and you'd have to work out your own 
 home-grown mechanism for memory safety.

 
 What Moritz is saying is that the following implementation of 
 fclose is correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          go_wild_and_corrupt_all_the_memory();
      }else{
          actually_close_the_file(stream);
      }
 }

 I think we can correctly assume no fclose implementations exist 
 that do anything but access data pointed at by stream. Which 
 means a segfault on every platform we support.

What a luck that Solaris/SPARC is not supported as on that 
platform fclose(NULL) and even close(-1) do not segfault. Had to 
learn it the hard way when we ported our project from 
Solaris/SPARC to Linux/x86_64. It was surprizing how often that 
(wrong) behavior happenned in our code base (100K line of C).

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that 
 are passed pointers that dereference those pointers, passing 
 null is safely going to segfault.

Dereferencing NULL pointer on Solaris/SPARC segfaults but 
fclose() does apparently not dereference blindly the passed 
pointer. I suspect that SUN intentionnally reduced the 
opportunities to segfault on a lot of system calls and libs. The 
port to Linux revealed several violations (stale pointer usage, 
double frees, buffer overflows) that never triggered on Solaris 
and the project is more than 20 year old.

 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in 
 those cases.

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 12:08 PM, Patrick Schluter wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) specific 
 implementation detail (it's what you generally expect to happen, 
 but AFAIK it isn't defined in official ISO/IEC C).

 In cases where C does not crash when dereferencing null, then D 
 would not crash when dereferencing null. D depends on the hardware 
 doing this (Walter has said so many times), so if C doesn't do it, 
 then D won't. So those systems would have to be treated specially, 
 and you'd have to work out your own home-grown mechanism for memory 
 safety.

 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          go_wild_and_corrupt_all_the_memory();
      }else{
          actually_close_the_file(stream);
      }
 }

 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 
 What a luck that Solaris/SPARC is not supported as on that platform 
 fclose(NULL) and even close(-1) do not segfault. Had to learn it the 
 hard way when we ported our project from Solaris/SPARC to Linux/x86_64. 
 It was surprizing how often that (wrong) behavior happenned in our code 
 base (100K line of C).

I'm guessing though that it's an implementation detail (like Walter's 
DMC example). A segfault is fine, and returning an error is fine. Both 
will properly be handled, and do not cause UB.

So I guess I should restate that we can assume no implementations exist 
that intentionally cause UB when stream is NULL (as in Timon's example). 
Either they check for null, and handle gracefully, or don't check and 
segfault.

What I was talking about is platforms that don't segfault on 
reading/writing from the zero page. Those we couldn't support with  safe 
D anyway.

-Steve

Jul 26 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations exist 
 that intentionally cause UB when stream is NULL (as in Timon's example). 
 Either they check for null, and handle gracefully, or don't check and 
 segfault.

No need to worry about that at all. If worse comes to worst - i.e. we do 
port to such an implementation - we can always provide a thin wrapper 
that checks for NULL then calls the native function. No need to change 
the signatures. -- Andrei

Jul 26 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 27.07.2017 01:56, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example).


My argument was not that we need to fear implementations that take 
explicit measures to screw you, but UB is UB. Compilers can in principle 
turn segfaults into any other behaviour they want, and this behaviour 
can change between releases. I'd just rather not codify guarantees that 
do not exist into the type system, as it is not really feasible to check 
them, even if in practice you will in the overwhelming majority get the 
expected behaviour.

 Either they check for null, and handle gracefully, or don't 
 check and segfault.

 
 No need to worry about that at all. If worse comes to worst - i.e. we do 
 port to such an implementation

How do you notice?

 - we can always provide a thin wrapper 
 that checks for NULL then calls the native function. No need to change 
 the signatures. -- Andrei

I don't see how that works, as you'd end up with two different 
implementations of the same C function. (I.e. you get a name clash in 
the object file.)

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 8:09 PM, Timon Gehr wrote:
 On 27.07.2017 01:56, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example).


 
 My argument was not that we need to fear implementations that take 
 explicit measures to screw you, but UB is UB. Compilers can in principle 
 turn segfaults into any other behaviour they want, and this behaviour 
 can change between releases. I'd just rather not codify guarantees that 
 do not exist into the type system, as it is not really feasible to check 
 them, even if in practice you will in the overwhelming majority get the 
 expected behaviour.

I can't see how compilers can take advantage of this one. However, we 
can take advantage that this UB is almost universally implemented as a 
hardware segfault that ends the process.

-Steve

Jul 26 2017

Jacob Carlborg <doob me.com> writes:

On 2017-07-27 03:14, Steven Schveighoffer wrote:

 I can't see how compilers can take advantage of this one. However, we 
 can take advantage that this UB is almost universally implemented as a 
 hardware segfault that ends the process.

Unfortunately it's not that easy with optimizing compilers for C and C++:

void contains_null_check(int* p)
{
     int dead = *p;

     if (p == 0)
         return;

     *p = 4;
}

If the compiler runs the "Dead Code Elimination" optimization before 
"Redundant Null Check Elimination" then the above code will turn into:


void contains_null_check(int* p)
{
     if (p == 0) // Null check not redundant, and is kept.
         return;

     *p = 4;
}

But if the compiler runs the optimizations in the opposite order we end 
up with this code:


void contains_null_check(int* p)
{
     int dead = *p;

     if (false) // "p" was dereferenced by this point, so it can't be null
         return;

     *p = 4;
}

And then the compiler runs the "Dead Code Elimination" pass and we're 
left with:

void contains_null_check(int* p)
{
     *p = 4;
}

This can change between releases of compilers and between different 
vendors. Introducing an inlining pass will make this even more 
complicated, because the above example might be spread a cross multiple 
functions that have now been inlined.

For reference: 
http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html

-- 
/Jacob Carlborg

Jul 26 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/27/17 2:48 AM, Jacob Carlborg wrote:
 And then the compiler runs the "Dead Code Elimination" pass and we're 
 left with:
 
 void contains_null_check(int* p)
 {
      *p = 4;
 }

So the result is that it will segfault. I don't see a problem with this. 
It's what I would have expected.

-Steve

Jul 27 2017

Patrick Schluter <Patrick.Schluter bbox.fr> writes:

On Thursday, 27 July 2017 at 11:46:24 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 2:48 AM, Jacob Carlborg wrote:
 And then the compiler runs the "Dead Code Elimination" pass 
 and we're left with:
 
 void contains_null_check(int* p)
 {
      *p = 4;
 }

 So the result is that it will segfault. I don't see a problem 
 with this. It's what I would have expected.

Except that that code was used in the Linux kernel where page 0 
was mapped and thus de-referencing the pointer did not segfault.

The issue that is missed here is for what purpose the compiler is 
used. Will the code always be run in a hosted environment or is 
it used in a freestanding implementation (kernel and embedded 
stuff). The C standard makes a difference between the 2 but the 
compiler gurus apparently do not care.
As for D, Walter's list of constraints for a D compiler makes it 
imho impossible to use the language on smaller embedded platforms 
ring 0 mode x86.
That's why calling D a system language to be somehow 
disingenuous. Calling it an application language to be truer.

Jul 27 2017

Jacob Carlborg <doob me.com> writes:

On 2017-07-27 13:46, Steven Schveighoffer wrote:

 So the result is that it will segfault. I don't see a problem with this. 
 It's what I would have expected.

The problem is that behavior might change depending on which compiler is 
used because the code is not valid according to the specification.

-- 
/Jacob Carlborg

Jul 27 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or don't 
 check and segfault.

 
 No need to worry about that at all. If worse comes to worst - i.e. we do 
 port to such an implementation - we can always provide a thin wrapper 
 that checks for NULL then calls the native function. No need to change 
 the signatures. -- Andrei

Hm.. so you mean:

pragma(mangle, "fgetc")
private extern(C) int real_fgetc(FILE * stream)

extern(D) int fgetc(FILE *stream)  trusted
{
   if(stream == null) assert(0);
   return real_fgetc(stream);
}

Yeah, that should work well actually. Nice!

-Steve

Jul 26 2017

Timon Gehr <timon.gehr gmx.ch> writes:

On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or don't 
 check and segfault.

 No need to worry about that at all. If worse comes to worst - i.e. we 
 do port to such an implementation - we can always provide a thin 
 wrapper that checks for NULL then calls the native function. No need 
 to change the signatures. -- Andrei

 
 Hm.. so you mean:
 
 pragma(mangle, "fgetc")
 private extern(C) int real_fgetc(FILE * stream)
 
 extern(D) int fgetc(FILE *stream)  trusted
 {
    if(stream == null) assert(0);
    return real_fgetc(stream);
 }
 
 Yeah, that should work well actually. Nice!
 
 -Steve

That works but it changes the signature. (extern(D) vs. extern(C)).

Jul 27 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/27/17 7:27 AM, Timon Gehr wrote:
 On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or 
 don't check and segfault.

 No need to worry about that at all. If worse comes to worst - i.e. we 
 do port to such an implementation - we can always provide a thin 
 wrapper that checks for NULL then calls the native function. No need 
 to change the signatures. -- Andrei

 Hm.. so you mean:

 pragma(mangle, "fgetc")
 private extern(C) int real_fgetc(FILE * stream)

 extern(D) int fgetc(FILE *stream)  trusted
 {
    if(stream == null) assert(0);
    return real_fgetc(stream);
 }

 Yeah, that should work well actually. Nice!

 
 That works but it changes the signature. (extern(D) vs. extern(C)).

Hm... you could use pragma(mangle) to get the signature the same. I was 
just thinking since it's going to be a D wrapper, it could be extern(D).

But you are right, &fgetc would result in a different type, so we should 
use pragma(mangle) instead.

-Steve

Jul 27 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/27/2017 07:27 AM, Timon Gehr wrote:
 On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or 
 don't check and segfault.

 No need to worry about that at all. If worse comes to worst - i.e. we 
 do port to such an implementation - we can always provide a thin 
 wrapper that checks for NULL then calls the native function. No need 
 to change the signatures. -- Andrei

 Hm.. so you mean:

 pragma(mangle, "fgetc")
 private extern(C) int real_fgetc(FILE * stream)

 extern(D) int fgetc(FILE *stream)  trusted
 {
    if(stream == null) assert(0);
    return real_fgetc(stream);
 }

 Yeah, that should work well actually. Nice!

 -Steve

 
 That works but it changes the signature. (extern(D) vs. extern(C)).

There are a number of techniques allowing you to daisy chain C functions 
in libraries without changing names by using e.g. linking order or 
dynamic symbol loading. Sounds exactly like the kind of problem to 
tackle when you see it. We have much more pressing things to be on. -- 
Andrei

Jul 27 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven 
 Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.

 In C land that behaviour is a platform (hardware/OS/libc) 
 specific implementation detail (it's what you generally 
 expect to happen, but AFAIK it isn't defined in official 
 ISO/IEC C).

 In cases where C does not crash when dereferencing null, then 
 D would not crash when dereferencing null. D depends on the 
 hardware doing this (Walter has said so many times), so if C 
 doesn't do it, then D won't. So those systems would have to 
 be treated specially, and you'd have to work out your own 
 home-grown mechanism for memory safety.

 
 What Moritz is saying is that the following implementation of 
 fclose is correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          go_wild_and_corrupt_all_the_memory();
      }else{
          actually_close_the_file(stream);
      }
 }

 I think we can correctly assume no fclose implementations exist 
 that do anything but access data pointed at by stream. Which 
 means a segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that 
 are passed pointers that dereference those pointers, passing 
 null is safely going to segfault.

 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in 
 those cases.

--- null.d ---
version (linux):

import core.stdc.stdio : FILE;
import core.sys.linux.sys.mman;

extern (C)  safe int fgetc(FILE* stream);

void mmapNull()
{
	void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
	assert (mmapNull == null, "Do `echo 0 > 
/proc/sys/vm/mmap_min_addr` as root");
	*(cast (char*) null) = 'D';
}

void nullDeref()  safe
{
	fgetc(null);
}

void main(string[] args)
{
	mmapNull();
	nullDeref();
}
---

For some fun on Linux, try out

$ rdmd null.d

Consider `mmapNull` being run in some third party shared lib you 
don't control.

Jul 27 2017

ag0aep6g <anonymous example.com> writes:

On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` 
 as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out

 $ rdmd null.d

The gist of this is that Linux can be configured so that null can be a 
valid pointer. Right?

That seems pretty bad for  safe at large, not only when C functions are 
involved.

Jul 27 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Thursday, 27 July 2017 at 13:45:21 UTC, ag0aep6g wrote:
 On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > 
 /proc/sys/vm/mmap_min_addr` as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out

 $ rdmd null.d

 The gist of this is that Linux can be configured so that null 
 can be a valid pointer. Right?

In summation, yes. To be technical about it:
- Linux can be configured so that the bottom page of a process' 
virtual address space is not protected from being mapped to valid 
memory (by default, `mmap_min_addr` is 4096, i.e. the bottom page 
can't be mapped)
- C's `NULL` is in pretty much all implementations (not the C 
spec) defined as the value `0`, which corresponds to the virtual 
address `0` in a process, i.e. lies in the bottom page of the 
process' virtual address space
- The null dereference segmentation fault on Linux stems from the 
fact that the bottom page (which `NULL` maps to) isn't mapped to 
valid memory
- If you map the bottom page of a process' virtual address space 
to valid memory, than accessing it doesn't create a segmentation 
fault

 That seems pretty bad for  safe at large, not only when C 
 functions are involved.

Yes:
- In C land, since derefencing `NULL` is UB by definition, this 
is perfectly valid behaviour
- In D lang, because we require `null` dereferences to crash, we 
break  safe with it

Jul 27 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 Likewise, because D depends on hardware flagging of dereferencing null 
 as a segfault, any platforms that *don't* have that for C also won't 
 have it for D. And then  safe doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in those cases.

 
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` 
 as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out

 $ rdmd null.d
 
 Consider `mmapNull` being run in some third party shared lib you don't 
 control.

Again, all these hacks are just messing with the assumptions D is 
making. You don't need C functions to trigger such problems. I'm fine 
with saying libraries or platforms that do not segfault when accessing 
zero page are incompatible with  safe code. And it's on you not to do 
this, the compiler will assume the segfault will occur.

-Steve

Jul 27 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven 
 Schveighoffer wrote:
 I think we can correctly assume no fclose implementations 
 exist that do anything but access data pointed at by stream. 
 Which means a segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions 
 that are passed pointers that dereference those pointers, 
 passing null is safely going to segfault.

 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 As we have good support for different prototypes for 
 different platforms, we could potentially unmark those as 
  trusted in those cases.

 
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > 
 /proc/sys/vm/mmap_min_addr` as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out

 $ rdmd null.d
 
 Consider `mmapNull` being run in some third party shared lib 
 you don't control.

 Again, all these hacks are just messing with the assumptions D 
 is making.

Which aren't in the official D spec (or at the very least I can't 
seem to find them there).

 You don't need C functions to trigger such problems.

Sure, but it was relevant to the previous discussion.

 I'm fine with saying libraries or platforms that do not 
 segfault when accessing zero page are incompatible with  safe 
 code.

So we can't have  safe in shared libraries on Linux? Because 
there's no way for the shared lib author to know what programs 
using it are going to do.

 And it's on you not to do this, the compiler will assume the 
 segfault will occur.

It's not a promise the author of the D code can (always) make.
In any case, the  trusted and  safe spec need to be explicit 
about the assumptions made.

Jul 27 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
 On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer wrote:
 On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 Likewise, because D depends on hardware flagging of dereferencing 
 null as a segfault, any platforms that *don't* have that for C also 
 won't have it for D. And then  safe doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in those 
 cases.

 --- null.d ---
 version (linux):

 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;

 extern (C)  safe int fgetc(FILE* stream);

 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > 
 /proc/sys/vm/mmap_min_addr` as root");
      *(cast (char*) null) = 'D';
 }

 void nullDeref()  safe
 {
      fgetc(null);
 }

 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---

 For some fun on Linux, try out

 $ rdmd null.d

 Consider `mmapNull` being run in some third party shared lib you 
 don't control.

 Again, all these hacks are just messing with the assumptions D is making.

 
 Which aren't in the official D spec (or at the very least I can't seem 
 to find them there).

You are right. I have asked Walter to add such an update. I should pull 
that out to its own thread, will do.

 You don't need C functions to trigger such problems.

 
 Sure, but it was relevant to the previous discussion.

Right, but what I'm saying is that it's a different argument. We could 
say "you can't mark fgetc  safe", and still have this situation occur.

 I'm fine with saying libraries or platforms that do not segfault when 
 accessing zero page are incompatible with  safe code.

 
 So we can't have  safe in shared libraries on Linux? Because there's no 
 way for the shared lib author to know what programs using it are going 
 to do.

You can't guarantee  safe on such processes or systems. It has to be 
assumed by the compiler that your provided code doesn't happen.

It's not that we can't have  safe because of what someone might do, it's 
that  safe guarantees can only work if you don't do such things.

It is nice to be aware of these possibilities, since they could be an 
effective attack on D  safe code.

 And it's on you not to do this, the compiler will assume the segfault 
 will occur.

 
 It's not a promise the author of the D code can (always) make.
 In any case, the  trusted and  safe spec need to be explicit about the 
 assumptions made.

I agree. The promise only works as well as the environment.  safe is not 
actually safe if it's based on incorrect assumptions.

-Steve

Jul 27 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Thursday, 27 July 2017 at 14:45:03 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
 On Thursday, 27 July 2017 at 13:56:00 UTC, Steven 
 Schveighoffer wrote:

 I'm fine with saying libraries or platforms that do not 
 segfault when accessing zero page are incompatible with  safe 
 code.

 
 So we can't have  safe in shared libraries on Linux? Because 
 there's no way for the shared lib author to know what programs 
 using it are going to do.

 You can't guarantee  safe on such processes or systems. It has 
 to be assumed by the compiler that your provided code doesn't 
 happen.

 It's not that we can't have  safe because of what someone might 
 do, it's that  safe guarantees can only work if you don't do 
 such things.

Which essentially means that any library written in  safe D 
exposing a C API needs to write in big fat red letters "Don't do 
this or you break our safety guarantees".


 It is nice to be aware of these possibilities, since they could 
 be an effective attack on D  safe code.

Well, yeah, that's the consequence of  safe correctness depending 
on UB always resulting in a crash.

Jul 27 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 26 July 2017 at 00:35:13 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer 
 wrote:
 The behavior is defined. It will crash with a segfault.

 
 In C land that behaviour is a platform (hardware/OS/libc) 
 specific implementation detail (it's what you generally expect 
 to happen, but AFAIK it isn't defined in official ISO/IEC C).

 In cases where C does not crash when dereferencing null, then D 
 would not crash when dereferencing null. [...]

OK, my (wrong) assumption was that a D compiler would on those 
platforms be required to inject null checks+crash in order to 
satisfy the property that null dereferences crashes D programs 
rely on.
Since that seems to not be the case: Is this documented in the D 
spec somewhere (I couldn't find it)? If not, imho it should.

Jul 26 2017

Kagamin <spam here.lot> writes:

On Tuesday, 25 July 2017 at 18:36:35 UTC, Moritz Maxeiner wrote:
 fgetc cannot be  trusted the same way fclose cannot be  trusted.
 If you pass either of them `null` - which constitutes a legal 
  safe context - the behaviour is undefined, which contradicts 
  trusted definition:
 <Trusted functions are guaranteed by the programmer to not 
 exhibit any undefined behavior if called by a safe function.>

There's a less questionable problem with it.

Jul 26 2017

Kagamin <spam here.lot> writes:

 There's a less questionable problem with it.

Hint: FILE struct is transparent, look inside it, lots of 
interesting stuff there.

Jul 29 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 2:14 PM, Andrei Alexandrescu wrote:
 On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047

 
 That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei

fclose is not  safe.

The charter of  safe (or  trusted in this case) is to assume valid 
pointers for parameters.

-Steve

Jul 25 2017

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 07/25/2017 02:20 PM, Steven Schveighoffer wrote:
 On 7/25/17 2:14 PM, Andrei Alexandrescu wrote:
 On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047 

 That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei

 
 fclose is not  safe.

Ah, nice. Thanks! -- Andrei

Jul 25 2017

Kagamin <spam here.lot> writes:

On Tuesday, 25 July 2017 at 15:12:30 UTC, Steven Schveighoffer 
wrote:
 I think signalfd can be marked  trusted, as  safe code supports 
 pointing at a single element.

What about functions that take zero terminated strings? Are they 
ok to read past the end of allocated object?

Jul 25 2017

Steven Schveighoffer <schveiguy yahoo.com> writes:

On 7/25/17 12:46 PM, Kagamin wrote:
 On Tuesday, 25 July 2017 at 15:12:30 UTC, Steven Schveighoffer wrote:
 I think signalfd can be marked  trusted, as  safe code supports 
 pointing at a single element.

 
 What about functions that take zero terminated strings? Are they ok to 
 read past the end of allocated object?

No, a null terminated string is as arbitrary as passing in a length.

Unfortunately, it's perfectly safe to call with a string literal. But 
there is no way to detect that during compile time. So it has to be unsafe.

The wrapper would be to use toStringz to make the call.

-Steve

Jul 25 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.

Since you explicitly state *all* OS functions:
nothrow: Should be OK (only callbacks could violate this and they 
should be nothrow, anyway).
 trusted: This can only be done for those functions that don't 
take arguments open to memory corruption. Take a look at POSIX 
read, it can never be trusted (same as any C function taking 
pointer+length of pointed to).
 nogc: This can only be done for those functions that are 
statically known to never call a D callback that's not also 
 nogc. Take a look at pthread_create vor pthread_join, they can 
never be  nogc, because that would mean threads may never 
allocate with the GC.

 I keep copying OS function declarations into my code, just so I 
 can add those attributes to them. Otherwise I simply cannot 
 call "signalfd" and "sigemptyset" (to name a couple from my 
 most recent history) from  safe code.

---
auto result = ()  trusted { return systemFunction(...) }();
---

Jul 25 2017

Shachar Shemesh <shachar weka.io> writes:

On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.

 
 Since you explicitly state *all* OS functions:
 nothrow: Should be OK (only callbacks could violate this and they should 
 be nothrow, anyway).

Technically, any system call that is a pthreads cancellation point may 
throw a C++ exception.

If we go down that route, however, calling system calls from nothrow 
becomes completely impossible, which is another way of saying that 
decorating just about anything with nothrow becomes impossible.

  trusted: This can only be done for those functions that don't take 
 arguments open to memory corruption. Take a look at POSIX read, it can 
 never be trusted (same as any C function taking pointer+length of 
 pointed to).
  nogc: This can only be done for those functions that are statically 
 known to never call a D callback that's not also  nogc. Take a look at 
 pthread_create vor pthread_join, they can never be  nogc, because that 
 would mean threads may never allocate with the GC.

The decoration's situation with callbacks is pretty horrible throughout 
D. I'm not sure this is the most compelling argument, however. The 
function passed to pthread_create does not, logically, run in the 
pthread_create function. As such, I don't think this logic holds.

As for pthread_join, I have no idea what you meant by it. Please 
elaborate why you think it is a problem.

 
 ---
 auto result = ()  trusted { return systemFunction(...) }();
 ---

Care to explain how to adapt that neat trick for "nothrow" and " nogc"?

Shachar

Jul 25 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh 
 wrote:
 The title really does says it all.

 
 Since you explicitly state *all* OS functions:
 nothrow: Should be OK (only callbacks could violate this and 
 they should be nothrow, anyway).

 Technically, any system call that is a pthreads cancellation 
 point may throw a C++ exception.

Good to know, then since D is supposed to be able to catch C++ 
exceptions (and can on 64bit Linux [1]) none of those may be 
attributed as `nothrow`, because C++ exceptions don't derive from 
`Error`.

 If we go down that route, however, calling system calls from 
 nothrow becomes completely impossible, which is another way of 
 saying that decorating just about anything with nothrow becomes 
 impossible.

No. `nothrow` functions can call throwing ones, as long as they 
catch any exceptions not derived from Error thrown by them.

  trusted: This can only be done for those functions that don't 
 take arguments open to memory corruption. Take a look at POSIX 
 read, it can never be trusted (same as any C function taking 
 pointer+length of pointed to).
  nogc: This can only be done for those functions that are 
 statically known to never call a D callback that's not also 
  nogc. Take a look at pthread_create vor pthread_join, they 
 can never be  nogc, because that would mean threads may never 
 allocate with the GC.

 The decoration's situation with callbacks is pretty horrible 
 throughout D.

Do you mean throughout druntime and phobos?

 I'm not sure this is the most compelling argument, however. The 
 function passed to pthread_create does not, logically, run in 
 the pthread_create function. As such, I don't think this logic 
 holds.

Then the  nogc definition would need to be updated from: "or 
indirectly through functions it may call" to reflect this, 
because that can be interpreted both ways.

 As for pthread_join, I have no idea what you meant by it. 
 Please elaborate why you think it is a problem.

Possible scenario on single core (no hyperthreading) system:
- thread 1 spawns thread 2
- thread 1 enters  nogc function `foo` and calls `pthread_join` 
on thread 2 before its own timeslice is over (and thus enters 
blocked state)
- thread 2 does work allocating via the GC, then terminates
- thread 1 wakes up and leaves  nogc function `foo`

Because  nogc (in contrast to nothrow) is explicitly designed as 
transitive, logically speaking, `foo` violated its  nogc 
constraint (it *caused* the GC allocations in thread 2).

 
 ---
 auto result = ()  trusted { return systemFunction(...) }();
 ---

 Care to explain how to adapt that neat trick for "nothrow" and 
 " nogc"?

nothrow: assumeWontThrow [2]
 nogc: assumeNoGC [3]

[1] http://forum.dlang.org/thread/n7jorc$1ied$1 digitalmars.com
[2] https://dlang.org/library/std/exception/assume_wont_throw.html
[3] https://p0nce.github.io/d-idioms/#Bypassing- nogc

Jul 25 2017

Shachar Shemesh <shachar weka.io> writes:

On 25/07/17 18:29, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.

 Since you explicitly state *all* OS functions:
 nothrow: Should be OK (only callbacks could violate this and they 
 should be nothrow, anyway).

 Technically, any system call that is a pthreads cancellation point may 
 throw a C++ exception.

 
 Good to know, then since D is supposed to be able to catch C++ 
 exceptions (and can on 64bit Linux [1]) none of those may be attributed 
 as `nothrow`, because C++ exceptions don't derive from `Error` >
 If we go down that route, however, calling system calls from nothrow 
 becomes completely impossible, which is another way of saying that 
 decorating just about anything with nothrow becomes impossible.

 
 No. `nothrow` functions can call throwing ones, as long as they catch 
 any exceptions not derived from Error thrown by them.
 

And right there and then you've introduced a serious problem. The 
purpose of the C++ exception thrown on cancellation point is to 
terminate the thread. It is designed to be uncatchable.

Had that been D, it might derive from Error, or even directly from 
Throwable. This is C++, however. It some weirdly named class.

I think labeling these "nothrow" is the correct course of action.

 The decoration's situation with callbacks is pretty horrible 
 throughout D.

 
 Do you mean throughout druntime and phobos?

I'm rechecking what I mean. I may have misspoke.

 
 I'm not sure this is the most compelling argument, however. The 
 function passed to pthread_create does not, logically, run in the 
 pthread_create function. As such, I don't think this logic holds.

 
 Then the  nogc definition would need to be updated from: "or indirectly 
 through functions it may call" to reflect this, because that can be 
 interpreted both ways.
 
 As for pthread_join, I have no idea what you meant by it. Please 
 elaborate why you think it is a problem.

 
 Possible scenario on single core (no hyperthreading) system:
 - thread 1 spawns thread 2
 - thread 1 enters  nogc function `foo` and calls `pthread_join` on 
 thread 2 before its own timeslice is over (and thus enters blocked state)
 - thread 2 does work allocating via the GC, then terminates
 - thread 1 wakes up and leaves  nogc function `foo`
 
 Because  nogc (in contrast to nothrow) is explicitly designed as 
 transitive, logically speaking, `foo` violated its  nogc constraint (it 
 *caused* the GC allocations in thread 2).

Following that logic, ANY function that might affect another thread 
cannot be  nogc. I think this way madness lies. I don't think other 
threads action, even if linked in some weird semantic way to ours, make 
us accountable to their actions.

If you pass a callback that is not  nogc to pthread_create, then your 
other thread might allocate. This doesn't change the fact that 
pthread_create doesn't allocate.

At Weka, we use this understanding of the semantics all the time. Our 
main thread is as  nogc as we can possibly make it. Whenever we need 
anything that violates our usual restrictions, we send it either to 
other threads or other processes for execution, and use the results when 
they return. Defining the various attributes too strictly will simply 
mean we cannot use them anywhere (which is pretty much what happens 
today, but the very thing I'm trying to change here).

Shachar

Jul 25 2017

Moritz Maxeiner <moritz ucworks.org> writes:

On Wednesday, 26 July 2017 at 06:44:51 UTC, Shachar Shemesh wrote:
 On 25/07/17 18:29, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh 
 wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh 
 wrote:
 The title really does says it all.

 Since you explicitly state *all* OS functions:
 nothrow: Should be OK (only callbacks could violate this and 
 they should be nothrow, anyway).

 Technically, any system call that is a pthreads cancellation 
 point may throw a C++ exception.

 
 Good to know, then since D is supposed to be able to catch C++ 
 exceptions (and can on 64bit Linux [1]) none of those may be 
 attributed as `nothrow`, because C++ exceptions don't derive 
 from `Error` >
 If we go down that route, however, calling system calls from 
 nothrow becomes completely impossible, which is another way 
 of saying that decorating just about anything with nothrow 
 becomes impossible.

 
 No. `nothrow` functions can call throwing ones, as long as 
 they catch any exceptions not derived from Error thrown by 
 them.
 

 And right there and then you've introduced a serious problem.
 The purpose of the C++ exception thrown on cancellation point 
 is to terminate the thread. It is designed to be uncatchable.

The issue lies with the definition of `nothrow` considering only 
D Throwables;
it would have to be updated to apply to C++ exceptions that 
equate to D exceptions derived from Error.

 I think labeling these "nothrow" is the correct course of 
 action.

Not with the `nothrow` spec as it is right now. After the spec 
having been updated to apply to C++ exception that may not be 
caught, sure.

 I'm not sure this is the most compelling argument, however. 
 The function passed to pthread_create does not, logically, 
 run in the pthread_create function. As such, I don't think 
 this logic holds.

 
 Then the  nogc definition would need to be updated from: "or 
 indirectly through functions it may call" to reflect this, 
 because that can be interpreted both ways.
 
 As for pthread_join, I have no idea what you meant by it. 
 Please elaborate why you think it is a problem.

 
 Possible scenario on single core (no hyperthreading) system:
 - thread 1 spawns thread 2
 - thread 1 enters  nogc function `foo` and calls 
 `pthread_join` on thread 2 before its own timeslice is over 
 (and thus enters blocked state)
 - thread 2 does work allocating via the GC, then terminates
 - thread 1 wakes up and leaves  nogc function `foo`
 
 Because  nogc (in contrast to nothrow) is explicitly designed 
 as transitive, logically speaking, `foo` violated its  nogc 
 constraint (it *caused* the GC allocations in thread 2).

 Following that logic, ANY function that might affect another 
 thread cannot be  nogc.

Not any function; as I interpret the spec only those who manually 
interleave another thread allocating via the GC such that it 
looks to a caller as if they had allocated using the GC.

 I think this way madness lies. I don't think other threads 
 action, even if linked in some weird semantic way to ours, make 
 us accountable to their actions.

I would say it depends on the exact semantics of each use case 
whether we are accountable.

 If you pass a callback that is not  nogc to pthread_create, 
 then your other thread might allocate. This doesn't change the 
 fact that pthread_create doesn't allocate.

The "indirectly through functions it may call" of the  nogc spec 
is ambiguous here because it doesn't actually require a direct 
call chain to the allocation. It would need to be updated.

 At Weka, we use this understanding of the semantics all the 
 time. Our main thread is as  nogc as we can possibly make it. 
 Whenever we need anything that violates our usual restrictions, 
 we send it either to other threads or other processes for 
 execution, and use the results when they return. Defining the 
 various attributes too strictly will simply mean we cannot use 
 them anywhere (which is pretty much what happens today, but the 
 very thing I'm trying to change here).

There is a difference between what's sensible and what the 
current wording of the spec allows for and before it's OK to 
attribute functions where the ambiguity applies, the spec wording 
(for both  nogc and nothrow) has to be made unambiguous.

P.S.: In case it's not clear: I'm playing devil's advocate in 
this subthread.

Jul 26 2017

D Programming

C/C++ Programming

Other

digitalmars.D - all OS functions should be "nothrow trusted nogc"