www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - all OS functions should be "nothrow trusted nogc"

reply Shachar Shemesh <shachar weka.io> writes:
The title really does says it all. I keep copying OS function 
declarations into my code, just so I can add those attributes to them. 
Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
couple from my most recent history) from  safe code.

I can try and set up a PR when I have the time. If anyone else wants to 
take an easy one before then, you're welcome to :-)

Shachar
Jul 25
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.
Not all OS functions can be ` trusted`. I don't about `signalfd` and `sigemptyset`, but `read` [1] can't be ` trusted`, for example. It takes pointer and length separately, and the pointer is a `void*`. That's not safe at all. [1] http://man7.org/linux/man-pages/man2/read.2.html
Jul 25
parent reply Shachar Shemesh <shachar weka.io> writes:
On 25/07/17 17:11, ag0aep6g wrote:
 On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.
Not all OS functions can be ` trusted`. I don't about `signalfd` and `sigemptyset`, but `read` [1] can't be ` trusted`, for example. It takes pointer and length separately, and the pointer is a `void*`. That's not safe at all.
And, indeed, the code calling "read" shouldn't be able to do that as safe. Read itself, however, is trusted (because, let's face it, if you cannot trust the kernel, you're screwed anyways). Having said that, I have no objection to excluding the "pointer+length" system calls from the above rule. They are, by far, the minority of system calls. Shachar
Jul 25
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 07/25/2017 04:32 PM, Shachar Shemesh wrote:
 And, indeed, the code calling "read" shouldn't be able to do that as 
  safe. Read itself, however, is trusted (because, let's face it, if you 
 cannot trust the kernel, you're screwed anyways).
That's not how ` trusted` works. The point of ` trusted` is to allow unsafe features in the implementation. The interface must be just as safe as with ` safe`. `read` doesn't have a safe interface. `read` is safe as long as long as you pass good arguments. When you pass bad arguments, `read` will break your stuff. A ` trusted` function must always be safe, no matter the arguments.
Jul 25
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 10:43 AM, ag0aep6g wrote:
 On 07/25/2017 04:32 PM, Shachar Shemesh wrote:
 And, indeed, the code calling "read" shouldn't be able to do that as 
  safe. Read itself, however, is trusted (because, let's face it, if 
 you cannot trust the kernel, you're screwed anyways).
That's not how ` trusted` works. The point of ` trusted` is to allow unsafe features in the implementation. The interface must be just as safe as with ` safe`. `read` doesn't have a safe interface. `read` is safe as long as long as you pass good arguments. When you pass bad arguments, `read` will break your stuff. A ` trusted` function must always be safe, no matter the arguments.
About http://man7.org/linux/man-pages/man2/read.2.html, there's just a bit of wrapping necessary: nothrow trusted nogc ssize_t read(int fd, ubyte[] buf) { return read(fd, buf.ptr, buf.length); } (btw void[] doesn't work) The point being that a safe D program needs to guarantee memory will not be corrupted, and there's no way for the type system to ensure that in the Posix read() call the buffer and the length are coordinated. Certain systems (such as the static checker used at Microsoft - it's fairly well known, there are a couple of papers on it, forgot the name) require annotations in the function signature to indicate the coordination, e.g.: ssize_t read(int fd, void *buf, islengthof(buf) size_t count); Then the type checker can verify upon each call that indeed count is the right size of buf. A suite of safe wrappers on OS primitives might be useful. Andrei
Jul 25
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 11:26 AM, Andrei Alexandrescu wrote:
 
 About http://man7.org/linux/man-pages/man2/read.2.html, there's just a 
 bit of wrapping necessary:
 
 nothrow  trusted  nogc
 ssize_t read(int fd, ubyte[] buf)
 {
      return read(fd, buf.ptr, buf.length);
 }
 
 (btw void[] doesn't work)
 
[snip]
 A suite of safe wrappers on OS primitives might be useful.
Great idea! Should it be a package on its own, or should we put the wrappers inside the original files? That is, do we make core.sys.safe.posix.unistd: read or do we make core.sys.posix.unistd: safe_read ? My preference is for the former, since it's very nice to have a pristine copy of the header file. -Steve
Jul 25
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 11:50 AM, Steven Schveighoffer wrote:
 On 7/25/17 11:26 AM, Andrei Alexandrescu wrote:
 About http://man7.org/linux/man-pages/man2/read.2.html, there's just a 
 bit of wrapping necessary:

 nothrow  trusted  nogc
 ssize_t read(int fd, ubyte[] buf)
 {
      return read(fd, buf.ptr, buf.length);
 }

 (btw void[] doesn't work)
[snip]
 A suite of safe wrappers on OS primitives might be useful.
Great idea! Should it be a package on its own, or should we put the wrappers inside the original files? That is, do we make core.sys.safe.posix.unistd: read or do we make core.sys.posix.unistd: safe_read ? My preference is for the former, since it's very nice to have a pristine copy of the header file.
Same here. I'd preserve the function name though. -- Andrei
Jul 25
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2017 8:26 AM, Andrei Alexandrescu wrote:
 A suite of safe wrappers on OS primitives might be useful.
The idea of fixing the operating system interface(s) has come up now and then. I've tried to discourage that on the following grounds: * We are not in the operating system business. * Operating system APIs grow like weeds. We'd set ourselves an impossible task. * It's a huge job simply to provide accurate declarations for the APIs. * We'd have to write our own documentation for the operating system APIs. It's hard enough writing such for Phobos. * A lot are fundamentally unfixable, like free() and strlen(). * The API import files should be focused solely on direct access to the APIs, not adding a translation layer. The user of them will expect this. * We already have safe wrappers for the commonly used APIs. For read(), there is std.stdio. It is worthwhile, however, to augment the APIs with the appropriate attributes like nogc, scope, nothrow, safe (for the ones that are), etc.
Jul 25
next sibling parent reply Kagamin <spam here.lot> writes:
On Wednesday, 26 July 2017 at 02:54:34 UTC, Walter Bright wrote:
 * Operating system APIs grow like weeds. We'd set ourselves an 
 impossible task.

 It is worthwhile, however, to augment the APIs with the 
 appropriate attributes like  nogc, scope, nothrow,  safe (for 
 the ones that are), etc.
Given that C and OS api have no notion of memory safety, they don't support it and don't maintain it, so if it once was safe, it can be refactored later and become unsafe relying on proper usage of the api. Then if it was marked safe, the qualifier must be removed, which will be a breaking change for D code, but not for C code. Should we still try to mark them safe at all?
Jul 26
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?
Marking ones that are safe with safe is fine. OS APIs pretty much never change.
Jul 26
next sibling parent reply Kagamin <spam here.lot> writes:
On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?
Marking ones that are safe with safe is fine. OS APIs pretty much never change.
New technologies and new features get introduced over time: 64 bit, ipv6, bitmap_v5, generally bigger data everywhere, and api changes accordingly and incorporates new features, and takes increasingly bigger arguments over time.
Jul 28
parent Grander <grander grander.grander> writes:
On Friday, 28 July 2017 at 12:40:06 UTC, Kagamin wrote:
 On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?
Marking ones that are safe with safe is fine. OS APIs pretty much never change.
New technologies and new features get introduced over time: 64 bit, ipv6, bitmap_v5, generally bigger data everywhere, and api changes accordingly and incorporates new features, and takes increasingly bigger arguments over time.
most of them don't lead to "real" API changes, they often only add new functions/overloads/whatever
Jul 28
prev sibling parent reply Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?
Marking ones that are safe with safe is fine. OS APIs pretty much never change.
Sometimes operating systems add new flags to their API which change how some values are interpreted. Some API functions may, for example, normally take a pointer to a such-and-such struct, but if a certain flag is specified, the parameter is instead interpreted as a pointer to a different data type. That would be one case where an API call becomes un- safe due to the addition of a flag.
Jul 31
parent reply Shachar Shemesh <shachar weka.io> writes:
On 31/07/17 16:33, Vladimir Panteleev wrote:
 On Wednesday, 26 July 2017 at 17:48:21 UTC, Walter Bright wrote:
 On 7/26/2017 6:29 AM, Kagamin wrote:
 Should we still try to mark them safe at all?
Marking ones that are safe with safe is fine. OS APIs pretty much never change.
Sometimes operating systems add new flags to their API which change how some values are interpreted. Some API functions may, for example, normally take a pointer to a such-and-such struct, but if a certain flag is specified, the parameter is instead interpreted as a pointer to a different data type. That would be one case where an API call becomes un- safe due to the addition of a flag.
One of the things that really bother me with the D community is the "100% or nothing" approach. System programming is, by definition, an exercise in juggling conflicting aims. The more absolute the language, the less useful it is for performing real life tasks. Shachar
Jul 31
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 31.07.2017 15:56, Shachar Shemesh wrote:
 
 One of the things that really bother me with the D community is the 
 "100% or nothing" approach.
 ...
Personally, I'm more bothered by this kind of lazy argument that sounds good but has no substance.
 System programming is, by definition, an exercise in juggling 
 conflicting aims. The more absolute the language, the less useful it is 
 for performing real life tasks.
Why do you think trusted exists?
Jul 31
parent reply Shachar Shemesh <shachar weka.io> writes:
On 31/07/17 17:08, Timon Gehr wrote:
 On 31.07.2017 15:56, Shachar Shemesh wrote:
 One of the things that really bother me with the D community is the 
 "100% or nothing" approach.
 ...
Personally, I'm more bothered by this kind of lazy argument that sounds good but has no substance.
 System programming is, by definition, an exercise in juggling 
 conflicting aims. The more absolute the language, the less useful it 
 is for performing real life tasks.
Why do you think trusted exists?
That's fine, but since, according to the logic presented here, no OS function can ever be safe, then all code calling such a function can't be safe either. At this point, half your code, give or take, is trusted. That's the point you give up, and just write everything as system. And what we have here is that you started out trying to be 100% pure (and, in this case, there is no problem with current code, only *hypothetical* future changes), and end up not getting any protection from safe at all. There is a proverb in Hebrew that says: תפסת מרובה, לא תפסת. Try to grab too much, and you end up holding nothing. Shachar
Jul 31
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 31.07.2017 16:15, Shachar Shemesh wrote:
 Why do you think  trusted exists?
That's fine, but since, according to the logic presented here, no OS function can ever be safe,
This is actually not true. Vladimir was just pointing out a complication of which to be aware. Are you arguing against applying due diligence when specifying library interfaces?
 
 There is a proverb in Hebrew that says:
 תפסת מרובה, לא תפסת.
 Try to grab too much, and you end up holding nothing. 
I.e. if you mark too many functions as trusted, you will have no memory safety.
Jul 31
parent Vladimir Panteleev <thecybershadow.lists gmail.com> writes:
On Monday, 31 July 2017 at 14:51:22 UTC, Timon Gehr wrote:
 On 31.07.2017 16:15, Shachar Shemesh wrote:
 That's fine, but since, according to the logic presented here, 
 no OS function can ever be  safe,
This is actually not true. Vladimir was just pointing out a complication of which to be aware. Are you arguing against applying due diligence when specifying library interfaces?
Indeed. safe is not a sandbox, there is no need to actually go to extreme measures to safeguard against potential changes beyond our control; just something to keep in mind.
Jul 31
prev sibling parent reply Kagamin <spam here.lot> writes:
On Monday, 31 July 2017 at 13:56:48 UTC, Shachar Shemesh wrote:
 One of the things that really bother me with the D community is 
 the "100% or nothing" approach.
In the worst case when a function becomes unsafe, only safe attribute will be removed from it, which will be a breaking change, but hopefully it will happen rarely enough.
Aug 01
parent reply w0rp <devw0rp gmail.com> writes:
Direct OS function calls should probably all be treated as 
unsafe, except for rare cases where the behaviour is very well 
defined in standards and in actual implementations to be safe. 
The way to get safe functions for OS functionality is to write 
wrapper functions in D which prohibit unsafe calls.
Aug 01
parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions in D
 which prohibit unsafe calls.
+1. T -- People say I'm indecisive, but I'm not sure about that. -- YHL, CONLANG
Aug 01
parent reply Marco Leise <Marco.Leise gmx.de> writes:
Am Tue, 1 Aug 2017 10:50:59 -0700
schrieb "H. S. Teoh via Digitalmars-d"
<digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions in D
 which prohibit unsafe calls.  
+1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o) -- Marco
Aug 01
next sibling parent "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Aug 01, 2017 at 10:39:35PM +0200, Marco Leise via Digitalmars-d wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:
 
 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as unsafe,
 except for rare cases where the behaviour is very well defined in
 standards and in actual implementations to be safe. The way to get
 safe functions for OS functionality is to write wrapper functions
 in D which prohibit unsafe calls.  
+1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o)
[...] LOL, that's laughably inefficient. Instead of calling strlen, you might as well have just looped with an index and returned the index. :-P foreach (i, c; str) if (!c) return i; Oh wait, so we didn't need the wrapper after all. :-P T -- It's amazing how careful choice of punctuation can leave you hanging:
Aug 01
prev sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via 
 Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as 
 unsafe, except for rare cases where the behaviour is very 
 well defined in standards and in actual implementations to 
 be safe. The way to get safe functions for OS functionality 
 is to write wrapper functions in D which prohibit unsafe 
 calls.
+1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o)
I know this is in jest, but since `strlen`'s interface is inherently unsafe, yes, the only way to make calling it safe happens to also solve what `strlen` is supposed to solve. To me the consequence of this would be to not use `strlen` (or any other C function where checking the arguments for safety solves a superset of what the C function solves) from D. I don't think this applies to most OS functions, though, just to (OS independent) libc functions.
Aug 01
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as > 
unsafe, except for rare cases where the behaviour is very > well defined in standards and in actual implementations to > be safe. The way to get safe functions for OS functionality > is to write wrapper functions in D which prohibit unsafe > calls. +1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o)
I know this is in jest, but since `strlen`'s interface is inherently unsafe, yes, the only way to make calling it safe happens to also solve what `strlen` is supposed to solve. To me the consequence of this would be to not use `strlen` (or any other C function where checking the arguments for safety solves a superset of what the C function solves) from D. I don't think this applies to most OS functions, though, just to (OS independent) libc functions.
I think it goes without saying that some functions just shouldn't be marked safe or trusted. strlen is one of those. -Steve
Aug 01
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 1 August 2017 at 21:59:46 UTC, Steven Schveighoffer 
wrote:
 On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via 
 Digitalmars-d wrote:
 Direct OS function calls should probably all be treated as

unsafe, except for rare cases where the behaviour is very > well defined in standards and in actual implementations to > be safe. The way to get safe functions for OS functionality
 is to write wrapper functions in D which prohibit unsafe >
calls. +1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o)
I know this is in jest, but since `strlen`'s interface is inherently unsafe, yes, the only way to make calling it safe happens to also solve what `strlen` is supposed to solve. To me the consequence of this would be to not use `strlen` (or any other C function where checking the arguments for safety solves a superset of what the C function solves) from D. I don't think this applies to most OS functions, though, just to (OS independent) libc functions.
I think it goes without saying that some functions just shouldn't be marked safe or trusted. strlen is one of those.
Of course, though I think this (sub) context was more about writing safe D wrappers for system C functions than about which C functions to mark as trusted/ safe. `strnlen` shouldn't be marked safe/ trusted, either, but writing a safe D wrapper for it doesn't involve doing in D what `strnlen` is supposed to do: --- size_t strnlen_safe(in char[] str) { return strnlen(&str[0], str.length); } --- Not that there's much of a reason to do so, anyway, when the D idiomatic way is just a Phobos away: --- import std.algorithm; // I probably wouldn't even define this but use the body as is auto strnlen_safe(in char[] str) { return countUntil(cast(ubyte[]) str, '\0'); } ---
Aug 01
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 8/1/17 6:17 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 21:59:46 UTC, Steven Schveighoffer wrote:
 On 8/1/17 5:54 PM, Moritz Maxeiner wrote:
 On Tuesday, 1 August 2017 at 20:39:35 UTC, Marco Leise wrote:
 Am Tue, 1 Aug 2017 10:50:59 -0700
 schrieb "H. S. Teoh via Digitalmars-d"
 <digitalmars-d puremagic.com>:

 On Tue, Aug 01, 2017 at 05:12:38PM +0000, w0rp via Digitalmars-d 
 wrote:
 Direct OS function calls should probably all be treated as

unsafe, except for rare cases where the behaviour is very > well defined in standards and in actual implementations to > be safe. The way to get safe functions for OS functionality
 is to write wrapper functions in D which prohibit unsafe >
calls. +1.
I think I got it now! size_t strlen_safe(in char[] str) trusted { foreach (c; str) if (!c) return strlen(str.ptr); return str.length; } :o)
I know this is in jest, but since `strlen`'s interface is inherently unsafe, yes, the only way to make calling it safe happens to also solve what `strlen` is supposed to solve. To me the consequence of this would be to not use `strlen` (or any other C function where checking the arguments for safety solves a superset of what the C function solves) from D. I don't think this applies to most OS functions, though, just to (OS independent) libc functions.
I think it goes without saying that some functions just shouldn't be marked safe or trusted. strlen is one of those.
Of course, though I think this (sub) context was more about writing safe D wrappers for system C functions than about which C functions to mark as trusted/ safe. `strnlen` shouldn't be marked safe/ trusted, either, but writing a safe D wrapper for it doesn't involve doing in D what `strnlen` is supposed to do: --- size_t strnlen_safe(in char[] str) { return strnlen(&str[0], str.length); } ---
Most definitely. It would be nice to have a fully safe interface that is as low-level as you can possibly get. Then any library implemented on top of it could be marked safe as well.
 Not that there's much of a reason to do so, anyway, when the D idiomatic 
 way is just a Phobos away:
 
 ---
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }
Oh that cast.... it irks me so. -Steve
Aug 01
next sibling parent reply "H. S. Teoh via Digitalmars-d" <digitalmars-d puremagic.com> writes:
On Tue, Aug 01, 2017 at 06:46:17PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 8/1/17 6:17 PM, Moritz Maxeiner wrote:
[...]
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }
Oh that cast.... it irks me so.
[...] Welcome to the wonderful world of autodecoding. :-D OTOH, we could just use byCodeUnit and we wouldn't need the cast, I think. T -- Don't get stuck in a closet---wear yourself out.
Aug 01
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 1 August 2017 at 22:52:26 UTC, H. S. Teoh wrote:
 On Tue, Aug 01, 2017 at 06:46:17PM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 On 8/1/17 6:17 PM, Moritz Maxeiner wrote:
[...]
 import std.algorithm;
 // I probably wouldn't even define this but use the body as 
 is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }
Oh that cast.... it irks me so.
[...] Welcome to the wonderful world of autodecoding. :-D OTOH, we could just use byCodeUnit and we wouldn't need the cast, I think.
I was lazy, okay (I nearly forgot putting the auto decoding prevention in there, because I always forget that D has auto decoding; it irks me as well) :p
Aug 01
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }
Oh that cast.... it irks me so. -Steve
return str.representation.countUntil('\0'); Andrei
Aug 02
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Wednesday, 2 August 2017 at 16:32:44 UTC, Andrei Alexandrescu 
wrote:
 import std.algorithm;
 // I probably wouldn't even define this but use the body as is
 auto strnlen_safe(in char[] str)
 {
      return countUntil(cast(ubyte[]) str, '\0');
 }
Oh that cast.... it irks me so. -Steve
return str.representation.countUntil('\0');
Thanks, wasn't aware of this; it makes auto decoding slightly more bearable.
Aug 02
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 10:54 PM, Walter Bright wrote:
 On 7/25/2017 8:26 AM, Andrei Alexandrescu wrote:
 A suite of safe wrappers on OS primitives might be useful.
The idea of fixing the operating system interface(s) has come up now and then. I've tried to discourage that on the following grounds: * We are not in the operating system business. * Operating system APIs grow like weeds. We'd set ourselves an impossible task. * It's a huge job simply to provide accurate declarations for the APIs. * We'd have to write our own documentation for the operating system APIs. It's hard enough writing such for Phobos. * A lot are fundamentally unfixable, like free() and strlen(). * The API import files should be focused solely on direct access to the APIs, not adding a translation layer. The user of them will expect this. * We already have safe wrappers for the commonly used APIs. For read(), there is std.stdio.
The standard library would not be in the position to provide such, but the project seems a good choice for a crowdsource and crowdmaintained library. -- Andrei
Jul 27
prev sibling parent reply Shachar Shemesh <shachar weka.io> writes:
On 25/07/17 18:26, Andrei Alexandrescu wrote:
 (btw void[] doesn't work)
Can you expand on this point? Shachar
Jul 26
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 3:05 AM, Shachar Shemesh wrote:
 On 25/07/17 18:26, Andrei Alexandrescu wrote:
 (btw void[] doesn't work)
Can you expand on this point? Shachar
Because anything casts to void[] implicitly. e.g.: void main() safe { int *[] arr = new int*[5]; read(0, arr); // reading raw pointer data, shouldn't be allowed } -Steve
Jul 26
prev sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 25 July 2017 at 14:32:18 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:11, ag0aep6g wrote:
 On 07/25/2017 03:50 PM, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes 
 to them. Otherwise I simply cannot call "signalfd" and 
 "sigemptyset" (to name a couple from my most recent history) 
 from  safe code.
Not all OS functions can be ` trusted`. I don't about `signalfd` and `sigemptyset`, but `read` [1] can't be ` trusted`, for example. It takes pointer and length separately, and the pointer is a `void*`. That's not safe at all.
And, indeed, the code calling "read" shouldn't be able to do that as safe. Read itself, however, is trusted
No, it is not, because it does not fulfill the definition of trusted (callable from *any* safe context without allowing memory corruption).
 (because, let's face it, if you cannot trust the kernel, you're 
 screwed anyways).
This has nothing to do with trusting the kernel: --- char[1] buf; int dontCorruptMePlease; read(fd, &buf[0], 10); --- The read implementation can't verify the buffer size, it must assume it to be correct. If it's too large for the actual buffer -> memory corruption. No function taking pointer+size of pointed to (that accesses them) can be trusted.
 Having said that, I have no objection to excluding the 
 "pointer+length" system calls from the above rule. They are, by 
 far, the minority of system calls.
And also happen to be the most used ones. But I digress, the point is *every single functionust be verified for every single Attribute* (other than nothrow). PRs are welcome :)
Jul 25
prev sibling next sibling parent reply Stefan Koch <uplink.coder googlemail.com> writes:
On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes 
 to them. Otherwise I simply cannot call "signalfd" and 
 "sigemptyset" (to name a couple from my most recent history) 
 from  safe code.

 I can try and set up a PR when I have the time. If anyone else 
 wants to take an easy one before then, you're welcome to :-)

 Shachar
these functions are supposed to have trused wrappers if used in safe code.
Jul 25
parent reply Shachar Shemesh <shachar weka.io> writes:
On 25/07/17 17:12, Stefan Koch wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to them. 
 Otherwise I simply cannot call "signalfd" and "sigemptyset" (to name a 
 couple from my most recent history) from  safe code.

 I can try and set up a PR when I have the time. If anyone else wants 
 to take an easy one before then, you're welcome to :-)

 Shachar
these functions are supposed to have trused wrappers if used in safe code.
I'd love to hear the difference between: extern(C) int signalfd (int __fd, const(sigset_t)* __mask, int __flags) nothrow nogc; and int signalfdWrapper(int __fd, const(sigset_t)* __mask, int __flags) nothrow trusted nogc { return signalfd(__fd, __mask, __flags); } Or are you suggesting the wrapper do something else? Shachar
Jul 25
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 10:27 AM, Shachar Shemesh wrote:
 On 25/07/17 17:12, Stefan Koch wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all. I keep copying OS function 
 declarations into my code, just so I can add those attributes to 
 them. Otherwise I simply cannot call "signalfd" and "sigemptyset" (to 
 name a couple from my most recent history) from  safe code.

 I can try and set up a PR when I have the time. If anyone else wants 
 to take an easy one before then, you're welcome to :-)

 Shachar
these functions are supposed to have trused wrappers if used in safe code.
I'd love to hear the difference between: extern(C) int signalfd (int __fd, const(sigset_t)* __mask, int __flags) nothrow nogc; and int signalfdWrapper(int __fd, const(sigset_t)* __mask, int __flags) nothrow trusted nogc { return signalfd(__fd, __mask, __flags); }
I think signalfd can be marked trusted, as safe code supports pointing at a single element. Other system calls that accept a pointer/length combo cannot be marked trusted. -Steve
Jul 25
next sibling parent reply Kagamin <spam here.lot> writes:
While we're at it, check this: 
https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
Jul 25
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
Looks fine to me. That's not an array of FILE, it's a single pointer. -Steve
Jul 25
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
Looks fine to me. That's not an array of FILE, it's a single pointer.
fgetc cannot be trusted the same way fclose cannot be trusted. If you pass either of them `null` - which constitutes a legal safe context - the behaviour is undefined, which contradicts trusted definition: <Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function.>
Jul 25
next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 2:36 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047 
Looks fine to me. That's not an array of FILE, it's a single pointer.
fgetc cannot be trusted the same way fclose cannot be trusted. If you pass either of them `null` - which constitutes a legal safe context - the behaviour is undefined, which contradicts trusted definition: <Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function.>
The behavior is defined. It will crash with a segfault. This is par for the course in safe land -- dereferencing null pointers is OK. What is not defined is to fclose a file, and then use that FILE * in any way afterwards without reassigning. Note that safe functions don't make any guarantees once you pass in an invalid (except for null) or dangling pointer. However, if you are using only safe code, you shouldn't be able to make one of these either. Hence fclose is not safe or trusted. The one case where this fails is for a null pointer to a very very large struct that has a way to reference data outside the protected page. I have proposed in the past a way to protect against this, but it didn't gain any traction. -Steve
Jul 25
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 2:36 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 18:07:06 UTC, Steven Schveighoffer 
 wrote:
 On 7/25/17 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
Looks fine to me. That's not an array of FILE, it's a single pointer.
fgetc cannot be trusted the same way fclose cannot be trusted. If you pass either of them `null` - which constitutes a legal safe context - the behaviour is undefined, which contradicts trusted definition: <Trusted functions are guaranteed by the programmer to not exhibit any undefined behavior if called by a safe function.>
The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
 This is par for the course in  safe land -- dereferencing null 
 pointers  is OK.
In D land we require null dereferences to crash. That means - from a strict, pedantic standpoint - that while it's OK to attribute D functions with null dereferences as trusted, the same can't be said for C functions with null dereferences.
Jul 25
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety. Optionally, one can redefine safe *on those platforms* to say all dereferences will be checked against null, and then it could work on such platforms (and of course, you'd have to remove the trusted marks from low-level C calls). Either way, we can mark these as trusted for all current D platforms. -Steve
Jul 25
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
Jul 25
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 02:45, Timon Gehr would have liked to have written:
 ...
 What Moritz is saying is that the following implementation of fclose is 
 correct according to the C standard:
 
 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }
(Forgot the returns.)
Jul 25
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 08:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
I'd think that would be the case, but failed to find a fgetc implementation that mentions it's undefined for a null FILE*. Is there a link? Thx. -- Andrei
Jul 25
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2017 5:56 PM, Andrei Alexandrescu wrote:
 I'd think that would be the case, but failed to find a fgetc implementation
that 
 mentions it's undefined for a null FILE*. Is there a link? Thx. -- Andrei
The documentation for DMC++ fgetc() is: https://digitalmars.com/rtl/stdio.html#fgetc and says: "Returns the character just read on success, or EOF if end-of-file or a read error is encountered." The implementation checks for fp being NULL and returns EOF if it is.
Jul 25
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 05:02, Walter Bright wrote:
 On 7/25/2017 5:56 PM, Andrei Alexandrescu wrote:
 I'd think that would be the case, but failed to find a fgetc 
 implementation that mentions it's undefined for a null FILE*. Is there 
 a link? Thx. -- Andrei
The documentation for DMC++ fgetc() is: https://digitalmars.com/rtl/stdio.html#fgetc and says: "Returns the character just read on success, or EOF if end-of-file or a read error is encountered." The implementation checks for fp being NULL and returns EOF if it is.
The C mindset is that this check is a waste of precious processing resources and morally wrong, as only a fool would pass NULL anyway, and fools deserve to get UB.
Jul 26
parent Walter Bright <newshound2 digitalmars.com> writes:
On 7/26/2017 3:14 AM, Timon Gehr wrote:
 On 26.07.2017 05:02, Walter Bright wrote:
 The implementation checks for fp being NULL and returns EOF if it is.
The C mindset is that this check is a waste of precious processing resources and morally wrong, as only a fool would pass NULL anyway, and fools deserve to get UB.
I wrote that code 30+ years ago, and no longer remember why I put the null check in. It might have been because other C compiler libraries did it.
Jul 26
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 02:56, Andrei Alexandrescu wrote:
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }
I'd think that would be the case, but failed to find a fgetc implementation that mentions it's undefined for a null FILE*. Is there a link? Thx. -- Andrei
It's implicit. In C, whenever you pass something that is outside the interface specification, you get UB. Also, in C, there is no way to get a segmentation fault except for UB, and fgetc(NULL) segfaults with glibc.
Jul 26
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support. On platforms that may not segfault, you'd be on your own. In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault. Likewise, because D depends on hardware flagging of dereferencing null as a segfault, any platforms that *don't* have that for C also won't have it for D. And then safe doesn't even work in D code either. As we have good support for different prototypes for different platforms, we could potentially unmark those as trusted in those cases. -Steve
Jul 25
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 7/25/2017 6:09 PM, Steven Schveighoffer wrote:
 Likewise, because D depends on hardware flagging of dereferencing null as a 
 segfault, any platforms that *don't* have that for C also won't have it for D. 
 And then  safe doesn't even work in D code either.
I spent 10 years programming on DOS with zero memory protection, and people have forgotten how awful that was. You couldn't simply instrument the code with null pointer checks, either, because then the program would be too big to fit. The solution finally appeared with the 286 DOS Extenders, which ran in protected mode. I switched to doing all my development under them, and would port to DOS only after passing all the test suite. D is definitely predicated on having hardware memory protection. The C/C++ Standards are still hanging on to EBCDIC, 10 bit bytes, non-IEEE floating point, etc. It's time to let that crap go :-) One C++ programmer told me that C++ could handle any character set. I asked him how RADIX50 was supported. Segfault! (I learned to program on RADIX50 systems.) D made some fundamental decisions: * Unicode * 2's complement * 8 bit bytes * IEEE arithmetic * memory protection * fixed sizes for integral types * single pointer type * >= 32 bit processors that relegated a lot of junk to the dustbin of history. (It's awful pretending to support that stuff. C and C++ pretend do, but just about zero programs will actually work on such systems, because there aren't any to try the code out on.)
Jul 25
parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 26 July 2017 at 03:16:44 UTC, Walter Bright wrote:
 On 7/25/2017 6:09 PM, Steven Schveighoffer wrote:
 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.
I spent 10 years programming on DOS with zero memory protection, and people have forgotten how awful that was. You couldn't simply instrument the code with null pointer checks, either, because then the program would be too big to fit. The solution finally appeared with the 286 DOS Extenders, which ran in protected mode. I switched to doing all my development under them, and would port to DOS only after passing all the test suite. D is definitely predicated on having hardware memory protection. The C/C++ Standards are still hanging on to EBCDIC, 10 bit bytes, non-IEEE floating point, etc. It's time to let that crap go :-) One C++ programmer told me that C++ could handle any character set. I asked him how RADIX50 was supported. Segfault! (I learned to program on RADIX50 systems.) D made some fundamental decisions: * Unicode * 2's complement * 8 bit bytes * IEEE arithmetic * memory protection * fixed sizes for integral types * single pointer type * >= 32 bit processors that relegated a lot of junk to the dustbin of history. (It's awful pretending to support that stuff. C and C++ pretend do, but just about zero programs will actually work on such systems, because there aren't any to try the code out on.)
And alone for that list of decision do I love you. I can not hear anymore all the crap about "undefined behaviour", "nasal demons" and optimizer that think that they are entitled to sabotage programs because he is an over zealous language lawyer in the C world practicing POOP (premature optimisation oriented programming).
Jul 26
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 03:09, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 ...
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support. On platforms that may not segfault, you'd be on your own. In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault.
I'm not going to assume that.
Jul 26
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 ...
 What Moritz is saying is that the following implementation of fclose 
 is correct according to the C standard:

 int fclose(FILE *stream){
      if(stream == NULL){
          return go_wild_and_corrupt_all_the_memory();
      }else{
          return actually_close_the_file(stream);
      }
 }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support. On platforms that may not segfault, you'd be on your own. In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault.
I'm not going to assume that.
Tell you what, when you find a D platform that this doesn't happen, we can fix it with a version statement ;) -Steve
Jul 26
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 26.07.2017 13:22, Steven Schveighoffer wrote:
 On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 ...
 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.
I'm not going to assume that.
Tell you what, when you find a D platform that this doesn't happen, > we can fix it with a version statement ;) -Steve
The burden of proof is on you, not me. You are advocating the C approach to memory safety.
Jul 26
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 7:55 AM, Timon Gehr wrote:
 On 26.07.2017 13:22, Steven Schveighoffer wrote:
 On 7/26/17 6:01 AM, Timon Gehr wrote:
 On 26.07.2017 03:09, Steven Schveighoffer wrote:
 ...
 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.
I'm not going to assume that.
Tell you what, when you find a D platform that this doesn't happen, > we can fix it with a version statement ;)
The burden of proof is on you, not me. You are advocating the C approach to memory safety.
They leave NULL dereferencing undefined because in some quirky old useless no-longer-existing hardware, it doesn't segfault. Note that this is more implementation defined than undefined (in fact, I couldn't find it listed in the UB section at all in the C11 spec). Look at Walter's response. I think D can simply only work with C implementations on platforms where null dereferencing segfaults and ignore the rest. Walter, can we update the safe spec to say that reading/writing data from the null page in C is required to generate a program crash for safe to be valid? This can be an exception to the UB rule. I just don't see the point of adding extra checks for null when the hardware already does it. -Steve
Jul 26
prev sibling next sibling parent reply Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven 
 Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support.
What a luck that Solaris/SPARC is not supported as on that platform fclose(NULL) and even close(-1) do not segfault. Had to learn it the hard way when we ported our project from Solaris/SPARC to Linux/x86_64. It was surprizing how often that (wrong) behavior happenned in our code base (100K line of C).
 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that 
 are passed pointers that dereference those pointers, passing 
 null is safely going to segfault.
Dereferencing NULL pointer on Solaris/SPARC segfaults but fclose() does apparently not dereference blindly the passed pointer. I suspect that SUN intentionnally reduced the opportunities to segfault on a lot of system calls and libs. The port to Linux revealed several violations (stale pointer usage, double frees, buffer overflows) that never triggered on Solaris and the project is more than 20 year old.
 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in 
 those cases.
Jul 26
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 12:08 PM, Patrick Schluter wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support.
What a luck that Solaris/SPARC is not supported as on that platform fclose(NULL) and even close(-1) do not segfault. Had to learn it the hard way when we ported our project from Solaris/SPARC to Linux/x86_64. It was surprizing how often that (wrong) behavior happenned in our code base (100K line of C).
I'm guessing though that it's an implementation detail (like Walter's DMC example). A segfault is fine, and returning an error is fine. Both will properly be handled, and do not cause UB. So I guess I should restate that we can assume no implementations exist that intentionally cause UB when stream is NULL (as in Timon's example). Either they check for null, and handle gracefully, or don't check and segfault. What I was talking about is platforms that don't segfault on reading/writing from the zero page. Those we couldn't support with safe D anyway. -Steve
Jul 26
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations exist 
 that intentionally cause UB when stream is NULL (as in Timon's example). 
 Either they check for null, and handle gracefully, or don't check and 
 segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation - we can always provide a thin wrapper that checks for NULL then calls the native function. No need to change the signatures. -- Andrei
Jul 26
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 27.07.2017 01:56, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example).
My argument was not that we need to fear implementations that take explicit measures to screw you, but UB is UB. Compilers can in principle turn segfaults into any other behaviour they want, and this behaviour can change between releases. I'd just rather not codify guarantees that do not exist into the type system, as it is not really feasible to check them, even if in practice you will in the overwhelming majority get the expected behaviour.
 Either they check for null, and handle gracefully, or don't 
 check and segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation
How do you notice?
 - we can always provide a thin wrapper 
 that checks for NULL then calls the native function. No need to change 
 the signatures. -- Andrei
I don't see how that works, as you'd end up with two different implementations of the same C function. (I.e. you get a name clash in the object file.)
Jul 26
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 8:09 PM, Timon Gehr wrote:
 On 27.07.2017 01:56, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example).
My argument was not that we need to fear implementations that take explicit measures to screw you, but UB is UB. Compilers can in principle turn segfaults into any other behaviour they want, and this behaviour can change between releases. I'd just rather not codify guarantees that do not exist into the type system, as it is not really feasible to check them, even if in practice you will in the overwhelming majority get the expected behaviour.
I can't see how compilers can take advantage of this one. However, we can take advantage that this UB is almost universally implemented as a hardware segfault that ends the process. -Steve
Jul 26
parent reply Jacob Carlborg <doob me.com> writes:
On 2017-07-27 03:14, Steven Schveighoffer wrote:

 I can't see how compilers can take advantage of this one. However, we 
 can take advantage that this UB is almost universally implemented as a 
 hardware segfault that ends the process.
Unfortunately it's not that easy with optimizing compilers for C and C++: void contains_null_check(int* p) { int dead = *p; if (p == 0) return; *p = 4; } If the compiler runs the "Dead Code Elimination" optimization before "Redundant Null Check Elimination" then the above code will turn into: void contains_null_check(int* p) { if (p == 0) // Null check not redundant, and is kept. return; *p = 4; } But if the compiler runs the optimizations in the opposite order we end up with this code: void contains_null_check(int* p) { int dead = *p; if (false) // "p" was dereferenced by this point, so it can't be null return; *p = 4; } And then the compiler runs the "Dead Code Elimination" pass and we're left with: void contains_null_check(int* p) { *p = 4; } This can change between releases of compilers and between different vendors. Introducing an inlining pass will make this even more complicated, because the above example might be spread a cross multiple functions that have now been inlined. For reference: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know_14.html -- /Jacob Carlborg
Jul 26
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/27/17 2:48 AM, Jacob Carlborg wrote:
 And then the compiler runs the "Dead Code Elimination" pass and we're 
 left with:
 
 void contains_null_check(int* p)
 {
      *p = 4;
 }
So the result is that it will segfault. I don't see a problem with this. It's what I would have expected. -Steve
Jul 27
next sibling parent Patrick Schluter <Patrick.Schluter bbox.fr> writes:
On Thursday, 27 July 2017 at 11:46:24 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 2:48 AM, Jacob Carlborg wrote:
 And then the compiler runs the "Dead Code Elimination" pass 
 and we're left with:
 
 void contains_null_check(int* p)
 {
      *p = 4;
 }
So the result is that it will segfault. I don't see a problem with this. It's what I would have expected.
Except that that code was used in the Linux kernel where page 0 was mapped and thus de-referencing the pointer did not segfault. The issue that is missed here is for what purpose the compiler is used. Will the code always be run in a hosted environment or is it used in a freestanding implementation (kernel and embedded stuff). The C standard makes a difference between the 2 but the compiler gurus apparently do not care. As for D, Walter's list of constraints for a D compiler makes it imho impossible to use the language on smaller embedded platforms ring 0 mode x86. That's why calling D a system language to be somehow disingenuous. Calling it an application language to be truer.
Jul 27
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-07-27 13:46, Steven Schveighoffer wrote:

 So the result is that it will segfault. I don't see a problem with this. 
 It's what I would have expected.
The problem is that behavior might change depending on which compiler is used because the code is not valid according to the specification. -- /Jacob Carlborg
Jul 27
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or don't 
 check and segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation - we can always provide a thin wrapper that checks for NULL then calls the native function. No need to change the signatures. -- Andrei
Hm.. so you mean: pragma(mangle, "fgetc") private extern(C) int real_fgetc(FILE * stream) extern(D) int fgetc(FILE *stream) trusted { if(stream == null) assert(0); return real_fgetc(stream); } Yeah, that should work well actually. Nice! -Steve
Jul 26
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or don't 
 check and segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation - we can always provide a thin wrapper that checks for NULL then calls the native function. No need to change the signatures. -- Andrei
Hm.. so you mean: pragma(mangle, "fgetc") private extern(C) int real_fgetc(FILE * stream) extern(D) int fgetc(FILE *stream) trusted { if(stream == null) assert(0); return real_fgetc(stream); } Yeah, that should work well actually. Nice! -Steve
That works but it changes the signature. (extern(D) vs. extern(C)).
Jul 27
next sibling parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/27/17 7:27 AM, Timon Gehr wrote:
 On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or 
 don't check and segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation - we can always provide a thin wrapper that checks for NULL then calls the native function. No need to change the signatures. -- Andrei
Hm.. so you mean: pragma(mangle, "fgetc") private extern(C) int real_fgetc(FILE * stream) extern(D) int fgetc(FILE *stream) trusted { if(stream == null) assert(0); return real_fgetc(stream); } Yeah, that should work well actually. Nice!
That works but it changes the signature. (extern(D) vs. extern(C)).
Hm... you could use pragma(mangle) to get the signature the same. I was just thinking since it's going to be a D wrapper, it could be extern(D). But you are right, &fgetc would result in a different type, so we should use pragma(mangle) instead. -Steve
Jul 27
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/27/2017 07:27 AM, Timon Gehr wrote:
 On 27.07.2017 02:11, Steven Schveighoffer wrote:
 On 7/26/17 7:56 PM, Andrei Alexandrescu wrote:
 On 07/26/2017 06:16 PM, Steven Schveighoffer wrote:
 So I guess I should restate that we can assume no implementations 
 exist that intentionally cause UB when stream is NULL (as in Timon's 
 example). Either they check for null, and handle gracefully, or 
 don't check and segfault.
No need to worry about that at all. If worse comes to worst - i.e. we do port to such an implementation - we can always provide a thin wrapper that checks for NULL then calls the native function. No need to change the signatures. -- Andrei
Hm.. so you mean: pragma(mangle, "fgetc") private extern(C) int real_fgetc(FILE * stream) extern(D) int fgetc(FILE *stream) trusted { if(stream == null) assert(0); return real_fgetc(stream); } Yeah, that should work well actually. Nice! -Steve
That works but it changes the signature. (extern(D) vs. extern(C)).
There are a number of techniques allowing you to daisy chain C functions in libraries without changing names by using e.g. linking order or dynamic symbol loading. Sounds exactly like the kind of problem to tackle when you see it. We have much more pressing things to be on. -- Andrei
Jul 27
prev sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 8:45 PM, Timon Gehr wrote:
 On 26.07.2017 02:35, Steven Schveighoffer wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven 
 Schveighoffer wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. D depends on the hardware doing this (Walter has said so many times), so if C doesn't do it, then D won't. So those systems would have to be treated specially, and you'd have to work out your own home-grown mechanism for memory safety.
What Moritz is saying is that the following implementation of fclose is correct according to the C standard: int fclose(FILE *stream){ if(stream == NULL){ go_wild_and_corrupt_all_the_memory(); }else{ actually_close_the_file(stream); } }
I think we can correctly assume no fclose implementations exist that do anything but access data pointed at by stream. Which means a segfault on every platform we support. On platforms that may not segfault, you'd be on your own. In other words, I think we can assume for any C functions that are passed pointers that dereference those pointers, passing null is safely going to segfault. Likewise, because D depends on hardware flagging of dereferencing null as a segfault, any platforms that *don't* have that for C also won't have it for D. And then safe doesn't even work in D code either. As we have good support for different prototypes for different platforms, we could potentially unmark those as trusted in those cases.
--- null.d --- version (linux): import core.stdc.stdio : FILE; import core.sys.linux.sys.mman; extern (C) safe int fgetc(FILE* stream); void mmapNull() { void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0); assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root"); *(cast (char*) null) = 'D'; } void nullDeref() safe { fgetc(null); } void main(string[] args) { mmapNull(); nullDeref(); } --- For some fun on Linux, try out # echo 0 > /proc/sys/vm/mmap_min_addr $ rdmd null.d Consider `mmapNull` being run in some third party shared lib you don't control.
Jul 27
next sibling parent reply ag0aep6g <anonymous example.com> writes:
On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` 
 as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out
 # echo 0 > /proc/sys/vm/mmap_min_addr
 $ rdmd null.d
The gist of this is that Linux can be configured so that null can be a valid pointer. Right? That seems pretty bad for safe at large, not only when C functions are involved.
Jul 27
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Thursday, 27 July 2017 at 13:45:21 UTC, ag0aep6g wrote:
 On 07/27/2017 03:24 PM, Moritz Maxeiner wrote:
 --- null.d ---
 version (linux):
 
 import core.stdc.stdio : FILE;
 import core.sys.linux.sys.mman;
 
 extern (C)  safe int fgetc(FILE* stream);
 
 void mmapNull()
 {
      void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, 
 MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0);
      assert (mmapNull == null, "Do `echo 0 > 
 /proc/sys/vm/mmap_min_addr` as root");
      *(cast (char*) null) = 'D';
 }
 
 void nullDeref()  safe
 {
      fgetc(null);
 }
 
 void main(string[] args)
 {
      mmapNull();
      nullDeref();
 }
 ---
 
 For some fun on Linux, try out
 # echo 0 > /proc/sys/vm/mmap_min_addr
 $ rdmd null.d
The gist of this is that Linux can be configured so that null can be a valid pointer. Right?
In summation, yes. To be technical about it: - Linux can be configured so that the bottom page of a process' virtual address space is not protected from being mapped to valid memory (by default, `mmap_min_addr` is 4096, i.e. the bottom page can't be mapped) - C's `NULL` is in pretty much all implementations (not the C spec) defined as the value `0`, which corresponds to the virtual address `0` in a process, i.e. lies in the bottom page of the process' virtual address space - The null dereference segmentation fault on Linux stems from the fact that the bottom page (which `NULL` maps to) isn't mapped to valid memory - If you map the bottom page of a process' virtual address space to valid memory, than accessing it doesn't create a segmentation fault
 That seems pretty bad for  safe at large, not only when C 
 functions are involved.
Yes: - In C land, since derefencing `NULL` is UB by definition, this is perfectly valid behaviour - In D lang, because we require `null` dereferences to crash, we break safe with it
Jul 27
prev sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 Likewise, because D depends on hardware flagging of dereferencing null 
 as a segfault, any platforms that *don't* have that for C also won't 
 have it for D. And then  safe doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in those cases.
--- null.d --- version (linux): import core.stdc.stdio : FILE; import core.sys.linux.sys.mman; extern (C) safe int fgetc(FILE* stream); void mmapNull() { void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0); assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root"); *(cast (char*) null) = 'D'; } void nullDeref() safe { fgetc(null); } void main(string[] args) { mmapNull(); nullDeref(); } --- For some fun on Linux, try out # echo 0 > /proc/sys/vm/mmap_min_addr $ rdmd null.d Consider `mmapNull` being run in some third party shared lib you don't control.
Again, all these hacks are just messing with the assumptions D is making. You don't need C functions to trigger such problems. I'm fine with saying libraries or platforms that do not segfault when accessing zero page are incompatible with safe code. And it's on you not to do this, the compiler will assume the segfault will occur. -Steve
Jul 27
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven 
 Schveighoffer wrote:
 I think we can correctly assume no fclose implementations 
 exist that do anything but access data pointed at by stream. 
 Which means a segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions 
 that are passed pointers that dereference those pointers, 
 passing null is safely going to segfault.

 Likewise, because D depends on hardware flagging of 
 dereferencing null as a segfault, any platforms that *don't* 
 have that for C also won't have it for D. And then  safe 
 doesn't even work in D code either.

 As we have good support for different prototypes for 
 different platforms, we could potentially unmark those as 
  trusted in those cases.
--- null.d --- version (linux): import core.stdc.stdio : FILE; import core.sys.linux.sys.mman; extern (C) safe int fgetc(FILE* stream); void mmapNull() { void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0); assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root"); *(cast (char*) null) = 'D'; } void nullDeref() safe { fgetc(null); } void main(string[] args) { mmapNull(); nullDeref(); } --- For some fun on Linux, try out # echo 0 > /proc/sys/vm/mmap_min_addr $ rdmd null.d Consider `mmapNull` being run in some third party shared lib you don't control.
Again, all these hacks are just messing with the assumptions D is making.
Which aren't in the official D spec (or at the very least I can't seem to find them there).
 You don't need C functions to trigger such problems.
Sure, but it was relevant to the previous discussion.
 I'm fine with saying libraries or platforms that do not 
 segfault when accessing zero page are incompatible with  safe 
 code.
So we can't have safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.
 And it's on you not to do this, the compiler will assume the 
 segfault will occur.
It's not a promise the author of the D code can (always) make. In any case, the trusted and safe spec need to be explicit about the assumptions made.
Jul 27
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
 On Thursday, 27 July 2017 at 13:56:00 UTC, Steven Schveighoffer wrote:
 On 7/27/17 9:24 AM, Moritz Maxeiner wrote:
 On Wednesday, 26 July 2017 at 01:09:50 UTC, Steven Schveighoffer wrote:
 I think we can correctly assume no fclose implementations exist that 
 do anything but access data pointed at by stream. Which means a 
 segfault on every platform we support.

 On platforms that may not segfault, you'd be on your own.

 In other words, I think we can assume for any C functions that are 
 passed pointers that dereference those pointers, passing null is 
 safely going to segfault.

 Likewise, because D depends on hardware flagging of dereferencing 
 null as a segfault, any platforms that *don't* have that for C also 
 won't have it for D. And then  safe doesn't even work in D code either.

 As we have good support for different prototypes for different 
 platforms, we could potentially unmark those as  trusted in those 
 cases.
--- null.d --- version (linux): import core.stdc.stdio : FILE; import core.sys.linux.sys.mman; extern (C) safe int fgetc(FILE* stream); void mmapNull() { void* mmapNull = mmap(null, 4096, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED | MAP_POPULATE, -1, 0); assert (mmapNull == null, "Do `echo 0 > /proc/sys/vm/mmap_min_addr` as root"); *(cast (char*) null) = 'D'; } void nullDeref() safe { fgetc(null); } void main(string[] args) { mmapNull(); nullDeref(); } --- For some fun on Linux, try out # echo 0 > /proc/sys/vm/mmap_min_addr $ rdmd null.d Consider `mmapNull` being run in some third party shared lib you don't control.
Again, all these hacks are just messing with the assumptions D is making.
Which aren't in the official D spec (or at the very least I can't seem to find them there).
You are right. I have asked Walter to add such an update. I should pull that out to its own thread, will do.
 You don't need C functions to trigger such problems.
Sure, but it was relevant to the previous discussion.
Right, but what I'm saying is that it's a different argument. We could say "you can't mark fgetc safe", and still have this situation occur.
 I'm fine with saying libraries or platforms that do not segfault when 
 accessing zero page are incompatible with  safe code.
So we can't have safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.
You can't guarantee safe on such processes or systems. It has to be assumed by the compiler that your provided code doesn't happen. It's not that we can't have safe because of what someone might do, it's that safe guarantees can only work if you don't do such things. It is nice to be aware of these possibilities, since they could be an effective attack on D safe code.
 And it's on you not to do this, the compiler will assume the segfault 
 will occur.
It's not a promise the author of the D code can (always) make. In any case, the trusted and safe spec need to be explicit about the assumptions made.
I agree. The promise only works as well as the environment. safe is not actually safe if it's based on incorrect assumptions. -Steve
Jul 27
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Thursday, 27 July 2017 at 14:45:03 UTC, Steven Schveighoffer 
wrote:
 On 7/27/17 10:20 AM, Moritz Maxeiner wrote:
 On Thursday, 27 July 2017 at 13:56:00 UTC, Steven 
 Schveighoffer wrote:
 I'm fine with saying libraries or platforms that do not 
 segfault when accessing zero page are incompatible with  safe 
 code.
So we can't have safe in shared libraries on Linux? Because there's no way for the shared lib author to know what programs using it are going to do.
You can't guarantee safe on such processes or systems. It has to be assumed by the compiler that your provided code doesn't happen. It's not that we can't have safe because of what someone might do, it's that safe guarantees can only work if you don't do such things.
Which essentially means that any library written in safe D exposing a C API needs to write in big fat red letters "Don't do this or you break our safety guarantees".
 It is nice to be aware of these possibilities, since they could 
 be an effective attack on D  safe code.
Well, yeah, that's the consequence of safe correctness depending on UB always resulting in a crash.
Jul 27
prev sibling parent Moritz Maxeiner <moritz ucworks.org> writes:
On Wednesday, 26 July 2017 at 00:35:13 UTC, Steven Schveighoffer 
wrote:
 On 7/25/17 5:23 PM, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 20:16:41 UTC, Steven Schveighoffer 
 wrote:
 The behavior is defined. It will crash with a segfault.
In C land that behaviour is a platform (hardware/OS/libc) specific implementation detail (it's what you generally expect to happen, but AFAIK it isn't defined in official ISO/IEC C).
In cases where C does not crash when dereferencing null, then D would not crash when dereferencing null. [...]
OK, my (wrong) assumption was that a D compiler would on those platforms be required to inject null checks+crash in order to satisfy the property that null dereferences crashes D programs rely on. Since that seems to not be the case: Is this documented in the D spec somewhere (I couldn't find it)? If not, imho it should.
Jul 26
prev sibling parent reply Kagamin <spam here.lot> writes:
On Tuesday, 25 July 2017 at 18:36:35 UTC, Moritz Maxeiner wrote:
 fgetc cannot be  trusted the same way fclose cannot be  trusted.
 If you pass either of them `null` - which constitutes a legal 
  safe context - the behaviour is undefined, which contradicts 
  trusted definition:
 <Trusted functions are guaranteed by the programmer to not 
 exhibit any undefined behavior if called by a safe function.>
There's a less questionable problem with it.
Jul 26
parent Kagamin <spam here.lot> writes:
 There's a less questionable problem with it.
Hint: FILE struct is transparent, look inside it, lots of interesting stuff there.
Jul 29
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei
Jul 25
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 2:14 PM, Andrei Alexandrescu wrote:
 On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047
That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei
fclose is not safe. The charter of safe (or trusted in this case) is to assume valid pointers for parameters. -Steve
Jul 25
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 07/25/2017 02:20 PM, Steven Schveighoffer wrote:
 On 7/25/17 2:14 PM, Andrei Alexandrescu wrote:
 On 07/25/2017 12:14 PM, Kagamin wrote:
 While we're at it, check this: 
 https://github.com/dlang/druntime/blob/master/src/core/stdc/stdio.d#L1047 
That might be a mistake. Is fclose(f); getc(f); defined? -- Andrei
fclose is not safe.
Ah, nice. Thanks! -- Andrei
Jul 25
prev sibling parent reply Kagamin <spam here.lot> writes:
On Tuesday, 25 July 2017 at 15:12:30 UTC, Steven Schveighoffer 
wrote:
 I think signalfd can be marked  trusted, as  safe code supports 
 pointing at a single element.
What about functions that take zero terminated strings? Are they ok to read past the end of allocated object?
Jul 25
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 7/25/17 12:46 PM, Kagamin wrote:
 On Tuesday, 25 July 2017 at 15:12:30 UTC, Steven Schveighoffer wrote:
 I think signalfd can be marked  trusted, as  safe code supports 
 pointing at a single element.
What about functions that take zero terminated strings? Are they ok to read past the end of allocated object?
No, a null terminated string is as arbitrary as passing in a length. Unfortunately, it's perfectly safe to call with a string literal. But there is no way to detect that during compile time. So it has to be unsafe. The wrapper would be to use toStringz to make the call. -Steve
Jul 25
prev sibling parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.
Since you explicitly state *all* OS functions: nothrow: Should be OK (only callbacks could violate this and they should be nothrow, anyway). trusted: This can only be done for those functions that don't take arguments open to memory corruption. Take a look at POSIX read, it can never be trusted (same as any C function taking pointer+length of pointed to). nogc: This can only be done for those functions that are statically known to never call a D callback that's not also nogc. Take a look at pthread_create vor pthread_join, they can never be nogc, because that would mean threads may never allocate with the GC.
 I keep copying OS function declarations into my code, just so I 
 can add those attributes to them. Otherwise I simply cannot 
 call "signalfd" and "sigemptyset" (to name a couple from my 
 most recent history) from  safe code.
--- auto result = () trusted { return systemFunction(...) }(); ---
Jul 25
parent reply Shachar Shemesh <shachar weka.io> writes:
On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.
Since you explicitly state *all* OS functions: nothrow: Should be OK (only callbacks could violate this and they should be nothrow, anyway).
Technically, any system call that is a pthreads cancellation point may throw a C++ exception. If we go down that route, however, calling system calls from nothrow becomes completely impossible, which is another way of saying that decorating just about anything with nothrow becomes impossible.
  trusted: This can only be done for those functions that don't take 
 arguments open to memory corruption. Take a look at POSIX read, it can 
 never be trusted (same as any C function taking pointer+length of 
 pointed to).
  nogc: This can only be done for those functions that are statically 
 known to never call a D callback that's not also  nogc. Take a look at 
 pthread_create vor pthread_join, they can never be  nogc, because that 
 would mean threads may never allocate with the GC.
The decoration's situation with callbacks is pretty horrible throughout D. I'm not sure this is the most compelling argument, however. The function passed to pthread_create does not, logically, run in the pthread_create function. As such, I don't think this logic holds. As for pthread_join, I have no idea what you meant by it. Please elaborate why you think it is a problem.
 
 ---
 auto result = ()  trusted { return systemFunction(...) }();
 ---
Care to explain how to adapt that neat trick for "nothrow" and " nogc"? Shachar
Jul 25
parent reply Moritz Maxeiner <moritz ucworks.org> writes:
On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh 
 wrote:
 The title really does says it all.
Since you explicitly state *all* OS functions: nothrow: Should be OK (only callbacks could violate this and they should be nothrow, anyway).
Technically, any system call that is a pthreads cancellation point may throw a C++ exception.
Good to know, then since D is supposed to be able to catch C++ exceptions (and can on 64bit Linux [1]) none of those may be attributed as `nothrow`, because C++ exceptions don't derive from `Error`.
 If we go down that route, however, calling system calls from 
 nothrow becomes completely impossible, which is another way of 
 saying that decorating just about anything with nothrow becomes 
 impossible.
No. `nothrow` functions can call throwing ones, as long as they catch any exceptions not derived from Error thrown by them.
  trusted: This can only be done for those functions that don't 
 take arguments open to memory corruption. Take a look at POSIX 
 read, it can never be trusted (same as any C function taking 
 pointer+length of pointed to).
  nogc: This can only be done for those functions that are 
 statically known to never call a D callback that's not also 
  nogc. Take a look at pthread_create vor pthread_join, they 
 can never be  nogc, because that would mean threads may never 
 allocate with the GC.
The decoration's situation with callbacks is pretty horrible throughout D.
Do you mean throughout druntime and phobos?
 I'm not sure this is the most compelling argument, however. The 
 function passed to pthread_create does not, logically, run in 
 the pthread_create function. As such, I don't think this logic 
 holds.
Then the nogc definition would need to be updated from: "or indirectly through functions it may call" to reflect this, because that can be interpreted both ways.
 As for pthread_join, I have no idea what you meant by it. 
 Please elaborate why you think it is a problem.
Possible scenario on single core (no hyperthreading) system: - thread 1 spawns thread 2 - thread 1 enters nogc function `foo` and calls `pthread_join` on thread 2 before its own timeslice is over (and thus enters blocked state) - thread 2 does work allocating via the GC, then terminates - thread 1 wakes up and leaves nogc function `foo` Because nogc (in contrast to nothrow) is explicitly designed as transitive, logically speaking, `foo` violated its nogc constraint (it *caused* the GC allocations in thread 2).
 
 ---
 auto result = ()  trusted { return systemFunction(...) }();
 ---
Care to explain how to adapt that neat trick for "nothrow" and " nogc"?
nothrow: assumeWontThrow [2] nogc: assumeNoGC [3] [1] http://forum.dlang.org/thread/n7jorc$1ied$1 digitalmars.com [2] https://dlang.org/library/std/exception/assume_wont_throw.html [3] https://p0nce.github.io/d-idioms/#Bypassing- nogc
Jul 25
parent reply Shachar Shemesh <shachar weka.io> writes:
On 25/07/17 18:29, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh wrote:
 The title really does says it all.
Since you explicitly state *all* OS functions: nothrow: Should be OK (only callbacks could violate this and they should be nothrow, anyway).
Technically, any system call that is a pthreads cancellation point may throw a C++ exception.
Good to know, then since D is supposed to be able to catch C++ exceptions (and can on 64bit Linux [1]) none of those may be attributed as `nothrow`, because C++ exceptions don't derive from `Error` >
 If we go down that route, however, calling system calls from nothrow 
 becomes completely impossible, which is another way of saying that 
 decorating just about anything with nothrow becomes impossible.
No. `nothrow` functions can call throwing ones, as long as they catch any exceptions not derived from Error thrown by them.
And right there and then you've introduced a serious problem. The purpose of the C++ exception thrown on cancellation point is to terminate the thread. It is designed to be uncatchable. Had that been D, it might derive from Error, or even directly from Throwable. This is C++, however. It some weirdly named class. I think labeling these "nothrow" is the correct course of action.
 The decoration's situation with callbacks is pretty horrible 
 throughout D.
Do you mean throughout druntime and phobos?
I'm rechecking what I mean. I may have misspoke.
 
 I'm not sure this is the most compelling argument, however. The 
 function passed to pthread_create does not, logically, run in the 
 pthread_create function. As such, I don't think this logic holds.
Then the nogc definition would need to be updated from: "or indirectly through functions it may call" to reflect this, because that can be interpreted both ways.
 As for pthread_join, I have no idea what you meant by it. Please 
 elaborate why you think it is a problem.
Possible scenario on single core (no hyperthreading) system: - thread 1 spawns thread 2 - thread 1 enters nogc function `foo` and calls `pthread_join` on thread 2 before its own timeslice is over (and thus enters blocked state) - thread 2 does work allocating via the GC, then terminates - thread 1 wakes up and leaves nogc function `foo` Because nogc (in contrast to nothrow) is explicitly designed as transitive, logically speaking, `foo` violated its nogc constraint (it *caused* the GC allocations in thread 2).
Following that logic, ANY function that might affect another thread cannot be nogc. I think this way madness lies. I don't think other threads action, even if linked in some weird semantic way to ours, make us accountable to their actions. If you pass a callback that is not nogc to pthread_create, then your other thread might allocate. This doesn't change the fact that pthread_create doesn't allocate. At Weka, we use this understanding of the semantics all the time. Our main thread is as nogc as we can possibly make it. Whenever we need anything that violates our usual restrictions, we send it either to other threads or other processes for execution, and use the results when they return. Defining the various attributes too strictly will simply mean we cannot use them anywhere (which is pretty much what happens today, but the very thing I'm trying to change here). Shachar
Jul 25
parent Moritz Maxeiner <moritz ucworks.org> writes:
On Wednesday, 26 July 2017 at 06:44:51 UTC, Shachar Shemesh wrote:
 On 25/07/17 18:29, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 14:39:15 UTC, Shachar Shemesh 
 wrote:
 On 25/07/17 17:24, Moritz Maxeiner wrote:
 On Tuesday, 25 July 2017 at 13:50:16 UTC, Shachar Shemesh 
 wrote:
 The title really does says it all.
Since you explicitly state *all* OS functions: nothrow: Should be OK (only callbacks could violate this and they should be nothrow, anyway).
Technically, any system call that is a pthreads cancellation point may throw a C++ exception.
Good to know, then since D is supposed to be able to catch C++ exceptions (and can on 64bit Linux [1]) none of those may be attributed as `nothrow`, because C++ exceptions don't derive from `Error` >
 If we go down that route, however, calling system calls from 
 nothrow becomes completely impossible, which is another way 
 of saying that decorating just about anything with nothrow 
 becomes impossible.
No. `nothrow` functions can call throwing ones, as long as they catch any exceptions not derived from Error thrown by them.
And right there and then you've introduced a serious problem. The purpose of the C++ exception thrown on cancellation point is to terminate the thread. It is designed to be uncatchable.
The issue lies with the definition of `nothrow` considering only D Throwables; it would have to be updated to apply to C++ exceptions that equate to D exceptions derived from Error.
 I think labeling these "nothrow" is the correct course of 
 action.
Not with the `nothrow` spec as it is right now. After the spec having been updated to apply to C++ exception that may not be caught, sure.
 I'm not sure this is the most compelling argument, however. 
 The function passed to pthread_create does not, logically, 
 run in the pthread_create function. As such, I don't think 
 this logic holds.
Then the nogc definition would need to be updated from: "or indirectly through functions it may call" to reflect this, because that can be interpreted both ways.
 As for pthread_join, I have no idea what you meant by it. 
 Please elaborate why you think it is a problem.
Possible scenario on single core (no hyperthreading) system: - thread 1 spawns thread 2 - thread 1 enters nogc function `foo` and calls `pthread_join` on thread 2 before its own timeslice is over (and thus enters blocked state) - thread 2 does work allocating via the GC, then terminates - thread 1 wakes up and leaves nogc function `foo` Because nogc (in contrast to nothrow) is explicitly designed as transitive, logically speaking, `foo` violated its nogc constraint (it *caused* the GC allocations in thread 2).
Following that logic, ANY function that might affect another thread cannot be nogc.
Not any function; as I interpret the spec only those who manually interleave another thread allocating via the GC such that it looks to a caller as if they had allocated using the GC.
 I think this way madness lies. I don't think other threads 
 action, even if linked in some weird semantic way to ours, make 
 us accountable to their actions.
I would say it depends on the exact semantics of each use case whether we are accountable.
 If you pass a callback that is not  nogc to pthread_create, 
 then your other thread might allocate. This doesn't change the 
 fact that pthread_create doesn't allocate.
The "indirectly through functions it may call" of the nogc spec is ambiguous here because it doesn't actually require a direct call chain to the allocation. It would need to be updated.
 At Weka, we use this understanding of the semantics all the 
 time. Our main thread is as  nogc as we can possibly make it. 
 Whenever we need anything that violates our usual restrictions, 
 we send it either to other threads or other processes for 
 execution, and use the results when they return. Defining the 
 various attributes too strictly will simply mean we cannot use 
 them anywhere (which is pretty much what happens today, but the 
 very thing I'm trying to change here).
There is a difference between what's sensible and what the current wording of the spec allows for and before it's OK to attribute functions where the ambiguity applies, the spec wording (for both nogc and nothrow) has to be made unambiguous. P.S.: In case it's not clear: I'm playing devil's advocate in this subthread.
Jul 26