www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - safety model in D

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
SafeD is, unfortunately, not finished at the moment. I want to leave in 
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it 
is "system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
   its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to 
the last \item - there's no way for a module to specify "trusted", 
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust, 
extensible design that doesn't lock our options.


Thanks,

Andrei
Nov 03 2009
next sibling parent reply "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:hcqb44$1nc9$1 digitalmars.com...
 SafeD is, unfortunately, not finished at the moment. I want to leave in 
 place a stub that won't lock our options. Here's what we currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default it is 
 "system", and can be overridden with "-safe".

 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
   its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

 So these are my thoughts so far. There is one problem though related to 
 the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me no 
 problem". Many modules in std fit that mold.

 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.

module(system, trusted) calvin; ?
Nov 03 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Nick Sabalausky wrote:
 module(system, trusted) calvin;
 ?

Yah, I was thinking of something along those lines. What I don't like is that trust is taken, not granted. But then a model with granted trust would be more difficult to define. Andrei
Nov 03 2009
parent reply "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:hcqc3m$1pgl$2 digitalmars.com...
 Nick Sabalausky wrote:
 module(system, trusted) calvin;
 ?

Yah, I was thinking of something along those lines. What I don't like is that trust is taken, not granted. But then a model with granted trust would be more difficult to define. Andrei

import(trust) someSystemModule; ? I get the feeling there's more to this issue than what I'm seeing...
Nov 03 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Nick Sabalausky wrote:
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
 news:hcqc3m$1pgl$2 digitalmars.com...
 Nick Sabalausky wrote:
 module(system, trusted) calvin;
 ?

that trust is taken, not granted. But then a model with granted trust would be more difficult to define. Andrei

import(trust) someSystemModule; ? I get the feeling there's more to this issue than what I'm seeing...

There's a lot more, but there are a few useful subspaces. One is, if an entire application only uses module(safe) that means there is no memory error in that application, ever. Andrei
Nov 03 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
 On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:
 
 There's a lot more, but there are a few useful subspaces. One is, if an
 entire application only uses module(safe) that means there is no memory
 error in that application, ever.

 Andrei

Does that mean that a module that uses a "trusted" module must also be marked as "trusted?" I would see this as pointless since system modules are likely to be used in safe code a lot.

Same here.
 I think the only real option is to have the importer decide if it is 
 trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how hard I try. I mean free is there!
 I don't see a reasonable way to have third party certification. 
 It is between the library writer and application developer. Since the 
 library writer's goal should be to have a system module that is safe, he 
 would likely want to mark it as trusted. This would leave "system" unused 
 because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example std.stdio can claim to be trusted because, in spite of using untrusted stuff like FILE* and fclose, they are encapsulated in a way that makes it impossible for a safe client to engender memory errors.
 In conclusion, here is a chunk of possible import options. I vote for the 
 top two.
 
 import(system) std.stdio;
 system import std.stdio;
 trusted import std.stdio;
 import(trusted) std.stdio;
 import("This is a system module and I know that it is potentially unsafe, 
 but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely something to keep in mind. Andrei
Nov 03 2009
next sibling parent reply "Aelxx" <aelxx yandex.ru> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> /  
 : news:hcr2hb$dvm$1 digitalmars.com...
 Jesse Phillips wrote:
 On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

 There's a lot more, but there are a few useful subspaces. One is, if an
 entire application only uses module(safe) that means there is no memory
 error in that application, ever.

 Andrei

Does that mean that a module that uses a "trusted" module must also be marked as "trusted?" I would see this as pointless since system modules are likely to be used in safe code a lot.

Same here.
 I think the only real option is to have the importer decide if it is 
 trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how hard I try. I mean free is there!
 I don't see a reasonable way to have third party certification. It is 
 between the library writer and application developer. Since the library 
 writer's goal should be to have a system module that is safe, he would 
 likely want to mark it as trusted. This would leave "system" unused 
 because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example std.stdio can claim to be trusted because, in spite of using untrusted stuff like FILE* and fclose, they are encapsulated in a way that makes it impossible for a safe client to engender memory errors.
 In conclusion, here is a chunk of possible import options. I vote for the 
 top two.

 import(system) std.stdio;
 system import std.stdio;
 trusted import std.stdio;
 import(trusted) std.stdio;
 import("This is a system module and I know that it is potentially unsafe, 
 but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely something to keep in mind. Andrei

system module foo ; ... (code) trusted module foo2 ; ... (code) safe module bar ; ... (code) import foo, foo2, bar ; // status defined automatically from module declaration. // error: used system module 'foo' in safe application.
Nov 04 2009
parent Jason House <jason.james.house gmail.com> writes:
Aelxx Wrote:

 
 "Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> /  
  : news:hcr2hb$dvm$1 digitalmars.com...
 Jesse Phillips wrote:
 On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

 There's a lot more, but there are a few useful subspaces. One is, if an
 entire application only uses module(safe) that means there is no memory
 error in that application, ever.

 Andrei

Does that mean that a module that uses a "trusted" module must also be marked as "trusted?" I would see this as pointless since system modules are likely to be used in safe code a lot.

Same here.
 I think the only real option is to have the importer decide if it is 
 trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how hard I try. I mean free is there!
 I don't see a reasonable way to have third party certification. It is 
 between the library writer and application developer. Since the library 
 writer's goal should be to have a system module that is safe, he would 
 likely want to mark it as trusted. This would leave "system" unused 
 because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example std.stdio can claim to be trusted because, in spite of using untrusted stuff like FILE* and fclose, they are encapsulated in a way that makes it impossible for a safe client to engender memory errors.
 In conclusion, here is a chunk of possible import options. I vote for the 
 top two.

 import(system) std.stdio;
 system import std.stdio;
 trusted import std.stdio;
 import(trusted) std.stdio;
 import("This is a system module and I know that it is potentially unsafe, 
 but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely something to keep in mind. Andrei

system module foo ; ... (code) trusted module foo2 ; ... (code) safe module bar ; ... (code) import foo, foo2, bar ; // status defined automatically from module declaration. // error: used system module 'foo' in safe application.

What stops an irritated programmer from marking every one of his modules as trusted? An even worse scenario would be if they create a safe facade module and importing all his pseudo-trusted code. As described so far, trust isn't transitive/viral.
Nov 04 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
 On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:
 
 I think the only real option is to have the importer decide if it is
 trusted.

hard I try. I mean free is there!

I would like to disagree here. void free(void *ptr); free() takes a pointer. There is no way for the coder to get a pointer in SafeD, compiler won't let them, so the function is unusable by a "safe" module even if the function is imported.

Pointers should be available to SafeD, just not certain operations with them. Andrei
Nov 04 2009
parent reply Jesse Phillips <jessekphillips+D gamil.com> writes:
Andrei Alexandrescu Wrote:

 Jesse Phillips wrote:
 On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:
 
 I think the only real option is to have the importer decide if it is
 trusted.

hard I try. I mean free is there!

I would like to disagree here. void free(void *ptr); free() takes a pointer. There is no way for the coder to get a pointer in SafeD, compiler won't let them, so the function is unusable by a "safe" module even if the function is imported.

Pointers should be available to SafeD, just not certain operations with them. Andrei

I must have been confused by the statement: "As long as these pointers are not exposed to the client, such an implementation might be certified to be SafeD compatible1 ." Found on the article for SafeD. I realize things may change, just sounded like pointers were not ever an option.
Nov 04 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
 Andrei Alexandrescu Wrote:
 
 Jesse Phillips wrote:
 On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

 I think the only real option is to have the importer decide if it is
 trusted.

hard I try. I mean free is there!

void free(void *ptr); free() takes a pointer. There is no way for the coder to get a pointer in SafeD, compiler won't let them, so the function is unusable by a "safe" module even if the function is imported.

them. Andrei

I must have been confused by the statement: "As long as these pointers are not exposed to the client, such an implementation might be certified to be SafeD compatible1 ." Found on the article for SafeD. I realize things may change, just sounded like pointers were not ever an option.

Yes, sorry for not mentioning that. It was Walter's idea to allow restricted use of pointers in SafeD. Initially we were thinking of banning pointers altogether. Andrei
Nov 04 2009
prev sibling next sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

 There's a lot more, but there are a few useful subspaces. One is, if an
 entire application only uses module(safe) that means there is no memory
 error in that application, ever.
 
 Andrei

Does that mean that a module that uses a "trusted" module must also be marked as "trusted?" I would see this as pointless since system modules are likely to be used in safe code a lot. I think the only real option is to have the importer decide if it is trusted. I don't see a reasonable way to have third party certification. It is between the library writer and application developer. Since the library writer's goal should be to have a system module that is safe, he would likely want to mark it as trusted. This would leave "system" unused because everyone wants to be safe. In conclusion, here is a chunk of possible import options. I vote for the top two. import(system) std.stdio; system import std.stdio; trusted import std.stdio; import(trusted) std.stdio; import("This is a system module and I know that it is potentially unsafe, but I still want to use it in my safe code") std.stdio;
Nov 03 2009
prev sibling parent Jesse Phillips <jessekphillips gmail.com> writes:
On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

 I think the only real option is to have the importer decide if it is
 trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how hard I try. I mean free is there!

I would like to disagree here. void free(void *ptr); free() takes a pointer. There is no way for the coder to get a pointer in SafeD, compiler won't let them, so the function is unusable by a "safe" module even if the function is imported.
Nov 04 2009
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 SafeD is, unfortunately, not finished at the moment. I want to leave in
 place a stub that won't lock our options. Here's what we currently have:
 module(system) calvin;
 This means calvin can do unsafe things.
 module(safe) susie;
 This means susie commits to extra checks and therefore only a subset of D.
 module hobbes;
 This means hobbes abides to whatever the default safety setting is.
 The default safety setting is up to the compiler. In dmd by default it
 is "system", and can be overridden with "-safe".
 Sketch of the safe rules:
 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
    pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
    its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}
 So these are my thoughts so far. There is one problem though related to
 the last \item - there's no way for a module to specify "trusted",
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
 no problem". Many modules in std fit that mold.
 How can we address that? Again, I'm looking for a simple, robust,
 extensible design that doesn't lock our options.
 Thanks,
 Andrei

One comment that came up in discussing GC stuff in Bugzilla: How do you prevent the following in SafeD? auto arrayOfRefs = new SomeClass[100]; GC.setAttr(arrayOfRefs.ptr, GC.BlkAttr.NO_SCAN); foreach(ref elem; arrayOfRefs) { elem = new SomeClass(); }
Nov 03 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 SafeD is, unfortunately, not finished at the moment. I want to leave in
 place a stub that won't lock our options. Here's what we currently have:
 module(system) calvin;
 This means calvin can do unsafe things.
 module(safe) susie;
 This means susie commits to extra checks and therefore only a subset of D.
 module hobbes;
 This means hobbes abides to whatever the default safety setting is.
 The default safety setting is up to the compiler. In dmd by default it
 is "system", and can be overridden with "-safe".
 Sketch of the safe rules:
 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
    pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
    its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}
 So these are my thoughts so far. There is one problem though related to
 the last \item - there's no way for a module to specify "trusted",
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
 no problem". Many modules in std fit that mold.
 How can we address that? Again, I'm looking for a simple, robust,
 extensible design that doesn't lock our options.
 Thanks,
 Andrei

One comment that came up in discussing GC stuff in Bugzilla: How do you prevent the following in SafeD? auto arrayOfRefs = new SomeClass[100]; GC.setAttr(arrayOfRefs.ptr, GC.BlkAttr.NO_SCAN); foreach(ref elem; arrayOfRefs) { elem = new SomeClass(); }

Is GC.setAttr a safe function? Andrei
Nov 03 2009
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
 SafeD is, unfortunately, not finished at the moment. I want to leave
 in place a stub that won't lock our options. Here's what we
 currently have:
 
 module(system) calvin;
 
 This means calvin can do unsafe things.
 
 module(safe) susie;
 
 This means susie commits to extra checks and therefore only a subset of D.
 
 module hobbes;
 
 This means hobbes abides to whatever the default safety setting is.
 
 The default safety setting is up to the compiler. In dmd by default
 it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing but trouble about this. A module will tipically be writen to be safe or system, I think the default should be defined (I'm not sure what the default should be though). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Did you know the originally a Danish guy invented the burglar-alarm unfortunately, it got stolen
Nov 03 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
 SafeD is, unfortunately, not finished at the moment. I want to leave
 in place a stub that won't lock our options. Here's what we
 currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default
 it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing but trouble about this. A module will tipically be writen to be safe or system, I think the default should be defined (I'm not sure what the default should be though).

The parenthesis pretty much destroys your point :o). I don't think letting the implementation decide is a faulty model. If you know what you want, you say it. Otherwise it means you don't care. Andrei
Nov 03 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el  3 de noviembre a las 17:54 me escribiste:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
 SafeD is, unfortunately, not finished at the moment. I want to leave
 in place a stub that won't lock our options. Here's what we
 currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default
 it is "system", and can be overridden with "-safe".

but trouble about this. A module will tipically be writen to be safe or system, I think the default should be defined (I'm not sure what the default should be though).


I guess this is a joke, but I have to ask: why? I'm not sure about plenty of stuff, that doesn't mean they are pointless.

Oh, I see what you mean. The problem is that many are as unsure as you are about what the default should be. If too many are unsure, maybe the decision should be left as a choice.
 I don't think letting the implementation decide is a faulty model.
 If you know what you want, you say it. Otherwise it means you don't
 care.

I can't understand how you can't care. Maybe I'm misunderstanding the proposal, since nobody else seems to see a problem here.

It's not a proposal as much as a discussion opener, but my suggestion is that if you just say module without any qualification, you leave it to the person compiling to choose the safety level. Andrei
Nov 04 2009
prev sibling next sibling parent reply Bill Baxter <wbaxter gmail.com> writes:
On Tue, Nov 3, 2009 at 2:33 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave in
 place a stub that won't lock our options. Here's what we currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of D=

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default it is
 "system", and can be overridden with "-safe".

 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item =A0No =A0unions =A0that =A0include =A0a reference =A0type =A0(array=

 =A0pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer =A0or reference to a local variable outside
 =A0its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

 So these are my thoughts so far. There is one problem though related to t=

 last \item - there's no way for a module to specify "trusted", meaning:
 "Yeah, I do unsafe stuff inside, but safe modules can call me no problem"=

 Many modules in std fit that mold.

 How can we address that? Again, I'm looking for a simple, robust, extensi=

 design that doesn't lock our options.

I have to say that I would be seriously annoyed to see repeated references to a feature that turns out to be vaporware. (I'm guessing there will be repeated references to SafeD based on the Chapter 4 sample, and I'm guessing it will be vaporware based on the question you're asking above). I'd say leave SafeD for the 2nd edition, and just comment that work is underway in a "Future of D" chapter near the end of the book. And of course add a "Look to <the publishers website || digitalmars.com> for the latest!" Even if not vaporware, it looks like whatever you write is going to be about something completely untested in the wild, and so has a high chance of turning out to need re-designing in the face of actual use. --bb
Nov 03 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
 On Tue, Nov 3, 2009 at 2:33 PM, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave in
 place a stub that won't lock our options. Here's what we currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default it is
 "system", and can be overridden with "-safe".

 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
  pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
  its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

 So these are my thoughts so far. There is one problem though related to the
 last \item - there's no way for a module to specify "trusted", meaning:
 "Yeah, I do unsafe stuff inside, but safe modules can call me no problem".
 Many modules in std fit that mold.

 How can we address that? Again, I'm looking for a simple, robust, extensible
 design that doesn't lock our options.

I have to say that I would be seriously annoyed to see repeated references to a feature that turns out to be vaporware. (I'm guessing there will be repeated references to SafeD based on the Chapter 4 sample, and I'm guessing it will be vaporware based on the question you're asking above). I'd say leave SafeD for the 2nd edition, and just comment that work is underway in a "Future of D" chapter near the end of the book. And of course add a "Look to <the publishers website || digitalmars.com> for the latest!" Even if not vaporware, it looks like whatever you write is going to be about something completely untested in the wild, and so has a high chance of turning out to need re-designing in the face of actual use. --bb

Ok, I won't use the term SafeD as if it were a product. But -safe is there, some checks are there, and Walter is apparently willing to complete them. It's not difficult to go with an initially conservative approach - e.g., "no taking the address of a local" as he wrote in a recent post - although a more refined approach would still allow to take addresses of locals, as long as they don't escape. Andrei
Nov 03 2009
prev sibling next sibling parent Bill Baxter <wbaxter gmail.com> writes:
On Tue, Nov 3, 2009 at 3:54 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Leandro Lucarella wrote:
 Andrei Alexandrescu, el =A03 de noviembre a las 16:33 me escribiste:
 SafeD is, unfortunately, not finished at the moment. I want to leave
 in place a stub that won't lock our options. Here's what we
 currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of
 D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default
 it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothin=


 but trouble about this. A module will tipically be writen to be safe or
 system, I think the default should be defined (I'm not sure what the
 default should be though).

The parenthesis pretty much destroys your point :o). I don't think letting the implementation decide is a faulty model. If you know what you want, you say it. Otherwise it means you don't care.

How can you not care? Either your module uses unsafe features or it doesn't. So it seems if you don't specify, then your module must pass the strictest checks, because otherwise it's not a "don't care" situation -- it's a "system"-only situation. --bb
Nov 03 2009
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
 Sketch of the safe rules:
 
 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with "pointers or reference types".
 \item No pointer arithmetic

 \item No escape of a pointer  or reference to a local variable outside
   its scope

revise: cannot take the address of a local or a reference.
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

add: . no inline assembler . no casting away of const, immutable, or shared
Nov 03 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

 Andrei Alexandrescu wrote:
 Sketch of the safe rules:
 
 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with "pointers or reference types".
 \item No pointer arithmetic

 \item No escape of a pointer  or reference to a local variable outside
   its scope

revise: cannot take the address of a local or a reference.
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

add: . no inline assembler . no casting away of const, immutable, or shared

How does casting away const, immutable, or shared cause memory corruption? If I understand SafeD correctly, that's its only goal. If it does more, I'd also argue casting to shared or immutable is, in general, unsafe. I'm also unsure if safeD has really fleshed out what would make use of (lockfree) shared variables safe. For example, array concatenation in one thread while reading in another thread could allow reading of garbage memory (e.g. if the length was incremented before writing the cell contents)
Nov 03 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Walter Bright Wrote:
 
 Andrei Alexandrescu wrote:
 Sketch of the safe rules:
 
 \begin{itemize*} \item No  cast  from a pointer type to an
 integral type and vice versa

 \item No  cast  between unrelated pointer types \item Bounds
 checks on all array accesses \item  No  unions  that  include  a
 reference  type  (array,   class , pointer, or  struct  including
 such a type)

"pointers or reference types".
 \item No pointer arithmetic \item No escape of a pointer  or
 reference to a local variable outside its scope

 \item Cross-module function calls must only go to other  safe 
 modules \end{itemize*}

or shared

How does casting away const, immutable, or shared cause memory corruption?

If you have an immutable string, the compiler may cache or enregister the length and do anything (such as hoisting checks out of loops) in confidence the length will never change. If you do change it -> memory error.
 If I understand SafeD correctly, that's its only goal. If it does
 more, I'd also argue casting to shared or immutable is, in general,
 unsafe. I'm also unsure if safeD has really fleshed out what would
 make use of (lockfree) shared variables safe. For example, array
 concatenation in one thread while reading in another thread could
 allow reading of garbage memory (e.g. if the length was incremented
 before writing the cell contents)

Shared arrays can't be modified. Andrei
Nov 03 2009
parent Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Jason House wrote:
 Walter Bright Wrote:
 
 Andrei Alexandrescu wrote:
 Sketch of the safe rules:
 
 \begin{itemize*} \item No  cast  from a pointer type to an
 integral type and vice versa

 \item No  cast  between unrelated pointer types \item Bounds
 checks on all array accesses \item  No  unions  that  include  a
 reference  type  (array,   class , pointer, or  struct  including
 such a type)

"pointers or reference types".
 \item No pointer arithmetic \item No escape of a pointer  or
 reference to a local variable outside its scope

 \item Cross-module function calls must only go to other  safe 
 modules \end{itemize*}

or shared

How does casting away const, immutable, or shared cause memory corruption?

If you have an immutable string, the compiler may cache or enregister the length and do anything (such as hoisting checks out of loops) in confidence the length will never change. If you do change it -> memory error.

These arguments are pretty reversible to show casting to XXX is as unsafe as casting away XXX. Consider code that creates thread-local mutable data, leaks it (e.g. assign to a global), and then casts it to immutable or shared and makes another call? To me, this is indistinguushable from the unsafe case.
Nov 04 2009
prev sibling parent Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
 How does casting away const, immutable, or shared cause memory
 corruption? If I understand SafeD correctly, that's its only goal. If
 it does more, I'd also argue casting to shared or immutable is, in
 general, unsafe.

They can cause memory corruption because inadvertent "tearing" can occur when two parts to a memory reference are updated, half from one and half from another alias.
 I'm also unsure if safeD has really fleshed out what
 would make use of (lockfree) shared variables safe. For example,
 array concatenation in one thread while reading in another thread
 could allow reading of garbage memory (e.g. if the length was
 incremented before writing the cell contents)

That kind of out-of-order reading is just what shared is meant to prevent.
Nov 03 2009
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
 Andrei Alexandrescu wrote:
 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with "pointers or reference types".
 \item No pointer arithmetic

 \item No escape of a pointer  or reference to a local variable outside
   its scope

revise: cannot take the address of a local or a reference.
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

add: . no inline assembler . no casting away of const, immutable, or shared

Ok, here's what I have now: \begin{itemize*} \item No cast from a pointer type to a non-pointer type (e.g.~ int ) and vice versa \item No cast between unrelated pointer types \item Bounds checks on all array accesses \item No unions that include pointer type, a reference type (array, class ), or a struct including such a type \item No pointer arithmetic \item Taking the address of a local is forbidden (in fact the needed restriction is to not allow such an address to escape, but that is more difficult to track) \item Cross-module function calls must only go to other safe modules \item No inline assembler \item No casting away of const , immutable , or shared \end{itemize*} Andrei
Nov 04 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Walter Bright, el  3 de noviembre a las 16:21 me escribiste:
 Andrei Alexandrescu wrote:
 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa

 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
  pointer, or  struct  including such a type)

"pointers or reference types".

Strictly speaking, arrays are not reference types either, right?

Ok, in order to not create confusion, I changed that. Here's the new list with one added item: \begin{itemize*} \item No cast from a pointer type to a non-pointer type (e.g.~ int ) and vice versa \item No cast between unrelated pointer types \item Bounds checks on all array accesses \item No unions that include pointer type, a class type, an array type, or a struct embedding such a type \item No pointer arithmetic \item Taking the address of a local is forbidden (in fact the needed restriction is to not allow such an address to escape, but that is more difficult to track) \item Cross-module function calls must only go to other safe modules \item No inline assembler \item No casting away of const , immutable , or shared \item No calls to unsafe functions \end{itemize*} Andrei
Nov 04 2009
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 03 Nov 2009 17:33:39 -0500, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 SafeD is, unfortunately, not finished at the moment. I want to leave in  
 place a stub that won't lock our options. Here's what we currently have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset of  
 D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.
 ...
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

 So these are my thoughts so far. There is one problem though related to  
 the last \item - there's no way for a module to specify "trusted",  
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me  
 no problem". Many modules in std fit that mold.

My interpretation of the module decorations was: module(system) calvin; This means calvin uses unsafe things, but is considered safe for other modules (it overrides the setting of the compiler, so can be compiled even in safe mode). module(safe) susie; This means susie commits to extra checks, and will be compiled in safe mode even if the compiler is in unsafe mode. Susie can only import module(safe) or module(system) modules, or if the compiler is in safe mode, any module. module hobbes; This means hobbes doesn't care whether he's safe or not. (note the important difference from your description) My rationale for interpreting module(system) is: why declare a module as system unless you *wanted* it to be compilable in safe mode? I would expect that very few modules are marked as module(system). And as for the default setting, I think that unsafe is a reasonable default. You can always create a shortcut/script/symlink to the compiler that adds the -safe flag if you wanted a safe-by-default version. -Steve
Nov 04 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave in 
 place a stub that won't lock our options. Here's what we currently have:
 
 module(system) calvin;
 
 This means calvin can do unsafe things.
 
 module(safe) susie;
 
 This means susie commits to extra checks and therefore only a subset of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

Where did susie come from? Only module(system) has been discussed before. Why the need for THREE types of modules? Distinguishing hobbes and susie seems pointless -- either hobbes is safe, or else it will not compile with the -safe switch (and it won't compile at all, on a compiler which makes safe the default!!). It seems that module(safe) is simply a comment, "yes, I've tested it with the -safe switch, and it does compile". Doesn't add any value that I can see. As I understood it, the primary purpose of 'SafeD' was to confine the usage of dangerous constructs to a small number of modules. IMHO, the overwhelming majority of modules should not require any marking.
 \item Cross-module function calls must only go to other  safe  modules

 So these are my thoughts so far. There is one problem though related to 
 the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
 no problem". Many modules in std fit that mold.
 
 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.

This actually seems pretty similar to public/private. I see three types of modules: module : the default, should compile in -safe mode. module(system) : Modules which need to do nasty stuff inside, but for which all the public functions are safe. module(sysinternal/restricted/...): Modules which exist only to support system modules. This will include most APIs to C libraries. Modules in the outer ring need to be prevented from calling ones in the inner ring.
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave 
 in place a stub that won't lock our options. Here's what we currently 
 have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset 
 of D.

> module hobbes; > > This means hobbes abides to whatever the default safety setting is. Where did susie come from? Only module(system) has been discussed before.

Well actually it's always been at least in the early discussions, I was actually surprised that dmd doesn't yet accept it lexically.
 Why the need for THREE types of modules? Distinguishing hobbes 
 and susie seems pointless -- either hobbes is safe, or else it will not 
 compile with the -safe switch (and it won't compile at all, on a 
 compiler which makes safe the default!!). It seems that module(safe) is 
 simply a comment, "yes, I've tested it with the -safe switch, and it 
 does compile". Doesn't add any value that I can see.

Agreed, module(safe) would be unnecessary if everything was safe unless overrode with "system". This would be a hard sell to Walter, however. It would be a hard sell to you, too - don't forget that safe + no bounds checking = having the cake and eating it. module(safe) is not a comment. We need three types of modules because of the interaction between what the module declares and what the command line wants. Let's assume the default, no-flag build allows unsafe code, like right now. Then, module(safe) means that the module volunteers itself for tighter checking, and module(system) is same as module unadorned. But then if the user compiles with -safe, module(safe) is the same as module unadorned, and module(system) allows for unchecked operations in that particular module. I was uncomfortable with this, but Walter convinced me that D's charter is not to allow sandbox compilation and execution of malicious code. If you have the sources, you may as well take a look at their module declarations if you have some worry. Regardless on the result of the debate regarding the default compilation mode, if the change of that default mode is allowed in the command line, then we need both module(safe) and module(system). On a loosely-related vein, I am starting to think it would be a good idea to refine the module declaration some more. It's a great way to have fine-grained compilation options without heavy command-line options. module(safe, contracts, debug) mymodule; This means the module forces safety, contract checks, and debug mode within itself, regardless on the command line.
 As I understood it, the primary purpose of 'SafeD' was to confine the 
 usage of dangerous constructs to a small number of modules. IMHO, the 
 overwhelming majority of modules should not require any marking.

Indeed. I hope so :o).
 \item Cross-module function calls must only go to other  safe  modules

 So these are my thoughts so far. There is one problem though related 
 to the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
 no problem". Many modules in std fit that mold.

 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.

This actually seems pretty similar to public/private. I see three types of modules: module : the default, should compile in -safe mode. module(system) : Modules which need to do nasty stuff inside, but for which all the public functions are safe. module(sysinternal/restricted/...): Modules which exist only to support system modules. This will include most APIs to C libraries. Modules in the outer ring need to be prevented from calling ones in the inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system" would be your "sysinternal". I'd like to milden "system" a bit like in e.g. "trusted", which would be your "system". Andrei
Nov 04 2009
parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:


 module(safe) is not a comment. We need three types of modules because of 
 the interaction between what the module declares and what the command 
 line wants.
 
 Let's assume the default, no-flag build allows unsafe code, like right 
 now. Then, module(safe) means that the module volunteers itself for 
 tighter checking, and module(system) is same as module unadorned.
 
 But then if the user compiles with -safe, module(safe) is the same as 
 module unadorned, and module(system) allows for unchecked operations in 
 that particular module. I was uncomfortable with this, but Walter 
 convinced me that D's charter is not to allow sandbox compilation and 
 execution of malicious code. If you have the sources, you may as well 
 take a look at their module declarations if you have some worry.
 
 Regardless on the result of the debate regarding the default compilation 
 mode, if the change of that default mode is allowed in the command line, 
 then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode? If module(safe) implies bound-checking *cannot* be turned off for that module, would any standard library modules be module(safe)?
 This actually seems pretty similar to public/private.
 I see three types of modules:

 module  : the default, should compile in -safe mode.
 module(system) : Modules which need to do nasty stuff inside, but for 
 which all the public functions are safe.
 module(sysinternal/restricted/...): Modules which exist only to 
 support system modules. This will include most APIs to C libraries.

 Modules in the outer ring need to be prevented from calling ones in 
 the inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system" would be your "sysinternal". I'd like to milden "system" a bit like in e.g. "trusted", which would be your "system".

Yeah, the names don't matter. The thing is, modules in the inner ring are extremely rare. I'd hope there'd be just a few in druntime, and no public ones at all in Phobos.
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:


 module(safe) is not a comment. We need three types of modules because 
 of the interaction between what the module declares and what the 
 command line wants.

 Let's assume the default, no-flag build allows unsafe code, like right 
 now. Then, module(safe) means that the module volunteers itself for 
 tighter checking, and module(system) is same as module unadorned.

 But then if the user compiles with -safe, module(safe) is the same as 
 module unadorned, and module(system) allows for unchecked operations 
 in that particular module. I was uncomfortable with this, but Walter 
 convinced me that D's charter is not to allow sandbox compilation and 
 execution of malicious code. If you have the sources, you may as well 
 take a look at their module declarations if you have some worry.

 Regardless on the result of the debate regarding the default 
 compilation mode, if the change of that default mode is allowed in the 
 command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.
 If module(safe) implies bound-checking *cannot* be turned off for that 
 module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't forget that std is a bad example of a typical library or program because std interfaces programs with the OS.
 This actually seems pretty similar to public/private.
 I see three types of modules:

 module  : the default, should compile in -safe mode.
 module(system) : Modules which need to do nasty stuff inside, but for 
 which all the public functions are safe.
 module(sysinternal/restricted/...): Modules which exist only to 
 support system modules. This will include most APIs to C libraries.

 Modules in the outer ring need to be prevented from calling ones in 
 the inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system" would be your "sysinternal". I'd like to milden "system" a bit like in e.g. "trusted", which would be your "system".

Yeah, the names don't matter. The thing is, modules in the inner ring are extremely rare. I'd hope there'd be just a few in druntime, and no public ones at all in Phobos.

That sounds plausible. Andrei
Nov 04 2009
parent reply Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:


 module(safe) is not a comment. We need three types of modules because 
 of the interaction between what the module declares and what the 
 command line wants.

 Let's assume the default, no-flag build allows unsafe code, like 
 right now. Then, module(safe) means that the module volunteers itself 
 for tighter checking, and module(system) is same as module unadorned.

 But then if the user compiles with -safe, module(safe) is the same as 
 module unadorned, and module(system) allows for unchecked operations 
 in that particular module. I was uncomfortable with this, but Walter 
 convinced me that D's charter is not to allow sandbox compilation and 
 execution of malicious code. If you have the sources, you may as well 
 take a look at their module declarations if you have some worry.

 Regardless on the result of the debate regarding the default 
 compilation mode, if the change of that default mode is allowed in 
 the command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.
 If module(safe) implies bound-checking *cannot* be turned off for that 
 module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't forget that std is a bad example of a typical library or program because std interfaces programs with the OS.

I think it's not so atypical. Database, graphics, anything which calls a C library will be the same. For an app, I'd imagine you'd have a policy of either always compiling with -safe, or ignoring it. If you've got a general-purpose library, you have to assume some of your users will be compiling with -safe. So you have to make all your library modules safe, regardless of how they are marked. (Similarly, -w is NOT optional for library developers). That doesn't leave very much. I'm not seeing the use case for module(safe).
Nov 04 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:
 Don wrote:
 Andrei Alexandrescu wrote:


 module(safe) is not a comment. We need three types of modules 
 because of the interaction between what the module declares and what 
 the command line wants.

 Let's assume the default, no-flag build allows unsafe code, like 
 right now. Then, module(safe) means that the module volunteers 
 itself for tighter checking, and module(system) is same as module 
 unadorned.

 But then if the user compiles with -safe, module(safe) is the same 
 as module unadorned, and module(system) allows for unchecked 
 operations in that particular module. I was uncomfortable with this, 
 but Walter convinced me that D's charter is not to allow sandbox 
 compilation and execution of malicious code. If you have the 
 sources, you may as well take a look at their module declarations if 
 you have some worry.

 Regardless on the result of the debate regarding the default 
 compilation mode, if the change of that default mode is allowed in 
 the command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.
 If module(safe) implies bound-checking *cannot* be turned off for 
 that module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't forget that std is a bad example of a typical library or program because std interfaces programs with the OS.

I think it's not so atypical. Database, graphics, anything which calls a C library will be the same.

I still think the standard library is different because it's part of the computing base offered by the language. A clearer example is Java, which has things in its standard library that cannot be done in Java. But I agree there will be other libraries that need to interface with C.
 For an app, I'd imagine you'd have a policy of either always compiling 
 with -safe, or ignoring it.

I'd say for an app you'd have a policy of marking most modules as safe. That makes it irrelevant what compiler switch is used and puts the onus in the right place: the module.
 If you've got a general-purpose library, you have to assume some of your 
 users will be compiling with -safe. So you have to make all your library 
 modules safe, regardless of how they are marked. (Similarly, -w is NOT 
 optional for library developers).

If you've got a general-purpose library, you try what any D codebase should try: make most of your modules safe and as few as possible system.
 That doesn't leave very much.
 I'm not seeing the use case for module(safe).

I think you ascribe to -safe what module(safe) should do. My point is that -safe is inferior, just some low-level means of choosing a default absent other declaration. The "good" way to go about it is to think your design in terms of safe vs. system modules. Andrei
Nov 04 2009
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-03 17:33:39 -0500, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 So these are my thoughts so far. There is one problem though related to 
 the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
 no problem". Many modules in std fit that mold.
 
 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.

What you want is to define the safety of the implementation separately from the safety of the interface. A safe module interface means that you can use the module in safe code, while a system interface forbid using the module in safe code. You could do this with two values in the parenthesis: module (<interface-safety>, <implementation-safety>) <name>; module (system, system) name; // interface: unsafe impl.: unsafe module (safe, safe) name; // interface: safe impl.: safe module (safe, system) name; // interface: safe impl.: unsafe module (system, safe) name; // interface: unsafe impl.: safe (The last one is silly, I know.) Then define a shortcut so you don't have to repeat yourself when the safety of the two is the same: module (<interface-and-implementation-safety>) <name>; module (system) name; // interface: unsafe impl.: unsafe module (safe) name; // interface: safe impl.: safe Of note, this also leaves the door open to a more fine-grained security policy in the future. We could add an 'extra-safe' or 'half-safe' mode if we wanted. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2009
next sibling parent reply Michal Minich <michal minich.sk> writes:
Hello Michel,

 module (system) name;         // interface: unsafe   impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by: module name; // interface: unsafe impl.: unsafe separating modules to unsafe-unsafe and safe-safe has no usefulness - as those modules could not interact, specifically you need modules that are implemented by unsafe means, but provides only safe interface, so I see it as: module name; // interface: unsafe impl.: unsafe module (system) name; // interface: safe impl.: unsafe module (safe) name; // interface: safe impl.: safe so you can call system modules (io, network...) from safe code.
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
 Hello Michel,
 
 module (system) name;         // interface: unsafe   impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by: module name; // interface: unsafe impl.: unsafe separating modules to unsafe-unsafe and safe-safe has no usefulness - as those modules could not interact, specifically you need modules that are implemented by unsafe means, but provides only safe interface, so I see it as: module name; // interface: unsafe impl.: unsafe module (system) name; // interface: safe impl.: unsafe module (safe) name; // interface: safe impl.: safe so you can call system modules (io, network...) from safe code.

That's a pretty clean design. How would it interact with a -safe command-line flag? Andrei
Nov 04 2009
next sibling parent reply Michal Minich <michal minich.sk> writes:
Hello Andrei,

 Michal Minich wrote:
 
 Hello Michel,
 
 module (system) name;         // interface: unsafe   impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe
 

by: module name; // interface: unsafe impl.: unsafe separating modules to unsafe-unsafe and safe-safe has no usefulness - as those modules could not interact, specifically you need modules that are implemented by unsafe means, but provides only safe interface, so I see it as: module name; // interface: unsafe impl.: unsafe module (system) name; // interface: safe impl.: unsafe module (safe) name; // interface: safe impl.: safe so you can call system modules (io, network...) from safe code.

command-line flag? Andrei

When compiling with -safe flag, you are doing it because you need your entire application to be safe*. Safe flag would just affect modules with no safety flag specified - making them (safe): module name; --> module (safe) name; and then compile. It would not affect system modules, because you already *belive* that the modules are *safe to use* (by using or not using -safe compiler flag). *note: you can also partially compile only some modules/package.
Nov 04 2009
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-04 09:29:21 -0500, Michal Minich <michal minich.sk> said:

 Hello Andrei,
 
 Michal Minich wrote:
 
 Hello Michel,
 
 module (system) name;         // interface: unsafe   impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe
 

by: module name; // interface: unsafe impl.: unsafe separating modules to unsafe-unsafe and safe-safe has no usefulness - as those modules could not interact, specifically you need modules that are implemented by unsafe means, but provides only safe interface, so I see it as: module name; // interface: unsafe impl.: unsafe module (system) name; // interface: safe impl.: unsafe module (safe) name; // interface: safe impl.: safe so you can call system modules (io, network...) from safe code.

command-line flag? Andrei

When compiling with -safe flag, you are doing it because you need your entire application to be safe*. Safe flag would just affect modules with no safety flag specified - making them (safe): module name; --> module (safe) name; and then compile.

I'm not sure this works so well. Look at this: module memory; // unsafe interface - unsafe impl. extern (C) void* malloc(int); extern (C) void free(void*); module (system) my.system; // safe interface - unsafe impl. import memory; void test() { auto i = malloc(10); free(i); } // ok: unsafe impl. allowed module (safe) my.safe; // safe interface - safe impl. import memory; void test() { auto i = malloc(10); free(i); } // error: malloc, free are unsafe How is this supposed to work correctly with and without the "-safe" compiler flag? The way you define things "-safe" would make module memory safe for use while it is not. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Nov 04 2009
next sibling parent reply Michal Minich <michal minich.sk> writes:
Hello Michel,

 I'm not sure this works so well. Look at this:
 
 module memory;   // unsafe interface - unsafe impl.
 extern (C) void* malloc(int);
 extern (C) void free(void*);
 module (system) my.system;   // safe interface - unsafe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
 allowed
 module (safe) my.safe;   // safe interface - safe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // error: malloc,
 free
 are unsafe
 How is this supposed to work correctly with and without the "-safe"
 compiler flag? The way you define things "-safe" would make module
 memory safe for use while it is not.

I'm saying the module memory would not compile when compiler is called with -safe switch. the compiler would try to compile each module without safety specification, as if they were *marked* (safe) - which will not succeed for module memory in this case. In this setting, the reasons to have -safe compiler switch are not so important, they are more like convenience, meaning more like -forcesafe. You would want to use this flag only when you *need* to make sure your application is safe, usually when you are using other libraries. By this switch you can prevent compilation of unsafe application in case some other library silently changes safe module to unsafe in newer version.
Nov 04 2009
parent reply Don <nospam nospam.com> writes:
Michal Minich wrote:
 Hello Michel,
 
 I'm not sure this works so well. Look at this:

 module memory;   // unsafe interface - unsafe impl.
 extern (C) void* malloc(int);
 extern (C) void free(void*);
 module (system) my.system;   // safe interface - unsafe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
 allowed
 module (safe) my.safe;   // safe interface - safe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // error: malloc,
 free
 are unsafe
 How is this supposed to work correctly with and without the "-safe"
 compiler flag? The way you define things "-safe" would make module
 memory safe for use while it is not.

I'm saying the module memory would not compile when compiler is called with -safe switch. the compiler would try to compile each module without safety specification, as if they were *marked* (safe) - which will not succeed for module memory in this case. In this setting, the reasons to have -safe compiler switch are not so important, they are more like convenience, meaning more like -forcesafe. You would want to use this flag only when you *need* to make sure your application is safe, usually when you are using other libraries. By this switch you can prevent compilation of unsafe application in case some other library silently changes safe module to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe interfaces.
Nov 04 2009
parent Michal Minich <michal minich.sk> writes:
Hello Don,

 Michal Minich wrote:
 
 Hello Michel,
 
 I'm not sure this works so well. Look at this:
 
 module memory;   // unsafe interface - unsafe impl.
 extern (C) void* malloc(int);
 extern (C) void free(void*);
 module (system) my.system;   // safe interface - unsafe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
 allowed
 module (safe) my.safe;   // safe interface - safe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // error: malloc,
 free
 are unsafe
 How is this supposed to work correctly with and without the "-safe"
 compiler flag? The way you define things "-safe" would make module
 memory safe for use while it is not.

called with -safe switch. the compiler would try to compile each module without safety specification, as if they were *marked* (safe) - which will not succeed for module memory in this case. In this setting, the reasons to have -safe compiler switch are not so important, they are more like convenience, meaning more like -forcesafe. You would want to use this flag only when you *need* to make sure your application is safe, usually when you are using other libraries. By this switch you can prevent compilation of unsafe application in case some other library silently changes safe module to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe interfaces.

Hello Don,
 Michal Minich wrote:
 
 Hello Michel,
 
 I'm not sure this works so well. Look at this:
 
 module memory;   // unsafe interface - unsafe impl.
 extern (C) void* malloc(int);
 extern (C) void free(void*);
 module (system) my.system;   // safe interface - unsafe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
 allowed
 module (safe) my.safe;   // safe interface - safe impl.
 import memory;
 void test() { auto i = malloc(10); free(i); }   // error: malloc,
 free
 are unsafe
 How is this supposed to work correctly with and without the "-safe"
 compiler flag? The way you define things "-safe" would make module
 memory safe for use while it is not.

called with -safe switch. the compiler would try to compile each module without safety specification, as if they were *marked* (safe) - which will not succeed for module memory in this case. In this setting, the reasons to have -safe compiler switch are not so important, they are more like convenience, meaning more like -forcesafe. You would want to use this flag only when you *need* to make sure your application is safe, usually when you are using other libraries. By this switch you can prevent compilation of unsafe application in case some other library silently changes safe module to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe interfaces.

then they are not (system) modules. they are just modules with no specification. When not using -safe switch, you cannot call from (safe) to module with no safety specification (you can only call (safe) and (system)) When using -safe switch, there does not exists module with not safety specification, all plain modules will be marked (safe), and (system) modules are unchanged. You will not be able to call extern(C) functions from (safe) module, because module in which they are defined will be marked (safe), and will not compile itself. There is the problem I think you are referring to: (system) modules should not be affected by -safe flag. User of module believes (system) is safe, so the (system) module can call anything anytime. So I would suggest such update: when -safe switch is not used: module name; // interface: unsafe impl: unsafe module (system) name; // interface: safe impl: unsafe module (safe) name; // interface: safe impl: safe when -safe switch is used: module name; // interface: unsafe impl: unsafe -- when imported from system module module name; // interface: safe impl: safe -- when imported from safe modules module (system) name; // interface: safe impl: unsafe module (safe) name; // interface: safe impl: safe this means, that when -safe switch is used, that modules with no specification would be marked (safe) only when imported by modules marked as (safe). When they are imported from (system) modules, they will not be marked (safe). There is no need to another check if both (safe) and (system) nodules imports given module, because import from (safe) modules is stronger check, which is always fulfils by import from (system module). In other words, (system) module does not need to perform any more checking when -safe flag is used, it is same as if it not used.
Nov 04 2009
prev sibling parent Jesse Phillips <jessekphillips+D gamil.com> writes:
Michel Fortin Wrote:

 How is this supposed to work correctly with and without the "-safe" 
 compiler flag? The way you define things "-safe" would make module 
 memory safe for use while it is not.

"-safe" would cause the compiler to check if the code was safe and error out if it wasn't. Not sure how it would work out for the precompiled libraries.
Nov 04 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
 On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:
 
 I think safe should be the default, as it should be the most used flavor
 in user code, right? What about:

 module s;             // interface: safe     impl.: safe 
 module (trusted) t;   // interface: safe     impl.: unsafe
 module (unsafe) u;    // interface: unsafe   impl.: unsafe

 * s can import other safe or trusted modules (no unsafe for s). * t can
 import any kind of module, but he guarantee not to corrupt your
   memory if you use it (that's why s can import it).
 * u can import any kind of modules and makes no guarantees (C bindings
   use this).

 That's a pretty clean design. How would it interact with a -safe
 command-line flag?

should be correctly marked as safe (default), trusted or unsafe) and let it compile anyway, add a compiler flag -no-safe (or whatever). But people should never use it, unless you are using some broken library or you are to lazy to mark your modules correctly. Is this too crazy?

I have no problem with safe as default, most of my code is safe. I also like the module (trusted) - it really pictures it meanings, better than "system". But I think there is no reason no use -no-safe compiler flag ... for what reason one would want to force safer program to compile as less safer :)

Efficiency (e.g. remove array bounds checks).
 As I'm thinking more about it, I don't see any reason to have any 
 compiler flag for safety at all.

That would be a great turn of events!!! Andrei
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
 On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:
 
 But I think there is no reason no use -no-safe compiler flag ... for
 what reason one would want to force safer program to compile as less
 safer :)

 As I'm thinking more about it, I don't see any reason to have any
 compiler flag for safety at all.

Andrei

Memory safety is pretty specific thing, If you want it, you want it all, not just some part of it - then you cannot call it memory safety.

I agree and always did.
 The 
 idea of safe module, which under some compiler switch is not safe does 
 not appeal to me.

Absolutely. Notice that if you thought I proposed that, there was a misunderstanding.
 But efficiency is also important, and if you want it, 
 why not move the code subjected to bounds checks to trusted/system module 
 - I hope they are not checked for bounds in release mode. Moving parts of 
 the code to trusted modules is more semantically describing, compared to 
 crude tool of ad-hoc compiler switch.

Well it's not as simple as that. Trusted code is not unchecked code - it's code that may drop redundant checks here and there, leaving code correct, even though the compiler cannot prove it. So no, there's no complete removal of bounds checking. But a trusted module is allowed to replace this: foreach (i; 0 .. a.length) ++a[i]; with foreach (i; 0 .. a.length) ++a.ptr[i]; The latter effectively escapes checks because it uses unchecked pointer arithmetic. The code is still correct, but this time it's the human vouching for it, not the compiler.
 One thing I'm concerned with, whether there is compiler switch or not, is 
 that module numbers will increase, as you will probably want to split 
 some modules in two, because some part may be safe, and some not. I'm 
 wondering why the safety is not discussed on function level, similarly as 
 pure and nothrow currently exists. I'm not sure this would be good, just 
 wondering. Was this topic already discussed?

This is a relatively new topics, and you pointed out some legit kinks. One possibility I discussed with Walter is to have version(safe) vs. version(system) or so. That would allow a module to expose different interfaces depending on the command line switches. Andrei
Nov 04 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
[snip]
 Therefore I propose to use F safety. 

I think you've made an excellent case. Andrei
Nov 04 2009
prev sibling next sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 module name;                  // interface: unsafe   impl.: unsafe
 module (system) name;         // interface: safe     impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe

 so you can call system modules (io, network...) from safe code.

That's a pretty clean design. How would it interact with a -safe command-line flag?

'-safe' turns on runtime safety checks, which can be and should be mostly orthogonal to the module safety level. -- Rainer Deyke - rainerd eldwood.com
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 module name;                  // interface: unsafe   impl.: unsafe
 module (system) name;         // interface: safe     impl.: unsafe
 module (safe) name;           // interface: safe     impl.: safe

 so you can call system modules (io, network...) from safe code.

command-line flag?

'-safe' turns on runtime safety checks, which can be and should be mostly orthogonal to the module safety level.

Runtime vs. compile-time is immaterial. There's one goal - no undefined behavior - that can be achieved through a mix of compile- and run-time checks. My understanding of a good model suggested by this discussion: module name; // does whatever, just like now module(safe) name; // submits to extra checks module(system) name; // encapsulates unsafe stuff in a safe interface No dedicated compile-time switches. Andrei
Nov 04 2009
next sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 '-safe' turns on runtime safety checks, which can be and should be
 mostly orthogonal to the module safety level.

Runtime vs. compile-time is immaterial.

The price of compile-time checks is that you are restricted to a subset of the language, which may or may not allow you to do what you need to do. The price of runtime checks is runtime performance. Safety is always good. To me, the question is never if I want safety, but if I can afford it. If I can't afford to pay the price of runtime checks, I may still want the compile-time checks. If I can't afford to pay the price of compile-time checks, I may still want the runtime checks. Thus, to me, the concepts of runtime and compile-time checks are orthogonal. A module either passes the compile-time checks or it does not. It makes no sense make the compile-time checks optional for some modules. If the module is written to pass the compile-time checks (i.e. uses the safe subset of the language), then the compile-time checks should always be performed for that module. -- Rainer Deyke - rainerd eldwood.com
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 '-safe' turns on runtime safety checks, which can be and should be
 mostly orthogonal to the module safety level.


The price of compile-time checks is that you are restricted to a subset of the language, which may or may not allow you to do what you need to do. The price of runtime checks is runtime performance. Safety is always good. To me, the question is never if I want safety, but if I can afford it. If I can't afford to pay the price of runtime checks, I may still want the compile-time checks. If I can't afford to pay the price of compile-time checks, I may still want the runtime checks. Thus, to me, the concepts of runtime and compile-time checks are orthogonal.

I hear what you're saying, but I am not enthusiastic at all about defining and advertising a half-pregnant state. Such a language is the worst of all worlds - it's frustrating to code in yet gives no guarantee to anyone. I don't see this going anywhere interesting. "Yeah, we have safety, and we also have, you know, half safety - it's like only a lap belt of sorts: inconvenient like crap and doesn't really help in an accident." I wouldn't want to code in such a language.
 A module either passes the compile-time checks or it does not.  It makes
 no sense make the compile-time checks optional for some modules.  If the
 module is written to pass the compile-time checks (i.e. uses the safe
 subset of the language), then the compile-time checks should always be
 performed for that module.

I think that's the current intention indeed. Andrei
Nov 04 2009
parent reply Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 I hear what you're saying, but I am not enthusiastic at all about
 defining and advertising a half-pregnant state. Such a language is the
 worst of all worlds - it's frustrating to code in yet gives no guarantee
 to anyone. I don't see this going anywhere interesting. "Yeah, we have
 safety, and we also have, you know, half safety - it's like only a lap
 belt of sorts: inconvenient like crap and doesn't really help in an
 accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal. Not only is this in direct contradiction to the attempts to allow both safe and unsafe modules to coexist in the same program, it is in contradiction with all existing programming languages, every single one of which offers some safety features but not absolute 100% safety. If you have a formal definition of safety, please post it. Without such a definition, I will use my own informal definition of safety for the rest of this post: "a safety feature is a language feature that reduces programming errors." First, to demonstrate that all programming languages in existence offer some safety features. With some esoteric exceptions (whitespace, hq9+), all programming languages have a syntax with some level of redundancy. This allows the language implementation to reject some inputs as syntactically incorrect. A redundant syntax is a safety feature. Another example relevant to D: D requires an explicit cast when converting an integer to a pointer. This is another safety feature. Now to demonstrate that no language offers 100% safety. In the abstract, no language can guarantee that a program matches the programmer's intention. However, let's look at a more specific form of safety: safety from dereferencing dangling pointers. To guarantee this, you would need to guarantee that the compiler never generates faulty code that causes the a dangling pointer to be dereferenced. If the program makes any system calls at all, you would also need to guarantee that no bugs in the OS cause a dangling pointer to be dereferenced. Both of these are clearly impossible. No language can offer 100% safety. Moreover, that safety necessarily reduces convenience is clearly false. This /only/ applies to compile-time checks. Runtime checks are purely an implementation issue. Even C and assembly can be implemented such that all instances of undefined behavior are trapped at runtime. Conversely, the performance penalty of safety applies mostly to runtime checks. If extensive testing with these checks turned on fails to reveal any bugs, it is entirely reasonable to remove these checks for the final release. -- Rainer Deyke - rainerd eldwood.com
Nov 04 2009
parent reply Don <nospam nospam.com> writes:
Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 I hear what you're saying, but I am not enthusiastic at all about
 defining and advertising a half-pregnant state. Such a language is the
 worst of all worlds - it's frustrating to code in yet gives no guarantee
 to anyone. I don't see this going anywhere interesting. "Yeah, we have
 safety, and we also have, you know, half safety - it's like only a lap
 belt of sorts: inconvenient like crap and doesn't really help in an
 accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal. Not only is this in direct contradiction to the attempts to allow both safe and unsafe modules to coexist in the same program, it is in contradiction with all existing programming languages, every single one of which offers some safety features but not absolute 100% safety. If you have a formal definition of safety, please post it. Without such a definition, I will use my own informal definition of safety for the rest of this post: "a safety feature is a language feature that reduces programming errors." First, to demonstrate that all programming languages in existence offer some safety features. With some esoteric exceptions (whitespace, hq9+), all programming languages have a syntax with some level of redundancy. This allows the language implementation to reject some inputs as syntactically incorrect. A redundant syntax is a safety feature. Another example relevant to D: D requires an explicit cast when converting an integer to a pointer. This is another safety feature. Now to demonstrate that no language offers 100% safety. In the abstract, no language can guarantee that a program matches the programmer's intention. However, let's look at a more specific form of safety: safety from dereferencing dangling pointers. To guarantee this, you would need to guarantee that the compiler never generates faulty code that causes the a dangling pointer to be dereferenced. If the program makes any system calls at all, you would also need to guarantee that no bugs in the OS cause a dangling pointer to be dereferenced. Both of these are clearly impossible. No language can offer 100% safety. Moreover, that safety necessarily reduces convenience is clearly false. This /only/ applies to compile-time checks. Runtime checks are purely an implementation issue. Even C and assembly can be implemented such that all instances of undefined behavior are trapped at runtime. Conversely, the performance penalty of safety applies mostly to runtime checks. If extensive testing with these checks turned on fails to reveal any bugs, it is entirely reasonable to remove these checks for the final release.

I'm in complete agreement with you, Reiner. What I got from Bartosz' original post was that a large class of bugs could be eliminated fairly painlessly via some compile-time checks. It seemed to be based on pragmatic concerns. I applauded it. (I may have misread it, of course). Now, things seem to have left pragmatism and got into ideology. Trying to eradicate _all_ possible memory corruption bugs is extremely difficult in a language like D. I'm not at all convinced that it is realistic (ends up too painful to use). It'd be far more reasonable if we had non-nullable pointers, for example. The ideology really scares me, because 'memory safety' covers just one class of bug. What everyone wants is to drive the _total_ bug count down, and we can improve that dramatically with basic compile-time checks. But demanding 100% memory safety has a horrible cost-benefit tradeoff. It seems like a major undertaking. And I doubt it would convince anyone, anyway. To really guarantee memory safety, you need a bug-free compiler...
Nov 05 2009
next sibling parent Michal Minich <michal minich.sk> writes:
Hello Don,

 Rainer Deyke wrote:
 
 Andrei Alexandrescu wrote:
 
 I hear what you're saying, but I am not enthusiastic at all about
 defining and advertising a half-pregnant state. Such a language is
 the worst of all worlds - it's frustrating to code in yet gives no
 guarantee to anyone. I don't see this going anywhere interesting.
 "Yeah, we have safety, and we also have, you know, half safety -
 it's like only a lap belt of sorts: inconvenient like crap and
 doesn't really help in an accident." I wouldn't want to code in such
 a language.
 

only is this in direct contradiction to the attempts to allow both safe and unsafe modules to coexist in the same program, it is in contradiction with all existing programming languages, every single one of which offers some safety features but not absolute 100% safety. If you have a formal definition of safety, please post it. Without such a definition, I will use my own informal definition of safety for the rest of this post: "a safety feature is a language feature that reduces programming errors." First, to demonstrate that all programming languages in existence offer some safety features. With some esoteric exceptions (whitespace, hq9+), all programming languages have a syntax with some level of redundancy. This allows the language implementation to reject some inputs as syntactically incorrect. A redundant syntax is a safety feature. Another example relevant to D: D requires an explicit cast when converting an integer to a pointer. This is another safety feature. Now to demonstrate that no language offers 100% safety. In the abstract, no language can guarantee that a program matches the programmer's intention. However, let's look at a more specific form of safety: safety from dereferencing dangling pointers. To guarantee this, you would need to guarantee that the compiler never generates faulty code that causes the a dangling pointer to be dereferenced. If the program makes any system calls at all, you would also need to guarantee that no bugs in the OS cause a dangling pointer to be dereferenced. Both of these are clearly impossible. No language can offer 100% safety. Moreover, that safety necessarily reduces convenience is clearly false. This /only/ applies to compile-time checks. Runtime checks are purely an implementation issue. Even C and assembly can be implemented such that all instances of undefined behavior are trapped at runtime. Conversely, the performance penalty of safety applies mostly to runtime checks. If extensive testing with these checks turned on fails to reveal any bugs, it is entirely reasonable to remove these checks for the final release.

What I got from Bartosz' original post was that a large class of bugs could be eliminated fairly painlessly via some compile-time checks. It seemed to be based on pragmatic concerns. I applauded it. (I may have misread it, of course). Now, things seem to have left pragmatism and got into ideology. Trying to eradicate _all_ possible memory corruption bugs is extremely difficult in a language like D. I'm not at all convinced that it is realistic (ends up too painful to use). It'd be far more reasonable if we had non-nullable pointers, for example. The ideology really scares me, because 'memory safety' covers just one class of bug. What everyone wants is to drive the _total_ bug count down, and we can improve that dramatically with basic compile-time checks. But demanding 100% memory safety has a horrible cost-benefit tradeoff. It seems like a major undertaking. And I doubt it would convince anyone, anyway. To really guarantee memory safety, you need a bug-free compiler...

I don't know how this could have anything with ideology? Are Java and C# ideological languages ? Certainly - if you see memory safety as ideology - you cannot escape form it in these languages. Currently in D exist pure functions, but you are not obliged o used them. I think the memory safety should be handled the same way, mark a function safe, if you want/need to restrict yourself to this style of coding. And just don't use it if you don't need, or cant -same as pure and nothrow. Notice that if you code your function safe, it would have only one negative impact on the caller - runtime bounds checking. I Admit it is not good. There are good reasons to require speed. As the standard libraries would use safe code - I'm not sure if it would be required to distribute two versions of .lib one with bounds checked safe code and one without bound checking on safe code? I think what concerns you is also how safety would affect use of D statements and expression, that it would be too difficult/awkward to use; I don't know exactly, but imagine it to be simpler - just like Java/C#(?) I there should be memory safety in D, I see no other possibility as to specify it per function and provide compiler switch to turn off bounds checking for safe code if need. I see it as most flexible for "code writers" and least interfering with "code users"; there is no need for trade-off. Compiler switch that would magically force safety on some code - would just not compile is no way (and specifying safety per module is too grainy - both for code users and writers). Btw. I think non-nullable pointers are equally important, but I see no prospect of them being implemented :(
Nov 05 2009
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
 Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 I hear what you're saying, but I am not enthusiastic at all about
 defining and advertising a half-pregnant state. Such a language is the
 worst of all worlds - it's frustrating to code in yet gives no guarantee
 to anyone. I don't see this going anywhere interesting. "Yeah, we have
 safety, and we also have, you know, half safety - it's like only a lap
 belt of sorts: inconvenient like crap and doesn't really help in an
 accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal. Not only is this in direct contradiction to the attempts to allow both safe and unsafe modules to coexist in the same program, it is in contradiction with all existing programming languages, every single one of which offers some safety features but not absolute 100% safety. If you have a formal definition of safety, please post it. Without such a definition, I will use my own informal definition of safety for the rest of this post: "a safety feature is a language feature that reduces programming errors." First, to demonstrate that all programming languages in existence offer some safety features. With some esoteric exceptions (whitespace, hq9+), all programming languages have a syntax with some level of redundancy. This allows the language implementation to reject some inputs as syntactically incorrect. A redundant syntax is a safety feature. Another example relevant to D: D requires an explicit cast when converting an integer to a pointer. This is another safety feature. Now to demonstrate that no language offers 100% safety. In the abstract, no language can guarantee that a program matches the programmer's intention. However, let's look at a more specific form of safety: safety from dereferencing dangling pointers. To guarantee this, you would need to guarantee that the compiler never generates faulty code that causes the a dangling pointer to be dereferenced. If the program makes any system calls at all, you would also need to guarantee that no bugs in the OS cause a dangling pointer to be dereferenced. Both of these are clearly impossible. No language can offer 100% safety. Moreover, that safety necessarily reduces convenience is clearly false. This /only/ applies to compile-time checks. Runtime checks are purely an implementation issue. Even C and assembly can be implemented such that all instances of undefined behavior are trapped at runtime. Conversely, the performance penalty of safety applies mostly to runtime checks. If extensive testing with these checks turned on fails to reveal any bugs, it is entirely reasonable to remove these checks for the final release.

I'm in complete agreement with you, Reiner. What I got from Bartosz' original post was that a large class of bugs could be eliminated fairly painlessly via some compile-time checks. It seemed to be based on pragmatic concerns. I applauded it. (I may have misread it, of course). Now, things seem to have left pragmatism and got into ideology. Trying to eradicate _all_ possible memory corruption bugs is extremely difficult in a language like D. I'm not at all convinced that it is realistic (ends up too painful to use). It'd be far more reasonable if we had non-nullable pointers, for example. The ideology really scares me, because 'memory safety' covers just one class of bug. What everyone wants is to drive the _total_ bug count down, and we can improve that dramatically with basic compile-time checks. But demanding 100% memory safety has a horrible cost-benefit tradeoff. It seems like a major undertaking. And I doubt it would convince anyone, anyway. To really guarantee memory safety, you need a bug-free compiler...

I protest against using "ideology" when characterizing safety. It instantly lowers the level of the discussion. There is no ideology being pushed here, just a clear notion with equally clear benefits. I think it is a good time we all get informed a bit more. First off: _all_ languages except C, C++, and assembler are or at least claim to be safe. All. I mean ALL. Did I mention all? If that was some ideology that is not realistic, is extremely difficult to achieve, and ends up too painful to use, then such theories would be difficult to corroborate with "ALL". Walter and I are in agreement that safety is not difficult to achieve in D and that it would allow a great many good programs to be written. Second, there are not many definitions of what safe means and no ifs and buts about it. This whole wishy-washy notion of wanting just a little bit of pregnancy is just not worth pursuing. The definition is given in Pierce's book "Types and Programming Languages" but I was happy yesterday to find a free online book section by Luca Cardelli: http://www.eecs.umich.edu/~bchandra/courses/papers/Cardelli_Types.pdf The text is very approachable and informative, and I suggest anyone interested to read through page 5 at least. I think it's a must for anyone participating in this to read the whole thing. Cardelli distinguishes between programs with "trapped errors" versus programs with "untrapped errors". Yesterday Walter and I have had a long discussion, followed by an email communication between Cardelli and myself, which confirmed that these three notions are equivalent: a) "memory safety" (notion we used so far) b) "no undefined behavior" (C++ definition, suggested by Walter) c) "no untrapped errors" (suggested by Cardelli) I suspect "memory safety" is the weakest marketing terms of the three. For example, there's this complaint above: "'memory safety' covers just one class of bug." But when you think of programs with undefined behavior vs. programs with entirely defined behavior, you realize what an important class of bugs that is. Non-nullable pointers are mightily useful, but "no undefined behavior" is quite a bit better to have. The argument about memory safety requiring a bug-free compiler is correct. It was actually aired quite a bit in Java's first years. It can be confidently said that Java won that argument. Why? Because Java had a principled approach that slowly but surely sealed all the gaps. The fact that dmd has bugs now should be absolutely no excuse for us to give up on defining a safe subset of the language. Andrei
Nov 05 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
 First off: _all_ languages except C, C++, and assembler are or at
 least claim to be safe. All. I mean ALL. Did I mention all? If that
 was some ideology that is not realistic, is extremely difficult to
 achieve, and ends up too painful to use, then such theories would be
 difficult to corroborate with "ALL". Walter and I are in agreement
 that safety is not difficult to achieve in D and that it would allow
 a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset of language features it can use is reduced) and the cost for the compiler (to increase the subset of language features that can be used, the compiler has to be much smarter). Most languages have a lot of developers, and can afford making the compiler smarter to allow safety with a low cost for the programmer (at least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the language safe and useful was already absorbed.
 A clear example of this, is not being able to take the address of a local.
 This is too restrictive to be useful, as you pointed in you post about
 having to write static methods because of this. If you can't find
 a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why are they realistic and SafeD wouldn't? Just because we know we could do it in unsafe D?
 I like the idea of having a safe subset in D, but I think being
 a programming language, *runtime* safety should be *always* a choice for
 the user compiling the code.

Well in that case we need to think again about the command-line options.
 As other said, you can never be 100% sure your program won't blow for
 unknown reasons (it could do that because a bug in the
 compiler/interpreter, or even because a hardware problem), you can just
 try to make it as difficult as possible, but 100% safety doesn't exist.

I understand that stance, but I don't find it useful. Andrei
Nov 05 2009
next sibling parent Jesse Phillips <jessekphillips+D gmail.com> writes:
Andrei Alexandrescu Wrote:

 Leandro Lucarella wrote:
 A clear example of this, is not being able to take the address of a local.
 This is too restrictive to be useful, as you pointed in you post about
 having to write static methods because of this. If you can't find
 a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why are they realistic and SafeD wouldn't? Just because we know we could do it in unsafe D?

I think part of the problem is that current users of D have picked it up because they do get this power. But it makes sense that there are potential users that would like the compiler to prevent them from unsafe constructs. And I can't imagine it being more restrictive than Java or C# which are very popular languages. I do like the different approaches though taken by C# and D. C# took a safe model and punched holes in it. D is taking an unsafe model and restricting it.
Nov 05 2009
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Leandro Lucarella (llucax gmail.com)'s article
 Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset of language features it can use is reduced) and the cost for the compiler (to increase the subset of language features that can be used, the compiler has to be much smarter). Most languages have a lot of developers, and can afford making the compiler smarter to allow safety with a low cost for the programmer (at least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the language safe and useful was already absorbed.

because of safety), so using D as it were Java yields very inefficient programs (using classes and new all over the places).

Why does safety have to do w/ Java's GC quality? IMHO it's more a language maturity and money thing. The only major constraint on D GC is unions and even in that case, all we need is one bit that says that stuff in unions needs to be pinned. I think we already agree that storing the only pointer to GC allocated memory in non-pointer types, xor linked lists involving GC allocated memory, etc. are undefined behavior. Other than that and lack of manpower, what prevents a really, really good GC from being implemented in D?
Nov 05 2009
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 A clear example of this, is not being able to take the address of a local.
 This is too restrictive to be useful, as you pointed in you post about
 having to write static methods because of this. If you can't find
 a workaround for this, I guess safety in D can look a little unrealistic.

Sorry, I forgot to mention one thing. My example of List in the thread "An interesting consequence of safety requirements" used struct, but it should be mentioned there's a completely safe alternative: just define List as a class and there is no safety problem at all. Java, C#, and others define lists as classes and it didn't seem to kill them. I agree that using a struct in D would be marginally more efficient, but that doesn't mean that if I want safety I'm dead in the water. In particular it's great that pointers are still usable in SafeD. I'm actually surprised that nobody sees how nicely safety fits D, particularly its handling of "ref". Andrei
Nov 05 2009
parent reply Max Samukha <spambox d-coding.com> writes:
On Thu, 5 Nov 2009 21:29:43 -0300, Leandro Lucarella
<llucax gmail.com> wrote:

See my other response about efficiency of D when using new/classes a lot.
You just can't do it efficiently in D, ask bearophile for some benchmarks
;)

This is in part because D doesn't have a compacting GC. A compacting GC implies allocation speeds comparable with the speed of allocation on stack. I guess many bearophile's benchmarks do not account for GC collection cycles, which should be slower in C#/Java because of the need to move objects. I think, fair benchmarks should always include garbage collection times.
Nov 06 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Max Samukha:

I guess many bearophile's benchmarks do not account for GC
 collection cycles,

I have not explored this well yet. From what I've seen, D is sometimes dead slow at the end program, when many final deallocations happen. In the Java versions of the tests this doesn't happen. A lot of time ago someone has even written a patch for the D1 GC to reduce that problem. Bye, bearophile
Nov 06 2009
prev sibling parent reply Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 First off: _all_ languages except C, C++, and assembler are or at least
 claim to be safe. All. I mean ALL. Did I mention all? If that was some
 ideology that is not realistic, is extremely difficult to achieve, and
 ends up too painful to use, then such theories would be difficult to
 corroborate with "ALL". Walter and I are in agreement that safety is not
 difficult to achieve in D and that it would allow a great many good
 programs to be written.

You're forgetting about all other system programming languages. Also, many of these claims to safety are demonstrably false.
 The text is very approachable and informative, and I suggest anyone
 interested to read through page 5 at least. I think it's a must for
 anyone participating in this to read the whole thing. Cardelli
 distinguishes between programs with "trapped errors" versus programs
 with "untrapped errors". Yesterday Walter and I have had a long
 discussion, followed by an email communication between Cardelli and
 myself, which confirmed that these three notions are equivalent:
 
 a) "memory safety" (notion we used so far)
 b) "no undefined behavior" (C++ definition, suggested by Walter)
 c) "no untrapped errors" (suggested by Cardelli)

They are clearly not equivalent. ++x + ++x has nothing to do with memory safety. Conversely, machine language has no concept of undefined behavior but is clearly not memory safe. Also, you haven't formally defined any of these concepts, so you're basically just hand-waving. -- Rainer Deyke - rainerd eldwood.com
Nov 05 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
 Andrei Alexandrescu wrote:
 First off: _all_ languages except C, C++, and assembler are or at least
 claim to be safe. All. I mean ALL. Did I mention all? If that was some
 ideology that is not realistic, is extremely difficult to achieve, and
 ends up too painful to use, then such theories would be difficult to
 corroborate with "ALL". Walter and I are in agreement that safety is not
 difficult to achieve in D and that it would allow a great many good
 programs to be written.

You're forgetting about all other system programming languages.

[citation needed]
  Also,
 many of these claims to safety are demonstrably false.

Which?
 The text is very approachable and informative, and I suggest anyone
 interested to read through page 5 at least. I think it's a must for
 anyone participating in this to read the whole thing. Cardelli
 distinguishes between programs with "trapped errors" versus programs
 with "untrapped errors". Yesterday Walter and I have had a long
 discussion, followed by an email communication between Cardelli and
 myself, which confirmed that these three notions are equivalent:

 a) "memory safety" (notion we used so far)
 b) "no undefined behavior" (C++ definition, suggested by Walter)
 c) "no untrapped errors" (suggested by Cardelli)

They are clearly not equivalent. ++x + ++x has nothing to do with memory safety. Conversely, machine language has no concept of undefined behavior but is clearly not memory safe. Also, you haven't formally defined any of these concepts, so you're basically just hand-waving.

Memory safety is defined formally in Pierce's book. Undefined behavior is defined by the C++ standard. Cardelli defines trapped and untrapped errors. Andrei
Nov 05 2009
parent Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
 Rainer Deyke wrote:
 You're forgetting about all other system programming languages.


Delphi.
  Also,
 many of these claims to safety are demonstrably false.

Which?

I can get Python to segfault.
 Memory safety is defined formally in Pierce's book.

Do you mean "Types and programming languages" by Benjamin C. Pierce? According to Google books, it does not contain the phrase "memory safety". It does contain a section of "language safety", which says that "a safe language is one that protects its own abstractions". By that definition, machine language is safe, because it has no abstractions to protect. Another quote: "Language safety can be achieved by static checking, but also by run-time checks that trap nonsensical operations just at the moment when they are attempted and stop the program or raise an exception". In other words, Pierce sees runtime checks and compile-time checks as orthogonal methods for providing the same safety.
 Undefined behavior
 is defined by the C++ standard.

Undefined behavior is a simple concept: the language specification does not define what will happen when the program invokes undefined behavior. Undefined behavior can be trivially eliminated from the language by replacing it with defined behavior. If a language construct is defined to trash the process memory space, then it is not undefined behavior.
 Cardelli defines trapped and untrapped
 errors.

Untrapped error: An execution error that does not immediately result in a fault. I can't find his definition of "execution error", which makes this definition useless to me. -- Rainer Deyke - rainerd eldwood.com
Nov 05 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el  6 de noviembre a las 02:13 me escribiste:
 == Quote from Leandro Lucarella (llucax gmail.com)'s article
 Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset of language features it can use is reduced) and the cost for the compiler (to increase the subset of language features that can be used, the compiler has to be much smarter). Most languages have a lot of developers, and can afford making the compiler smarter to allow safety with a low cost for the programmer (at least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the language safe and useful was already absorbed.

because of safety), so using D as it were Java yields very inefficient programs (using classes and new all over the places).

Why does safety have to do w/ Java's GC quality?

Because you don't have unions and other things that prevents the GC from being fully precise.
 IMHO it's more a language maturity and money thing.

That's another reason, but the Boehm GC is probably one of the more advanced and state of the art GCs and I don't think it's close to what the Java GC can do (I didn't see recent benchmarks though, so I might be completely wrong :)
 The only major constraint on D GC is unions and even in that case, all
 we need is one bit that says that stuff in unions needs to be pinned.
 I think we already agree that storing the only pointer to GC allocated
 memory in non-pointer types, xor linked lists involving GC allocated
 memory, etc.  are undefined behavior.  Other than that and lack of
 manpower, what prevents a really, really good GC from being implemented
 in D?

Having a precise stack and registers. Java has a VM that provides all that information. Maybe the distance between a good Java GC and a good D GC can be narrowed a lot, but I don't think D could ever match Java (or other languages with full precise scanning). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- I always get the feeling that when lesbians look at me, they're thinking, '*That's* why I'm not a heterosexual.' -- George Constanza
Nov 05 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Max Samukha, el  6 de noviembre a las 11:10 me escribiste:
 On Thu, 5 Nov 2009 21:29:43 -0300, Leandro Lucarella
 <llucax gmail.com> wrote:
 
See my other response about efficiency of D when using new/classes a lot.
You just can't do it efficiently in D, ask bearophile for some benchmarks
;)

This is in part because D doesn't have a compacting GC. A compacting GC implies allocation speeds comparable with the speed of allocation on stack. I guess many bearophile's benchmarks do not account for GC collection cycles, which should be slower in C#/Java because of the need to move objects. I think, fair benchmarks should always include garbage collection times.

I don't think it's slower, because GCs usually treat differently small and large objects (the D GC already does that). So very small objects (that are the ones more likely to get allocated and freed in huge ammounts) are copied and large objects usually not. Moving a small object is not much more work than doing a sweep, and you get the extra bonus of not having to scan the whole heap, just the live data. This is a huge gain, which make moving collectors very fast (at the expense of extra memory since you have to reserve twice the programs working set). -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Dale tu mano al mono, pero no el codo, dado que un mono confianzudo es irreversible. -- Ricardo Vaporeso. La Reja, Agosto de 1912.
Nov 06 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset of language features it can use is reduced) and the cost for the compiler (to increase the subset of language features that can be used, the compiler has to be much smarter). Most languages have a lot of developers, and can afford making the compiler smarter to allow safety with a low cost for the programmer (at least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the language safe and useful was already absorbed.

That's an unfair comparison. Java has a very efficient GC (partially because of safety), so using D as it were Java yields very inefficient programs (using classes and new all over the places). D can't be completely safe and because of that, it's doomed to have a quite worse GC, so writing code a la Java in D beats the purpose of using D in the first place.
A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why are they realistic and SafeD wouldn't? Just because we know we could do it in unsafe D?

Because in other languages there are no locals! All objects are references and allocated in the heap (except, maybe, some value types). Again, you can do that in D too, but because D is a system language, you can't assume a lot of things and it has a lot less optimization opportune ties, yielding bad performance when not used wisely.
I like the idea of having a safe subset in D, but I think being
a programming language, *runtime* safety should be *always* a choice for
the user compiling the code.

Well in that case we need to think again about the command-line options.

Not necessarily, -release is already there =) But then, I don't have any issues with the GCC-way of hundreds of compiler flags to have fine grained control, so I'm all for adding new flags for that.
As other said, you can never be 100% sure your program won't blow for
unknown reasons (it could do that because a bug in the
compiler/interpreter, or even because a hardware problem), you can just
try to make it as difficult as possible, but 100% safety doesn't exist.

I understand that stance, but I don't find it useful.

The usefulness is that D can't be 100% safe, so spending time in trying to make it that way (specially at the expense of flexibility, i.e., don't providing a way to disable bound-checking in safe modules) makes no sense. You'll just end up with a less efficient Java. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- The average person laughs 13 times a day
Nov 05 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 10:06 me escribiste:
 Leandro Lucarella wrote:
A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Sorry, I forgot to mention one thing. My example of List in the thread "An interesting consequence of safety requirements" used struct, but it should be mentioned there's a completely safe alternative: just define List as a class and there is no safety problem at all. Java, C#, and others define lists as classes and it didn't seem to kill them. I agree that using a struct in D would be marginally more efficient, but that doesn't mean that if I want safety I'm dead in the water. In particular it's great that pointers are still usable in SafeD. I'm actually surprised that nobody sees how nicely safety fits D, particularly its handling of "ref".

See my other response about efficiency of D when using new/classes a lot. You just can't do it efficiently in D, ask bearophile for some benchmarks ;) -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Wake from your sleep, the drying of your tears, Today we escape, we escape.
Nov 05 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  4 de noviembre a las 08:16 me escribiste:
 Michal Minich wrote:
Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by: module name; // interface: unsafe impl.: unsafe separating modules to unsafe-unsafe and safe-safe has no usefulness - as those modules could not interact, specifically you need modules that are implemented by unsafe means, but provides only safe interface, so I see it as: module name; // interface: unsafe impl.: unsafe module (system) name; // interface: safe impl.: unsafe module (safe) name; // interface: safe impl.: safe so you can call system modules (io, network...) from safe code.


I think safe should be the default, as it should be the most used flavor in user code, right? What about: module s; // interface: safe impl.: safe module (trusted) t; // interface: safe impl.: unsafe module (unsafe) u; // interface: unsafe impl.: unsafe * s can import other safe or trusted modules (no unsafe for s). * t can import any kind of module, but he guarantee not to corrupt your memory if you use it (that's why s can import it). * u can import any kind of modules and makes no guarantees (C bindings use this).
 That's a pretty clean design. How would it interact with a -safe
 command-line flag?

I'll use safe by default. If you want to use broken stuff (everything should be correctly marked as safe (default), trusted or unsafe) and let it compile anyway, add a compiler flag -no-safe (or whatever). But people should never use it, unless you are using some broken library or you are to lazy to mark your modules correctly. Is this too crazy? -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- El discman vuelve locos a los controles, te lleva a cualquier lugar. Ajústense pronto los cinturones, nos vamos a estrellar. Evidentemente, no escuchaste el speech, que dio la azafata, antes de despegar.
Nov 04 2009
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:

 I think safe should be the default, as it should be the most used flavor
 in user code, right? What about:
 
 module s;             // interface: safe     impl.: safe 
 module (trusted) t;   // interface: safe     impl.: unsafe
 module (unsafe) u;    // interface: unsafe   impl.: unsafe
 
 * s can import other safe or trusted modules (no unsafe for s). * t can
 import any kind of module, but he guarantee not to corrupt your
   memory if you use it (that's why s can import it).
 * u can import any kind of modules and makes no guarantees (C bindings
   use this).
 
 That's a pretty clean design. How would it interact with a -safe
 command-line flag?

I'll use safe by default. If you want to use broken stuff (everything should be correctly marked as safe (default), trusted or unsafe) and let it compile anyway, add a compiler flag -no-safe (or whatever). But people should never use it, unless you are using some broken library or you are to lazy to mark your modules correctly. Is this too crazy?

I have no problem with safe as default, most of my code is safe. I also like the module (trusted) - it really pictures it meanings, better than "system". But I think there is no reason no use -no-safe compiler flag ... for what reason one would want to force safer program to compile as less safer :) As I'm thinking more about it, I don't see any reason to have any compiler flag for safety at all.
Nov 04 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Michal Minich, el  4 de noviembre a las 18:58 me escribiste:
 As I'm thinking more about it, I don't see any reason to have any 
 compiler flag for safety at all.

That was exacly my point. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Be nice to nerds Chances are you'll end up working for one
Nov 04 2009
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:

 But I think there is no reason no use -no-safe compiler flag ... for
 what reason one would want to force safer program to compile as less
 safer :)

Efficiency (e.g. remove array bounds checks).
 As I'm thinking more about it, I don't see any reason to have any
 compiler flag for safety at all.

That would be a great turn of events!!! Andrei

Memory safety is pretty specific thing, If you want it, you want it all, not just some part of it - then you cannot call it memory safety. The idea of safe module, which under some compiler switch is not safe does not appeal to me. But efficiency is also important, and if you want it, why not move the code subjected to bounds checks to trusted/system module - I hope they are not checked for bounds in release mode. Moving parts of the code to trusted modules is more semantically describing, compared to crude tool of ad-hoc compiler switch. One thing I'm concerned with, whether there is compiler switch or not, is that module numbers will increase, as you will probably want to split some modules in two, because some part may be safe, and some not. I'm wondering why the safety is not discussed on function level, similarly as pure and nothrow currently exists. I'm not sure this would be good, just wondering. Was this topic already discussed?
Nov 04 2009
prev sibling next sibling parent Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 14:24:47 -0600, Andrei Alexandrescu wrote:

 But efficiency is also important, and if you want it, why not move the
 code subjected to bounds checks to trusted/system module - I hope they
 are not checked for bounds in release mode. Moving parts of the code to
 trusted modules is more semantically describing, compared to crude tool
 of ad-hoc compiler switch.

Well it's not as simple as that. Trusted code is not unchecked code - it's code that may drop redundant checks here and there, leaving code correct, even though the compiler cannot prove it. So no, there's no complete removal of bounds checking. But a trusted module is allowed to replace this: foreach (i; 0 .. a.length) ++a[i]; with foreach (i; 0 .. a.length) ++a.ptr[i]; The latter effectively escapes checks because it uses unchecked pointer arithmetic. The code is still correct, but this time it's the human vouching for it, not the compiler.
 One thing I'm concerned with, whether there is compiler switch or not,
 is that module numbers will increase, as you will probably want to
 split some modules in two, because some part may be safe, and some not.
 I'm wondering why the safety is not discussed on function level,
 similarly as pure and nothrow currently exists. I'm not sure this would
 be good, just wondering. Was this topic already discussed?

This is a relatively new topics, and you pointed out some legit kinks. One possibility I discussed with Walter is to have version(safe) vs. version(system) or so. That would allow a module to expose different interfaces depending on the command line switches. Andrei

Sorry for the long post, but it should explain how safety specification should work (and how not). Consider these 3 ways of specifying memory safety: safety specification at module level (M) safety specification at function level (F) safety specification using version switching (V) I see a very big difference between these things: while the M and F are "interface" specification, V is implementation detail. This difference applies only to library/module users, it causes no difference for library/module writer - he must always decide if he writes safe, unsafe or trusted code Imagine scenario with M safety for library user: Library user wants to make memory safe application. He marks his main module as safe, and can be sure (and/or trust), that his application is safe from this point on; because safety is explicit in "interface" he cannot import and use unsafe code. scenario with V safety: Library user wants to make memory safe application. He can import any module. He can use -safe switch on compiler so compiler will use safe version of code - if available! User can be never sure if his application is safe or not. Safety is implementation detail! For this reason, I think V safety is very unsuitable option. Absolutely useless. But there are also problems with M safety. Imagine module for string manipulation with 10 independent functions. The module is marked safe. Library writer then decides add another function, which is unsafe. He can now do following: Option 1: He can mark the module trusted, and implement the function in unsafe way. Compatibility with safe clients using this module will remain. Bad thing: there are 10 provably safe functions, which are not checked by compiler. Also the trust level of module is lower in eyes of user. Library may end us with all modules as trusted (no safe). Option 2: He will implement this in separate unsafe module. This has negative impact on library structure. Option 3: He will implement this in separate trusted module and publicly import this trusted module in original safe module. The thirds options is transparent for module user, and probably the best solution, but I have a feeling that many existing modules will end having their unsafe twin. I see this pattern to emerge: module(safe) std.string module(trusted) std.string_trusted // do not import, already exposed by std.string Therefore I propose to use F safety. It is in fact the same beast as pure and nothrow - they also guarantee some kind of safety, and they are also part of function interface (signature). Compiler also needs to perform stricter check as normally. Just imagine marking entire module pure or nothrow. If certainly possible, is it practical? You would find yourself splitting your functions into separate modules with specific check, or not using pure and nothrow entirely. This way, if you mark your main function safe, you can be sure(and/or trust) your application is safe. More usually - you can use safe only for some functions and this requirement will propagate to all called functions, the same way as for pure or nothrow. One think to figure out remains how to turn of runtime bounds checking for trusted code (and probably safe too). This is legitimate requirement, because probably all the standard library will be safe or trusted, and users which are not concerned with safety and want speed, need to have this compiler switch.
Nov 04 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
 First off: _all_ languages except C, C++, and assembler are or at
 least claim to be safe. All. I mean ALL. Did I mention all? If that
 was some ideology that is not realistic, is extremely difficult to
 achieve, and ends up too painful to use, then such theories would be
 difficult to corroborate with "ALL". Walter and I are in agreement
 that safety is not difficult to achieve in D and that it would allow
 a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset of language features it can use is reduced) and the cost for the compiler (to increase the subset of language features that can be used, the compiler has to be much smarter). Most languages have a lot of developers, and can afford making the compiler smarter to allow safety with a low cost for the programmer (at least when writing code, that cost might be higher performance-wise). A clear example of this, is not being able to take the address of a local. This is too restrictive to be useful, as you pointed in you post about having to write static methods because of this. If you can't find a workaround for this, I guess safety in D can look a little unrealistic. I like the idea of having a safe subset in D, but I think being a programming language, *runtime* safety should be *always* a choice for the user compiling the code. As other said, you can never be 100% sure your program won't blow for unknown reasons (it could do that because a bug in the compiler/interpreter, or even because a hardware problem), you can just try to make it as difficult as possible, but 100% safety doesn't exist. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Se ha dicho tanto que las apariencias engañan Por supuesto que engañarán a quien sea tan vulgar como para creerlo
Nov 05 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  3 de noviembre a las 17:54 me escribiste:
 Leandro Lucarella wrote:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing but trouble about this. A module will tipically be writen to be safe or system, I think the default should be defined (I'm not sure what the default should be though).

The parenthesis pretty much destroys your point :o).

I guess this is a joke, but I have to ask: why? I'm not sure about plenty of stuff, that doesn't mean they are pointless.
 I don't think letting the implementation decide is a faulty model.
 If you know what you want, you say it. Otherwise it means you don't
 care.

I can't understand how you can't care. Maybe I'm misunderstanding the proposal, since nobody else seems to see a problem here. -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- CAMPAÑA POR LA PAZ: APLASTARON JUGUETES BÉLICOS -- Crónica TV
Nov 04 2009
prev sibling next sibling parent Leandro Lucarella <llucax gmail.com> writes:
Walter Bright, el  3 de noviembre a las 16:21 me escribiste:
 Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
  pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with "pointers or reference types".

Strictly speaking, arrays are not reference types either, right? -- Leandro Lucarella (AKA luca) http://llucax.com.ar/ ---------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------- Nos retiramos hasta la semana que viene reflexionando sobre nuestras vidas: "Qué vida de mier'... Qué vida de mier'!" -- Sidharta Kiwi
Nov 04 2009
prev sibling next sibling parent reply jpf <spam example.com> writes:
Andrei Alexandrescu wrote:
 How can we address that? Again, I'm looking for a simple, robust,
 extensible design that doesn't lock our options.
 
 
 Thanks,
 
 Andrei

by silverlight / moonlight). It's quite similar to what you've proposed. http://www.mono-project.com/Moonlight2CoreCLR#Security_levels Btw, is there a reason why safety should be specified at the module level? As we have attributes now that would be a perfect usecase for them: example: Safety(Safe) void doSomething()... or: Safety.Critical void doSomething()... where that attribute could be applied to functions, classes, modules, ... Another related question: Will there be a way to provide different implementations for different safety levels? version(Safety.Critical) { //Some unsafe yet highly optimized asm stuff here } else { //Same thing in safe }
Nov 04 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
jpf wrote:
 Andrei Alexandrescu wrote:
 How can we address that? Again, I'm looking for a simple, robust,
 extensible design that doesn't lock our options.


 Thanks,

 Andrei

by silverlight / moonlight). It's quite similar to what you've proposed. http://www.mono-project.com/Moonlight2CoreCLR#Security_levels

I don't have much time right now, but here's what a cursory look reveals: ==================== Security levels The CoreCLR security model divide all code into three distinct levels: transparent, safe-critical and critical. This model is much simpler to understand (and implement) than CAS (e.g. no stack-walk). Only a few rules can describe much of it. ==================== The keywords "security" and "stack-walk" give it away that this is a matter of software security, not language safety. These are quite different.
 Btw, is there a reason why safety should be specified at the module
 level? As we have attributes now that would be a perfect usecase for
 them: example:
 
  Safety(Safe)
 void doSomething()...
 
 or:
  Safety.Critical
 void doSomething()...
 
 where that attribute could be applied to functions, classes, modules, ...
 
 Another related question: Will there be a way to provide different
 implementations for different safety levels?
 
 version(Safety.Critical)
 {
    //Some unsafe yet highly optimized asm stuff here
 }
 else
 {
    //Same thing in safe
 }

I think it muddies things too much to allow people to make safety decisions at any point (e.g., I'm not a fan of C#'s unsafe). Andrei
Nov 04 2009
parent jpf <spam example.com> writes:
Andrei Alexandrescu wrote:
 jpf wrote:
 You may want to have a look at the CoreCLR security model (that's used
 by silverlight / moonlight). It's quite similar to what you've proposed.
 http://www.mono-project.com/Moonlight2CoreCLR#Security_levels

I don't have much time right now, but here's what a cursory look reveals: ==================== Security levels The CoreCLR security model divide all code into three distinct levels: transparent, safe-critical and critical. This model is much simpler to understand (and implement) than CAS (e.g. no stack-walk). Only a few rules can describe much of it. ==================== The keywords "security" and "stack-walk" give it away that this is a matter of software security, not language safety. These are quite different.

What i wanted to refer to are the levels "Transparent", "Critical" and "Safe Critical", which work exactly as "safe", "system" and "Yeah, I do unsafe stuff inside, but safe modules can call me no problem". The implementation and use case might be different, but the meaning is the same. There's nothing unique in the .net implementation, I just though you may want to have a look at how others solved a similiar problem.
Nov 04 2009
prev sibling next sibling parent reply Tim Matthews <tim.matthews7 gmail.com> writes:
Andrei Alexandrescu wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave in 
 place a stub that won't lock our options. Here's what we currently have:
 
 module(system) calvin;
 
 This means calvin can do unsafe things.
 
 module(safe) susie;
 
 This means susie commits to extra checks and therefore only a subset of D.
 
 module hobbes;
 
 This means hobbes abides to whatever the default safety setting is.
 
 The default safety setting is up to the compiler. In dmd by default it 
 is "system", and can be overridden with "-safe".
 
 Sketch of the safe rules:
 
 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
   its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}
 
 So these are my thoughts so far. There is one problem though related to 
 the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
 no problem". Many modules in std fit that mold.
 
 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.
 
 
 Thanks,
 
 Andrei

Not sure if this is the right topic to say this but maybe D needs monads to allow more functions to be marked as pure. Then functional could be added to the list of paradigms D supports and will also be safer.
Nov 04 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tim Matthews wrote:
 Andrei Alexandrescu wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave 
 in place a stub that won't lock our options. Here's what we currently 
 have:

 module(system) calvin;

 This means calvin can do unsafe things.

 module(safe) susie;

 This means susie commits to extra checks and therefore only a subset 
 of D.

 module hobbes;

 This means hobbes abides to whatever the default safety setting is.

 The default safety setting is up to the compiler. In dmd by default it 
 is "system", and can be overridden with "-safe".

 Sketch of the safe rules:

 \begin{itemize*}
 \item No  cast  from a pointer type to an integral type and vice versa
 \item No  cast  between unrelated pointer types
 \item Bounds checks on all array accesses
 \item  No  unions  that  include  a reference  type  (array,   class ,
   pointer, or  struct  including such a type)
 \item No pointer arithmetic
 \item No escape of a pointer  or reference to a local variable outside
   its scope
 \item Cross-module function calls must only go to other  safe  modules
 \end{itemize*}

 So these are my thoughts so far. There is one problem though related 
 to the last \item - there's no way for a module to specify "trusted", 
 meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me 
 no problem". Many modules in std fit that mold.

 How can we address that? Again, I'm looking for a simple, robust, 
 extensible design that doesn't lock our options.


 Thanks,

 Andrei

Not sure if this is the right topic to say this but maybe D needs monads to allow more functions to be marked as pure. Then functional could be added to the list of paradigms D supports and will also be safer.

Would be great if you found the time to write and discuss a DIP. Andrei
Nov 05 2009
prev sibling parent "AJ" <aj nospam.net> writes:
Andrei Alexandrescu wrote:
 SafeD is, unfortunately, not finished at the moment. I want to leave
 in place a stub that won't lock our options. Here's what we currently
 have:

Is the whole SafeD thing trying to do something similar to Microsoft's "managed/unmanaged" code thing? I don't know much about it, but I had relegated the managed/unmanaged thing to being C++-like (unmanaged) or Java-like (managed). "Sandboxing", in short.
Nov 06 2009