## digitalmars.D - safety model in D

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it
is "system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

Thanks,

Andrei

Nov 03 2009
"Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message
news:hcqb44$1nc9$1 digitalmars.com...
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it is
"system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me no
problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

module(system, trusted) calvin;
?

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Nick Sabalausky wrote:
module(system, trusted) calvin;
?

Yah, I was thinking of something along those lines. What I don't like is
that trust is taken, not granted. But then a model with granted trust
would be more difficult to define.

Andrei

Nov 03 2009
"Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message
news:hcqc3m$1pgl$2 digitalmars.com...
Nick Sabalausky wrote:
module(system, trusted) calvin;
?

Yah, I was thinking of something along those lines. What I don't like is
that trust is taken, not granted. But then a model with granted trust
would be more difficult to define.

Andrei

import(trust) someSystemModule;
?

I get the feeling there's more to this issue than what I'm seeing...

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Nick Sabalausky wrote:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message
news:hcqc3m$1pgl$2 digitalmars.com...
Nick Sabalausky wrote:
module(system, trusted) calvin;
?

that trust is taken, not granted. But then a model with granted trust
would be more difficult to define.

Andrei

import(trust) someSystemModule;
?

I get the feeling there's more to this issue than what I'm seeing...

There's a lot more, but there are a few useful subspaces. One is, if an
entire application only uses module(safe) that means there is no memory
error in that application, ever.

Andrei

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

There's a lot more, but there are a few useful subspaces. One is, if an
entire application only uses module(safe) that means there is no memory
error in that application, ever.

Andrei

Does that mean that a module that uses a "trusted" module must also be
marked as "trusted?" I would see this as pointless since system modules
are likely to be used in safe code a lot.

Same here.

I think the only real option is to have the importer decide if it is
trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how
hard I try. I mean free is there!

I don't see a reasonable way to have third party certification.
It is between the library writer and application developer. Since the
library writer's goal should be to have a system module that is safe, he
would likely want to mark it as trusted. This would leave "system" unused
because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example
std.stdio can claim to be trusted because, in spite of using untrusted
stuff like FILE* and fclose, they are encapsulated in a way that makes
it impossible for a safe client to engender memory errors.

In conclusion, here is a chunk of possible import options. I vote for the
top two.

import(system) std.stdio;
system import std.stdio;
trusted import std.stdio;
import(trusted) std.stdio;
import("This is a system module and I know that it is potentially unsafe,
but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely
something to keep in mind.

Andrei

Nov 03 2009
"Aelxx" <aelxx yandex.ru> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> ÓÏÏÂÝÉÌ/ÓÏÏÂÝÉÌÁ ×
ÎÏ×ÏÓÔÑÈ ÓÌÅÄÕÀÝÅÅ: news:hcr2hb$dvm$1 digitalmars.com...
Jesse Phillips wrote:
On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

There's a lot more, but there are a few useful subspaces. One is, if an
entire application only uses module(safe) that means there is no memory
error in that application, ever.

Andrei

Does that mean that a module that uses a "trusted" module must also be
marked as "trusted?" I would see this as pointless since system modules
are likely to be used in safe code a lot.

Same here.

I think the only real option is to have the importer decide if it is
trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how
hard I try. I mean free is there!

I don't see a reasonable way to have third party certification. It is
between the library writer and application developer. Since the library
writer's goal should be to have a system module that is safe, he would
likely want to mark it as trusted. This would leave "system" unused
because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example
std.stdio can claim to be trusted because, in spite of using untrusted
stuff like FILE* and fclose, they are encapsulated in a way that makes it
impossible for a safe client to engender memory errors.

In conclusion, here is a chunk of possible import options. I vote for the
top two.

import(system) std.stdio;
system import std.stdio;
trusted import std.stdio;
import(trusted) std.stdio;
import("This is a system module and I know that it is potentially unsafe,
but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely
something to keep in mind.

Andrei

system module foo ;
... (code)
trusted module foo2 ;
... (code)
safe module bar ;
... (code)

import foo, foo2, bar ; // status defined automatically from module
declaration.
//  error: used system module 'foo' in safe application.

Nov 04 2009
Jason House <jason.james.house gmail.com> writes:
Aelxx Wrote:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> ÓÏÏÂÝÉÌ/ÓÏÏÂÝÉÌÁ ×
ÎÏ×ÏÓÔÑÈ ÓÌÅÄÕÀÝÅÅ: news:hcr2hb$dvm$1 digitalmars.com...
Jesse Phillips wrote:
On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

There's a lot more, but there are a few useful subspaces. One is, if an
entire application only uses module(safe) that means there is no memory
error in that application, ever.

Andrei

Does that mean that a module that uses a "trusted" module must also be
marked as "trusted?" I would see this as pointless since system modules
are likely to be used in safe code a lot.

Same here.

I think the only real option is to have the importer decide if it is
trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how
hard I try. I mean free is there!

I don't see a reasonable way to have third party certification. It is
between the library writer and application developer. Since the library
writer's goal should be to have a system module that is safe, he would
likely want to mark it as trusted. This would leave "system" unused
because everyone wants to be safe.

Certain modules definitely can't aspire to be trusted. But for example
std.stdio can claim to be trusted because, in spite of using untrusted
stuff like FILE* and fclose, they are encapsulated in a way that makes it
impossible for a safe client to engender memory errors.

In conclusion, here is a chunk of possible import options. I vote for the
top two.

import(system) std.stdio;
system import std.stdio;
trusted import std.stdio;
import(trusted) std.stdio;
import("This is a system module and I know that it is potentially unsafe,
but I still want to use it in my safe code") std.stdio;

Specifying a clause with import crossed my mind too, it's definitely
something to keep in mind.

Andrei

system module foo ;
... (code)
trusted module foo2 ;
... (code)
safe module bar ;
... (code)

import foo, foo2, bar ; // status defined automatically from module
declaration.
//  error: used system module 'foo' in safe application.

What stops an irritated programmer from marking every one of his modules as
trusted?

An even worse scenario would be if they create a safe facade module and
importing all his pseudo-trusted code. As described so far, trust isn't
transitive/viral.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

I think the only real option is to have the importer decide if it is
trusted.

hard I try. I mean free is there!

I would like to disagree here.

void free(void *ptr);

free() takes a pointer. There is no way for the coder to get a pointer in
SafeD, compiler won't let them, so the function is unusable by a "safe"
module even if the function is imported.

Pointers should be available to SafeD, just not certain operations with
them.

Andrei

Nov 04 2009
Jesse Phillips <jessekphillips+D gamil.com> writes:
Andrei Alexandrescu Wrote:

Jesse Phillips wrote:
On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

I think the only real option is to have the importer decide if it is
trusted.

hard I try. I mean free is there!

I would like to disagree here.

void free(void *ptr);

free() takes a pointer. There is no way for the coder to get a pointer in
SafeD, compiler won't let them, so the function is unusable by a "safe"
module even if the function is imported.

Pointers should be available to SafeD, just not certain operations with
them.

Andrei

I must have been confused by the statement:

"As long as these pointers are not exposed to the client, such an
implementation might be certified to be SafeD compatible1 ."

Found on the article for SafeD. I realize things may change, just sounded like
pointers were not ever an option.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jesse Phillips wrote:
Andrei Alexandrescu Wrote:

Jesse Phillips wrote:
On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

I think the only real option is to have the importer decide if it is
trusted.

hard I try. I mean free is there!

void free(void *ptr);

free() takes a pointer. There is no way for the coder to get a pointer in
SafeD, compiler won't let them, so the function is unusable by a "safe"
module even if the function is imported.

them.

Andrei

I must have been confused by the statement:

"As long as these pointers are not exposed to the client, such an
implementation might be certified to be SafeD compatible1 ."

Found on the article for SafeD. I realize things may change, just sounded like
pointers were not ever an option.

Yes, sorry for not mentioning that. It was Walter's idea to allow
restricted use of pointers in SafeD. Initially we were thinking of
banning pointers altogether.

Andrei

Nov 04 2009
Jesse Phillips <jessekphillips gmail.com> writes:
On Tue, 03 Nov 2009 17:55:15 -0600, Andrei Alexandrescu wrote:

There's a lot more, but there are a few useful subspaces. One is, if an
entire application only uses module(safe) that means there is no memory
error in that application, ever.

Andrei

Does that mean that a module that uses a "trusted" module must also be
marked as "trusted?" I would see this as pointless since system modules
are likely to be used in safe code a lot.

I think the only real option is to have the importer decide if it is
trusted. I don't see a reasonable way to have third party certification.
It is between the library writer and application developer. Since the
library writer's goal should be to have a system module that is safe, he
would likely want to mark it as trusted. This would leave "system" unused
because everyone wants to be safe.

In conclusion, here is a chunk of possible import options. I vote for the
top two.

import(system) std.stdio;
system import std.stdio;
trusted import std.stdio;
import(trusted) std.stdio;
import("This is a system module and I know that it is potentially unsafe,
but I still want to use it in my safe code") std.stdio;

Nov 03 2009
Jesse Phillips <jessekphillips gmail.com> writes:
On Tue, 03 Nov 2009 23:13:14 -0600, Andrei Alexandrescu wrote:

I think the only real option is to have the importer decide if it is
trusted.

That can't work. I can't say that stdc.stdlib is trusted no matter how
hard I try. I mean free is there!

I would like to disagree here.

void free(void *ptr);

free() takes a pointer. There is no way for the coder to get a pointer in
SafeD, compiler won't let them, so the function is unusable by a "safe"
module even if the function is imported.

Nov 04 2009
dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:
module(system) calvin;
This means calvin can do unsafe things.
module(safe) susie;
This means susie commits to extra checks and therefore only a subset of D.
module hobbes;
This means hobbes abides to whatever the default safety setting is.
The default safety setting is up to the compiler. In dmd by default it
is "system", and can be overridden with "-safe".
Sketch of the safe rules:
\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}
So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.
How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.
Thanks,
Andrei

One comment that came up in discussing GC stuff in Bugzilla:  How do you prevent
the following in SafeD?

auto arrayOfRefs = new SomeClass[100];
GC.setAttr(arrayOfRefs.ptr, GC.BlkAttr.NO_SCAN);

foreach(ref elem; arrayOfRefs) {
elem = new SomeClass();
}

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:
module(system) calvin;
This means calvin can do unsafe things.
module(safe) susie;
This means susie commits to extra checks and therefore only a subset of D.
module hobbes;
This means hobbes abides to whatever the default safety setting is.
The default safety setting is up to the compiler. In dmd by default it
is "system", and can be overridden with "-safe".
Sketch of the safe rules:
\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}
So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.
How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.
Thanks,
Andrei

One comment that came up in discussing GC stuff in Bugzilla:  How do you
prevent
the following in SafeD?

auto arrayOfRefs = new SomeClass[100];
GC.setAttr(arrayOfRefs.ptr, GC.BlkAttr.NO_SCAN);

foreach(ref elem; arrayOfRefs) {
elem = new SomeClass();
}

Is GC.setAttr a safe function?

Andrei

Nov 03 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing
but trouble about this. A module will tipically be writen to be safe or
system, I think the default should be defined (I'm not sure what the
default should be though).

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Did you know the originally a Danish guy invented the burglar-alarm
unfortunately, it got stolen

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing
but trouble about this. A module will tipically be writen to be safe or
system, I think the default should be defined (I'm not sure what the
default should be though).

The parenthesis pretty much destroys your point :o).

I don't think letting the implementation decide is a faulty model. If
you know what you want, you say it. Otherwise it means you don't care.

Andrei

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  3 de noviembre a las 17:54 me escribiste:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

but trouble about this. A module will tipically be writen to be safe or
system, I think the default should be defined (I'm not sure what the
default should be though).

I guess this is a joke, but I have to ask: why? I'm not sure about plenty
of stuff, that doesn't mean they are pointless.

Oh, I see what you mean. The problem is that many are as unsure as you
are about what the default should be. If too many are unsure, maybe the
decision should be left as a choice.

I don't think letting the implementation decide is a faulty model.
If you know what you want, you say it. Otherwise it means you don't
care.

I can't understand how you can't care. Maybe I'm misunderstanding the
proposal, since nobody else seems to see a problem here.

It's not a proposal as much as a discussion opener, but my suggestion is
that if you just say module without any qualification, you leave it to
the person compiling to choose the safety level.

Andrei

Nov 04 2009
Bill Baxter <wbaxter gmail.com> writes:
On Tue, Nov 3, 2009 at 2:33 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D=

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it is
"system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item =A0No =A0unions =A0that =A0include =A0a reference =A0type =A0(array=

=A0pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer =A0or reference to a local variable outside
=A0its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to t=

last \item - there's no way for a module to specify "trusted", meaning:
"Yeah, I do unsafe stuff inside, but safe modules can call me no problem"=

Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust, extensi=

design that doesn't lock our options.

I have to say that I would be seriously annoyed to see repeated
references to a feature that turns out to be vaporware.  (I'm guessing
there will be repeated references to SafeD based on the Chapter 4
sample, and I'm guessing it will be vaporware based on the question
you're asking above).  I'd say leave SafeD for the 2nd edition, and
just comment that work is underway in a "Future of D" chapter near the
end of the book.  And of course add a "Look to <the publishers website
|| digitalmars.com> for the latest!"

Even if not vaporware, it looks like whatever you write is going to be
about something completely untested in the wild, and so has a high
chance of turning out to need re-designing in the face of actual use.

--bb

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Bill Baxter wrote:
On Tue, Nov 3, 2009 at 2:33 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it is
"system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to the
last \item - there's no way for a module to specify "trusted", meaning:
"Yeah, I do unsafe stuff inside, but safe modules can call me no problem".
Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust, extensible
design that doesn't lock our options.

I have to say that I would be seriously annoyed to see repeated
references to a feature that turns out to be vaporware.  (I'm guessing
there will be repeated references to SafeD based on the Chapter 4
sample, and I'm guessing it will be vaporware based on the question
you're asking above).  I'd say leave SafeD for the 2nd edition, and
just comment that work is underway in a "Future of D" chapter near the
end of the book.  And of course add a "Look to <the publishers website
|| digitalmars.com> for the latest!"

Even if not vaporware, it looks like whatever you write is going to be
about something completely untested in the wild, and so has a high
chance of turning out to need re-designing in the face of actual use.

--bb

Ok, I won't use the term SafeD as if it were a product. But -safe is
there, some checks are there, and Walter is apparently willing to
complete them. It's not difficult to go with an initially conservative
approach - e.g., "no taking the address of a local" as he wrote in a
recent post - although a more refined approach would still allow to take
addresses of locals, as long as they don't escape.

Andrei

Nov 03 2009
Bill Baxter <wbaxter gmail.com> writes:
On Tue, Nov 3, 2009 at 3:54 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
Leandro Lucarella wrote:
Andrei Alexandrescu, el =A03 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of
D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothin=

but trouble about this. A module will tipically be writen to be safe or
system, I think the default should be defined (I'm not sure what the
default should be though).

The parenthesis pretty much destroys your point :o).

I don't think letting the implementation decide is a faulty model. If you
know what you want, you say it. Otherwise it means you don't care.

How can you not care?  Either your module uses unsafe features or it
doesn't.  So it seems if you don't specify, then your module must pass
the strictest checks, because otherwise it's not a "don't care"
situation -- it's a "system"-only situation.

--bb

Nov 03 2009
Walter Bright <newshound1 digitalmars.com> writes:
Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.

\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with
"pointers or reference types".

\item No pointer arithmetic

\item No escape of a pointer  or reference to a local variable outside
its scope

revise: cannot take the address of a local or a reference.

\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

add:
. no inline assembler
. no casting away of const, immutable, or shared

Nov 03 2009
Jason House <jason.james.house gmail.com> writes:
Walter Bright Wrote:

Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.

\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with
"pointers or reference types".

\item No pointer arithmetic

\item No escape of a pointer  or reference to a local variable outside
its scope

revise: cannot take the address of a local or a reference.

\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

add:
. no inline assembler
. no casting away of const, immutable, or shared

How does casting away const, immutable, or shared cause memory corruption? If I
understand SafeD correctly, that's its only goal. If it does more, I'd also
argue casting to shared or immutable is, in general, unsafe. I'm also unsure if
safeD has really fleshed out what would make use of (lockfree) shared variables
safe. For example, array concatenation in one thread while reading in another
thread could allow reading of garbage memory (e.g. if the length was
incremented before writing the cell contents)

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
Walter Bright Wrote:

Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*} \item No  cast  from a pointer type to an
integral type and vice versa

\item No  cast  between unrelated pointer types \item Bounds
checks on all array accesses \item  No  unions  that  include  a
reference  type  (array,   class , pointer, or  struct  including
such a type)

"pointers or reference types".

\item No pointer arithmetic \item No escape of a pointer  or
reference to a local variable outside its scope

\item Cross-module function calls must only go to other  safe
modules \end{itemize*}

or shared

How does casting away const, immutable, or shared cause memory
corruption?

If you have an immutable string, the compiler may cache or enregister
the length and do anything (such as hoisting checks out of loops) in
confidence the length will never change. If you do change it -> memory
error.

If I understand SafeD correctly, that's its only goal. If it does
more, I'd also argue casting to shared or immutable is, in general,
unsafe. I'm also unsure if safeD has really fleshed out what would
make use of (lockfree) shared variables safe. For example, array
concatenation in one thread while reading in another thread could
allow reading of garbage memory (e.g. if the length was incremented
before writing the cell contents)

Shared arrays can't be modified.

Andrei

Nov 03 2009
Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

Jason House wrote:
Walter Bright Wrote:

Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*} \item No  cast  from a pointer type to an
integral type and vice versa

\item No  cast  between unrelated pointer types \item Bounds
checks on all array accesses \item  No  unions  that  include  a
reference  type  (array,   class , pointer, or  struct  including
such a type)

"pointers or reference types".

\item No pointer arithmetic \item No escape of a pointer  or
reference to a local variable outside its scope

\item Cross-module function calls must only go to other  safe
modules \end{itemize*}

or shared

How does casting away const, immutable, or shared cause memory
corruption?

If you have an immutable string, the compiler may cache or enregister
the length and do anything (such as hoisting checks out of loops) in
confidence the length will never change. If you do change it -> memory
error.

These arguments are pretty reversible to show casting to XXX is as unsafe as
casting away XXX. Consider code that creates thread-local mutable data, leaks
it (e.g. assign to a global), and then casts it to immutable or shared and
makes another call? To me, this is indistinguushable from the unsafe case.

Nov 04 2009
Walter Bright <newshound1 digitalmars.com> writes:
Jason House wrote:
How does casting away const, immutable, or shared cause memory
corruption? If I understand SafeD correctly, that's its only goal. If
it does more, I'd also argue casting to shared or immutable is, in
general, unsafe.

They can cause memory corruption because inadvertent "tearing" can occur
when two parts to a memory reference are updated, half from one and half
from another alias.

I'm also unsure if safeD has really fleshed out what
would make use of (lockfree) shared variables safe. For example,
array concatenation in one thread while reading in another thread
could allow reading of garbage memory (e.g. if the length was
incremented before writing the cell contents)

That kind of out-of-order reading is just what shared is meant to prevent.

Nov 03 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Walter Bright wrote:
Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.

\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with
"pointers or reference types".

\item No pointer arithmetic

\item No escape of a pointer  or reference to a local variable outside
its scope

revise: cannot take the address of a local or a reference.

\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

add:
. no inline assembler
. no casting away of const, immutable, or shared

Ok, here's what I have now:

\begin{itemize*}
\item No  cast  from a pointer type to a non-pointer type (e.g.~ int )
and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item No  unions that include  pointer type, a reference  type (array,
class ), or a  struct  including such a type
\item No pointer arithmetic
\item Taking the  address of a local is forbidden  (in fact the needed
restriction is to  not allow such an address to  escape, but that is
more difficult to track)
\item Cross-module function calls must only go to other  safe  modules
\item No inline assembler
\item No casting away of  const ,  immutable , or  shared
\end{itemize*}

Andrei

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
Walter Bright, el  3 de noviembre a las 16:21 me escribiste:
Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)

"pointers or reference types".

Strictly speaking, arrays are not reference types either, right?

Ok, in order to not create confusion, I changed that. Here's the new
list with one added item:

\begin{itemize*}
\item No  cast  from a pointer type to a non-pointer type (e.g.~ int )
and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item No  unions that include pointer  type, a  class   type, an array
type, or a  struct  embedding such a type
\item No pointer arithmetic
\item Taking the  address of a local is forbidden  (in fact the needed
restriction is to  not allow such an address to  escape, but that is
more difficult to track)
\item Cross-module function calls must only go to other  safe  modules
\item No inline assembler
\item No casting away of  const ,  immutable , or  shared
\item No calls to unsafe functions
\end{itemize*}

Andrei

Nov 04 2009
"Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 03 Nov 2009 17:33:39 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of
D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.
...
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

My interpretation of the module decorations was:

module(system) calvin;

This means calvin uses unsafe things, but is considered safe for other
modules (it overrides the setting of the compiler, so can be compiled even
in safe mode).

module(safe) susie;

This means susie commits to extra checks, and will be compiled in safe
mode even if the compiler is in unsafe mode.  Susie can only import
module(safe) or module(system) modules, or if the compiler is in safe
mode, any module.

module hobbes;

This means hobbes doesn't care whether he's safe or not.  (note the
important difference from your description)

My rationale for interpreting module(system) is: why declare a module as
system unless you *wanted* it to be compilable in safe mode?

I would expect that very few modules are marked as module(system).

And as for the default setting, I think that unsafe is a reasonable
default.  You can always create a shortcut/script/symlink to the compiler
that adds the -safe flag if you wanted a safe-by-default version.

-Steve

Nov 04 2009
Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

Where did susie come from? Only module(system) has been discussed
before. Why the need for THREE types of modules? Distinguishing hobbes
and susie seems pointless -- either hobbes is safe, or else it will not
compile with the -safe switch (and it won't compile at all, on a
compiler which makes safe the default!!). It seems that module(safe) is
simply a comment, "yes, I've tested it with the -safe switch, and it
does compile". Doesn't add any value that I can see.

As I understood it, the primary purpose of 'SafeD' was to confine the
usage of dangerous constructs to a small number of modules. IMHO, the
overwhelming majority of modules should not require any marking.

\item Cross-module function calls must only go to other  safe  modules

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

This actually seems pretty similar to public/private.
I see three types of modules:

module  : the default, should compile in -safe mode.
module(system) : Modules which need to do nasty stuff inside, but for
which all the public functions are safe.
module(sysinternal/restricted/...): Modules which exist only to support
system modules. This will include most APIs to C libraries.

Modules in the outer ring need to be prevented from calling ones in the
inner ring.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
Andrei Alexandrescu wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we currently
have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset
of D.

> module hobbes;
>
> This means hobbes abides to whatever the default safety setting is.

Where did susie come from?  Only module(system) has been discussed
before.

Well actually it's always been at least in the early discussions, I was
actually surprised that dmd doesn't yet accept it lexically.

Why the need for THREE types of modules? Distinguishing hobbes
and susie seems pointless -- either hobbes is safe, or else it will not
compile with the -safe switch (and it won't compile at all, on a
compiler which makes safe the default!!). It seems that module(safe) is
simply a comment, "yes, I've tested it with the -safe switch, and it
does compile". Doesn't add any value that I can see.

Agreed, module(safe) would be unnecessary if everything was safe unless
overrode with "system". This would be a hard sell to Walter, however. It
would be a hard sell to you, too - don't forget that safe + no bounds
checking = having the cake and eating it.

module(safe) is not a comment. We need three types of modules because of
the interaction between what the module declares and what the command
line wants.

Let's assume the default, no-flag build allows unsafe code, like right
now. Then, module(safe) means that the module volunteers itself for
tighter checking, and module(system) is same as module unadorned.

But then if the user compiles with -safe, module(safe) is the same as
module unadorned, and module(system) allows for unchecked operations in
that particular module. I was uncomfortable with this, but Walter
convinced me that D's charter is not to allow sandbox compilation and
execution of malicious code. If you have the sources, you may as well
take a look at their module declarations if you have some worry.

Regardless on the result of the debate regarding the default compilation
mode, if the change of that default mode is allowed in the command line,
then we need both module(safe) and module(system).

On a loosely-related vein, I am starting to think it would be a good
idea to refine the module declaration some more. It's a great way to
have fine-grained compilation options without heavy command-line options.

module(safe, contracts, debug) mymodule;

This means the module forces safety, contract checks, and debug mode
within itself, regardless on the command line.

As I understood it, the primary purpose of 'SafeD' was to confine the
usage of dangerous constructs to a small number of modules. IMHO, the
overwhelming majority of modules should not require any marking.

Indeed. I hope so :o).

\item Cross-module function calls must only go to other  safe  modules

So these are my thoughts so far. There is one problem though related
to the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

This actually seems pretty similar to public/private.
I see three types of modules:

module  : the default, should compile in -safe mode.
module(system) : Modules which need to do nasty stuff inside, but for
which all the public functions are safe.
module(sysinternal/restricted/...): Modules which exist only to support
system modules. This will include most APIs to C libraries.

Modules in the outer ring need to be prevented from calling ones in the
inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system"
would be your "sysinternal". I'd like to milden "system" a bit like in
e.g. "trusted", which would be your "system".

Andrei

Nov 04 2009
Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:

module(safe) is not a comment. We need three types of modules because of
the interaction between what the module declares and what the command
line wants.

Let's assume the default, no-flag build allows unsafe code, like right
now. Then, module(safe) means that the module volunteers itself for
tighter checking, and module(system) is same as module unadorned.

But then if the user compiles with -safe, module(safe) is the same as
module unadorned, and module(system) allows for unchecked operations in
that particular module. I was uncomfortable with this, but Walter
convinced me that D's charter is not to allow sandbox compilation and
execution of malicious code. If you have the sources, you may as well
take a look at their module declarations if you have some worry.

Regardless on the result of the debate regarding the default compilation
mode, if the change of that default mode is allowed in the command line,
then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?
If module(safe) implies bound-checking *cannot* be turned off for that
module, would any standard library modules be module(safe)?

This actually seems pretty similar to public/private.
I see three types of modules:

module  : the default, should compile in -safe mode.
module(system) : Modules which need to do nasty stuff inside, but for
which all the public functions are safe.
module(sysinternal/restricted/...): Modules which exist only to
support system modules. This will include most APIs to C libraries.

Modules in the outer ring need to be prevented from calling ones in
the inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system"
would be your "sysinternal". I'd like to milden "system" a bit like in
e.g. "trusted", which would be your "system".

Yeah, the names don't matter. The thing is, modules in the inner ring
are extremely rare. I'd hope there'd be just a few in druntime, and no
public ones at all in Phobos.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:

module(safe) is not a comment. We need three types of modules because
of the interaction between what the module declares and what the
command line wants.

Let's assume the default, no-flag build allows unsafe code, like right
now. Then, module(safe) means that the module volunteers itself for
tighter checking, and module(system) is same as module unadorned.

But then if the user compiles with -safe, module(safe) is the same as
module unadorned, and module(system) allows for unchecked operations
in that particular module. I was uncomfortable with this, but Walter
convinced me that D's charter is not to allow sandbox compilation and
execution of malicious code. If you have the sources, you may as well
take a look at their module declarations if you have some worry.

Regardless on the result of the debate regarding the default
compilation mode, if the change of that default mode is allowed in the
command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.

If module(safe) implies bound-checking *cannot* be turned off for that
module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't forget
that std is a bad example of a typical library or program because std
interfaces programs with the OS.

This actually seems pretty similar to public/private.
I see three types of modules:

module  : the default, should compile in -safe mode.
module(system) : Modules which need to do nasty stuff inside, but for
which all the public functions are safe.
module(sysinternal/restricted/...): Modules which exist only to
support system modules. This will include most APIs to C libraries.

Modules in the outer ring need to be prevented from calling ones in
the inner ring.

Well I wouldn't want to go any dirtier than "system", so my "system"
would be your "sysinternal". I'd like to milden "system" a bit like in
e.g. "trusted", which would be your "system".

Yeah, the names don't matter. The thing is, modules in the inner ring
are extremely rare. I'd hope there'd be just a few in druntime, and no
public ones at all in Phobos.

That sounds plausible.

Andrei

Nov 04 2009
Don <nospam nospam.com> writes:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:

module(safe) is not a comment. We need three types of modules because
of the interaction between what the module declares and what the
command line wants.

Let's assume the default, no-flag build allows unsafe code, like
right now. Then, module(safe) means that the module volunteers itself
for tighter checking, and module(system) is same as module unadorned.

But then if the user compiles with -safe, module(safe) is the same as
module unadorned, and module(system) allows for unchecked operations
in that particular module. I was uncomfortable with this, but Walter
convinced me that D's charter is not to allow sandbox compilation and
execution of malicious code. If you have the sources, you may as well
take a look at their module declarations if you have some worry.

Regardless on the result of the debate regarding the default
compilation mode, if the change of that default mode is allowed in
the command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.

If module(safe) implies bound-checking *cannot* be turned off for that
module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't forget
that std is a bad example of a typical library or program because std
interfaces programs with the OS.

I think it's not so atypical. Database, graphics, anything which calls a
C library will be the same.
For an app, I'd imagine you'd have a policy of either always compiling
with -safe, or ignoring it.
If you've got a general-purpose library, you have to assume some of your
users will be compiling with -safe. So you have to make all your library
modules safe, regardless of how they are marked. (Similarly, -w is NOT
optional for library developers).

That doesn't leave very much.
I'm not seeing the use case for module(safe).

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:
Don wrote:
Andrei Alexandrescu wrote:

module(safe) is not a comment. We need three types of modules
because of the interaction between what the module declares and what
the command line wants.

Let's assume the default, no-flag build allows unsafe code, like
right now. Then, module(safe) means that the module volunteers
itself for tighter checking, and module(system) is same as module
unadorned.

But then if the user compiles with -safe, module(safe) is the same
as module unadorned, and module(system) allows for unchecked
operations in that particular module. I was uncomfortable with this,
but Walter convinced me that D's charter is not to allow sandbox
compilation and execution of malicious code. If you have the
sources, you may as well take a look at their module declarations if
you have some worry.

Regardless on the result of the debate regarding the default
compilation mode, if the change of that default mode is allowed in
the command line, then we need both module(safe) and module(system).

When would it be MANDATORY for a module to be compiled in safe mode?

module(safe) entails safe mode, come hell or high water.

If module(safe) implies bound-checking *cannot* be turned off for
that module, would any standard library modules be module(safe)?

I think most or all of the standard library is trusted. But don't
forget that std is a bad example of a typical library or program
because std interfaces programs with the OS.

I think it's not so atypical. Database, graphics, anything which calls a
C library will be the same.

I still think the standard library is different because it's part of the
computing base offered by the language. A clearer example is Java, which
has things in its standard library that cannot be done in Java. But I
agree there will be other libraries that need to interface with C.

For an app, I'd imagine you'd have a policy of either always compiling
with -safe, or ignoring it.

I'd say for an app you'd have a policy of marking most modules as safe.
That makes it irrelevant what compiler switch is used and puts the onus
in the right place: the module.

If you've got a general-purpose library, you have to assume some of your
users will be compiling with -safe. So you have to make all your library
modules safe, regardless of how they are marked. (Similarly, -w is NOT
optional for library developers).

If you've got a general-purpose library, you try what any D codebase
should try: make most of your modules safe and as few as possible system.

That doesn't leave very much.
I'm not seeing the use case for module(safe).

I think you ascribe to -safe what module(safe) should do. My point is
that -safe is inferior, just some low-level means of choosing a default
absent other declaration. The "good" way to go about it is to think your
design in terms of safe vs. system modules.

Andrei

Nov 04 2009
Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-03 17:33:39 -0500, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> said:

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

What you want is to define the safety of the implementation separately
from the safety of the interface. A safe module interface means that
you can use the module in safe code, while a system interface forbid
using the module in safe code. You could do this with two values in the
parenthesis:

module (<interface-safety>, <implementation-safety>) <name>;

module (system, system) name; // interface: unsafe   impl.: unsafe
module (safe, safe) name;     // interface: safe     impl.: safe
module (safe, system) name;   // interface: safe     impl.: unsafe
module (system, safe) name;   // interface: unsafe   impl.: safe

(The last one is silly, I know.)

Then define a shortcut so you don't have to repeat yourself when the
safety of the two is the same:

module (<interface-and-implementation-safety>) <name>;

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

Of note, this also leaves the door open to a more fine-grained security
policy in the future. We could add an 'extra-safe' or 'half-safe' mode
if we wanted.

--
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Nov 04 2009
Michal Minich <michal minich.sk> writes:
Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by:

module name; // interface: unsafe   impl.: unsafe

separating modules to unsafe-unsafe and safe-safe  has no usefulness - as
those modules could not interact, specifically you need modules that are
implemented by unsafe means, but provides only safe interface, so I see it
as:

module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by:

module name; // interface: unsafe   impl.: unsafe

separating modules to unsafe-unsafe and safe-safe  has no usefulness -
as those modules could not interact, specifically you need modules that
are implemented by unsafe means, but provides only safe interface, so I
see it as:

module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

That's a pretty clean design. How would it interact with a -safe
command-line flag?

Andrei

Nov 04 2009
Michal Minich <michal minich.sk> writes:
Hello Andrei,

Michal Minich wrote:

Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

by:

module name; // interface: unsafe   impl.: unsafe

separating modules to unsafe-unsafe and safe-safe  has no usefulness
- as those modules could not interact, specifically you need modules
that are implemented by unsafe means, but provides only safe
interface, so I see it as:

module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

command-line flag?

Andrei

When compiling with -safe flag, you are doing it because you need your entire
application to be safe*.

Safe flag would just affect modules with no safety flag specified - making
them (safe):

module name; --> module (safe) name;

and then compile.

It would not affect system modules, because you already *belive* that the
modules are *safe to use* (by using or not using -safe compiler flag).

*note: you can also partially compile only some modules/package.

Nov 04 2009
Michel Fortin <michel.fortin michelf.com> writes:
On 2009-11-04 09:29:21 -0500, Michal Minich <michal minich.sk> said:

Hello Andrei,

Michal Minich wrote:

Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

by:

module name; // interface: unsafe   impl.: unsafe

separating modules to unsafe-unsafe and safe-safe  has no usefulness
- as those modules could not interact, specifically you need modules
that are implemented by unsafe means, but provides only safe
interface, so I see it as:

module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

command-line flag?

Andrei

When compiling with -safe flag, you are doing it because you need your
entire application to be safe*.

Safe flag would just affect modules with no safety flag specified -
making them (safe):

module name; --> module (safe) name;

and then compile.

I'm not sure this works so well. Look at this:

module memory;   // unsafe interface - unsafe impl.
extern (C) void* malloc(int);
extern (C) void free(void*);

module (system) my.system;   // safe interface - unsafe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl. allowed

module (safe) my.safe;   // safe interface - safe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // error: malloc, free
are unsafe

How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

--
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Nov 04 2009
Michal Minich <michal minich.sk> writes:
Hello Michel,

I'm not sure this works so well. Look at this:

module memory;   // unsafe interface - unsafe impl.
extern (C) void* malloc(int);
extern (C) void free(void*);
module (system) my.system;   // safe interface - unsafe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
allowed
module (safe) my.safe;   // safe interface - safe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // error: malloc,
free
are unsafe
How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

I'm saying the module memory would not compile when compiler is called with
-safe switch.

the compiler would try to compile each module without safety specification,
as if they were *marked* (safe) - which will not succeed for module memory
in this case.

In this setting, the reasons to have -safe compiler switch are not so
important,
they are more like convenience, meaning more like -forcesafe.

You would want to use this flag only when you *need* to make sure your
application
is safe, usually when you are using other libraries. By this switch you can
prevent compilation of unsafe application in case some other library silently
changes safe module to unsafe in newer version.

Nov 04 2009
Don <nospam nospam.com> writes:
Michal Minich wrote:
Hello Michel,

I'm not sure this works so well. Look at this:

module memory;   // unsafe interface - unsafe impl.
extern (C) void* malloc(int);
extern (C) void free(void*);
module (system) my.system;   // safe interface - unsafe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
allowed
module (safe) my.safe;   // safe interface - safe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // error: malloc,
free
are unsafe
How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

I'm saying the module memory would not compile when compiler is called
with -safe switch.

the compiler would try to compile each module without safety
specification, as if they were *marked* (safe) - which will not succeed
for module memory in this case.

In this setting, the reasons to have -safe compiler switch are not so
important, they are more like convenience, meaning more like -forcesafe.
You would want to use this flag only when you *need* to make sure your
application is safe, usually when you are using other libraries. By this
switch you can prevent compilation of unsafe application in case some
other library silently changes safe module to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe
interfaces.

Nov 04 2009
Michal Minich <michal minich.sk> writes:
Hello Don,

Michal Minich wrote:

Hello Michel,

I'm not sure this works so well. Look at this:

module memory;   // unsafe interface - unsafe impl.
extern (C) void* malloc(int);
extern (C) void free(void*);
module (system) my.system;   // safe interface - unsafe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
allowed
module (safe) my.safe;   // safe interface - safe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // error: malloc,
free
are unsafe
How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

called with -safe switch.

the compiler would try to compile each module without safety
specification, as if they were *marked* (safe) - which will not
succeed for module memory in this case.

In this setting, the reasons to have -safe compiler switch are not so
important, they are more like convenience, meaning more like
-forcesafe. You would want to use this flag only when you *need* to
make sure your application is safe, usually when you are using other
libraries. By this switch you can prevent compilation of unsafe
application in case some other library silently changes safe module
to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe
interfaces.

Hello Don,

Michal Minich wrote:

Hello Michel,

I'm not sure this works so well. Look at this:

module memory;   // unsafe interface - unsafe impl.
extern (C) void* malloc(int);
extern (C) void free(void*);
module (system) my.system;   // safe interface - unsafe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // ok: unsafe impl.
allowed
module (safe) my.safe;   // safe interface - safe impl.
import memory;
void test() { auto i = malloc(10); free(i); }   // error: malloc,
free
are unsafe
How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

called with -safe switch.

the compiler would try to compile each module without safety
specification, as if they were *marked* (safe) - which will not
succeed for module memory in this case.

In this setting, the reasons to have -safe compiler switch are not so
important, they are more like convenience, meaning more like
-forcesafe. You would want to use this flag only when you *need* to
make sure your application is safe, usually when you are using other
libraries. By this switch you can prevent compilation of unsafe
application in case some other library silently changes safe module
to unsafe in newer version.

from safe modules -- eg extern(C) functions. They MUST have unsafe
interfaces.

then they are not (system) modules. they are just modules with no specification.

When not using -safe switch, you cannot call from (safe) to module with no
safety specification (you can only call (safe) and (system))

When using -safe switch, there does not exists module with not safety
specification,
all plain modules will be marked (safe), and (system) modules are unchanged.
You will not be able to call extern(C) functions from (safe) module, because
module in which they are defined will be marked (safe), and will not compile
itself. There is the problem I think you are referring to: (system) modules
should not be affected by -safe flag. User of module believes (system) is
safe, so the (system) module can call anything anytime. So I would suggest
such update:

when -safe switch is not used:
module name;            // interface: unsafe impl: unsafe
module (system) name;   // interface: safe   impl: unsafe
module (safe) name;     // interface: safe   impl: safe

when -safe switch is used:
module name;            // interface: unsafe impl: unsafe   -- when imported
from system module
module name;            // interface: safe   impl: safe     -- when imported
from safe modules
module (system) name;   // interface: safe   impl: unsafe
module (safe) name;     // interface: safe   impl: safe

this means, that when -safe switch is used, that modules with no specification
would be marked (safe) only when imported by modules marked as (safe). When
they are imported from (system) modules, they will not be marked (safe).
There is no need to another check if both (safe) and (system) nodules imports
given module, because import from (safe) modules is stronger check, which
is always fulfils by import from (system module).

In other words, (system) module does not need to perform any more checking
when -safe flag is used, it is same as if it not used.

Nov 04 2009
Jesse Phillips <jessekphillips+D gamil.com> writes:
Michel Fortin Wrote:

How is this supposed to work correctly with and without the "-safe"
compiler flag? The way you define things "-safe" would make module
memory safe for use while it is not.

"-safe" would cause the compiler to check if the code was safe and error out if
it wasn't. Not sure how it would work out for the precompiled libraries.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:

I think safe should be the default, as it should be the most used flavor
in user code, right? What about:

module s;             // interface: safe     impl.: safe
module (trusted) t;   // interface: safe     impl.: unsafe
module (unsafe) u;    // interface: unsafe   impl.: unsafe

* s can import other safe or trusted modules (no unsafe for s). * t can
import any kind of module, but he guarantee not to corrupt your
memory if you use it (that's why s can import it).
* u can import any kind of modules and makes no guarantees (C bindings
use this).

That's a pretty clean design. How would it interact with a -safe
command-line flag?

should be correctly marked as safe (default), trusted or unsafe) and let
it compile anyway, add a compiler flag -no-safe (or whatever).

But people should never use it, unless you are using some broken library
or you are to lazy to mark your modules correctly.

Is this too crazy?

I have no problem with safe as default, most of my code is safe. I also
like the module (trusted) - it really pictures it meanings, better than
"system".

But I think there is no reason no use -no-safe compiler flag ... for what
reason one would want to force safer program to compile as less safer :)

Efficiency (e.g. remove array bounds checks).

As I'm thinking more about it, I don't see any reason to have any
compiler flag for safety at all.

That would be a great turn of events!!!

Andrei

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:

But I think there is no reason no use -no-safe compiler flag ... for
what reason one would want to force safer program to compile as less
safer :)

As I'm thinking more about it, I don't see any reason to have any
compiler flag for safety at all.

Andrei

Memory safety is pretty specific thing, If you want it, you want it all,
not just some part of it - then you cannot call it memory safety.

I agree and always did.

The
idea of safe module, which under some compiler switch is not safe does
not appeal to me.

Absolutely. Notice that if you thought I proposed that, there was a
misunderstanding.

But efficiency is also important, and if you want it,
why not move the code subjected to bounds checks to trusted/system module
- I hope they are not checked for bounds in release mode. Moving parts of
the code to trusted modules is more semantically describing, compared to
crude tool of ad-hoc compiler switch.

Well it's not as simple as that. Trusted code is not unchecked code -
it's code that may drop redundant checks here and there, leaving code
correct, even though the compiler cannot prove it. So no, there's no
complete removal of bounds checking. But a trusted module is allowed to
replace this:

foreach (i; 0 .. a.length) ++a[i];

with

foreach (i; 0 .. a.length) ++a.ptr[i];

The latter effectively escapes checks because it uses unchecked pointer
arithmetic. The code is still correct, but this time it's the human
vouching for it, not the compiler.

One thing I'm concerned with, whether there is compiler switch or not, is
that module numbers will increase, as you will probably want to split
some modules in two, because some part may be safe, and some not. I'm
wondering why the safety is not discussed on function level, similarly as
pure and nothrow currently exists. I'm not sure this would be good, just
wondering. Was this topic already discussed?

This is a relatively new topics, and you pointed out some legit kinks.
One possibility I discussed with Walter is to have version(safe) vs.
version(system) or so. That would allow a module to expose different
interfaces depending on the command line switches.

Andrei

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michal Minich wrote:
[snip]
Therefore I propose to use F safety.

I think you've made an excellent case.

Andrei

Nov 04 2009
Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

That's a pretty clean design. How would it interact with a -safe
command-line flag?

'-safe' turns on runtime safety checks, which can be and should be
mostly orthogonal to the module safety level.

--
Rainer Deyke - rainerd eldwood.com

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
Andrei Alexandrescu wrote:
module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

command-line flag?

'-safe' turns on runtime safety checks, which can be and should be
mostly orthogonal to the module safety level.

Runtime vs. compile-time is immaterial. There's one goal - no undefined
behavior - that can be achieved through a mix of compile- and run-time
checks.

My understanding of a good model suggested by this discussion:

module name;         // does whatever, just like now
module(safe) name;   // submits to extra checks
module(system) name; // encapsulates unsafe stuff in a safe interface

No dedicated compile-time switches.

Andrei

Nov 04 2009
Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
Rainer Deyke wrote:
'-safe' turns on runtime safety checks, which can be and should be
mostly orthogonal to the module safety level.

Runtime vs. compile-time is immaterial.

The price of compile-time checks is that you are restricted to a subset
of the language, which may or may not allow you to do what you need to do.

The price of runtime checks is runtime performance.

Safety is always good.  To me, the question is never if I want safety,
but if I can afford it.  If I can't afford to pay the price of runtime
checks, I may still want the compile-time checks.  If I can't afford to
pay the price of compile-time checks, I may still want the runtime
checks.  Thus, to me, the concepts of runtime and compile-time checks
are orthogonal.

A module either passes the compile-time checks or it does not.  It makes
no sense make the compile-time checks optional for some modules.  If the
module is written to pass the compile-time checks (i.e. uses the safe
subset of the language), then the compile-time checks should always be
performed for that module.

--
Rainer Deyke - rainerd eldwood.com

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
Andrei Alexandrescu wrote:
Rainer Deyke wrote:
'-safe' turns on runtime safety checks, which can be and should be
mostly orthogonal to the module safety level.

The price of compile-time checks is that you are restricted to a subset
of the language, which may or may not allow you to do what you need to do.

The price of runtime checks is runtime performance.

Safety is always good.  To me, the question is never if I want safety,
but if I can afford it.  If I can't afford to pay the price of runtime
checks, I may still want the compile-time checks.  If I can't afford to
pay the price of compile-time checks, I may still want the runtime
checks.  Thus, to me, the concepts of runtime and compile-time checks
are orthogonal.

I hear what you're saying, but I am not enthusiastic at all about
defining and advertising a half-pregnant state. Such a language is the
worst of all worlds - it's frustrating to code in yet gives no guarantee
to anyone. I don't see this going anywhere interesting. "Yeah, we have
safety, and we also have, you know, half safety - it's like only a lap
belt of sorts: inconvenient like crap and doesn't really help in an
accident." I wouldn't want to code in such a language.

A module either passes the compile-time checks or it does not.  It makes
no sense make the compile-time checks optional for some modules.  If the
module is written to pass the compile-time checks (i.e. uses the safe
subset of the language), then the compile-time checks should always be
performed for that module.

I think that's the current intention indeed.

Andrei

Nov 04 2009
Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
I hear what you're saying, but I am not enthusiastic at all about
defining and advertising a half-pregnant state. Such a language is the
worst of all worlds - it's frustrating to code in yet gives no guarantee
to anyone. I don't see this going anywhere interesting. "Yeah, we have
safety, and we also have, you know, half safety - it's like only a lap
belt of sorts: inconvenient like crap and doesn't really help in an
accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal.  Not only
is this in direct contradiction to the attempts to allow both safe and
unsafe modules to coexist in the same program, it is in contradiction
with all existing programming languages, every single one of which
offers some safety features but not absolute 100% safety.

If you have a formal definition of safety, please post it.  Without such
a definition, I will use my own informal definition of safety for the
rest of this post: "a safety feature is a language feature that reduces
programming errors."

First, to demonstrate that all programming languages in existence offer
some safety features.  With some esoteric exceptions (whitespace, hq9+),
all programming languages have a syntax with some level of redundancy.
This allows the language implementation to reject some inputs as
syntactically incorrect.  A redundant syntax is a safety feature.

Another example relevant to D: D requires an explicit cast when
converting an integer to a pointer.  This is another safety feature.

Now to demonstrate that no language offers 100% safety.  In the
abstract, no language can guarantee that a program matches the
programmer's intention.  However, let's look at a more specific form of
safety: safety from dereferencing dangling pointers.  To guarantee this,
you would need to guarantee that the compiler never generates faulty
code that causes the a dangling pointer to be dereferenced.  If the
program makes any system calls at all, you would also need to guarantee
that no bugs in the OS cause a dangling pointer to be dereferenced.
Both of these are clearly impossible.  No language can offer 100% safety.

Moreover, that safety necessarily reduces convenience is clearly false.
This /only/ applies to compile-time checks.  Runtime checks are purely
an implementation issue.  Even C and assembly can be implemented such
that all instances of undefined behavior are trapped at runtime.

Conversely, the performance penalty of safety applies mostly to runtime
checks.  If extensive testing with these checks turned on fails to
reveal any bugs, it is entirely reasonable to remove these checks for
the final release.

--
Rainer Deyke - rainerd eldwood.com

Nov 04 2009
Don <nospam nospam.com> writes:
Rainer Deyke wrote:
Andrei Alexandrescu wrote:
I hear what you're saying, but I am not enthusiastic at all about
defining and advertising a half-pregnant state. Such a language is the
worst of all worlds - it's frustrating to code in yet gives no guarantee
to anyone. I don't see this going anywhere interesting. "Yeah, we have
safety, and we also have, you know, half safety - it's like only a lap
belt of sorts: inconvenient like crap and doesn't really help in an
accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal.  Not only
is this in direct contradiction to the attempts to allow both safe and
unsafe modules to coexist in the same program, it is in contradiction
with all existing programming languages, every single one of which
offers some safety features but not absolute 100% safety.

If you have a formal definition of safety, please post it.  Without such
a definition, I will use my own informal definition of safety for the
rest of this post: "a safety feature is a language feature that reduces
programming errors."

First, to demonstrate that all programming languages in existence offer
some safety features.  With some esoteric exceptions (whitespace, hq9+),
all programming languages have a syntax with some level of redundancy.
This allows the language implementation to reject some inputs as
syntactically incorrect.  A redundant syntax is a safety feature.

Another example relevant to D: D requires an explicit cast when
converting an integer to a pointer.  This is another safety feature.

Now to demonstrate that no language offers 100% safety.  In the
abstract, no language can guarantee that a program matches the
programmer's intention.  However, let's look at a more specific form of
safety: safety from dereferencing dangling pointers.  To guarantee this,
you would need to guarantee that the compiler never generates faulty
code that causes the a dangling pointer to be dereferenced.  If the
program makes any system calls at all, you would also need to guarantee
that no bugs in the OS cause a dangling pointer to be dereferenced.
Both of these are clearly impossible.  No language can offer 100% safety.

Moreover, that safety necessarily reduces convenience is clearly false.
This /only/ applies to compile-time checks.  Runtime checks are purely
an implementation issue.  Even C and assembly can be implemented such
that all instances of undefined behavior are trapped at runtime.

Conversely, the performance penalty of safety applies mostly to runtime
checks.  If extensive testing with these checks turned on fails to
reveal any bugs, it is entirely reasonable to remove these checks for
the final release.

I'm in complete agreement with you, Reiner.
What I got from Bartosz' original post was that a large class of bugs
could be eliminated fairly painlessly via some compile-time checks. It
seemed to be based on pragmatic concerns. I applauded it. (I may have
misread it, of course).
Now, things seem to have left pragmatism and got into ideology. Trying
to eradicate _all_ possible memory corruption bugs is extremely
difficult in a language like D. I'm not at all convinced that it is
realistic (ends up too painful to use). It'd be far more reasonable if
we had non-nullable pointers, for example.

The ideology really scares me, because 'memory safety' covers just one
class of bug. What everyone wants is to drive the _total_ bug count
down, and we can improve that dramatically with basic compile-time
checks. But demanding 100% memory safety has a horrible cost-benefit
tradeoff. It seems like a major undertaking.

And I doubt it would convince anyone, anyway. To really guarantee memory
safety, you need a bug-free compiler...

Nov 05 2009
Michal Minich <michal minich.sk> writes:
Hello Don,

Rainer Deyke wrote:

Andrei Alexandrescu wrote:

I hear what you're saying, but I am not enthusiastic at all about
defining and advertising a half-pregnant state. Such a language is
the worst of all worlds - it's frustrating to code in yet gives no
guarantee to anyone. I don't see this going anywhere interesting.
"Yeah, we have safety, and we also have, you know, half safety -
it's like only a lap belt of sorts: inconvenient like crap and
doesn't really help in an accident." I wouldn't want to code in such
a language.

only is this in direct contradiction to the attempts to allow both
safe and unsafe modules to coexist in the same program, it is in
contradiction with all existing programming languages, every single
one of which offers some safety features but not absolute 100%
safety.

If you have a formal definition of safety, please post it.  Without
such a definition, I will use my own informal definition of safety
for the rest of this post: "a safety feature is a language feature
that reduces programming errors."

First, to demonstrate that all programming languages in existence
offer some safety features.  With some esoteric exceptions
(whitespace, hq9+), all programming languages have a syntax with some
level of redundancy. This allows the language implementation to
reject some inputs as syntactically incorrect.  A redundant syntax is
a safety feature.

Another example relevant to D: D requires an explicit cast when
converting an integer to a pointer.  This is another safety feature.

Now to demonstrate that no language offers 100% safety.  In the
abstract, no language can guarantee that a program matches the
programmer's intention.  However, let's look at a more specific form
of
safety: safety from dereferencing dangling pointers.  To guarantee
this,
you would need to guarantee that the compiler never generates faulty
code that causes the a dangling pointer to be dereferenced.  If the
program makes any system calls at all, you would also need to
guarantee
that no bugs in the OS cause a dangling pointer to be dereferenced.
Both of these are clearly impossible.  No language can offer 100%
safety.
Moreover, that safety necessarily reduces convenience is clearly
false. This /only/ applies to compile-time checks.  Runtime checks
are purely an implementation issue.  Even C and assembly can be
implemented such that all instances of undefined behavior are trapped
at runtime.

Conversely, the performance penalty of safety applies mostly to
runtime checks.  If extensive testing with these checks turned on
fails to reveal any bugs, it is entirely reasonable to remove these
checks for the final release.

What I got from Bartosz' original post was that a large class of bugs
could be eliminated fairly painlessly via some compile-time checks. It
seemed to be based on pragmatic concerns. I applauded it. (I may have
misread it, of course).
Now, things seem to have left pragmatism and got into ideology. Trying
to eradicate _all_ possible memory corruption bugs is extremely
difficult in a language like D. I'm not at all convinced that it is
realistic (ends up too painful to use). It'd be far more reasonable if
we had non-nullable pointers, for example.
The ideology really scares me, because 'memory safety' covers just one
class of bug. What everyone wants is to drive the _total_ bug count
down, and we can improve that dramatically with basic compile-time
checks. But demanding 100% memory safety has a horrible cost-benefit
tradeoff. It seems like a major undertaking.

And I doubt it would convince anyone, anyway. To really guarantee
memory safety, you need a bug-free compiler...

I don't know how this could have anything with ideology? Are Java and C#
ideological languages ? Certainly - if you see memory safety as ideology
- you cannot escape form it in  these languages.

Currently in D exist pure functions, but you are not obliged o used them.
I think the memory safety should be handled the same way, mark a function
safe, if you want/need to restrict yourself to this style of coding.  And
just don't use it if you don't need, or cant -same as pure and nothrow.

Notice that if you code your function safe, it would have only one negative
impact on the caller - runtime bounds checking. I Admit it is not good. There
are good reasons to require speed. As the standard libraries would use safe
code - I'm not sure if it would be required to distribute two versions of
.lib one with bounds checked safe code and one without bound checking on
safe code?

I think what concerns you is also how safety would affect use of D statements
and expression, that it would be too difficult/awkward to use; I don't know
exactly, but imagine it to be simpler - just like Java/C#(?)

I there should be memory safety in D, I see no other possibility as to specify
it per function and provide compiler switch to turn off bounds checking for
safe code if need. I see it as most flexible for "code writers" and least
interfering with "code users"; there is no need for trade-off.

Compiler switch that would magically force safety on some code - would just
not  compile is no way (and specifying safety per module is too grainy -
both for code users and writers).

Btw. I think non-nullable pointers are equally important, but I see no prospect
of them being implemented :(

Nov 05 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Don wrote:
Rainer Deyke wrote:
Andrei Alexandrescu wrote:
I hear what you're saying, but I am not enthusiastic at all about
defining and advertising a half-pregnant state. Such a language is the
worst of all worlds - it's frustrating to code in yet gives no guarantee
to anyone. I don't see this going anywhere interesting. "Yeah, we have
safety, and we also have, you know, half safety - it's like only a lap
belt of sorts: inconvenient like crap and doesn't really help in an
accident." I wouldn't want to code in such a language.

Basically you're saying that safety is an all or nothing deal.  Not only
is this in direct contradiction to the attempts to allow both safe and
unsafe modules to coexist in the same program, it is in contradiction
with all existing programming languages, every single one of which
offers some safety features but not absolute 100% safety.

If you have a formal definition of safety, please post it.  Without such
a definition, I will use my own informal definition of safety for the
rest of this post: "a safety feature is a language feature that reduces
programming errors."

First, to demonstrate that all programming languages in existence offer
some safety features.  With some esoteric exceptions (whitespace, hq9+),
all programming languages have a syntax with some level of redundancy.
This allows the language implementation to reject some inputs as
syntactically incorrect.  A redundant syntax is a safety feature.

Another example relevant to D: D requires an explicit cast when
converting an integer to a pointer.  This is another safety feature.

Now to demonstrate that no language offers 100% safety.  In the
abstract, no language can guarantee that a program matches the
programmer's intention.  However, let's look at a more specific form of
safety: safety from dereferencing dangling pointers.  To guarantee this,
you would need to guarantee that the compiler never generates faulty
code that causes the a dangling pointer to be dereferenced.  If the
program makes any system calls at all, you would also need to guarantee
that no bugs in the OS cause a dangling pointer to be dereferenced.
Both of these are clearly impossible.  No language can offer 100% safety.

Moreover, that safety necessarily reduces convenience is clearly false.
This /only/ applies to compile-time checks.  Runtime checks are purely
an implementation issue.  Even C and assembly can be implemented such
that all instances of undefined behavior are trapped at runtime.

Conversely, the performance penalty of safety applies mostly to runtime
checks.  If extensive testing with these checks turned on fails to
reveal any bugs, it is entirely reasonable to remove these checks for
the final release.

I'm in complete agreement with you, Reiner.
What I got from Bartosz' original post was that a large class of bugs
could be eliminated fairly painlessly via some compile-time checks. It
seemed to be based on pragmatic concerns. I applauded it. (I may have
misread it, of course).
Now, things seem to have left pragmatism and got into ideology. Trying
to eradicate _all_ possible memory corruption bugs is extremely
difficult in a language like D. I'm not at all convinced that it is
realistic (ends up too painful to use). It'd be far more reasonable if
we had non-nullable pointers, for example.

The ideology really scares me, because 'memory safety' covers just one
class of bug. What everyone wants is to drive the _total_ bug count
down, and we can improve that dramatically with basic compile-time
checks. But demanding 100% memory safety has a horrible cost-benefit
tradeoff. It seems like a major undertaking.

And I doubt it would convince anyone, anyway. To really guarantee memory
safety, you need a bug-free compiler...

I protest against using "ideology" when characterizing safety. It
instantly lowers the level of the discussion. There is no ideology being
pushed here, just a clear notion with equally clear benefits. I think it
is a good time we all get informed a bit more.

First off: _all_ languages except C, C++, and assembler are or at least
claim to be safe. All. I mean ALL. Did I mention all? If that was some
ideology that is not realistic, is extremely difficult to achieve, and
ends up too painful to use, then such theories would be difficult to
corroborate with "ALL". Walter and I are in agreement that safety is not
difficult to achieve in D and that it would allow a great many good
programs to be written.

Second, there are not many definitions of what safe means and no ifs and
buts about it. This whole wishy-washy notion of wanting just a little
bit of pregnancy is just not worth pursuing. The definition is given in
Pierce's book "Types and Programming Languages" but I was happy
yesterday to find a free online book section by Luca Cardelli:

http://www.eecs.umich.edu/~bchandra/courses/papers/Cardelli_Types.pdf

The text is very approachable and informative, and I suggest anyone
interested to read through page 5 at least. I think it's a must for
anyone participating in this to read the whole thing. Cardelli
distinguishes between programs with "trapped errors" versus programs
with "untrapped errors". Yesterday Walter and I have had a long
discussion, followed by an email communication between Cardelli and
myself, which confirmed that these three notions are equivalent:

a) "memory safety" (notion we used so far)
b) "no undefined behavior" (C++ definition, suggested by Walter)
c) "no untrapped errors" (suggested by Cardelli)

I suspect "memory safety" is the weakest marketing terms of the three.
For example, there's this complaint above: "'memory safety' covers just
one class of bug." But when you think of programs with undefined
behavior vs. programs with entirely defined behavior, you realize what
an important class of bugs that is. Non-nullable pointers are mightily
useful, but "no undefined behavior" is quite a bit better to have.

The argument about memory safety requiring a bug-free compiler is
correct. It was actually aired quite a bit in Java's first years. It can
be confidently said that Java won that argument. Why? Because Java had a
principled approach that slowly but surely sealed all the gaps. The fact
that dmd has bugs now should be absolutely no excuse for us to give up
on defining a safe subset of the language.

Andrei

Nov 05 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset
of language features it can use is reduced) and the cost for the compiler
(to increase the subset of language features that can be used, the
compiler has to be much smarter).

Most languages have a lot of developers, and can afford making the
compiler smarter to allow safety with a low cost for the programmer (at
least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the language
safe and useful was already absorbed.

A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why are
they realistic and SafeD wouldn't? Just because we know we could do it
in unsafe D?

I like the idea of having a safe subset in D, but I think being
a programming language, *runtime* safety should be *always* a choice for
the user compiling the code.

Well in that case we need to think again about the command-line options.

As other said, you can never be 100% sure your program won't blow for
unknown reasons (it could do that because a bug in the
compiler/interpreter, or even because a hardware problem), you can just
try to make it as difficult as possible, but 100% safety doesn't exist.

I understand that stance, but I don't find it useful.

Andrei

Nov 05 2009
Jesse Phillips <jessekphillips+D gmail.com> writes:
Andrei Alexandrescu Wrote:

Leandro Lucarella wrote:
A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why are
they realistic and SafeD wouldn't? Just because we know we could do it
in unsafe D?

I think part of the problem is that current users of D have picked it up
because they do get this power. But it makes sense that there are potential
users that would like the compiler to prevent them from unsafe constructs. And
I can't imagine it being more restrictive than Java or C# which are very
popular languages.

I do like the different approaches though taken by C# and D. C# took a safe
model and punched holes in it. D is taking an unsafe model and restricting it.

Nov 05 2009
dsimcha <dsimcha yahoo.com> writes:
== Quote from Leandro Lucarella (llucax gmail.com)'s article
Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset
of language features it can use is reduced) and the cost for the compiler
(to increase the subset of language features that can be used, the
compiler has to be much smarter).

Most languages have a lot of developers, and can afford making the
compiler smarter to allow safety with a low cost for the programmer (at
least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the
language safe and useful was already absorbed.

because of safety), so using D as it were Java yields very inefficient
programs (using classes and new all over the places).

Why does safety have to do w/ Java's GC quality?  IMHO it's more a language
maturity and money thing.  The only major constraint on D GC is unions and even
in
that case, all we need is one bit that says that stuff in unions needs to be
pinned.  I think we already agree that storing the only pointer to GC allocated
memory in non-pointer types, xor linked lists involving GC allocated memory,
etc.
are undefined behavior.  Other than that and lack of manpower, what prevents a
really, really good GC from being implemented in D?

Nov 05 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Sorry, I forgot to mention one thing. My example of List in the thread
"An interesting consequence of safety requirements" used struct, but it
should be mentioned there's a completely safe alternative: just define
List as a class and there is no safety problem at all. Java, C#, and
others define lists as classes and it didn't seem to kill them. I agree
that using a struct in D would be marginally more efficient, but that
doesn't mean that if I want safety I'm dead in the water. In particular
it's great that pointers are still usable in SafeD. I'm actually
surprised that nobody sees how nicely safety fits D, particularly its
handling of "ref".

Andrei

Nov 05 2009
Max Samukha <spambox d-coding.com> writes:
On Thu, 5 Nov 2009 21:29:43 -0300, Leandro Lucarella
<llucax gmail.com> wrote:

See my other response about efficiency of D when using new/classes a lot.
You just can't do it efficiently in D, ask bearophile for some benchmarks
;)

This is in part because D doesn't have a compacting GC. A compacting
GC implies allocation speeds comparable with the speed of allocation
on stack. I guess many bearophile's benchmarks do not account for GC
collection cycles, which should be slower in C#/Java because of the
need to move objects. I think, fair benchmarks should always include
garbage collection times.

Nov 06 2009
bearophile <bearophileHUGS lycos.com> writes:
Max Samukha:

I guess many bearophile's benchmarks do not account for GC
collection cycles,

I have not explored this well yet. From what I've seen, D is sometimes dead
slow at the end program, when many final deallocations happen. In the Java
versions of the tests this doesn't happen. A lot of time ago someone has even
written a patch for the D1 GC to reduce that problem.

Bye,
bearophile

Nov 06 2009
Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
First off: _all_ languages except C, C++, and assembler are or at least
claim to be safe. All. I mean ALL. Did I mention all? If that was some
ideology that is not realistic, is extremely difficult to achieve, and
ends up too painful to use, then such theories would be difficult to
corroborate with "ALL". Walter and I are in agreement that safety is not
difficult to achieve in D and that it would allow a great many good
programs to be written.

You're forgetting about all other system programming languages.  Also,
many of these claims to safety are demonstrably false.

The text is very approachable and informative, and I suggest anyone
interested to read through page 5 at least. I think it's a must for
anyone participating in this to read the whole thing. Cardelli
distinguishes between programs with "trapped errors" versus programs
with "untrapped errors". Yesterday Walter and I have had a long
discussion, followed by an email communication between Cardelli and
myself, which confirmed that these three notions are equivalent:

a) "memory safety" (notion we used so far)
b) "no undefined behavior" (C++ definition, suggested by Walter)
c) "no untrapped errors" (suggested by Cardelli)

They are clearly not equivalent.  ++x + ++x has nothing to do with
memory safety.  Conversely, machine language has no concept of undefined
behavior but is clearly not memory safe.  Also, you haven't formally
defined any of these concepts, so you're basically just hand-waving.

--
Rainer Deyke - rainerd eldwood.com

Nov 05 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Rainer Deyke wrote:
Andrei Alexandrescu wrote:
First off: _all_ languages except C, C++, and assembler are or at least
claim to be safe. All. I mean ALL. Did I mention all? If that was some
ideology that is not realistic, is extremely difficult to achieve, and
ends up too painful to use, then such theories would be difficult to
corroborate with "ALL". Walter and I are in agreement that safety is not
difficult to achieve in D and that it would allow a great many good
programs to be written.

You're forgetting about all other system programming languages.

[citation needed]

Also,
many of these claims to safety are demonstrably false.

Which?

The text is very approachable and informative, and I suggest anyone
interested to read through page 5 at least. I think it's a must for
anyone participating in this to read the whole thing. Cardelli
distinguishes between programs with "trapped errors" versus programs
with "untrapped errors". Yesterday Walter and I have had a long
discussion, followed by an email communication between Cardelli and
myself, which confirmed that these three notions are equivalent:

a) "memory safety" (notion we used so far)
b) "no undefined behavior" (C++ definition, suggested by Walter)
c) "no untrapped errors" (suggested by Cardelli)

They are clearly not equivalent.  ++x + ++x has nothing to do with
memory safety.  Conversely, machine language has no concept of undefined
behavior but is clearly not memory safe.  Also, you haven't formally
defined any of these concepts, so you're basically just hand-waving.

Memory safety is defined formally in Pierce's book. Undefined behavior
is defined by the C++ standard. Cardelli defines trapped and untrapped
errors.

Andrei

Nov 05 2009
Rainer Deyke <rainerd eldwood.com> writes:
Andrei Alexandrescu wrote:
Rainer Deyke wrote:
You're forgetting about all other system programming languages.

Delphi.

Also,
many of these claims to safety are demonstrably false.

Which?

I can get Python to segfault.

Memory safety is defined formally in Pierce's book.

Do you mean "Types and programming languages" by Benjamin C. Pierce?
According to Google books, it does not contain the phrase "memory
safety".  It does contain a section of "language safety", which says
that "a safe language is one that protects its own abstractions".  By
that definition, machine language is safe, because it has no
abstractions to protect.

Another quote: "Language safety can be achieved by static checking, but
also by run-time checks that trap nonsensical operations just at the
moment when they are attempted and stop the program or raise an
exception".  In other words, Pierce sees runtime checks and compile-time
checks as orthogonal methods for providing the same safety.

Undefined behavior
is defined by the C++ standard.

Undefined behavior is a simple concept: the language specification does
not define what will happen when the program invokes undefined behavior.
Undefined behavior can be trivially eliminated from the language by
replacing it with defined behavior.  If a language construct is defined
to trash the process memory space, then it is not undefined behavior.

Cardelli defines trapped and untrapped
errors.

Untrapped error: An execution error that does not immediately result in
a fault.

I can't find his definition of "execution error", which makes this
definition useless to me.

--
Rainer Deyke - rainerd eldwood.com

Nov 05 2009
Leandro Lucarella <llucax gmail.com> writes:
dsimcha, el  6 de noviembre a las 02:13 me escribiste:
== Quote from Leandro Lucarella (llucax gmail.com)'s article
Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset
of language features it can use is reduced) and the cost for the compiler
(to increase the subset of language features that can be used, the
compiler has to be much smarter).

Most languages have a lot of developers, and can afford making the
compiler smarter to allow safety with a low cost for the programmer (at
least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the
language safe and useful was already absorbed.

because of safety), so using D as it were Java yields very inefficient
programs (using classes and new all over the places).

Why does safety have to do w/ Java's GC quality?

Because you don't have unions and other things that prevents the GC from
being fully precise.

IMHO it's more a language maturity and money thing.

That's another reason, but the Boehm GC is probably one of the more
advanced and state of the art GCs and I don't think it's close to what the
Java GC can do (I didn't see recent benchmarks though, so I might be
completely wrong :)

The only major constraint on D GC is unions and even in that case, all
we need is one bit that says that stuff in unions needs to be pinned.
I think we already agree that storing the only pointer to GC allocated
memory in non-pointer types, xor linked lists involving GC allocated
memory, etc.  are undefined behavior.  Other than that and lack of
manpower, what prevents a really, really good GC from being implemented
in D?

Having a precise stack and registers. Java has a VM that provides all that
information. Maybe the distance between a good Java GC and a good D GC can
be narrowed a lot, but I don't think D could ever match Java (or other
languages with full precise scanning).

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
I always get the feeling that when lesbians look at me, they're thinking,
'*That's* why I'm not a heterosexual.'
-- George Constanza

Nov 05 2009
Leandro Lucarella <llucax gmail.com> writes:
Max Samukha, el  6 de noviembre a las 11:10 me escribiste:
On Thu, 5 Nov 2009 21:29:43 -0300, Leandro Lucarella
<llucax gmail.com> wrote:

See my other response about efficiency of D when using new/classes a lot.
You just can't do it efficiently in D, ask bearophile for some benchmarks
;)

This is in part because D doesn't have a compacting GC. A compacting
GC implies allocation speeds comparable with the speed of allocation
on stack. I guess many bearophile's benchmarks do not account for GC
collection cycles, which should be slower in C#/Java because of the
need to move objects. I think, fair benchmarks should always include
garbage collection times.

I don't think it's slower, because GCs usually treat differently small and
large objects (the D GC already does that). So very small objects (that
are the ones more likely to get allocated and freed in huge ammounts) are
copied and large objects usually not. Moving a small object is not much
more work than doing a sweep, and you get the extra bonus of not having to
scan the whole heap, just the live data. This is a huge gain, which make
moving collectors very fast (at the expense of extra memory since you have
to reserve twice the programs working set).

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Dale tu mano al mono, pero no el codo, dado que un mono confianzudo es
irreversible.
-- Ricardo Vaporeso. La Reja, Agosto de 1912.

Nov 06 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 09:57 me escribiste:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset
of language features it can use is reduced) and the cost for the compiler
(to increase the subset of language features that can be used, the
compiler has to be much smarter).

Most languages have a lot of developers, and can afford making the
compiler smarter to allow safety with a low cost for the programmer (at
least when writing code, that cost might be higher performance-wise).

D is already a rich superset of Java. So the cost of making the
language safe and useful was already absorbed.

That's an unfair comparison. Java has a very efficient GC (partially
because of safety), so using D as it were Java yields very inefficient
programs (using classes and new all over the places). D can't be
completely safe and because of that, it's doomed to have a quite worse GC,
so writing code a la Java in D beats the purpose of using D in the first
place.

A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Most other languages do not allow taking addresses of locals. Why
are they realistic and SafeD wouldn't? Just because we know we could
do it in unsafe D?

Because in other languages there are no locals! All objects are references
and allocated in the heap (except, maybe, some value types). Again, you
can do that in D too, but because D is a system language, you can't assume
a lot of things and it has a lot less optimization opportune ties,
yielding bad performance when not used wisely.

I like the idea of having a safe subset in D, but I think being
a programming language, *runtime* safety should be *always* a choice for
the user compiling the code.

Well in that case we need to think again about the command-line options.

Not necessarily, -release is already there =)

But then, I don't have any issues with the GCC-way of hundreds of compiler
flags to have fine grained control, so I'm all for adding new flags for
that.

As other said, you can never be 100% sure your program won't blow for
unknown reasons (it could do that because a bug in the
compiler/interpreter, or even because a hardware problem), you can just
try to make it as difficult as possible, but 100% safety doesn't exist.

I understand that stance, but I don't find it useful.

The usefulness is that D can't be 100% safe, so spending time in trying to
make it that way (specially at the expense of flexibility, i.e., don't
providing a way to disable bound-checking in safe modules) makes no sense.
You'll just end up with a less efficient Java.

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
The average person laughs 13 times a day

Nov 05 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 10:06 me escribiste:
Leandro Lucarella wrote:
A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

Sorry, I forgot to mention one thing. My example of List in the
thread "An interesting consequence of safety requirements" used
struct, but it should be mentioned there's a completely safe
alternative: just define List as a class and there is no safety
problem at all. Java, C#, and others define lists as classes and it
didn't seem to kill them. I agree that using a struct in D would be
marginally more efficient, but that doesn't mean that if I want
safety I'm dead in the water. In particular it's great that pointers
are still usable in SafeD. I'm actually surprised that nobody sees
how nicely safety fits D, particularly its handling of "ref".

See my other response about efficiency of D when using new/classes a lot.
You just can't do it efficiently in D, ask bearophile for some benchmarks
;)

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Wake from your sleep,
the drying of your tears,
Today we escape, we escape.

Nov 05 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  4 de noviembre a las 08:16 me escribiste:
Michal Minich wrote:
Hello Michel,

module (system) name;         // interface: unsafe   impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

I thought that first (unsafe-unsafe) case is currently available just by:

module name; // interface: unsafe   impl.: unsafe

separating modules to unsafe-unsafe and safe-safe  has no
usefulness - as those modules could not interact, specifically you
need modules that are implemented by unsafe means, but provides
only safe interface, so I see it as:

module name;                  // interface: unsafe   impl.: unsafe
module (system) name;         // interface: safe     impl.: unsafe
module (safe) name;           // interface: safe     impl.: safe

so you can call system modules (io, network...) from safe code.

I think safe should be the default, as it should be the most used flavor
in user code, right? What about:

module s;             // interface: safe     impl.: safe
module (trusted) t;   // interface: safe     impl.: unsafe
module (unsafe) u;    // interface: unsafe   impl.: unsafe

* s can import other safe or trusted modules (no unsafe for s).
* t can import any kind of module, but he guarantee not to corrupt your
memory if you use it (that's why s can import it).
* u can import any kind of modules and makes no guarantees (C bindings
use this).

That's a pretty clean design. How would it interact with a -safe
command-line flag?

I'll use safe by default. If you want to use broken stuff (everything
should be correctly marked as safe (default), trusted or unsafe) and let
it compile anyway, add a compiler flag -no-safe (or whatever).

But people should never use it, unless you are using some broken library
or you are to lazy to mark your modules correctly.

Is this too crazy?

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
El discman vuelve locos a los controles, te lleva a cualquier lugar.
AjÃºstense pronto los cinturones, nos vamos a estrellar.
Evidentemente, no escuchaste el speech,
que dio la azafata, antes de despegar.

Nov 04 2009
Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 14:03:42 -0300, Leandro Lucarella wrote:

I think safe should be the default, as it should be the most used flavor
in user code, right? What about:

module s;             // interface: safe     impl.: safe
module (trusted) t;   // interface: safe     impl.: unsafe
module (unsafe) u;    // interface: unsafe   impl.: unsafe

* s can import other safe or trusted modules (no unsafe for s). * t can
import any kind of module, but he guarantee not to corrupt your
memory if you use it (that's why s can import it).
* u can import any kind of modules and makes no guarantees (C bindings
use this).

That's a pretty clean design. How would it interact with a -safe
command-line flag?

I'll use safe by default. If you want to use broken stuff (everything
should be correctly marked as safe (default), trusted or unsafe) and let
it compile anyway, add a compiler flag -no-safe (or whatever).

But people should never use it, unless you are using some broken library
or you are to lazy to mark your modules correctly.

Is this too crazy?

I have no problem with safe as default, most of my code is safe. I also
like the module (trusted) - it really pictures it meanings, better than
"system".

But I think there is no reason no use -no-safe compiler flag ... for what
reason one would want to force safer program to compile as less safer :)

As I'm thinking more about it, I don't see any reason to have any
compiler flag for safety at all.

Nov 04 2009
Leandro Lucarella <llucax gmail.com> writes:
Michal Minich, el  4 de noviembre a las 18:58 me escribiste:
As I'm thinking more about it, I don't see any reason to have any
compiler flag for safety at all.

That was exacly my point.

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Be nice to nerds
Chances are you'll end up working for one

Nov 04 2009
Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 13:12:54 -0600, Andrei Alexandrescu wrote:

But I think there is no reason no use -no-safe compiler flag ... for
what reason one would want to force safer program to compile as less
safer :)

Efficiency (e.g. remove array bounds checks).

As I'm thinking more about it, I don't see any reason to have any
compiler flag for safety at all.

That would be a great turn of events!!!

Andrei

Memory safety is pretty specific thing, If you want it, you want it all,
not just some part of it - then you cannot call it memory safety. The
idea of safe module, which under some compiler switch is not safe does
not appeal to me. But efficiency is also important, and if you want it,
why not move the code subjected to bounds checks to trusted/system module
- I hope they are not checked for bounds in release mode. Moving parts of
the code to trusted modules is more semantically describing, compared to
crude tool of ad-hoc compiler switch.

One thing I'm concerned with, whether there is compiler switch or not, is
that module numbers will increase, as you will probably want to split
some modules in two, because some part may be safe, and some not. I'm
wondering why the safety is not discussed on function level, similarly as
pure and nothrow currently exists. I'm not sure this would be good, just
wondering. Was this topic already discussed?

Nov 04 2009
Michal Minich <michal.minich gmail.com> writes:
On Wed, 04 Nov 2009 14:24:47 -0600, Andrei Alexandrescu wrote:

But efficiency is also important, and if you want it, why not move the
code subjected to bounds checks to trusted/system module - I hope they
are not checked for bounds in release mode. Moving parts of the code to
trusted modules is more semantically describing, compared to crude tool
of ad-hoc compiler switch.

Well it's not as simple as that. Trusted code is not unchecked code -
it's code that may drop redundant checks here and there, leaving code
correct, even though the compiler cannot prove it. So no, there's no
complete removal of bounds checking. But a trusted module is allowed to
replace this:

foreach (i; 0 .. a.length) ++a[i];

with

foreach (i; 0 .. a.length) ++a.ptr[i];

The latter effectively escapes checks because it uses unchecked pointer
arithmetic. The code is still correct, but this time it's the human
vouching for it, not the compiler.

One thing I'm concerned with, whether there is compiler switch or not,
is that module numbers will increase, as you will probably want to
split some modules in two, because some part may be safe, and some not.
I'm wondering why the safety is not discussed on function level,
similarly as pure and nothrow currently exists. I'm not sure this would
be good, just wondering. Was this topic already discussed?

This is a relatively new topics, and you pointed out some legit kinks.
One possibility I discussed with Walter is to have version(safe) vs.
version(system) or so. That would allow a module to expose different
interfaces depending on the command line switches.

Andrei

Sorry for the long post, but it should explain how safety specification
should work (and how not).

Consider these 3 ways of specifying memory safety:

safety specification at module level (M)
safety specification at function level (F)
safety specification using version switching (V)

I see a very big difference between these things:
while the M and F are "interface" specification, V is implementation
detail.

This difference applies only to library/module users, it causes no
difference for library/module writer - he must always decide if he writes
safe, unsafe or trusted code

Imagine scenario with M safety for library user:
Library user wants to make memory safe application. He marks his main
module as safe, and can be sure (and/or trust), that his application is
safe from this point on; because safety is explicit in "interface" he
cannot import and use unsafe code.

scenario with V safety:
Library user wants to make memory safe application. He can import any
module. He can use -safe switch on compiler so compiler will use safe
version of code - if available! User can be never sure if his application
is safe or not. Safety is implementation detail!

For this reason, I think V safety is very unsuitable option. Absolutely
useless.

But there are also problems with M safety.
Imagine module for string manipulation with 10 independent functions. The
module is marked safe. Library writer then decides add another function,
which is unsafe. He can now do following:

Option 1: He can mark the module trusted, and implement the function in
unsafe way. Compatibility with safe clients using this module will
remain. Bad thing: there are 10 provably safe functions, which are not
checked by compiler. Also the trust level of module is lower in eyes of
user. Library may end us with all modules as trusted (no safe).

Option 2: He will implement this in separate unsafe module. This has
negative impact on library structure.

Option 3: He will implement this in separate trusted module and publicly
import this trusted module in original safe module.

The thirds options is transparent for module user, and probably the best
solution, but I have a feeling that many existing modules will end having
their unsafe twin. I see this pattern to emerge:

module(safe) std.string
module(trusted) std.string_trusted // do not import, already exposed by
std.string

Therefore I propose to use F safety.

It is in fact the same beast as pure and nothrow - they also guarantee
some kind of safety, and they are also part of function interface
(signature). Compiler also needs to perform stricter check as normally.

Just imagine marking entire module pure or nothrow. If certainly
possible, is it practical? You would find yourself splitting your
functions into separate modules with specific check, or not using pure
and nothrow entirely.

This way, if you mark your main function safe, you can be sure(and/or
trust) your application is safe. More usually - you can use safe only for
some functions and this requirement will propagate to all called
functions, the same way as for pure or nothrow.

One think to figure out remains how to turn of runtime bounds checking
for trusted code (and probably safe too). This is legitimate requirement,
because probably all the standard library will be safe or trusted, and
users which are not concerned with safety and want speed, need to have
this compiler switch.

Nov 04 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  5 de noviembre a las 08:48 me escribiste:
First off: _all_ languages except C, C++, and assembler are or at
least claim to be safe. All. I mean ALL. Did I mention all? If that
was some ideology that is not realistic, is extremely difficult to
achieve, and ends up too painful to use, then such theories would be
difficult to corroborate with "ALL". Walter and I are in agreement
that safety is not difficult to achieve in D and that it would allow
a great many good programs to be written.

I think the problem is the cost. The cost for the programmer (the subset
of language features it can use is reduced) and the cost for the compiler
(to increase the subset of language features that can be used, the
compiler has to be much smarter).

Most languages have a lot of developers, and can afford making the
compiler smarter to allow safety with a low cost for the programmer (at
least when writing code, that cost might be higher performance-wise).

A clear example of this, is not being able to take the address of a local.
This is too restrictive to be useful, as you pointed in you post about
having to write static methods because of this. If you can't find
a workaround for this, I guess safety in D can look a little unrealistic.

I like the idea of having a safe subset in D, but I think being
a programming language, *runtime* safety should be *always* a choice for
the user compiling the code.

As other said, you can never be 100% sure your program won't blow for
unknown reasons (it could do that because a bug in the
compiler/interpreter, or even because a hardware problem), you can just
try to make it as difficult as possible, but 100% safety doesn't exist.

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Se ha dicho tanto que las apariencias engaÃ±an
Por supuesto que engaÃ±arÃ¡n a quien sea tan vulgar como para creerlo

Nov 05 2009
Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el  3 de noviembre a las 17:54 me escribiste:
Leandro Lucarella wrote:
Andrei Alexandrescu, el  3 de noviembre a las 16:33 me escribiste:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we
currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default
it is "system", and can be overridden with "-safe".

What's the rationale for letting the compiler decide? I can't see nothing
but trouble about this. A module will tipically be writen to be safe or
system, I think the default should be defined (I'm not sure what the
default should be though).

The parenthesis pretty much destroys your point :o).

I guess this is a joke, but I have to ask: why? I'm not sure about plenty
of stuff, that doesn't mean they are pointless.

I don't think letting the implementation decide is a faulty model.
If you know what you want, you say it. Otherwise it means you don't
care.

I can't understand how you can't care. Maybe I'm misunderstanding the
proposal, since nobody else seems to see a problem here.

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
CAMPAÃ‘A POR LA PAZ: APLASTARON JUGUETES BÃ‰LICOS
-- CrÃ³nica TV

Nov 04 2009
Leandro Lucarella <llucax gmail.com> writes:
Walter Bright, el  3 de noviembre a las 16:21 me escribiste:
Andrei Alexandrescu wrote:
Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa

replace integral type with non-pointer type.

\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)

pointers are not a reference type. Replace "reference type" with
"pointers or reference types".

Strictly speaking, arrays are not reference types either, right?

--
Leandro Lucarella (AKA luca)                     http://llucax.com.ar/
----------------------------------------------------------------------
GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145  104C 949E BFB6 5F5A 8D05)
----------------------------------------------------------------------
Nos retiramos hasta la semana que viene reflexionando sobre nuestras
vidas: "QuÃ© vida de mier'... QuÃ© vida de mier'!"
-- Sidharta Kiwi

Nov 04 2009
jpf <spam example.com> writes:
Andrei Alexandrescu wrote:
How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

Thanks,

Andrei

by silverlight / moonlight). It's quite similar to what you've proposed.
http://www.mono-project.com/Moonlight2CoreCLR#Security_levels

Btw, is there a reason why safety should be specified at the module
level? As we have attributes now that would be a perfect usecase for
them: example:

Safety(Safe)
void doSomething()...

or:
Safety.Critical
void doSomething()...

where that attribute could be applied to functions, classes, modules, ...

Another related question: Will there be a way to provide different
implementations for different safety levels?

version(Safety.Critical)
{
//Some unsafe yet highly optimized asm stuff here
}
else
{
//Same thing in safe
}

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
jpf wrote:
Andrei Alexandrescu wrote:
How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

Thanks,

Andrei

by silverlight / moonlight). It's quite similar to what you've proposed.
http://www.mono-project.com/Moonlight2CoreCLR#Security_levels

I don't have much time right now, but here's what a cursory look reveals:

====================
Security levels

The CoreCLR security model divide all code into three distinct levels:
transparent, safe-critical and critical. This model is much simpler to
understand (and implement) than CAS (e.g. no stack-walk). Only a few
rules can describe much of it.
====================

The keywords "security" and "stack-walk" give it away that this is a
matter of software security, not language safety. These are quite different.

Btw, is there a reason why safety should be specified at the module
level? As we have attributes now that would be a perfect usecase for
them: example:

Safety(Safe)
void doSomething()...

or:
Safety.Critical
void doSomething()...

where that attribute could be applied to functions, classes, modules, ...

Another related question: Will there be a way to provide different
implementations for different safety levels?

version(Safety.Critical)
{
//Some unsafe yet highly optimized asm stuff here
}
else
{
//Same thing in safe
}

I think it muddies things too much to allow people to make safety
decisions at any point (e.g., I'm not a fan of C#'s unsafe).

Andrei

Nov 04 2009
jpf <spam example.com> writes:
Andrei Alexandrescu wrote:
jpf wrote:
You may want to have a look at the CoreCLR security model (that's used
by silverlight / moonlight). It's quite similar to what you've proposed.
http://www.mono-project.com/Moonlight2CoreCLR#Security_levels

I don't have much time right now, but here's what a cursory look reveals:

====================
Security levels

The CoreCLR security model divide all code into three distinct levels:
transparent, safe-critical and critical. This model is much simpler to
understand (and implement) than CAS (e.g. no stack-walk). Only a few
rules can describe much of it.
====================

The keywords "security" and "stack-walk" give it away that this is a
matter of software security, not language safety. These are quite
different.

What i wanted to refer to are the levels "Transparent", "Critical" and
"Safe Critical", which work exactly as "safe", "system" and "Yeah, I do
unsafe stuff inside, but safe modules can call me no problem". The
implementation and use case might be different, but the meaning is the
same. There's nothing unique in the .net implementation, I just though
you may want to have a look at how others solved a similiar problem.

Nov 04 2009
Tim Matthews <tim.matthews7 gmail.com> writes:
Andrei Alexandrescu wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave in
place a stub that won't lock our options. Here's what we currently have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it
is "system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related to
the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

Thanks,

Andrei

Not sure if this is the right topic to say this but maybe D needs monads
to allow more functions to be marked as pure. Then functional could be
added to the list of paradigms D supports and will also be safer.

Nov 04 2009
Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Tim Matthews wrote:
Andrei Alexandrescu wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we currently
have:

module(system) calvin;

This means calvin can do unsafe things.

module(safe) susie;

This means susie commits to extra checks and therefore only a subset
of D.

module hobbes;

This means hobbes abides to whatever the default safety setting is.

The default safety setting is up to the compiler. In dmd by default it
is "system", and can be overridden with "-safe".

Sketch of the safe rules:

\begin{itemize*}
\item No  cast  from a pointer type to an integral type and vice versa
\item No  cast  between unrelated pointer types
\item Bounds checks on all array accesses
\item  No  unions  that  include  a reference  type  (array,   class ,
pointer, or  struct  including such a type)
\item No pointer arithmetic
\item No escape of a pointer  or reference to a local variable outside
its scope
\item Cross-module function calls must only go to other  safe  modules
\end{itemize*}

So these are my thoughts so far. There is one problem though related
to the last \item - there's no way for a module to specify "trusted",
meaning: "Yeah, I do unsafe stuff inside, but safe modules can call me
no problem". Many modules in std fit that mold.

How can we address that? Again, I'm looking for a simple, robust,
extensible design that doesn't lock our options.

Thanks,

Andrei

Not sure if this is the right topic to say this but maybe D needs monads
to allow more functions to be marked as pure. Then functional could be
added to the list of paradigms D supports and will also be safer.

Would be great if you found the time to write and discuss a DIP.

Andrei

Nov 05 2009
"AJ" <aj nospam.net> writes:
Andrei Alexandrescu wrote:
SafeD is, unfortunately, not finished at the moment. I want to leave
in place a stub that won't lock our options. Here's what we currently
have:

Is the whole SafeD thing trying to do something similar to Microsoft's
"managed/unmanaged" code thing? I don't know much about it, but I had
relegated the managed/unmanaged thing to being C++-like (unmanaged) or
Java-like (managed). "Sandboxing", in short.

Nov 06 2009