digitalmars.D - Void-safety (and related things)

bearophile (9/9) Aug 11 2009 Found on Lambda the Ultimate blog, Void-safety in Eiffel language, anoth...

Jason House (8/21) Aug 11 2009 I've recently convinced myself that nullability should be the exception ...

Ary Borenszweig (6/16) Aug 11 2009 Yes. Default initialization is really week against uninitialized

bearophile (4/5) Aug 11 2009 You can't ask a single person to be able to do everything. Are you able ...

Michiel Helvensteijn (6/10) Aug 11 2009 I doubt it's the direction D wants to go. Because proving correctness at

Ary Borenszweig (2/12) Aug 11 2009 What do you mean by "holy grail"?

Michiel Helvensteijn (13/18) Aug 11 2009 You missed that discussion, did you? Basically, if you want to know at

Joel C. Salomon (11/26) Aug 21 2009 Third (stop-gap) option:

bearophile (26/31) Aug 21 2009 Thank you for that link.

Joel C. Salomon (11/21) Aug 21 2009 D uses "static if" for things other than versioning. But this attitude
Jarrett Billingsley (14/21) Aug 21 2009 ssor, is never necessary, and is usually abused. Conditional compilation...

bearophile <bearophileHUGS lycos.com> writes:

Found on Lambda the Ultimate blog, Void-safety in Eiffel language, another
attempt at solving this problem:
http://docs.eiffel.com/sites/default/files/void-safe-eiffel.pdf


I think to solve this problem a language like D can use three different
strategies at the same time. Three kinds of object references can be defined:
1) the default one (its syntax is the shorter one, they are defined using the
like current ones) is the "non nullable object reference". Many objects in a
program are like this. The type system assures the code to be correct, you

open to avoid the usage of uninitialized references of such kind. (this is a

in its ability to spot uninitialized variables).
2) The second kind is the current one, "unsafe nullabile object reference",
it's faster, its syntax is a bit longer, to be used only where max performance
is necessary.
3) The third kind is the "safe nullabile object reference". You can define it
like using the syntax "Foo? f;". It's a "fat" reference, so beside the pointer
this reference contains an integer number that represents the class. If your
program has 500 classes, you need 500 different values for it. On the other
hand usually in a program a specific reference can't be of 500 different
classes, so the maximum number can be decreased, and you can keep at runtime
sono conversion tables that convert some subsets of such numbers into a full
pointer to class info. Such tables are a bit slow to use (but they don't need
too much memory), but the program uses them only when a reference (of the third
kind) is null, so it's not a bit problem. On 64-bit systems such numeric tag
can be put into the most significant bits of the pointer itself (so when such
pointer isn't null you just need a test and a mask, the shift is required only
in the uncommon case of null). This also means that the max number of possible
class instances decreases, but not so much (you can have some conversion tables
to reduce such such numeric tag to 2-5 bits in most programs). When the code
uses a method of a null reference of such kind the program may call the correct
method of a "default" instance of that class (or even a user-specified
instance).

Do you like? :-)

Bye,
bearophile

Aug 11 2009

Jason House <jason.james.house gmail.com> writes:

I've recently convinced myself that nullability should be the exception instead

assuming they're non-null. Only in certain special cases do I handle null
explicitly. The issue is that if any special case is missed/mishandled, it can
spread to other code.

I'm also too lazy to write non-null contracts in D. They also have far less
value since violations are not caught at compile time (or better yet, in my IDE
as I write code).

It may be as simple as having the following 3 types:
T // non-nullable
T? // nullable, safe
T* //  nullable, unsafe

I'd also like to remove all default initialization in favor of use of
uninitialized variable errors. Default initialization in D is cute, but it is
not a solution for programmer oversight. Single-threaded code will reproducibly
do the wrong thing, but may be harder to notice in the first place. The very
fact that the signalling nan change has made it into D shows that people want
this type of behavior!


bearophile Wrote:

 Found on Lambda the Ultimate blog, Void-safety in Eiffel language, another
attempt at solving this problem:
 http://docs.eiffel.com/sites/default/files/void-safe-eiffel.pdf
 
 
 I think to solve this problem a language like D can use three different
strategies at the same time. Three kinds of object references can be defined:
 1) the default one (its syntax is the shorter one, they are defined using the
like current ones) is the "non nullable object reference". Many objects in a
program are like this. The type system assures the code to be correct, you

open to avoid the usage of uninitialized references of such kind. (this is a

in its ability to spot uninitialized variables).
 2) The second kind is the current one, "unsafe nullabile object reference",
it's faster, its syntax is a bit longer, to be used only where max performance
is necessary.
 3) The third kind is the "safe nullabile object reference". You can define it
like using the syntax "Foo? f;". It's a "fat" reference, so beside the pointer
this reference contains an integer number that represents the class. If your
program has 500 classes, you need 500 different values for it. On the other
hand usually in a program a specific reference can't be of 500 different
classes, so the maximum number can be decreased, and you can keep at runtime
sono conversion tables that convert some subsets of such numbers into a full
pointer to class info. Such tables are a bit slow to use (but they don't need
too much memory), but the program uses them only when a reference (of the third
kind) is null, so it's not a bit problem. On 64-bit systems such numeric tag
can be put into the most significant bits of the pointer itself (so when such
pointer isn't null you just need a test and a mask, the shift is required only
in the uncommon case of null). This also means that the max number of possible
class instances decreases, but not so much (you can have some conversion tables
to reduce such such numeric tag to 2-5 bits in most programs). When the code
uses a method of a null reference of such kind the program may call the correct
method of a "default" instance of that class (or even a user-specified
instance).
 
 Do you like? :-)
 
 Bye,
 bearophile

Aug 11 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Jason House wrote:
 I've recently convinced myself that nullability should be the exception

assuming they're non-null. Only in certain special cases do I handle null
explicitly. The issue is that if any special case is missed/mishandled, it can
spread to other code.
 
 I'm also too lazy to write non-null contracts in D. They also have far less
value since violations are not caught at compile time (or better yet, in my IDE
as I write code).
 
 It may be as simple as having the following 3 types:
 T // non-nullable
 T? // nullable, safe
 T* //  nullable, unsafe
 
 I'd also like to remove all default initialization in favor of use of
uninitialized variable errors. Default initialization in D is cute, but it is
not a solution for programmer oversight. Single-threaded code will reproducibly
do the wrong thing, but may be harder to notice in the first place. The very
fact that the signalling nan change has made it into D shows that people want
this type of behavior!

Yes. Default initialization is really week against uninitialized 
variables errors. You notice the errors of the first one at runtime, and 
the errors of the second one at compile-time.

But I don't see that changing anytime soon... (I think it's because "it 
gets hard").

Aug 11 2009

bearophile <bearophileHUGS lycos.com> writes:

Ary Borenszweig:
(I think it's because "it gets hard").<

You can't ask a single person to be able to do everything. Are you able to
implement that thing? Probably I am not able. If someone here is able and
willing to do it then I suggest such person to ask Walter permission to
implement it.

Bye,
bearophile

Aug 11 2009

Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:

bearophile wrote:

 You can't ask a single person to be able to do everything. Are you able to
 implement that thing? Probably I am not able. If someone here is able and
 willing to do it then I suggest such person to ask Walter permission to
 implement it.

I doubt it's the direction D wants to go. Because proving correctness at
compile-time requires the holy grail, and testing correctness at runtime
requires extra space for each variable and extra time for each access.

-- 
Michiel Helvensteijn

Aug 11 2009

Ary Borenszweig <ary esperanto.org.ar> writes:

Michiel Helvensteijn wrote:
 bearophile wrote:
 
 You can't ask a single person to be able to do everything. Are you able to
 implement that thing? Probably I am not able. If someone here is able and
 willing to do it then I suggest such person to ask Walter permission to
 implement it.

 
 I doubt it's the direction D wants to go. Because proving correctness at
 compile-time requires the holy grail, and testing correctness at runtime
 requires extra space for each variable and extra time for each access.

What do you mean by "holy grail"?

Aug 11 2009

Michiel Helvensteijn <m.helvensteijn.remove gmail.com> writes:

Ary Borenszweig wrote:

 I doubt it's the direction D wants to go. Because proving correctness at
 compile-time requires the holy grail, and testing correctness at runtime
 requires extra space for each variable and extra time for each access.

 
 What do you mean by "holy grail"?

You missed that discussion, did you? Basically, if you want to know at
compile-time whether a variable is initialized, there are several
possibilities:

* Be overly conservative: Make sure every possible computational path has an
assignment to the variable, otherwise give an error. This would throw out
the baby with the bathwater. Many valid programs would cause an error.

* Actually analyze the control flow: Make sure that exactly all reachable
states have the variable initialized, otherwise give an error. Dubbed "holy
grail", because this sort of analysis is still some time off, and would
allow some very cool correctness verification.

-- 
Michiel Helvensteijn

Aug 11 2009

"Joel C. Salomon" <joelcsalomon gmail.com> writes:

Michiel Helvensteijn wrote:
 I doubt it's the direction D wants to go. Because proving correctness at
 compile-time requires the holy grail, and testing correctness at runtime
 requires extra space for each variable and extra time for each access.


 
 Basically, if you want to know at compile-time whether a variable is
 initialized, there are several possibilities:
 
 * Be overly conservative: Make sure every possible computational path has an
 assignment to the variable, otherwise give an error. This would throw out
 the baby with the bathwater. Many valid programs would cause an error.
 
 * Actually analyze the control flow: Make sure that exactly all reachable
 states have the variable initialized, otherwise give an error. Dubbed "holy
 grail", because this sort of analysis is still some time off, and would
 allow some very cool correctness verification.

Third (stop-gap) option:
• Be conservative, but trust the programmer:  Allow some sort of pragma
to tell the compiler that the programmer has done the flow analysis and
the variable really is set (or non-null, or…).  It will be an unchecked
error to lie to the compiler--until the holy grail is implemented, when
it will become a checked error.

This is a feature of the Plan 9 C compilers (cf. “The compile-time
environment” in <http://plan9.bell-labs.com/sys/doc/comp.html>).

“If you lie to the compiler, it will get its revenge.” —Henry Spencer

—Joel Salomon

Aug 21 2009

bearophile <bearophileHUGS lycos.com> writes:

Joel C. Salomon:

http://plan9.bell-labs.com/sys/doc/comp.html<

Thank you for that link.
I can see some interesting things in that very C-like language:

The #if directive was omitted because it greatly complicates the preprocessor,
is never necessary, and is usually abused. Conditional compilation in general
makes code hard to understand; the Plan 9 source uses it sparingly. Also,
because the compilers remove dead code, regular if statements with constant
conditions are more readable equivalents to many #ifs.<

Can the "static if" be removed from D then?

------------------

Variables inside functions can have any order, are D compilers too doing this?

Unlike its counterpart on other systems, the Plan 9 loader rearranges data to
optimize access. This means the order of variables in the loaded program is
unrelated to its order in the source. Most programs don�t care, but some assume
that, for example, the variables declared by

int a;
int b;
will appear at adjacent addresses in memory. On Plan 9, they won�t.<


------------------

Plan 9 uses this strategy to solve endianess-induced troubles in integer I/O:

Plan 9 is a heterogeneous environment, so programs must expect that external
files will be written by programs on machines of different architectures. The
compilers, for instance, must handle without confusion object files written by
other machines. The traditional approach to this problem is to pepper the
source with #ifdefs to turn byte-swapping on and off. Plan 9 takes a different
approach: of the handful of machine-dependent #ifdefs in all the source, almost
all are deep in the libraries. Instead programs read and write files in a
defined format, either (for low volume applications) as formatted text, or (for
high volume applications) as binary in a known byte order. If the external data
were written with the most significant byte first, the following code reads a
4-byte integer correctly regardless of the architecture of the executing
machine (assuming an unsigned long holds 4 bytes):

ulong getlong(void) {
    ulong l;
    l = (getchar()&0xFF)<<24;
    l |= (getchar()&0xFF)<<16;
    l |= (getchar()&0xFF)<<8;
    l |= (getchar()&0xFF)<<0;
    return l;
}

Note that this code does not �swap� the bytes; instead it just reads them in
the correct order. Variations of this code will handle any binary format and
also avoid problems involving how structures are padded, how words are aligned,
and other impediments to portability. Be aware, though, that extra care is
needed to handle floating point data.<

------------------

I don't fully understand this:

the declaration

extern register reg;
(this appearance of the register keyword is not ignored) allocates a global
register to hold the variable reg. External registers must be used carefully:
they need to be declared in all source files and libraries in the program to
guarantee the register is not allocated temporarily for other purposes.
Especially on machines with few registers, such as the i386, it is easy to link
accidentally with code that has already usurped the global registers and there
is no diagnostic when this happens. Used wisely, though, external registers are
powerful. The Plan 9 operating system uses them to access per-process and
per-machine data structures on a multiprocessor. The storage class they provide
is hard to create in other ways.<

Bye,
bearophile

Aug 21 2009

"Joel C. Salomon" <joelcsalomon gmail.com> writes:

bearophile wrote, re. <http://plan9.bell-labs.com/sys/doc/comp.html>:
 I can see some interesting things in that very C-like language:
 
 The #if directive was omitted because it greatly complicates the preprocessor,
is never necessary, and is usually abused. Conditional compilation in general
makes code hard to understand; the Plan 9 source uses it sparingly. Also,
because the compilers remove dead code, regular if statements with constant
conditions are more readable equivalents to many #ifs.

 
 Can the "static if" be removed from D then?

D uses "static if" for things other than versioning.  But this attitude
is relevant when considering “enhancements” to D’s version(foo).

 I don't fully understand this:
 
 the declaration
     extern register reg;
 (this appearance of the register keyword is not ignored) allocates a global
register to hold the variable reg. External registers must be used carefully:
they need to be declared in all source files and libraries in the program to
guarantee the register is not allocated temporarily for other purposes.
Especially on machines with few registers, such as the i386, it is easy to link
accidentally with code that has already usurped the global registers and there
is no diagnostic when this happens. Used wisely, though, external registers are
powerful. The Plan 9 operating system uses them to access per-process and
per-machine data structures on a multiprocessor. The storage class they provide
is hard to create in other ways.


Generally, the Plan 9 C compilers ignore the "register" keyword,
preferring to handle this sort of optimization themselves.  The "extern
register" declaration is not for optimization, but to allocate a
register as a global variable.  This register will never be used by the
compiler as a temporary, or to pass arguments, or whatever compilers use
registers for; it has been completely given over for the programmer’s
use.  Apparently, this was helpful in writing the Plan 9 kernel.

—Joel Salomon

Aug 21 2009

Jarrett Billingsley <jarrett.billingsley gmail.com> writes:

On Fri, Aug 21, 2009 at 12:58 PM, bearophile<bearophileHUGS lycos.com> wrot=
e:
 Joel C. Salomon:

http://plan9.bell-labs.com/sys/doc/comp.html<

 Thank you for that link.
 I can see some interesting things in that very C-like language:

The #if directive was omitted because it greatly complicates the preproce=


ssor, is never necessary, and is usually abused. Conditional compilation in=
 general makes code hard to understand; the Plan 9 source uses it sparingly=
. Also, because the compilers remove dead code, regular if statements with =
constant conditions are more readable equivalents to many #ifs.<
 Can the "static if" be removed from D then?

No.  Not only can 'static if' appear where 'if' can't (like at module
scope and inside templates), it also does not create a scope unlike a
normal 'if'.  They're similar, but different enough to warrant being
different constructs.

 Variables inside functions can have any order, are D compilers too doing =

this?

None currently do, but I think it's allowed by the D spec.  Please
don't go beg the LDC developers for this as soon as you read this.
They really do have better things to do.

Aug 21 2009

D Programming

C/C++ Programming

Other

digitalmars.D - Void-safety (and related things)