www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.announce - Reddit: SafeD - The Safe Subset of D

reply Walter Bright <newshound1 digitalmars.com> writes:
http://reddit.com/r/programming/info/6d210/comments/
Mar 22 2008
next sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Walter Bright wrote:
 
 http://reddit.com/r/programming/info/6d210/comments/

Is this something that actually exists? Or just an idea being thrown around? Or just a marketing slogan for using D without pointers? -bb
Mar 22 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Bill Baxter wrote:
 Is this something that actually exists?  Or just an idea being thrown 
 around? Or just a marketing slogan for using D without pointers?

It's an idea for defining a language subset that is enforceable by the compiler.
Mar 22 2008
next sibling parent Bjoern <nanali nospam-wanadoo.fr> writes:
Walter Bright schrieb:
 Bill Baxter wrote:
 Is this something that actually exists?  Or just an idea being thrown 
 around? Or just a marketing slogan for using D without pointers?

It's an idea for defining a language subset that is enforceable by the compiler.

... in other words, an upcoming compiler switch ? I like that idea very much.
Mar 23 2008
prev sibling parent reply Bill Baxter <dnewsgroup billbaxter.com> writes:
Walter Bright wrote:
 Bill Baxter wrote:
 Is this something that actually exists?  Or just an idea being thrown 
 around? Or just a marketing slogan for using D without pointers?

It's an idea for defining a language subset that is enforceable by the compiler.

Hmm. This sounds like a big can of worms to me. I'd rather have bug fixes or further progress on the WalterAndrei.pdf agenda than a compiler switch that forces me to use a restricted subset of D. I guess I'm just not "enterprise" enough. :-) --bb
Mar 25 2008
parent Sean Kelly <sean invisibleduck.org> writes:
== Quote from Bill Baxter (dnewsgroup billbaxter.com)'s article
 Walter Bright wrote:
 Bill Baxter wrote:
 Is this something that actually exists?  Or just an idea being thrown
 around? Or just a marketing slogan for using D without pointers?

It's an idea for defining a language subset that is enforceable by the compiler.

fixes or further progress on the WalterAndrei.pdf agenda than a compiler switch that forces me to use a restricted subset of D. I guess I'm just not "enterprise" enough. :-)

While you're at it, you may as well ask for all the features in the 1.0 spec to actually be implemented. Why aim low? Sean
Mar 25 2008
prev sibling next sibling parent reply Lars Ivar Igesund <larsivar igesund.net> writes:
Walter Bright wrote:

 
 http://reddit.com/r/programming/info/6d210/comments/

In light of this I find it rather incredible that printf still is exposed through Object. -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Mar 23 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Lars Ivar Igesund wrote:
 In light of this I find it rather incredible that printf still is exposed
 through Object.

printf would have to be removed from the safe D subset.
Mar 23 2008
next sibling parent =?utf-8?Q?Julio=20C=c3=a9sar=20Carrascal=20Urquijo?= <jcarrascal gmail.com> writes:
Hello Walter,

 Lars Ivar Igesund wrote:
 
 In light of this I find it rather incredible that printf still is
 exposed through Object.
 


Yay!!! That's great news
Mar 23 2008
prev sibling parent Lars Ivar Igesund <larsivar igesund.net> writes:
Walter Bright wrote:

 Lars Ivar Igesund wrote:
 In light of this I find it rather incredible that printf still is exposed
 through Object.

printf would have to be removed from the safe D subset.

Point is that there is a useless _and_ unsafe global symbol/presence (only in Phobos though). -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Mar 23 2008
prev sibling next sibling parent reply Charles D Hixson <charleshixsn earthlink.net> writes:
Walter Bright wrote:
 
 http://reddit.com/r/programming/info/6d210/comments/

http://www.digitalmars.com/d/2.0/safed.html A bit light on the syntax, but I completely agree with avoiding pointers whenever possible. Unchecked casts and unions are also clearly dangerous, if less so. Unfortunately, unchecked casts seem to be necessary for some purposes. It's not just efficiency, I literally don't see any way around them. (I'm considering a union to be a kind of unchecked cast.) P.S.: What is a checked cast? If I want to consider a long as an array of bytes: byte[long.sizeof] val; it ought to be safe to later consider that array of bytes as a long. But what's a safe (checked) way to do it? (I'm not talking about conversion. I don't want to end up with an array of longs.) PPS: When talking about casts or type conversions, please make it explicit whether the same bit pattern is maintained. I often read those descriptions, and realize that I can't figure out exactly what is happening. With C I was always certain that I was just telling the compiler to think about the same piece of memory differently, and that nothing actually changed. With more modern languages, a lot more magic happens under the hood, and I'm no longer as certain what's going on. I often wonder after reading the documentation whether the same bit pattern is maintained, or whether an equivalent value is produced. E.g., I've never tried casting a float to a long. What would it produce? I can't predict. I'd often prefer to deal with ulongs or ucents rather than byte arrays, but then at other times I need to address particular bytes out of that value. Because I don't really understand a cast, I just use byte arrays (well, ubyte). But it's "sloppier". Generally I'm dealing with a unitary entity, and needing to think of it as an array all the time is uncomfortable. (I'd even like a notation for dealing with particular bits, though I haven't needed that recently.) Note that this isn't a request for a change in how things act, but rather in how they are documented. I *suspect* that cast is presumed to be defined by C, and that it means "Think about they type differently, but don't change it's bit pattern", but I'm never quite certain.
Mar 23 2008
next sibling parent Bill Baxter <dnewsgroup billbaxter.com> writes:
Charles D Hixson wrote:
 With C I was always certain that I was just telling the 
 compiler to think about the same piece of memory differently, and that 
 nothing actually changed.  

Not so: int x = (int)2.5; Not the same bit pattern after as before (not even if it was 2.0 on the right). But maybe you're only talking about casts with pointers in them? --bb
Mar 23 2008
prev sibling next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Charles D Hixson wrote:
 PPS:  When talking about casts or type conversions, please make it 
 explicit whether the same bit pattern is maintained. I often read those 
 descriptions, and realize that I can't figure out exactly what is 
 happening.  With C I was always certain that I was just telling the 
 compiler to think about the same piece of memory differently, and that 
 nothing actually changed.

int i = 3; double d = (double)i; changes the bit pattern in C (as well as in D)
Mar 23 2008
parent Charles D Hixson <charleshixsn earthlink.net> writes:
Walter Bright wrote:
 Charles D Hixson wrote:
 PPS:  When talking about casts or type conversions, please make it 
 explicit whether the same bit pattern is maintained. I often read 
 those descriptions, and realize that I can't figure out exactly what 
 is happening.  With C I was always certain that I was just telling the 
 compiler to think about the same piece of memory differently, and that 
 nothing actually changed.

int i = 3; double d = (double)i; changes the bit pattern in C (as well as in D)

So I didn't understand them in C, either. I'd been understanding them as a temporary substitute for a union for doing unsafe conversions. (And avoiding them as much as possible. Apparently just as well.)
Mar 23 2008
prev sibling parent reply Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Charles D Hixson wrote:
 
 PPS:  When talking about casts or type conversions, please make it 
 explicit whether the same bit pattern is maintained. I often read those 
 descriptions, and realize that I can't figure out exactly what is 
 happening.  With C I was always certain that I was just telling the 
 compiler to think about the same piece of memory differently, and that 
 nothing actually changed.  With more modern languages, a lot more magic 
 happens under the hood, and I'm no longer as certain what's going on.  I 
 often wonder after reading the documentation whether the same bit 
 pattern is maintained, or whether an equivalent value is produced.  
 E.g., I've never tried casting a float to a long.  What would it 
 produce?  I can't predict.  I'd often prefer to deal with ulongs or 
 ucents rather than byte arrays, but then at other times I need to 
 address particular bytes out of that value.  Because I don't really 
 understand a cast, I just use byte arrays (well, ubyte).  But it's 
 "sloppier".  Generally I'm dealing with a unitary entity, and needing to 
 think of it as an array all the time is uncomfortable.  (I'd even like a 
 notation for dealing with particular bits, though I haven't needed that 
 recently.)
 Note that this isn't a request for a change in how things act, but 
 rather in how they are documented.
 I *suspect* that cast is presumed to be defined by C, and that it means 
 "Think about they type differently, but don't change it's bit pattern", 
 but I'm never quite certain.

Yup, this is one of the C legacy behaviors/mentality that I've found ever more irritant. I would prefer that the language syntax would better distinguish between opaque casts (no bit changes) and conversion casts (bit changes, since a conversion is made). -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mar 24 2008
parent Don Clugston <dac nospam.com.au> writes:
Bruno Medeiros wrote:
 Charles D Hixson wrote:
 PPS:  When talking about casts or type conversions, please make it 
 explicit whether the same bit pattern is maintained. I often read 
 those descriptions, and realize that I can't figure out exactly what 
 is happening.  With C I was always certain that I was just telling the 
 compiler to think about the same piece of memory differently, and that 
 nothing actually changed.  With more modern languages, a lot more 
 magic happens under the hood, and I'm no longer as certain what's 
 going on.  I often wonder after reading the documentation whether the 
 same bit pattern is maintained, or whether an equivalent value is 
 produced.  E.g., I've never tried casting a float to a long.  What 
 would it produce?  I can't predict.  I'd often prefer to deal with 
 ulongs or ucents rather than byte arrays, but then at other times I 
 need to address particular bytes out of that value.  Because I don't 
 really understand a cast, I just use byte arrays (well, ubyte).  But 
 it's "sloppier".  Generally I'm dealing with a unitary entity, and 
 needing to think of it as an array all the time is uncomfortable.  
 (I'd even like a notation for dealing with particular bits, though I 
 haven't needed that recently.)
 Note that this isn't a request for a change in how things act, but 
 rather in how they are documented.
 I *suspect* that cast is presumed to be defined by C, and that it 
 means "Think about they type differently, but don't change it's bit 
 pattern", but I'm never quite certain.

Yup, this is one of the C legacy behaviors/mentality that I've found ever more irritant. I would prefer that the language syntax would better distinguish between opaque casts (no bit changes) and conversion casts (bit changes, since a conversion is made).

I'm with you. The worst example: Despite the name, C++'s reinterpret_cast<> sometimes does _conversion casts_, involving bit changes.
Mar 25 2008
prev sibling next sibling parent reply Clay Smith <clayasaurus gmail.com> writes:
Walter Bright wrote:
 
 http://reddit.com/r/programming/info/6d210/comments/

Is SafeD just a label for the programmer selectively using D features?
Mar 23 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Clay Smith wrote:
 Is SafeD just a label for the programmer selectively using D features?

Yes, but it would also be enforced by a compiler switch.
Mar 23 2008
next sibling parent Clay Smith <clayasaurus gmail.com> writes:
Walter Bright wrote:
 Clay Smith wrote:
 Is SafeD just a label for the programmer selectively using D features?

Yes, but it would also be enforced by a compiler switch.

I think it's a great idea, then.
Mar 23 2008
prev sibling next sibling parent reply =?utf-8?Q?Julio=20C=c3=a9sar=20Carrascal=20Urquijo?= <jcarrascal gmail.com> writes:
Walter wrote:
 Is SafeD just a label for the programmer selectively using D
 features?
 


How will one assert that a library function is certified for usage in SafeD even if it uses unsafe constructs? New keywords? Thanks
Mar 23 2008
next sibling parent "Jarrett Billingsley" <kb3ctd2 yahoo.com> writes:
"Julio César Carrascal Urquijo" <jcarrascal gmail.com> wrote in message 
news:792eeb25153d8ca5b07163649b2 news.digitalmars.com...
 Walter wrote:
 Is SafeD just a label for the programmer selectively using D
 features?


How will one assert that a library function is certified for usage in SafeD even if it uses unsafe constructs? New keywords? Thanks

Maybe the compiler would just check the public interface?
Mar 23 2008
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Julio César Carrascal Urquijo wrote:
 How will one assert that a library function is certified for usage in 
 SafeD even if it uses unsafe constructs? New keywords?

There'll have to be some syntax for that.
Mar 23 2008
next sibling parent reply Georg Wrede <georg nospam.org> writes:
Walter Bright wrote:
 Julio César Carrascal Urquijo wrote:
 
 How will one assert that a library function is certified for usage in 
 SafeD even if it uses unsafe constructs? New keywords?

There'll have to be some syntax for that.

I hope you mean that once such a library function is Certified, it gets some kind of [at least compiler readable] property stating that it is SafeD compliant? As to the matter of certifying the function, in trivial cases the compiler could do it. But with some important special cases, I can see no other way than to manually scrutinize the source code. Think of a complicated function (say, some hairy tensor math operation, maybe an FFT function, or whatever that's nontrivial) that internally needs to do "unsafe" operations or even in-line asm, but that has been deemed safe by Authoritative Professionals.
Mar 24 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Georg Wrede wrote:
 Walter Bright wrote:
 Julio César Carrascal Urquijo wrote:

 How will one assert that a library function is certified for usage in 
 SafeD even if it uses unsafe constructs? New keywords?

There'll have to be some syntax for that.

I hope you mean that once such a library function is Certified, it gets some kind of [at least compiler readable] property stating that it is SafeD compliant?

Yes.
 As to the matter of certifying the function, in trivial cases the 
 compiler could do it.

There's no reason to syntactically mark a function as safe if the compiler can verify it.
 But with some important special cases, I can see no other way than to 
 manually scrutinize the source code. Think of a complicated function 
 (say, some hairy tensor math operation, maybe an FFT function, or 
 whatever that's nontrivial) that internally needs to do "unsafe" 
 operations or even in-line asm, but that has been deemed safe by 
 Authoritative Professionals.

Yes, but the idea is to reduce the scope as much as possible of where you have to manually look for unsafe code.
Mar 24 2008
next sibling parent Georg Wrede <georg nospam.org> writes:
Walter Bright wrote:
 Georg Wrede wrote:
 Walter Bright wrote:
 Julio César Carrascal Urquijo wrote:

 How will one assert that a library function is certified for usage 
 in SafeD even if it uses unsafe constructs? New keywords?

There'll have to be some syntax for that.

I hope you mean that once such a library function is Certified, it gets some kind of [at least compiler readable] property stating that it is SafeD compliant?

Yes.
 As to the matter of certifying the function, in trivial cases the 
 compiler could do it.

There's no reason to syntactically mark a function as safe if the compiler can verify it.
 But with some important special cases, I can see no other way than to 
 manually scrutinize the source code. Think of a complicated function 
 (say, some hairy tensor math operation, maybe an FFT function, or 
 whatever that's nontrivial) that internally needs to do "unsafe" 
 operations or even in-line asm, but that has been deemed safe by 
 Authoritative Professionals.

Yes, but the idea is to reduce the scope as much as possible of where you have to manually look for unsafe code.

I'm simply thrilled!
Mar 24 2008
prev sibling parent Christopher Wright <dhasenan gmail.com> writes:
Walter Bright wrote:
 There's no reason to syntactically mark a function as safe if the 
 compiler can verify it.

Well, if you have a mix of safe and unsafe code, there is -- you want to tell the compiler to verify some stuff and ignore other stuff.
Mar 25 2008
prev sibling parent Bruno Medeiros <brunodomedeiros+spam com.gmail> writes:
Walter Bright wrote:
 Julio César Carrascal Urquijo wrote:
 How will one assert that a library function is certified for usage in 
 SafeD even if it uses unsafe constructs? New keywords?

There'll have to be some syntax for that.

What does it matter to safe code if a library uses unsafe constructs internally? -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Mar 25 2008
prev sibling parent reply Knud Soerensen <4tuu4k002 sneakemail.com> writes:
Walter Bright wrote:
 Clay Smith wrote:
 Is SafeD just a label for the programmer selectively using D features?

Yes, but it would also be enforced by a compiler switch.

Hi Walter Would it be better to implement a general framework for defining code constraints ?? See Scott Meyers talk on Generalizing Const http://video.google.com/videoplay?docid=-4728145737208991310
Mar 24 2008
parent reply renoX <renosky free.fr> writes:
Knud Soerensen a écrit :
 Walter Bright wrote:
 Clay Smith wrote:
 Is SafeD just a label for the programmer selectively using D features?


Hi Walter Would it be better to implement a general framework for defining code constraints ?? See Scott Meyers talk on Generalizing Const http://video.google.com/videoplay?docid=-4728145737208991310

I must admit that the presentation went way over my head, but I wanted to add that those 'code constraints' reminds me of 'capabilities' which are an interesting way to provide granular security. renoX
Mar 24 2008
parent renoX <renosky free.fr> writes:
renoX a écrit :
 Knud Soerensen a écrit :
 Walter Bright wrote:
 Clay Smith wrote:
 Is SafeD just a label for the programmer selectively using D features?


Hi Walter Would it be better to implement a general framework for defining code constraints ?? See Scott Meyers talk on Generalizing Const http://video.google.com/videoplay?docid=-4728145737208991310

I must admit that the presentation went way over my head, but I wanted to add that those 'code constraints' reminds me of 'capabilities' which are an interesting way to provide granular security. renoX

Just to explain what I mean for this, here's a video-talk about Joe, a Java's subset intended to enable capability-style programming: http://uk.youtube.com/watch?v=EGX2I31OhBE http://code.google.com/p/joe-e/ The goal is is different but it's still interesting: if I understood correctly SafeD's goal would be to offer a Java-like safety, but even Java's like safety isn't enough to provide fine grain security so researchers made Joe a Java's subset for this. So maybe Joe's design would be interesting as an inspiration to SafeD (and if it's too limiting, there could be several level of 'safety'). Regards, renoX
Mar 24 2008
prev sibling next sibling parent reply janderson <askme me.com> writes:
Walter Bright wrote:
 
 http://reddit.com/r/programming/info/6d210/comments/

Would it be possible to use unsafe libraries with a safe D subset? I'm think of something like being able to specify what libraries you can link to that are unsafe somehow, perhaps through a dll or wrapper. -Joel
Mar 24 2008
parent janderson <askme me.com> writes:
janderson wrote:
 Walter Bright wrote:
 http://reddit.com/r/programming/info/6d210/comments/

Would it be possible to use unsafe libraries with a safe D subset? I'm think of something like being able to specify what libraries you can link to that are unsafe somehow, perhaps through a dll or wrapper. -Joel

What I'm getting at is that some libs may be safe for one project but may not be considered safe in another. For instance, someone may want to prevent writing to a file, in another this may be perfectly acceptable however they may want security in other areas. -Joel
Mar 24 2008
prev sibling next sibling parent reply Dejan Lekic <dejan.lekic gmail.com> writes:
I wish D uses the concept of UNSAFE modules similar (or the same) as in 
the excellent Modula-3 language (which influenced all modern OO 
languages, IMHO).
Modula-3 has a keyword "UNSAFE" which is used in a module or interface 
declaration to indicate that it is _unsafe_. In other words it informs 
us that the module/interface uses unsafe features of the language. If 
module or interface is not labeled "unsafe" (default behavior), it is 
assumed to be safe.
This simple concept is amazing, as is the fact that Modula-3 (as a 
language) had this (plus numerous other modern features) two decades ago.

Kind regards

PS. I am not advocating Modula-3 here. I do C++ (16 years), D, Java and 
PHP programming, mostly.
Mar 25 2008
next sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Dejan Lekic wrote:
 This simple concept is amazing, as is the fact that Modula-3 (as a 
 language) had this (plus numerous other modern features) two decades ago.

Features from Modula-3 make me nervous, as Modula-3 was an abject failure. I don't know why M3 failed, so I am suspicious of adopting features without thoroughly understanding why M3 failed.
Mar 25 2008
parent reply Charles D Hixson <charleshixsn earthlink.net> writes:
Walter Bright wrote:
 Dejan Lekic wrote:
 This simple concept is amazing, as is the fact that Modula-3 (as a 
 language) had this (plus numerous other modern features) two decades ago.

Features from Modula-3 make me nervous, as Modula-3 was an abject failure. I don't know why M3 failed, so I am suspicious of adopting features without thoroughly understanding why M3 failed.

That particular feature, however, seems reasonable. But I question the granularity. I think that unsafe should be an attribute that could be set statically for a class or a function. And read whenever you felt like reading it, of course. But the question is, should it be set by the compiler, or by the programmer. And what form should a check for safe vs. unsafe take? One obvious form would be an assert check, but this presumes that unsafe code is allowed by default, and is only rejected if a test fails. Another plausible approach would be to by default forbid the calling of unsafe code, and to have a pragma that allows it. This is a bit messy, as pragmas don't have nice boundary conditions. (Rather like extern. You either enable it for a statement/block, or for all succeeding statement/blocks. Still, a variation of pragma, the pragma ( Identifier , ExpressionList ) form, could be nice. I'm thinking of it having the form: pragma (AllowUnsafe, class1, class2, etc, classk::unsafeFunction, etc.); to allow the use of the specified unsafe code. The reason for this form of the pragma is so that you don't accidentally allow all unsafe functions throughout and entire module by mistake. Naturally the normal rules would still apply: If you declared a pragma within a block, it would only apply within that block. If you're adopting the pragma form, then the code could be detected as unsafe by the compiler, and only it's use forbidden (unless specifically allowed). I haven't been able to think of any reason for dynamically allowing/forbidding the use of unsafe code, though I suppose such is possible. In such a case it would probably be appropriate to use whatever form of allowance is decided upon, and then to forbid it with standard if statement (or try/catch blocks).
Mar 25 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
I suspect that having a granular level of specifying safe/unsafe is the 
wrong approach. Doing it at the module level is easy to understand, and 
has the side effect of encouraging better modularization of safe/unsafe 
code.
Mar 25 2008
parent Georg Wrede <georg nospam.org> writes:
Walter Bright wrote:
 I suspect that having a granular level of specifying safe/unsafe is the 
 wrong approach. Doing it at the module level is easy to understand, and 
 has the side effect of encouraging better modularization of safe/unsafe 
 code.

That would mean that modules in Phobos would be either Safe or Unsafe. Or[/and] that some modules would have to have two versions, one Safe and the other Unsafe. More practical would be (especially if the compiler has access to info/hints to the safety of individual functions) to have it per function. Then the compiler could discriminate, depending on if the user had used the -Safe switch or not. Personally, I'd advocate having Safety on App Level. Either an app is SafeD compliant, or not. I have a hard time seeing anything in between.
Mar 25 2008
prev sibling parent Jan Claeys <digitalmars janc.be> writes:
Op Tue, 25 Mar 2008 11:18:22 -0700, schreef Walter Bright:

 Features from Modula-3 make me nervous, as Modula-3 was an abject
 failure. I don't know why M3 failed, so I am suspicious of adopting
 features without thoroughly understanding why M3 failed.

<http://en.wikipedia.org/wiki/Modula-3#Historical_development> According to this Wikipedia article, it might have something to do with it being designed by DEC and then DEC practically dying... -- JanC
Apr 26 2008
prev sibling next sibling parent reply Chris Miller <chris dprogramming.com> writes:
On Sat, 22 Mar 2008 21:47:59 -0700
Walter Bright <newshound1 digitalmars.com> wrote:

 
 http://reddit.com/r/programming/info/6d210/comments/

I think this calls for a compiler switch that forces bounds checking on, whether or not debug or release mode. You don't want to be shipping debug code. Also, a pragma or similar would be helpful; if it could enable bounds-checking from that point until the end of the scope, you could completely rely on bounds checks in your code, like you can do in other modern languages. Finally, would SafeD have to disallow destructors? If you're accessing garbage collected memory in a destructor, you're asking for trouble. It's not always as simple as directly disallowing access these fields. Calling functions can indirectly cause the memory to be accessed. However, if you're not accessing GC memory in a destructor, you're probably using some lower-level functions, which are generally untrustworthy. -- Chris Miller <chris dprogramming.com>
Mar 25 2008
parent reply Chris Miller <lordSaurontheGreat gmail.com> writes:
Chris Miller Wrote:

 On Sat, 22 Mar 2008 21:47:59 -0700
 Walter Bright <newshound1 digitalmars.com> wrote:
 
 
 http://reddit.com/r/programming/info/6d210/comments/

I think this calls for a compiler switch that forces bounds checking on, whether or not debug or release mode. You don't want to be shipping debug code. Also, a pragma or similar would be helpful; if it could enable bounds-checking from that point until the end of the scope, you could completely rely on bounds checks in your code, like you can do in other modern languages. Finally, would SafeD have to disallow destructors? If you're accessing garbage collected memory in a destructor, you're asking for trouble. It's not always as simple as directly disallowing access these fields. Calling functions can indirectly cause the memory to be accessed. However, if you're not accessing GC memory in a destructor, you're probably using some lower-level functions, which are generally untrustworthy.

I thought the garbage collector only freed memory after the destructor had been run. DMD 1.00 spec document, page 104 says "The garbage collector calls the destructor when the object is deleted." Did this change? I haven't checked for an update to my copy of the spec document in some time. -- the "other" Chris Miller
Mar 25 2008
next sibling parent reply Chris Miller <lordSaurontheGreat gmail.com> writes:
Chris Miller Wrote:
 
 http://www.digitalmars.com/d/2.0/class.html#Destructor
 
 "When the garbage collector calls a destructor for an object of a class that
has members that are references to garbage collected objects, those references
are no longer valid. This means that destructors cannot reference sub objects.
This is because that the garbage collector does not collect objects in any
guaranteed order, so there is no guarantee that any pointers or references to
any other garbage collected objects exist when the garbage collector runs the
destructor for an object."

Oh, well then you should just move the code you wanted to run in the parent object's destructor to the sub-object's destructor, though it does leave my wondering why the garbage collector would have killed the sub-object first, since if there is code in the parent object's destructor that uses the sub-object, that should count as a reference, so the sub-object should still be in scope - I think. Or does code in a destructor not count towards keeping heap references in scope? -- the "other" Chris Miller
Mar 25 2008
parent reply Chris Miller <lordSaurontheGreat gmail.com> writes:
Brad Roberts Wrote:

 Chris Miller wrote:
 Chris Miller Wrote:
 http://www.digitalmars.com/d/2.0/class.html#Destructor

 "When the garbage collector calls a destructor for an object of a class that
has members that are references to garbage collected objects, those references
are no longer valid. This means that destructors cannot reference sub objects.
This is because that the garbage collector does not collect objects in any
guaranteed order, so there is no guarantee that any pointers or references to
any other garbage collected objects exist when the garbage collector runs the
destructor for an object."

Oh, well then you should just move the code you wanted to run in the parent object's destructor to the sub-object's destructor, though it does leave my wondering why the garbage collector would have killed the sub-object first, since if there is code in the parent object's destructor that uses the sub-object, that should count as a reference, so the sub-object should still be in scope - I think. Or does code in a destructor not count towards keeping heap references in scope? -- the "other" Chris Miller

'code' never holds references, strictly data. The question you have, I believe, is when obj1 holds a reference to obj2 (be it a pointer or a reference), what happens when the last reference to obj1 disappears. Both objects become 'dead' and the order of cleanup is undefined. In that case, you can question the choice, but consider any cycle of references where together they are all referenced. When the last external reference to the cycle disappears the entire set is collectable with no 'right' order. Based on the fact that in some cases there's no 'right' order, the collector takes the approach of 'no order can be assumed for any object being collected'. The result being that no references to other collectable memory can be referenced from a destructor. Because of the collector, you don't need to either. Destructors only need to manage non-collectable entities.

Ah, so you're saying that reference counting is run according to scope relative to the sequential process line as it propagates down from main(). That makes sense, since those objects are no longer referenced by anything from the running scope, they're all garbage collectible. So yes, then the only solution would be to introduce a new rule to the garbage collector stipulating that objects with the least number of references from other objects in the collectable scope should be deleted first. This would inherently slow down the garbage collector, thus presenting you with a tradeoff. What about something like scope(exit) object.doThis(); ? Sort of like Tango's FileConduit? Would that work to be able to manually force a ordered deletion? It wouldn't be automatic by the garbage collector. In my experience a scope statement is a lot easier than the C++ way of carefully discovering when something is finally out of scope! Just a thought. I don't know, but it's a very interesting problem to think about!
Mar 26 2008
parent reply Kevin Bealer <kevinbealer gmail.com> writes:
== Quote from Chris Miller (lordSaurontheGreat gmail.com)'s article
...
 assumed for any object being collected'.  The result being that no
 references to other collectable memory can be referenced from a
 destructor.  Because of the collector, you don't need to either.
 Destructors only need to manage non-collectable entities.

Ah, so you're saying that reference counting is run according to scope relative to the sequential process line as it propagates down from main(). That makes sense, since those objects are no longer referenced by anything from the running scope, they're all garbage collectible. So yes, then the only solution would be to introduce a new rule to the garbage collector stipulating that objects with the least number of references from other objects in the collectable scope should be deleted first. This would inherently slow down the garbage collector, thus presenting you with a tradeoff. What about something like scope(exit) object.doThis(); ? Sort of like Tango's FileConduit? Would that work to be able to manually force a ordered deletion? It wouldn't be automatic by the garbage collector. In my experience a scope statement is a lot easier than the C++ way of carefully discovering when something is finally out of scope! Just a thought. I don't know, but it's a very interesting problem to think about!

Garbage collection can be done (to a degree) with reference counting, but systems which use GC more often use an algorithm called "mark / sweep" (or a variation of it.) This has various advantages over reference counting. First, if you have circular pointers, such as blocks A and B pointing to each other, they would never be collected in a reference counting system; secondly, every time a reference counting system needs to copy or overwrite a pointer, a count has to be adjusted somewhere, and this means that a lot of algorithms run more slowly -- for example, copying an array of pointers requires all the pointed-to blocks to be "touched" in order to increase their refcounts. The way mark/sweep collection works is that the program stack is scanned, along with all global variables, and the stacks of any threads and thread local storage, plus the actual machine registers. This scan looks for pointers or (in a "conservative" design) things which *might* be pointers. Any of the blocks of (dynamically allocated) memory that is pointed to by one of these (stack, global, or register) pointers is then considered to also be "live". This block is then scanned as well for pointers, and so on. Basically, if there is a series of pointers from a live area to a specific block, that block is also live, recursively. Any other area is considered "garbage", because if you don't have a pointer to it anywhere in the live set, you can't still be using it. Mark/sweep doesn't have the problem of circular reference counts, but on the other hand, there is no way to figure out which blocks were parents of which others, so that destruction order is essentially random. There is no way to fix this reliably, especially since the circular links can mean that the objects are both parents of each other -- so what order do they get destroyed in? In languages like D and C++, the garbage collection is conservative, and this means that any pointer-sized block will be considered a pointer if it contains a value that is an address of any area that might still be live. This means that a few "garbage" blocks can be kept around because they are in a memory area which is pointed to by some random integer. This also means that in practice, even for sets of memory blocks that are not circular and might have an obvious destruction order could not be guaranteed to be destroyed in the right order, because a random integer in any of them might make them seem circular, so relying on any policy that tried to detect circularity would be unreliable at best. There are some ways to work around this though -- If object A needs to call B.close(), then a reference to object B can be stored in a global (or static) variable as well as in object A. After object A's destructor calls B.close(), then it should remove B from the global table, thus making B garbage (B will not actually get freed until the next GC cycle.) (Make sure the table doesn't need to synchronize for the "remove" step since that could cause deadlock, so an associative array is probably a bad idea.) Kevin
Mar 30 2008
next sibling parent reply "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Mon, 31 Mar 2008 05:13:50 +0200, Kevin Bealer <kevinbealer gmail.com>  
wrote:

 In languages like D and C++, the garbage collection is conservative, and
 this means that any pointer-sized block will be considered a pointer if
 it contains a value that is an address of any area that might still be
 live.
This means that a few "garbage" blocks can be kept around because they
 are in a memory area which is pointed to by some random integer. This
 also means that in practice, even for sets of memory blocks that are
 not circular and might have an obvious destruction order could not be
 guaranteed to be destroyed in the right order, because a random integer
 in any of them might make them seem circular, so relying on any policy
 that tried to detect circularity would be unreliable at best.

In D, an allocated block can be marked as containing no pointers, and thus will not be scanned for things looking pointers. I don't know how good the GC/compiler is at understanding these things on its own, but at least a programmer can make it more informed. --Simen
Mar 31 2008
parent reply Kevin Bealer <kevinbealer gmail.com> writes:
Simen Kjaeraas Wrote:

 On Mon, 31 Mar 2008 05:13:50 +0200, Kevin Bealer <kevinbealer gmail.com>  
 wrote:
 
 In languages like D and C++, the garbage collection is conservative, and
 this means that any pointer-sized block will be considered a pointer if
 it contains a value that is an address of any area that might still be
 live.
This means that a few "garbage" blocks can be kept around because they
 are in a memory area which is pointed to by some random integer. This
 also means that in practice, even for sets of memory blocks that are
 not circular and might have an obvious destruction order could not be
 guaranteed to be destroyed in the right order, because a random integer
 in any of them might make them seem circular, so relying on any policy
 that tried to detect circularity would be unreliable at best.

In D, an allocated block can be marked as containing no pointers, and thus will not be scanned for things looking pointers. I don't know how good the GC/compiler is at understanding these things on its own, but at least a programmer can make it more informed. --Simen

Yes, and I forgot to mention that at one point the D compiler got an upgrade that made it more "precise" (the opposite of "conservative" in GC jargon). It now knows that a dynamically allocated array of a primitive type (such as an int[] array or one of the string types) is not a pointer. So if you are dealing with lots of large matrices or lots of string data, you should suffer much less from the effects of the accidental retention of blocks. I don't think it affects for structures, classes, or stack data, though it would probably be straightforward to do this for structures that had no pointers. Kevin
Mar 31 2008
parent Kevin Bealer <kevinbealer gmail.com> writes:
Kevin Bealer Wrote:

 Simen Kjaeraas Wrote:
 
 On Mon, 31 Mar 2008 05:13:50 +0200, Kevin Bealer <kevinbealer gmail.com>  
 wrote:
 
 In languages like D and C++, the garbage collection is conservative, and
 this means that any pointer-sized block will be considered a pointer if
 it contains a value that is an address of any area that might still be
 live.
This means that a few "garbage" blocks can be kept around because they
 are in a memory area which is pointed to by some random integer. This
 also means that in practice, even for sets of memory blocks that are
 not circular and might have an obvious destruction order could not be
 guaranteed to be destroyed in the right order, because a random integer
 in any of them might make them seem circular, so relying on any policy
 that tried to detect circularity would be unreliable at best.

In D, an allocated block can be marked as containing no pointers, and thus will not be scanned for things looking pointers. I don't know how good the GC/compiler is at understanding these things on its own, but at least a programmer can make it more informed. --Simen

Yes, and I forgot to mention that at one point the D compiler got an upgrade that made it more "precise" (the opposite of "conservative" in GC jargon). It now knows that a dynamically allocated array of a primitive type (such as an int[] array or one of the string types) is not a pointer.

Oops -- I meant the data in the array -- the array actually is a pointer/length.
 So if you are dealing with lots of large matrices or lots of string data, you
 should suffer much less from the effects of the accidental retention of blocks.
 I don't think it affects for structures, classes, or stack data, though it
would
 probably be straightforward to do this for structures that had no pointers.
 
 Kevin
 

Mar 31 2008
prev sibling next sibling parent e-t172 <e-t172 akegroup.org> writes:
Kevin Bealer a écrit :
 There are some ways to work around this though -- If object A needs
 to call B.close(), then a reference to object B can be stored in a
 global (or static) variable as well as in object A.  After object A's
 destructor calls B.close(), then it should remove B from the global
 table, thus making B garbage (B will not actually get freed until the
 next GC cycle.)  (Make sure the table doesn't need to synchronize for
 the "remove" step since that could cause deadlock, so an associative
 array is probably a bad idea.)

What about this solution: class Parent { private Child mychild; this() { mychild = new Child(); addRoot(mychild); } ~this() { mychild.close(); removeRoot(mychild); } } That would not destroy mychild until the Parent object destructor is called. One could even add "delete mychild" at the end of ~this to make it even more efficient, if we suppose that mychild is useless anyway without its Parent (which is often the case, consider a I/O object, i.e. a socket, attached to a "parent" that control it). The downside of this solution is that you're basically falling back to manual memory management (sort of), but I don't think it is avoidable. By the way, I never understood why the I/O objets in Phobos and Tango (i.e. FileConduit) do not automatically close() themselves in their destructors... that would solve the problem in the majority of cases.
Mar 31 2008
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 Mark/sweep doesn't have the problem of circular reference counts, but on
 the other hand, there is no way to figure out which blocks were parents
 of which others, so that destruction order is essentially random.  There
 is no way to fix this reliably, especially since the circular links can
 mean that the objects are both parents of each other -- so what order do
 they get destroyed in?

This should probably be qualified by saying that there is no efficient way to find the parent object, and in the case of circular chains, there may not even be a parent object. However, in the general case the GC could theoretically maintain enough bookkeeping information to destroy objects hierarchically when possible. But I suspect that doing so would greatly increase both the memory needed for a scan and the time involved. Thus, guaranteeing in- order destruction simply isn't practical, even when it's possible. Sean
Mar 31 2008
parent Kevin Bealer <kevinbealer gmail.com> writes:
Sean Kelly Wrote:

 == Quote from Kevin Bealer (kevinbealer gmail.com)'s article
 Mark/sweep doesn't have the problem of circular reference counts, but on
 the other hand, there is no way to figure out which blocks were parents
 of which others, so that destruction order is essentially random.  There
 is no way to fix this reliably, especially since the circular links can
 mean that the objects are both parents of each other -- so what order do
 they get destroyed in?

This should probably be qualified by saying that there is no efficient way to find the parent object, and in the case of circular chains, there may not even be a parent object. However, in the general case the GC could theoretically maintain enough bookkeeping information to destroy objects hierarchically when possible. But I suspect that doing so would greatly increase both the memory needed for a scan and the time involved. Thus, guaranteeing in- order destruction simply isn't practical, even when it's possible. Sean

Yes -- especially for a precise collector in a language like Java, but I'd be reluctant to make assumptions about the data in this way in a conservative collector; I think the bookkeeping you mention would require the GC to have nearly perfect type information, which would make it impossible to do many things. If object A has noisy data that happens to point into B's area, then you have a relationship from A to B. If B also has a relationship to A, then you get a cycle, and the GC has to fall back on unordered destruction. If B expects the GC to make sure that A still exists this could be a problem. Unless the user specifically marks these relationships, it seems difficult to do this reliably. This could be done of course, but as long as we need to change B, why not make B handle cleanup in its own destructor? For extra control over B's behavior, the B object could take a function pointer (not a delegate) that told it how to clean itself up. I think we can assume that function pointers won't point to anything collectable. Maybe something like this: class B { // I don't remember the function pointer syntax... void setDestructHandler(void (*foo)(B*)); }; Kevin
Mar 31 2008
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
Chris Miller wrote:
 Chris Miller Wrote:
 http://www.digitalmars.com/d/2.0/class.html#Destructor

 "When the garbage collector calls a destructor for an object of a class that
has members that are references to garbage collected objects, those references
are no longer valid. This means that destructors cannot reference sub objects.
This is because that the garbage collector does not collect objects in any
guaranteed order, so there is no guarantee that any pointers or references to
any other garbage collected objects exist when the garbage collector runs the
destructor for an object."

Oh, well then you should just move the code you wanted to run in the parent object's destructor to the sub-object's destructor, though it does leave my wondering why the garbage collector would have killed the sub-object first, since if there is code in the parent object's destructor that uses the sub-object, that should count as a reference, so the sub-object should still be in scope - I think. Or does code in a destructor not count towards keeping heap references in scope? -- the "other" Chris Miller

'code' never holds references, strictly data. The question you have, I believe, is when obj1 holds a reference to obj2 (be it a pointer or a reference), what happens when the last reference to obj1 disappears. Both objects become 'dead' and the order of cleanup is undefined. In that case, you can question the choice, but consider any cycle of references where together they are all referenced. When the last external reference to the cycle disappears the entire set is collectable with no 'right' order. Based on the fact that in some cases there's no 'right' order, the collector takes the approach of 'no order can be assumed for any object being collected'. The result being that no references to other collectable memory can be referenced from a destructor. Because of the collector, you don't need to either. Destructors only need to manage non-collectable entities. Later, Brad
Mar 25 2008
prev sibling next sibling parent reply Lutger <lutger.blijdestijn gmail.com> writes:
Sounds great. Would it be possible to compile SafeD code to Java 
bytecode? iirc it isn't possible to do that with D.
Mar 25 2008
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Lutger wrote:
 Sounds great. Would it be possible to compile SafeD code to Java 
 bytecode? iirc it isn't possible to do that with D.

It possibly could be.
Mar 25 2008
parent Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 Lutger wrote:
 Sounds great. Would it be possible to compile SafeD code to Java 
 bytecode? iirc it isn't possible to do that with D.

It possibly could be.

Hurray! Will it also have mandatory braces? :-) Ciao -- Roberto Mariottini, http://www.mariottini.net/roberto/ SuperbCalc, a free tape calculator: http://www.mariottini.net/roberto/superbcalc/
Mar 26 2008
prev sibling next sibling parent Chris Miller <chris dprogramming.com> writes:
 Finally, would SafeD have to disallow destructors? If you're accessing garbage
collected memory in a destructor, you're asking for trouble. It's not always as
simple as directly disallowing access these fields. Calling functions can
indirectly cause the memory to be accessed. However, if you're not accessing GC
memory in a destructor, you're probably using some lower-level functions, which
are generally untrustworthy.

I thought the garbage collector only freed memory after the destructor had been run. DMD 1.00 spec document, page 104 says "The garbage collector calls the destructor when the object is deleted." Did this change? I haven't checked for an update to my copy of the spec document in some time. -- the "other" Chris Miller

http://www.digitalmars.com/d/2.0/class.html#Destructor "When the garbage collector calls a destructor for an object of a class that has members that are references to garbage collected objects, those references are no longer valid. This means that destructors cannot reference sub objects. This is because that the garbage collector does not collect objects in any guaranteed order, so there is no guarantee that any pointers or references to any other garbage collected objects exist when the garbage collector runs the destructor for an object." -- Chris Miller <chris dprogramming.com>
Mar 25 2008
prev sibling next sibling parent Alexander Panek <alexander.panek brainsware.org> writes:
I think polishing D 1.0 is definitely more important.
Mar 26 2008
prev sibling parent reply Don Clugston <dac nospam.com.au> writes:
Walter Bright wrote:
 
 http://reddit.com/r/programming/info/6d210/comments/

I think the real paradigm shift will come from 'pure', not from 'safe D' (or even const). I'd love to be able to mark most of my functions as pure right now (even before it's enforced by the compiler).
Mar 31 2008
parent Walter Bright <newshound1 digitalmars.com> writes:
Don Clugston wrote:
 I think the real paradigm shift will come from 'pure', not from 'safe D' 
 (or even const).
 I'd love to be able to mark most of my functions as pure right now (even 
 before it's enforced by the compiler).

Pure is coming!
Apr 02 2008