www.digitalmars.com         C & C++   DMDScript  

D - A possible solution to the GC conundrom.

reply Andy Friesen <andy ikagames.com> writes:
Mayhaps pointers to auto classes should be allowed; they'd have to be 
new'd and deleted just like C++.  Such pointers could exist as object 
members as well.

This has the advantage of not interfering with the default object 
behaviour (which is going to be less of a headache in most cases), while 
still allowing things to be done "the hard way" wherever desired, on a 
per-class basis.

I'm not sure how auto classes holding references to normal objects would 
complicate the garbage collector, though.

  -- andy
Jan 15 2003
parent reply "Walter" <walter digitalmars.com> writes:
See the new stuff in www.digitalmars.com/d/memory.html

-Walter


"Andy Friesen" <andy ikagames.com> wrote in message
news:b04o3c$ec0$1 digitaldaemon.com...
 Mayhaps pointers to auto classes should be allowed; they'd have to be
 new'd and deleted just like C++.  Such pointers could exist as object
 members as well.

 This has the advantage of not interfering with the default object
 behaviour (which is going to be less of a headache in most cases), while
 still allowing things to be done "the hard way" wherever desired, on a
 per-class basis.

 I'm not sure how auto classes holding references to normal objects would
 complicate the garbage collector, though.

   -- andy

Feb 10 2003
parent reply "Mike Wynn" <mike.wynn l8night.co.uk> writes:
shouldn't things such as stack allocated obj's be handled by the compiler,
rather than the programmer.
and free lists would (imho) be better if part of the inner working of the gc
allocator (cache mem blocks of given sizes, or perhaps per class mem cache
chains). open the way the GC works so we can write our own gc's yes, but
this ... no.
the mark/release for instance can be done without new `new` syntax o.k. you
can't say new Foo(), but you can say Foo.create();

copy on write, why is this a manual op, it should be a built in behaviour,
(err on the side of caution if the compiler can not fully detect no write
occurs). this forces programmers to continually write the same code again
and again.

again I'm confused by the changes you've made to D, I quote from the D
introduction

"who is D for" item 5 ...

Programmers who enjoy the expressive power of C++ but are frustrated by the
need to expend much effort explicitly managing memory and finding pointer
bugs

and item 8

Programmers who think the language should provide enough features to obviate
the continual necessity to manipulate pointers directly

you seem to have now added
Library writers who want to shoot the above programmers where it hurts, by
using a language with robust features, but turning them all off in their
libraries.

Mike.

"Walter" <walter digitalmars.com> wrote in message
news:b2aaet$gpe$1 digitaldaemon.com...
 See the new stuff in www.digitalmars.com/d/memory.html

 -Walter


 "Andy Friesen" <andy ikagames.com> wrote in message
 news:b04o3c$ec0$1 digitaldaemon.com...
 Mayhaps pointers to auto classes should be allowed; they'd have to be
 new'd and deleted just like C++.  Such pointers could exist as object
 members as well.

 This has the advantage of not interfering with the default object
 behaviour (which is going to be less of a headache in most cases), while
 still allowing things to be done "the hard way" wherever desired, on a
 per-class basis.

 I'm not sure how auto classes holding references to normal objects would
 complicate the garbage collector, though.

   -- andy


Feb 11 2003
next sibling parent Ilya Minkov <midiclub 8ung.at> writes:
Mike Wynn wrote:
 shouldn't things such as stack allocated obj's be handled by the
 compiler, rather than the programmer.

Code with garbage-collected objects, introduce stack objects as optimisation as soon as you're sure everything else is done. It's a dangerous feature by itself, so no provision should be made to simplify its usage. Sharing away stack-allocated data is a common bug source. And why? Because it's made too easy to allocate data on stack, so that it's often done without further thought!
 and free lists would (imho) be better if part of the inner working of
 the gc allocator (cache mem blocks of given sizes, or perhaps per
 class mem cache chains). open the way the GC works so we can write
 our own gc's yes, but this ... no.

GCs work differently. I can see that GC interface is being kept minimal, maybe that's on purpose? Allowing further control of GC means constraining the type of GC. Besides, different types of GCs may wish to generate different in-line code, like pointer write wrappers, optimal auto-scanners and such. Until there's no way to write code-generating code, a plug-in GC doesn't make sense. BTW, Free Lists make sense for C heaps only anyway, since D GC already allocates memory in a similar performance-tuned manner.
 the mark/release for instance can be done without new `new` syntax
 o.k. you can't say new Foo(), but you can say Foo.create();
 
 copy on write, why is this a manual op, it should be a built in
 behaviour, (err on the side of caution if the compiler can not fully
 detect no write occurs). this forces programmers to continually write
 the same code again and again.

Urrr... Wasn't it automated?
 again I'm confused by the changes you've made to D, I quote from the
 D introduction
 
 "who is D for" item 5 ...
 
 Programmers who enjoy the expressive power of C++ but are frustrated
 by the need to expend much effort explicitly managing memory and
 finding pointer bugs
 
 and item 8
 
 Programmers who think the language should provide enough features to
 obviate the continual necessity to manipulate pointers directly
 
 you seem to have now added Library writers who want to shoot the
 above programmers where it hurts, by using a language with robust
 features, but turning them all off in their libraries.

Libraries *have* to use GC and exceptions and be safe where possible. Besides, library writers have had all the possibilities before, because D supports all C types and pointer handling. It just became easier for the end-users to devise their own efficient memory menagement. Just that it has to be avoided whenever possible, for all reasons.
 Mike.

Feb 11 2003
prev sibling parent reply "Walter" <walter digitalmars.com> writes:
"Mike Wynn" <mike.wynn l8night.co.uk> wrote in message
news:b2ahc1$kmt$1 digitaldaemon.com...
 shouldn't things such as stack allocated obj's be handled by the compiler,
 rather than the programmer.
 and free lists would (imho) be better if part of the inner working of the

 allocator (cache mem blocks of given sizes, or perhaps per class mem cache
 chains). open the way the GC works so we can write our own gc's yes, but
 this ... no.
 the mark/release for instance can be done without new `new` syntax o.k.

 can't say new Foo(), but you can say Foo.create();

Stack obj's are automatically handled by the compiler for structs, static arrays, etc. Just not for class objects. What I provided in the new release is a mechanism for advanced programmers to explore some other ways of doing things. These techniques should not be commonly used, nearly all usage will work fine with the gc.
 copy on write, why is this a manual op, it should be a built in behaviour,
 (err on the side of caution if the compiler can not fully detect no write
 occurs). this forces programmers to continually write the same code again
 and again.

It needs to be manual because languages that do it automatically suffer from truly terrible performance when doing things like uppercasing string contents one character at a time.
 again I'm confused by the changes you've made to D, I quote from the D
 introduction
 "who is D for" item 5 ...
 Programmers who enjoy the expressive power of C++ but are frustrated by

 need to expend much effort explicitly managing memory and finding pointer
 bugs
 and item 8
 Programmers who think the language should provide enough features to

 the continual necessity to manipulate pointers directly

None of these new techniques are necessary to write D programs with. They are there for some very specialized uses where taking control of allocation/deallocation can get some performance gains.
Feb 11 2003
next sibling parent "Mike Wynn" <mike.wynn l8night.co.uk> writes:
have you seen http://www.cs.purdue.edu/s3/projects/bloat/
although its Java, the optimisations still aply to non stack based CPU's

 copy on write, why is this a manual op, it should be a built in


 (err on the side of caution if the compiler can not fully detect no


 occurs). this forces programmers to continually write the same code


 and again.

It needs to be manual because languages that do it automatically suffer

 truly terrible performance when doing things like uppercasing string
 contents one character at a time.

having iterators that are designed for such ops would be good /* the iterator func*/ template iterators( T ) { bit toUpperIt( in T orig, out T repl ) { repl = toUpper( orig ); return repl != orig; } alias bit (*ifunc)( in T, out T ); T[] modifyContents( T[] ar, ifunc iter ) { for ( int i = 0; i < ar.length; i++ ) { T tmp; if ( ifunc( ar[i], tmp ) ) { T[] nar = new T[ar.length]; if (i>0) {nar[0..i] = ar[0..i];} nar[i] = tmp; while( ++i < ar.length ) { ifunc( ar[i], tmp ); nar[i] = tmp; } return nar; } return ar; } } if build into the lang, op's such as char[] ups = foo.modifyContents( myStr, &toUpper ); could be optimied into a fast version of char[] or wchar[] toUpper may be `char[] ups = myStr.iter( &toUpper );` at it is currently D has 3 types of array lets call them blocks, vectors and slices. blocks are int[7] or similar, so any op such as ~= will cause an new array to be created vectors are int[] so sometimes ops such as ~= will NOT cause the array to be realloced and slices are ar[n..m] which are like blocks in that they can not be extended, and are aliases to another array import c.stdio; int[] func( int[] a, int[]b ) { if ( a[0] < 4 ) a ~= b[0]; return a[1..3]; } int main( char[][] args ) { int [] foo; int [1] b; b[0] = 9; for( int i = 0; i< 2; i++ ) { foo ~= i; } int[] c = func( foo, b ); foo ~= 1; printf("c[0] : %d\n", c[0] ); return 0; } what is printed out ? 9 or 1 (no cheating and compiling it first).
Feb 11 2003
prev sibling parent reply Russ Lewis <spamhole-2001-07-16 deming-os.org> writes:
Walter wrote:

 copy on write, why is this a manual op, it should be a built in behaviour,
 (err on the side of caution if the compiler can not fully detect no write
 occurs). this forces programmers to continually write the same code again
 and again.

It needs to be manual because languages that do it automatically suffer from truly terrible performance when doing things like uppercasing string contents one character at a time.

GC, in order to be fast and useful, will eventually need optimizing compilers that are GC aware. In the COW example you gave (and I'm assuming for the moment that COW *was* the rule), the compiler would initially render the program as making the many copies you are talking about. However, the optimizer would then realize that all the intermediate copies are going to be garbage immediately, and collapse it down into a single copy-modify. IMHO, forcing this complexity on the programmer instead of the compiler is Bad. It's better to have the early compilers not very optimized than to add this heavy burden on all D programmers for all time. OTOH, I recognize and agree with your point that it should be (relatively) easy to make a D compiler; otherwise *no* D compiler anywhere will be standards compliant. For that reason, I argue (again) is that somehow we need a optimization/translation library for D. I don't yet know how this would work, but it would be some sort of open source, standard set of mappings that all compilers could integrate into their parser and their optimizers. It would provide a consistent set of syntax sugar and optimizations for all compilers to leverage. Compilers would differentiate themselves by what they added to that standard set. I don't know how this would work, exactly, but I think that long term it's the only solution for a good marketplace of optimizing, standards-compliant compilers. -- The Villagers are Online! http://villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]
Feb 13 2003
next sibling parent reply Ilya Minkov <midiclub 8ung.at> writes:
This can be accomplished by keeping user count information during 
compilation. If the user count is unknown, it is asssumed to be >1, else 
it's exactly 1. If it's exactly 1, data doesn't need to be copied.

This way, function input will be assumed to have >1 count and copied on 
the first write, then no more. A more advanced compiler could promote 
this information beyond function boundaries where statically possible.

BTW, is there any way to store this information in the array header to 
make runtime decisions? It's only 1 bit large...

There also has to be a way to simplify implementing a copy-on-write 
convention manually?

-i.


Russ Lewis wrote:
 Walter wrote:
 
 
copy on write, why is this a manual op, it should be a built in behaviour,
(err on the side of caution if the compiler can not fully detect no write
occurs). this forces programmers to continually write the same code again
and again.

It needs to be manual because languages that do it automatically suffer from truly terrible performance when doing things like uppercasing string contents one character at a time.

GC, in order to be fast and useful, will eventually need optimizing compilers that are GC aware. In the COW example you gave (and I'm assuming for the moment that COW *was* the rule), the compiler would initially render the program as making the many copies you are talking about. However, the optimizer would then realize that all the intermediate copies are going to be garbage immediately, and collapse it down into a single copy-modify. IMHO, forcing this complexity on the programmer instead of the compiler is Bad. It's better to have the early compilers not very optimized than to add this heavy burden on all D programmers for all time. OTOH, I recognize and agree with your point that it should be (relatively) easy to make a D compiler; otherwise *no* D compiler anywhere will be standards compliant. For that reason, I argue (again) is that somehow we need a optimization/translation library for D. I don't yet know how this would work, but it would be some sort of open source, standard set of mappings that all compilers could integrate into their parser and their optimizers. It would provide a consistent set of syntax sugar and optimizations for all compilers to leverage. Compilers would differentiate themselves by what they added to that standard set. I don't know how this would work, exactly, but I think that long term it's the only solution for a good marketplace of optimizing, standards-compliant compilers. -- The Villagers are Online! http://villagersonline.com .[ (the fox.(quick,brown)) jumped.over(the dog.lazy) ] .[ (a version.of(English).(precise.more)) is(possible) ] ?[ you want.to(help(develop(it))) ]

Feb 13 2003
parent Ilya Minkov <midiclub 8ung.at> writes:
(correcting myself)

Ilya Minkov wrote:
 This can be accomplished by keeping user count information during 
 compilation. If the user count is unknown, it is asssumed to be >1, else 
 it's exactly 1. If it's exactly 1, data doesn't need to be copied.

Sorry, it's nonsense. It doesn't work in that manner. :)
 BTW, is there any way to store this information in the array header to 
 make runtime decisions? It's only 1 bit large...

And thus slow ourselves down to interpreted languages. No, thanks. :) It might be faster to copy than to make decisions all the time...
 There also has to be a way to simplify implementing a copy-on-write 
 convention manually?

still needs thinking about... The possible solution would be to "always copy" in alpha code, than change it into copy-on-write someday...
 -i.

Feb 14 2003
prev sibling parent "Walter" <walter digitalmars.com> writes:
"Russ Lewis" <spamhole-2001-07-16 deming-os.org> wrote in message
news:3E4BA572.242876D2 deming-os.org...
 GC, in order to be fast and useful, will eventually need optimizing

 that are GC aware.  In the COW example you gave (and I'm assuming for the

 that COW *was* the rule), the compiler would initially render the program

 making the many copies you are talking about.  However, the optimizer

 realize that all the intermediate copies are going to be garbage

 and collapse it down into a single copy-modify.

 IMHO, forcing this complexity on the programmer instead of the compiler is

 It's better to have the early compilers not very optimized than to add

 heavy burden on all D programmers for all time.

 OTOH, I recognize and agree with your point that it should be (relatively)

 to make a D compiler; otherwise *no* D compiler anywhere will be standards
 compliant.

If this optimization was reasonably straightforward to do, I would fully agree with you that D should do COW automatically. But I don't see how to do it reliably, especially for non-trivial examples. I also don't know of any COW language compiler that is able to do such optimizations, and from that I infer it is a very hard problem.
 For that reason, I argue (again) is that somehow we need a
 optimization/translation library for D.  I don't yet know how this would

 but it would be some sort of open source, standard set of mappings that

 compilers could integrate into their parser and their optimizers.  It

 provide a consistent set of syntax sugar and optimizations for all

 leverage.  Compilers would differentiate themselves by what they added to

 standard set.  I don't know how this would work, exactly, but I think that

 term it's the only solution for a good marketplace of optimizing,
 standards-compliant compilers.

At least the lexer/parser/semantic analysis code is open source, for just the reason you state - to make it easy for others to do compatible implementations of D.
Feb 14 2003