www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - -nogc

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
I've discussed something with Walter today and thought I'd share it here.

The possibility of using D without a garbage collector was always
looming and has been used to placate naysayers ("you can call malloc if
you want" etc.) but that opportunity has not been realized in a seamless
manner. As soon as you concatenate arrays, add to a hash, or create an
object, you will call into the GC.

So I'm thinking there should be a flag -nogc that enables a different
model of memory allocation. Here's the steps we need to take:

1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
superdan suggested that when he wasn't busy cursing :o).

2. Do the similar thing for associative arrays.

3. Have two object.d at hand: one is "normal" and uses garbage
collection, the other (call it object_nogc.d) has an entirely different
definition for arrays, hashes, and Object.

4. The definition of Object in object_nogc.d includes a reference count
member for intrusive refcounting.

5. Define a Ref!(T) struct in object_nogc.d that does intrusive
reference counting against T using ctors and dtor.

6. At this point we already have a usable, credible no-gc offering: just
use object_nogc.d instead of object.d and instead of "new Widget(args)"
use "Ref!(Widget)(args)".

7. Add a -nogc option to the compiler. In that mode, the compiler
replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
"Ref!(T)(args)" for all classes T except inside
object_nogc.d. The exception, as Walter pointed out, is to avoid
infinite regression (how do you implement Ref if the reference you hold
inside will also be wrapped in Ref???)

8. Well with this all a very solid offering of D without garbage
collection would be available at a low cost!

One cool thing is that you can compile the same application with and
without GC and test the differences easily. That's bound to show a
number of interesting things!

A disadvantage is that -nogc must be global - you can't link a program
that's partially built with gc and partially without. This was a major
counter-argument to adding optional gc to C++.


Andrei
Apr 23 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.
1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
".Array!(T)"<

That has to be done with care an in a transparent way, not adding the Array name in the namespace, so you can create an Array youself, etc. Bye, bearophile
Apr 23 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
bearophile wrote:
 Andrei Alexandrescu:
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.

I was thinking of starting from the opposite end - add the required tools first so we gain experience with them, and then integrate with the compiler.
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
".Array!(T)"<

That has to be done with care an in a transparent way, not adding the Array name in the namespace, so you can create an Array youself, etc.

There shouldn't be any harm in using Array or AssocArray directly. Andrei
Apr 23 2009
parent reply Robert Fraser <fraserofthenight gmail.com> writes:
Andrei Alexandrescu wrote:
 There shouldn't be any harm in using Array or AssocArray directly.

How about __Array or something? I already have a struct Array!()
Apr 24 2009
parent Leandro Lucarella <llucax gmail.com> writes:
Robert Fraser, el 24 de abril a las 07:55 me escribiste:
 Andrei Alexandrescu wrote:
There shouldn't be any harm in using Array or AssocArray directly.

How about __Array or something? I already have a struct Array!()

How about std.array.Array or something? -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Are you such a dreamer? To put the world to rights? I'll stay home forever Where two & two always makes up five
Apr 25 2009
prev sibling next sibling parent reply "Denis Koroskin" <2korden gmail.com> writes:
On Thu, 23 Apr 2009 15:08:43 +0400, bearophile <bearophileHUGS lycos.com> wrote:

 Andrei Alexandrescu:
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.

Same here. My version of -nogc would work as follows: 1) You mark a module as -nogc/realtime/whatever (similar to module(system) or module(safe)). module(nogc) some.module.that.doesnt.use.gc; 2) Array concatenations are not allowed, array.length is readonly, can't insert into AA, can't new objects (only malloc? this needs additional thoughts) 3) ... I believe this is an interesting way to explore. When one writes real-time/kernel-level/etc code that shouldn't leak, he should be very cautious and compiler should help him by restricting access to methods that potentially/certainly leak. But how would one ensure that? A simple idea would be to allow module(nogc) only access other modules that are marked as nogc, too. This will effectively disallow most (or all - at least in the beginning) parts of Phobos (for a reason!). But marking whole module as nogc is not very good. For example, I'd like to be able to read from and write to an existing array, but I should be unable to resize them. Thus, all the method of Array but a void Array.resize(size_t newSize) must be marked as safe/noleak/nogc/whatever (nogc is a good name here). Similarly, safe modules should be able to access other safe methods, to, and T* Array!(T).ptr() should not be among them. Thoughts?
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Thu, 23 Apr 2009 15:08:43 +0400, bearophile
 <bearophileHUGS lycos.com> wrote:
 
 Andrei Alexandrescu:
 The possibility of using D without a garbage collector was always
  looming and has been used to placate naysayers ("you can call
 malloc if you want" etc.) but that opportunity has not been
 realized in a seamless manner. As soon as you concatenate arrays,
 add to a hash, or create an object, you will call into the GC.

with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.

Same here. My version of -nogc would work as follows: 1) You mark a module as -nogc/realtime/whatever (similar to module(system) or module(safe)). module(nogc) some.module.that.doesnt.use.gc; 2) Array concatenations are not allowed, array.length is readonly, can't insert into AA, can't new objects (only malloc? this needs additional thoughts)

I don't understand the appeal of this. I mean that means you'd have to write your own Array class to do anything about arrays anyway. So why not actually integrate it with the language?
 3) ...
 
 I believe this is an interesting way to explore. When one writes
 real-time/kernel-level/etc code that shouldn't leak, he should be
 very cautious and compiler should help him by restricting access to
 methods that potentially/certainly leak. But how would one ensure
 that?
 
 A simple idea would be to allow module(nogc) only access other
 modules that are marked as nogc, too. This will effectively disallow
 most (or all - at least in the beginning) parts of Phobos (for a
 reason!).
 
 But marking whole module as nogc is not very good. For example, I'd
 like to be able to read from and write to an existing array, but I
 should be unable to resize them.

I think marking the entire application as gc or not is the most sensible option. It's also relatively simple to implement.
 Thus, all the method of Array but a void Array.resize(size_t newSize)
 must be marked as safe/noleak/nogc/whatever (nogc is a good name
 here).
 
 Similarly, safe modules should be able to access other safe methods,
 to, and T* Array!(T).ptr() should not be among them.
 
 Thoughts?

My ambition about mixing together gc and no-gc modules is smaller. In essence, I think C++ has shown that that's not an easy option. I want to have the opportunity to choose at application level whether I will be using gc or not. In my opinion, anything that gets into defining new qualifiers or storage classes is bound to cause too much aggravation. We already have enough of those. Andrei
Apr 23 2009
parent bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 I don't understand the appeal of this. I mean that means you'd have to 
 write your own Array class to do anything about arrays anyway. So why 
 not actually integrate it with the language?

I think the main appeal is that the language then behaves almost like C, and lot of people know C already, so they don't have to learn more things, like using a reference counting GC (and its traps) too. Bye, bearophile
Apr 23 2009
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 23 de abril a las 07:08 me escribiste:
 Andrei Alexandrescu:
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. This is simple. While adding a second memory management system, ref-counted, looks like an increase of complexity for both the compiler and the programmers.

I definitely agree that -nogc should not imply reference counting garbarge collection. Now in Tango/Druntime you already can use a dummy GC that all it does is calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so what -nogc should do in that case is just link against the "stub" GC instead to the "basic".
1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
".Array!(T)"<

That has to be done with care an in a transparent way, not adding the Array name in the namespace, so you can create an Array youself, etc.

srd.array.Array can be used, and leave T[] as a syntax sugar only (but you could also write std.array.Array!(T) instead). -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Brad Roberts wrote:
 Leandro Lucarella wrote:
 
 Now in Tango/Druntime you already can use a dummy GC that all it does is
 calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so
 what -nogc should do in that case is just link against the "stub" GC
 instead to the "basic".

This claim really needs to stop. You can't just swap from the normal gc to the stub gc an expect your app to use malloc/free and behave exactly as it did before. No, it'll leak. Free will never be called (outside the rare case of an explicit delete). Normal apps expect implicit cleanup to be invoked by the gc which will never happen in the stub. That's fine in tiny apps, or apps that carefully manage their own memory, but then you weren't using the gc in the first place for those apps.

Totally agreed. I've always wondered what the purpose of the stub GC was in druntime. "We can implement an appallingly crappy allocation model" is the only message I'm getting. Andrei
Apr 23 2009
next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 23 de abril a las 12:48 me escribiste:
 Brad Roberts wrote:
Leandro Lucarella wrote:
Now in Tango/Druntime you already can use a dummy GC that all it does is
calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so
what -nogc should do in that case is just link against the "stub" GC
instead to the "basic".

to the stub gc an expect your app to use malloc/free and behave exactly as it did before. No, it'll leak. Free will never be called (outside the rare case of an explicit delete). Normal apps expect implicit cleanup to be invoked by the gc which will never happen in the stub. That's fine in tiny apps, or apps that carefully manage their own memory, but then you weren't using the gc in the first place for those apps.

Totally agreed. I've always wondered what the purpose of the stub GC was in druntime. "We can implement an appallingly crappy allocation model" is the only message I'm getting. Andrei

From the stub GC documentation: This module contains a minimal garbage collector implementation according to published requirements. This library is mostly intended to serve as an example, but it is usable in applications which do not rely on a garbage collector to clean up memory (ie. when dynamic array resizing is not used, and all memory allocated with 'new' is freed deterministically with 'delete'). I think being an example is just a good enough reason (it was useful for me at least). The other use you may like it or not, but is there. And may I ask what it would happen if I do this with your "-nogc" proposal? class A { B b; } class B { A a; } A a = new A; a.b = new B; a.b.a = a; ? Wont this leak? Are you planning to make a backup tracing collector to fix cycles maybe? Because I don't think using a naive reference counting will avoid leaks as easy as you put it... RC is not *that* simple. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 And may I ask what it would happen if I do this with your "-nogc"
 proposal?
 
 class A
 {
 	B b;
 }
 
 class B
 {
 	A a;
 }
 
 A a = new A;
 a.b = new B;
 a.b.a = a;
 
 ? Wont this leak? Are you planning to make a backup tracing collector to
 fix cycles maybe? Because I don't think using a naive reference counting
 will avoid leaks as easy as you put it...
 
 RC is not *that* simple.

Oh I absolutely agree. In short, what happens depends on how Ref is implemented. In essence what I suggest is not (a simplified method of) reference counting, it's a hook that allows various allocation/collection strategies to be implemented by knowledgeable people (hint, hint) :o). I think WeakRef!T would also have to be part of the offering inside object_whatever.d. Then the example above can be fixed for the refcounting case by making one of the references weak. Andrei
Apr 23 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
 
 I think WeakRef!T would also have to be part of the offering inside 
 object_whatever.d. Then the example above can be fixed for the 
 refcounting case by making one of the references weak.

Now I wonder if the weak ref thread has (indirectly) caused the no gc thread. Have you and/or Walter and/or Sean decided that weakref is sufficiently tied to the gc that is should be part of the D standard libraries? PS: I discovered recently that wikipedia says D has no support for weak references. It's always good to remove negative points about a language ;)
Apr 23 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 
 I think WeakRef!T would also have to be part of the offering inside
  object_whatever.d. Then the example above can be fixed for the 
 refcounting case by making one of the references weak.

Now I wonder if the weak ref thread has (indirectly) caused the no gc thread. Have you and/or Walter and/or Sean decided that weakref is sufficiently tied to the gc that is should be part of the D standard libraries?

I haven't followed that closely, but then you never know how the mind works.
 PS: I discovered recently that wikipedia says D has no support for
 weak references. It's always good to remove negative points about a
 language ;)

Yes, weak references should be in there. I learned today that Java has also soft references, which the gc can collect and nullify when on low memory conditions. Andrei
Apr 23 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 23 de abril a las 13:27 me escribiste:
 Leandro Lucarella wrote:
And may I ask what it would happen if I do this with your "-nogc"
proposal?
class A
{
	B b;
}
class B
{
	A a;
}
A a = new A;
a.b = new B;
a.b.a = a;
? Wont this leak? Are you planning to make a backup tracing collector to
fix cycles maybe? Because I don't think using a naive reference counting
will avoid leaks as easy as you put it...
RC is not *that* simple.

Oh I absolutely agree. In short, what happens depends on how Ref is implemented. In essence what I suggest is not (a simplified method of) reference counting, it's a hook that allows various allocation/collection strategies to be implemented by knowledgeable people (hint, hint) :o).

I think it would be awesome to have some kind of hooks to instrument references read/writes so one can implement RC-based, partition-based GC (like generational) or incremental collectors. There were a proposal to make that some time ago by Frank Benoit: http://www.digitalmars.com/webnews/newsgroups.php?art_group=digitalmars.D&article_id=35426 He proposed to add these hooks, with the default implementation being: ___ref_assign( void * trg, void * src ){ trg = src; } void* ___ref_read( void * src ){ return src; } I think you proposal of wrapping read/writes into a library defined type is an improvement over the Frank's proposal. (other proposals in that very same mail are as interesting as the read/write instrumentation)
 I think WeakRef!T would also have to be part of the offering inside
 object_whatever.d. Then the example above can be fixed for the
 refcounting case by making one of the references weak.

I think WeakRef!T should be part of the GC, not object.d. The same goes for Ref!T. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Hey you, with you ear against the wall Waiting for someone to call out Would you touch me?
Apr 25 2009
prev sibling parent reply Sean Kelly <sean invisibleduck.org> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Brad Roberts wrote:
 Leandro Lucarella wrote:

 Now in Tango/Druntime you already can use a dummy GC that all it does is
 calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so
 what -nogc should do in that case is just link against the "stub" GC
 instead to the "basic".

This claim really needs to stop. You can't just swap from the normal gc to the stub gc an expect your app to use malloc/free and behave exactly as it did before. No, it'll leak. Free will never be called (outside the rare case of an explicit delete). Normal apps expect implicit cleanup to be invoked by the gc which will never happen in the stub. That's fine in tiny apps, or apps that carefully manage their own memory, but then you weren't using the gc in the first place for those apps.

in druntime. "We can implement an appallingly crappy allocation model" is the only message I'm getting.

It was originally intended as a demo for how to plug a new GC into the lib when I wrote it for Tango. I left it in place mostly because the DLL demo for Win32 actually links this "stub" implementation, and it seemed preferable to have something that would actually allocate memory than segfault if the DLL startup code did things in a bad order. A few people have wanted to try and write a small footprint C-like app in D as well, so it's a handy if not terribly user-friendly option for that sort of thing. The few times I've brought this up in the past, I've always been careful to say that array and AA operations will leak if linked against this lib.
Apr 23 2009
parent Kagamin <spam here.lot> writes:
Sean Kelly Wrote:

 I've always been careful to say that array and
 AA operations will leak if linked against this lib.

You can remove leaking operations from druntime and leaking application won't link.
Apr 24 2009
prev sibling parent Leandro Lucarella <llucax gmail.com> writes:
Brad Roberts, el 23 de abril a las 10:39 me escribiste:
 Leandro Lucarella wrote:
 
 Now in Tango/Druntime you already can use a dummy GC that all it does is
 calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so
 what -nogc should do in that case is just link against the "stub" GC
 instead to the "basic".

This claim really needs to stop. You can't just swap from the normal gc to the stub gc an expect your app to use malloc/free and behave exactly as it did before. No, it'll leak. Free will never be called (outside the rare case of an explicit delete). Normal apps expect implicit cleanup to be invoked by the gc which will never happen in the stub. That's fine in tiny apps, or apps that carefully manage their own memory, but then you weren't using the gc in the first place for those apps.

Maybe it wasn't clear enough (and it's less clear even in you mail where you remove all context of my response) but what I said was in response to bearophile mail, which said (and I quote it again): One simple possible solution: -nogc is to write C-like programs, with no automatic reference counting. It doesn't include the GC in the final executable (making it much smaller) and in such programs AAs and array concatenation and closures are forbidden (compilation error if you try to use them). "New" allocates using the C heap, and you have to use "delete" manually for each of them. So, no, I'm not implying that one can use the "stub" GC implementation and just go and be happy and play with dynamic arrays and hashes and expect that memory get freed magically. Please read the message in its context before putting words in my mouth. Thanks. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
Leandro Lucarella wrote:

 Now in Tango/Druntime you already can use a dummy GC that all it does is
 calling C malloc/free for gc_malloc/gc_free, exactly for this purpose, so
 what -nogc should do in that case is just link against the "stub" GC
 instead to the "basic".

This claim really needs to stop. You can't just swap from the normal gc to the stub gc an expect your app to use malloc/free and behave exactly as it did before. No, it'll leak. Free will never be called (outside the rare case of an explicit delete). Normal apps expect implicit cleanup to be invoked by the gc which will never happen in the stub. That's fine in tiny apps, or apps that carefully manage their own memory, but then you weren't using the gc in the first place for those apps. Later, Brad
Apr 23 2009
prev sibling next sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 I've discussed something with Walter today and thought I'd share it here.
 
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.
 
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

This means replacing a mark/sweep GC with a reference counting GC. I'd think that it would be better to have -nogc not use Ref(T) by default, and add another flag -refcount that implies -nogc and uses Ref(T) by default.
Apr 23 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 I've discussed something with Walter today and thought I'd share it here.

 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

This means replacing a mark/sweep GC with a reference counting GC.

It just means that certain types and constructs are rewritten. The exact strategy depends on how Ref, Array, and AssocArray are defined. Probably a good approach is to simply rewrite anything anyway and have Ref vanish in gc mode by means of e.g. alias this. So this means that we only need a flag -object=/path/to/object.d after all. Andrei
Apr 23 2009
parent Christopher Wright <dhasenan gmail.com> writes:
Andrei Alexandrescu wrote:
 Christopher Wright wrote:
 This means replacing a mark/sweep GC with a reference counting GC.

It just means that certain types and constructs are rewritten. The exact strategy depends on how Ref, Array, and AssocArray are defined. Probably a good approach is to simply rewrite anything anyway and have Ref vanish in gc mode by means of e.g. alias this. So this means that we only need a flag -object=/path/to/object.d after all.

Well, sure, if you're going for the *entirely* general solution :) I like it, by the way. It would also make it easier to deal with the builtin aggregate types via reflection, if Array and AssocArray were in object.d. Anyway, if D supported implicit refcounting as well as mark/sweep, that's a lot closer to supporting any random allocation scheme I might dream up. Also, it might be good to separate object.d into several modules in that case. That might not be so fun from the compiler's view or an efficiency standpoint, though.
Apr 23 2009
prev sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Christopher Wright wrote:
 Andrei Alexandrescu wrote:
 I've discussed something with Walter today and thought I'd share it here.

 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

This means replacing a mark/sweep GC with a reference counting GC. I'd think that it would be better to have -nogc not use Ref(T) by default, and add another flag -refcount that implies -nogc and uses Ref(T) by default.

A thought occurs. If you wanted to implement, say, a moving collector, it would be very useful to have some level of indirection instead of direct references (either using an ID for the actual address, or by using an opPostMove to notify the GC, if we ever get one.) So perhaps you could make "-nogc" alias to "-nogc=object_nogc", thus allowing people to replace Array(T), AA(K,V) and Ref(T) with their own. -- Daniel
Apr 23 2009
prev sibling next sibling parent reply Frank Benoit <keinfarbton googlemail.com> writes:
I am using D for a real time test system. There i have to make sure that
real time code does never use direct or indirect allocations.
I can use the GC in the non real time thread and at start up. I can
preallocate as much as I want. Just, it is not allowed in IRQ handler,
certainly.

What i did is, i patched the GC to have a callback in the allocation
function. My application can register for that callback and checks the
current thread. If it is a real time thread, an error is generated.

The disadvantage is, it is a runtime check. The advantage is, i can mix
code that can use the GC and code that can't.
Apr 23 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Frank Benoit wrote:
 I am using D for a real time test system. There i have to make sure that
 real time code does never use direct or indirect allocations.
 I can use the GC in the non real time thread and at start up. I can
 preallocate as much as I want. Just, it is not allowed in IRQ handler,
 certainly.
 
 What i did is, i patched the GC to have a callback in the allocation
 function. My application can register for that callback and checks the
 current thread. If it is a real time thread, an error is generated.
 
 The disadvantage is, it is a runtime check. The advantage is, i can mix
 code that can use the GC and code that can't.

Great. We need to mind this scenario (the right place is I think Ref). Andrei
Apr 23 2009
prev sibling parent ponec <aliloko gmail.com> writes:
I am also using D for real time applications (audio library, not embedded
systems). 

Since I'm ensuring allocation don't occur in a real-time loop, I'm quite happy
with the way the GC work. We have custom allocators, do we ? 
I mean, anything which is too slow could use pools anyway.


Frank Benoit Wrote:

 I am using D for a real time test system. There i have to make sure that
 real time code does never use direct or indirect allocations.
 I can use the GC in the non real time thread and at start up. I can
 preallocate as much as I want. Just, it is not allowed in IRQ handler,
 certainly.
 
 What i did is, i patched the GC to have a callback in the allocation
 function. My application can register for that callback and checks the
 current thread. If it is a real time thread, an error is generated.
 
 The disadvantage is, it is a runtime check. The advantage is, i can mix
 code that can use the GC and code that can't.

Apr 23 2009
prev sibling next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-04-23 06:58:38 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 I've discussed something with Walter today and thought I'd share it here.
 
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

Very true. It's pretty easy to call the GC without noticing in D.
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:
 
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

That makes sense.
 2. Do the similar thing for associative arrays.
 
 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

Couldn't that just be a version switch, such as `version (D_NO_GC)` and `version (D_GC)`. Then you can implement things differently in other modules too when there is or there isn't a GC.
 4. The definition of Object in object_nogc.d includes a reference count
 member for intrusive refcounting.
 
 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.
 
 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".

How's that going to work with scope classes? scope Widget = new Widget; scope Widget = Ref!(Widget)();
 7. Add a -nogc option to the compiler. In that mode, the compiler
 replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
 "Ref!(T)(args)" for all classes T except inside
 object_nogc.d. The exception, as Walter pointed out, is to avoid
 infinite regression (how do you implement Ref if the reference you hold
 inside will also be wrapped in Ref???)

I'm just wondering, why wouldn't the compiler always use Ref!(T)? In the GC mode it'd simply resolve to a T, but if you wanted to experiment with another kind of GC -- say one which would require calling a notification function when writing a new value, such as the one in Objective-C 2.0 -- then you could. Hum, also, how would it work for pointers to things in memory blocks that are normally managed by the GC? Would those increment the reference count for the memory block?
 8. Well with this all a very solid offering of D without garbage
 collection would be available at a low cost!
 
 One cool thing is that you can compile the same application with and
 without GC and test the differences easily. That's bound to show a
 number of interesting things!

Indeed.
 A disadvantage is that -nogc must be global - you can't link a program
 that's partially built with gc and partially without. This was a major
 counter-argument to adding optional gc to C++.

Another disadvantage is that you change the reference semantics and capabilities. With a GC, you can create circular pointer references and it won't leak memory once you stop referencing them. Do that with reference counting and you'll have memory leaks all around. So with no GC, you have to have weak references if you're going to build tree structures where branches know about their parents (which is a pretty common thing). I'd suggest that weak references be put in the language so the compiler can replace them with WeakRef!(T) in no-GC mode and do something else in GC mode. Being in the language would just be some syntactic sugar that would make them more bearable in normal code. As for compatibility, it may be worth looking at how it has been done in Objective-C 2.0. Objective-C has always used reference counting. Version 2.0 brought a GC. When you build a library, you have to specify whether the resulting binary expects a GC, reference counting, or can work in both modes. Something that works in both mode incurs a slight overhead, but sometime the binary compatibility is just worth it. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Apr 23 2009
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I've discussed something with Walter today and thought I'd share it here.
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

One of my concerns with this is what effect it would have on CTFE, templates, etc. One of the nice things about truly builtin arrays and AAs is that they fully work at compile time. I'd be for this idea only if we could guarantee that nothing with respect to arrays and AAs that works at compile time now would break if they were moved to object.
Apr 23 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I've discussed something with Walter today and thought I'd share it here.
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

One of my concerns with this is what effect it would have on CTFE, templates, etc. One of the nice things about truly builtin arrays and AAs is that they fully work at compile time. I'd be for this idea only if we could guarantee that nothing with respect to arrays and AAs that works at compile time now would break if they were moved to object.

All compile-time stuff will remain unchanged. Only when it comes about generating code will the idea enter in action. Andrei
Apr 23 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 I've discussed something with Walter today and thought I'd share it here.
 
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.
 
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:
 
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

I like that translation since it can allow customization. How will the type system handle Array!(const(T)) and const(Array!T)? They're no longer implicitly convertible.
 2. Do the similar thing for associative arrays.
 
 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

A version statement seems more powerful. Standard libraries may need changes too.
 4. The definition of Object in object_nogc.d includes a reference count
 member for intrusive refcounting.

That's still a method of garbage collecting! -nogc is kind of misleading...
 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.
 
 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".

Ick... Please make this hijack the default new behavior.
 7. Add a -nogc option to the compiler. In that mode, the compiler
 replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
 "Ref!(T)(args)" for all classes T except inside
 object_nogc.d. The exception, as Walter pointed out, is to avoid
 infinite regression (how do you implement Ref if the reference you hold
 inside will also be wrapped in Ref???)
 
 8. Well with this all a very solid offering of D without garbage
 collection would be available at a low cost!
 
 One cool thing is that you can compile the same application with and
 without GC and test the differences easily. That's bound to show a
 number of interesting things!
 
 A disadvantage is that -nogc must be global - you can't link a program
 that's partially built with gc and partially without. This was a major
 counter-argument to adding optional gc to C++.
 
 
 Andrei

Apr 23 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 I've discussed something with Walter today and thought I'd share it here.

 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

I like that translation since it can allow customization. How will the type system handle Array!(const(T)) and const(Array!T)? They're no longer implicitly convertible.

We will need to accommodate multiple implicit conversions somehow anyway (e.g. multiple alias this entries). This is a great question because it illustrates how segregating arrays out of the language challenges the magic that made them "special" and democratizes good features such that other types can benefit of them too.
 2. Do the similar thing for associative arrays.

 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

A version statement seems more powerful. Standard libraries may need changes too.
 4. The definition of Object in object_nogc.d includes a reference count
 member for intrusive refcounting.

That's still a method of garbage collecting! -nogc is kind of misleading...

Yah, I agree.
 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.

 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".

Ick... Please make this hijack the default new behavior.

Agreed. Andrei
Apr 23 2009
next sibling parent reply "Joel C. Salomon" <joelcsalomon gmail.com> writes:
Andrei Alexandrescu wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).
 
 2. Do the similar thing for associative arrays. 



 
 We will need to accommodate multiple implicit conversions somehow anyway
 (e.g. multiple alias this entries). This is a great question because it
 illustrates how segregating arrays out of the language challenges the
 magic that made them "special" and democratizes good features such that
 other types can benefit of them too.

 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.

 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".

Ick... Please make this hijack the default new behavior.

Agreed.

Just as (1) & (2) point to a way to remove the “magic” of built-in arrays & hash-tables, so too might (5) & (6) point to a way of replacing the “new T(args)” syntax with something cleaner? Not that “new!(T)(args)” looks nicer than the current syntax, but is it perhaps a better fit with the rest of the language? I’ve been lurking on these list for a while and have noticed a pattern: • Walter includes a useful feature (e.g., complex numbers) into the language itself for efficiency reasons. • Walter makes the language more powerful in some subtle but far-reaching way. • Someone re-implements the special feature as a library function, based on the language improvement. I’m just wondering what addition will––accidentally, of course––make the creation of new infix operators possible. ;-) —Joel Salomon
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Joel C. Salomon wrote:
 Just as (1) & (2) point to a way to remove the “magic” of built-in
 arrays & hash-tables, so too might (5) & (6) point to a way of replacing
 the “new T(args)” syntax with something cleaner?  Not that
 “new!(T)(args)” looks nicer than the current syntax, but is it perhaps a
 better fit with the rest of the language?

I agree. new sucks. Andrei
Apr 23 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Just as (1) & (2) point to a way to remove the “magic” of built-in
 arrays & hash-tables, so too might (5) & (6) point to a way of replacing
 the “new T(args)” syntax with something cleaner?  Not that
 “new!(T)(args)” looks nicer than the current syntax, but is it perhaps a
 better fit with the rest of the language?

I agree. new sucks. Andrei

Oh I don't know, I rather like being able to allocate stuff on the heap. I mean, if I didn't, the poor heap would probably be very lonely. Poor, poor oft-maligned heap; all because he's a bit slower than stack allocation and needs to be cleaned up after. He's trying to help, you know! Joking aside, what do you have in mind? Every solution I come up with ends up being more or less the same (except with the 'new' keyword in a different place) or worse. -- Daniel
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Daniel Keep wrote:
 
 Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Just as (1) & (2) point to a way to remove the “magic” of built-in
 arrays & hash-tables, so too might (5) & (6) point to a way of replacing
 the “new T(args)” syntax with something cleaner?  Not that
 “new!(T)(args)” looks nicer than the current syntax, but is it perhaps a
 better fit with the rest of the language?

Andrei

Oh I don't know, I rather like being able to allocate stuff on the heap. I mean, if I didn't, the poor heap would probably be very lonely. Poor, poor oft-maligned heap; all because he's a bit slower than stack allocation and needs to be cleaned up after. He's trying to help, you know! Joking aside, what do you have in mind? Every solution I come up with ends up being more or less the same (except with the 'new' keyword in a different place) or worse.

I don't know. Here's what we need: 1. Create one heap object of type T with creation arguments args. Currently: new T(args), except when T is a variable-sized array in which case syntax is new T[arg] (only one arg allowed). It is next to impossible to create one heap object of fixed-size array type. 2. Create one array T[] of size s and initialize its elements from a range or generator function. Currently possible via a function that internally uses GC calls. 3. Create one array T[] of a size s and do not initialize content. Currently possible by calling GC functions. Probably not really needed because it's not that safe anyway. Andrei
Apr 23 2009
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Just as (1) & (2) point to a way to remove the “magic” of built-in
 arrays & hash-tables, so too might (5) & (6) point to a way of
 replacing
 the “new T(args)” syntax with something cleaner?  Not that
 “new!(T)(args)” looks nicer than the current syntax, but is it
 perhaps a
 better fit with the rest of the language?

Andrei

Oh I don't know, I rather like being able to allocate stuff on the heap. I mean, if I didn't, the poor heap would probably be very lonely. Poor, poor oft-maligned heap; all because he's a bit slower than stack allocation and needs to be cleaned up after. He's trying to help, you know! Joking aside, what do you have in mind? Every solution I come up with ends up being more or less the same (except with the 'new' keyword in a different place) or worse.

I don't know. Here's what we need: 1. Create one heap object of type T with creation arguments args. Currently: new T(args), except when T is a variable-sized array in which case syntax is new T[arg] (only one arg allowed). It is next to impossible to create one heap object of fixed-size array type.

Actually,
 new T[](size);

This works just fine. I do think that (new T[n]) should be of type T[n], not T[].
 2. Create one array T[] of size s and initialize its elements from a
 range or generator function. Currently possible via a function that
 internally uses GC calls.

 3. Create one array T[] of a size s and do not initialize content.
 Currently possible by calling GC functions. Probably not really needed
 because it's not that safe anyway.
 
 Andrei

Perhaps have these overloads for object.Array!(T)...
 new T[](size_t, T)
 new T[](size_t, S) if isInputRange!(S)

As for the uninitialised case, I don't think it's that important, either. <dreaming> That said... Assuming I'm allowed to make language changes, I'd be tempted to make void a valid type for variables and arguments, and also make it double as an expression. That is, void is a literal of type void with value void, the ONLY valid value for voids. Then, you could have:
 auto a = new T[](n, void);

I've always been annoyed that you can't have void variables or arguments; it always seems to complicate my beautiful generic code :'(
 ReturnTypeOf!(Fn) wrap(alias Fn)(ParameterTypeTuple!(Fn) args)
 {
     logf("> ",Fn.stringof,args);
     auto result = Fn(args);
     logf("< ",Fn.stringof,args);
 }

That works only so long as Fn doesn't return a void. And before anyone suggests it, this trick:
 return Fn(args);

only works in trivial cases; it doesn't help if you need to be able to DO something after the function call. </dreaming> -- Daniel
Apr 23 2009
prev sibling parent Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 1. Create one heap object of type T with creation arguments args. 
 Currently: new T(args), except when T is a variable-sized array in which 
 case syntax is new T[arg] (only one arg allowed). It is next to 
 impossible to create one heap object of fixed-size array type.

You can create it. new T[arg] does just that. You can't *use* it. T[5]* arr=something; arr[3]; //wrong (*arr)[3]; //ok And as it was said, current new syntax allows arguments for array new.
Apr 24 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 We will need to accommodate multiple implicit conversions somehow anyway 
 (e.g. multiple alias this entries). This is a great question because it 
 illustrates how segregating arrays out of the language challenges the 
 magic that made them "special" and democratizes good features such that 
 other types can benefit of them too.

Making things more orthogonal can be quite useful and it allows more flexibility, but having common things implemented as magic is sometimes good because you need less knowledge to use the language, you need less time to understand code written by other people (because there are less general ways to do something), and this allows for more sharing of code, and you may cover most of the common usages anyway. So the risk of making things more orthogonal is to over-generalize. Scheme language (that is very orthogonal) shows this failure very well, this is a related quotation:
In practice Scheme follows exactly the opposite route: there are dozens of
different and redundant object systems, module systems, even record systems,
built just by piling up feature over feature. So the minimalism of the core
language is just a lie or at best a red herring (the core language can be
minimalistic, but the core language is basically useless for any real life
job).<

The Axioms of C++0x (that in theory the compiler can use to perform optimizations, according to Wikipedia http://en.wikipedia.org/wiki/C++0x#Axioms ) show one example of over-generalization. I think the D2 language may offer just few common Axioms (commutativity, etc), allowing most of the usages of Axioms and avoiding lot of compiler complexity to support them in general. Bye, bearophile
Apr 23 2009
parent "Joel C. Salomon" <joelcsalomon gmail.com> writes:
bearophile wrote:
 Making things more orthogonal can be quite useful and it allows more
flexibility, but having common things implemented as magic is sometimes good
because you need less knowledge to use the language, you need less time to
understand code written by other people (because there are less general ways to
do something), and this allows for more sharing of code, and you may cover most
of the common usages anyway. So the risk of making things more orthogonal is to
over-generalize. Scheme language (that is very orthogonal) shows this failure
very well, this is a related quotation:
 
 In practice Scheme follows exactly the opposite route: there are dozens of
different and redundant object systems, module systems, even record systems,
built just by piling up feature over feature. So the minimalism of the core
language is just a lie or at best a red herring (the core language can be
minimalistic, but the core language is basically useless for any real life
job).<


The difference here is that a core of these semi-magic features is standard. You’ll never write “std.Array!(T)” in real code; you’d use “T[]”. You might still replace the core library with custom versions of Object &c., but you ought not to be tempted to do so. —Joel Salomon
Apr 23 2009
prev sibling parent Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 I've discussed something with Walter today and thought I'd share it here.

 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

I like that translation since it can allow customization. How will the type system handle Array!(const(T)) and const(Array!T)? They're no longer implicitly convertible.

We will need to accommodate multiple implicit conversions somehow anyway (e.g. multiple alias this entries). This is a great question because it illustrates how segregating arrays out of the language challenges the magic that made them "special" and democratizes good features such that other types can benefit of them too.

This is something I've given thought to in the past. The basic requirement for x!(const(T),U) to be implicitly convertible to const(x!(T,U)) is: 1. Member variables match in physical loxation, quantity. 2. Member variables in both x!(T,U) and x!(const(T),U) are both implicitly convertible to normal const versions 3. Const member functions must have identical behavior
 2. Do the similar thing for associative arrays.

 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

A version statement seems more powerful. Standard libraries may need changes too.
 4. The definition of Object in object_nogc.d includes a reference count
 member for intrusive refcounting.

That's still a method of garbage collecting! -nogc is kind of misleading...

Yah, I agree.
 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.

 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".

Ick... Please make this hijack the default new behavior.

Agreed. Andrei

Apr 23 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Fri, 24 Apr 2009 08:47:49 +0400, Daniel Keep <daniel.keep.lists gmail.com>
wrote:

 Andrei Alexandrescu wrote:
 Daniel Keep wrote:
 Andrei Alexandrescu wrote:
 Joel C. Salomon wrote:
 Just as (1) & (2) point to a way to remove the “magic” of built-in
 arrays & hash-tables, so too might (5) & (6) point to a way of
 replacing
 the “new T(args)” syntax with something cleaner?  Not that
 “new!(T)(args)” looks nicer than the current syntax, but is it
 perhaps a
 better fit with the rest of the language?

Andrei

Oh I don't know, I rather like being able to allocate stuff on the heap. I mean, if I didn't, the poor heap would probably be very lonely. Poor, poor oft-maligned heap; all because he's a bit slower than stack allocation and needs to be cleaned up after. He's trying to help, you know! Joking aside, what do you have in mind? Every solution I come up with ends up being more or less the same (except with the 'new' keyword in a different place) or worse.

I don't know. Here's what we need: 1. Create one heap object of type T with creation arguments args. Currently: new T(args), except when T is a variable-sized array in which case syntax is new T[arg] (only one arg allowed). It is next to impossible to create one heap object of fixed-size array type.

Actually,
 new T[](size);

This works just fine. I do think that (new T[n]) should be of type T[n], not T[].
 2. Create one array T[] of size s and initialize its elements from a
 range or generator function. Currently possible via a function that
 internally uses GC calls.

 3. Create one array T[] of a size s and do not initialize content.
 Currently possible by calling GC functions. Probably not really needed
 because it's not that safe anyway.

 Andrei

Perhaps have these overloads for object.Array!(T)...
 new T[](size_t, T)
 new T[](size_t, S) if isInputRange!(S)


I agree these overloads belong to Array.
 As for the uninitialised case, I don't think it's that important, either.

 <dreaming>

 That said...

 Assuming I'm allowed to make language changes, I'd be tempted to make
 void a valid type for variables and arguments, and also make it double
 as an expression.  That is, void is a literal of type void with value
 void, the ONLY valid value for voids.  Then, you could have:

 auto a = new T[](n, void);


This is a nice idea!
 I've always been annoyed that you can't have void variables or
 arguments; it always seems to complicate my beautiful generic code :'(

 ReturnTypeOf!(Fn) wrap(alias Fn)(ParameterTypeTuple!(Fn) args)
 {
     logf("> ",Fn.stringof,args);
     auto result = Fn(args);
     logf("< ",Fn.stringof,args);


 }

That works only so long as Fn doesn't return a void. And before anyone suggests it, this trick:
 return Fn(args);

only works in trivial cases; it doesn't help if you need to be able to DO something after the function call. </dreaming> -- Daniel

Yes, I've hit the same issue a lot of times, although scope(exit)/scope(success) usually helps.
Apr 24 2009
prev sibling next sibling parent reply Georg Wrede <georg.wrede iki.fi> writes:
Andrei Alexandrescu wrote:
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.

No printable comment on that one...
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:

Today really seems to be the lucky day of D! So many pieces are clicking together!! Oh, my!
 7. Add a -nogc option to the compiler. In that mode, the compiler
 replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
 "Ref!(T)(args)" for all classes T except inside
 object_nogc.d. 

Of course.
 The exception, as Walter pointed out, is to avoid
 infinite regression (how do you implement Ref if the reference you hold
 inside will also be wrapped in Ref???)

I wish you'd elaborate.
 A disadvantage is that -nogc must be global - you can't link a program
 that's partially built with gc and partially without. This was a major
 counter-argument to adding optional gc to C++.

Well, IMHO, that's life. Can't win them all, all of the time. So let's live with it. One might of course try to figure out a way to have the linker recognize this, or even *simpler*, the runtime might barf on it!!
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Georg Wrede wrote:
 Today really seems to be the lucky day of D!
 So many pieces are clicking together!!
 Oh, my!

Also, I just gave my talk on ranges vs. iterators at ACCU. The audience is still "shell shocked" as Walter put it. I guess it went well :o). Andrei
Apr 23 2009
parent "Nick Sabalausky" <a a.a> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:gsq8jg$eda$2 digitalmars.com...
 Georg Wrede wrote:
 Today really seems to be the lucky day of D!
 So many pieces are clicking together!!
 Oh, my!

Also, I just gave my talk on ranges vs. iterators at ACCU. The audience is still "shell shocked" as Walter put it. I guess it went well :o).

I don't suppose there's going to be a video on that?
Apr 23 2009
prev sibling next sibling parent reply Leandro Lucarella <llucax gmail.com> writes:
Andrei Alexandrescu, el 23 de abril a las 05:58 me escribiste:
 I've discussed something with Walter today and thought I'd share it here.
 
 The possibility of using D without a garbage collector was always
 looming and has been used to placate naysayers ("you can call malloc if
 you want" etc.) but that opportunity has not been realized in a seamless
 manner. As soon as you concatenate arrays, add to a hash, or create an
 object, you will call into the GC.
 
 So I'm thinking there should be a flag -nogc that enables a different
 model of memory allocation. Here's the steps we need to take:
 
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).
 
 2. Do the similar thing for associative arrays.
 
 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.
 
 4. The definition of Object in object_nogc.d includes a reference count
 member for intrusive refcounting.
 
 5. Define a Ref!(T) struct in object_nogc.d that does intrusive
 reference counting against T using ctors and dtor.

Oh! So you mean -nomarksweep then, right? Rerence counting *is* a garbage collection algorithm.
 6. At this point we already have a usable, credible no-gc offering: just
 use object_nogc.d instead of object.d and instead of "new Widget(args)"
 use "Ref!(Widget)(args)".
 
 7. Add a -nogc option to the compiler. In that mode, the compiler
 replaces automatically "T" -> "Ref!(T)" and "new T(args)" ->
 "Ref!(T)(args)" for all classes T except inside
 object_nogc.d. The exception, as Walter pointed out, is to avoid
 infinite regression (how do you implement Ref if the reference you hold
 inside will also be wrapped in Ref???)
 
 8. Well with this all a very solid offering of D without garbage
 collection would be available at a low cost!

Besides this not being "non gc D", seems like a good idea. If I know there were any chance that RC had a chance to be accepted in the D, I would included it in my thesis.
 One cool thing is that you can compile the same application with and
 without GC and test the differences easily. That's bound to show a
 number of interesting things!

Please, stop calling it "without GC", is really confusing =) -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ----------------------------------------------------------------------------
Apr 23 2009
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Leandro Lucarella wrote:
 Oh! So you mean -nomarksweep then, right? Rerence counting *is* a garbage
 collection algorithm.

Probably the best way is to express it as a positive: -object=object_marksweep.d -object=object_refcount.d -object=object_generational.d The point is that changing the relatively small object.d enables various fundamental approaches to allocation.
 8. Well with this all a very solid offering of D without garbage
 collection would be available at a low cost!

Besides this not being "non gc D", seems like a good idea. If I know there were any chance that RC had a chance to be accepted in the D, I would included it in my thesis.

Walter?
 One cool thing is that you can compile the same application with and
 without GC and test the differences easily. That's bound to show a
 number of interesting things!

Please, stop calling it "without GC", is really confusing =)

I agree. Andrei
Apr 23 2009
prev sibling next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Andrei Alexandrescu schrieb:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).
 
 2. Do the similar thing for associative arrays.
 
 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

question about debug code speed: a far as i understand it now is the code-speed of (assoc) arrays independent from the debug-code-generator because of the buildinness i ask because i hate the speed of std::vectors/maps at debug-time especially when it comes to large datasets - my major slowdown in prototype development comes then from the awfully slow stl containers - does D suffer from this - or will it, when you add this extension?
Apr 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dennis luehring wrote:
 Andrei Alexandrescu schrieb:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

 2. Do the similar thing for associative arrays.

 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

question about debug code speed: a far as i understand it now is the code-speed of (assoc) arrays independent from the debug-code-generator because of the buildinness i ask because i hate the speed of std::vectors/maps at debug-time especially when it comes to large datasets - my major slowdown in prototype development comes then from the awfully slow stl containers - does D suffer from this - or will it, when you add this extension?

I think we'll be in better shape than debug stl because ranges are inherently cheaper to check. But only testing will tell. Andrei
Apr 23 2009
parent reply dennis luehring <dl.soluz gmx.net> writes:
Andrei Alexandrescu schrieb:
 dennis luehring wrote:
 Andrei Alexandrescu schrieb:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

 2. Do the similar thing for associative arrays.

 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

question about debug code speed: a far as i understand it now is the code-speed of (assoc) arrays independent from the debug-code-generator because of the buildinness i ask because i hate the speed of std::vectors/maps at debug-time especially when it comes to large datasets - my major slowdown in prototype development comes then from the awfully slow stl containers - does D suffer from this - or will it, when you add this extension?

I think we'll be in better shape than debug stl because ranges are inherently cheaper to check. But only testing will tell.

not only iterating is awfully slow - also the pushing of data (i can clearly say that my simple stl container based volume simulation is nearly not debugable because of the amount of data needed to get into an debugable state) - so im working only with release versions - and start to convert my (c++)modules into relaseable c-interfaced dlls (just to be able to debug the main-code) - and this situation is an absolut prototyping no-go please: always test if your extensions to containers do not harm the "debugging" speed - the stl is best example of killing this part of prototype-test-programm-cycle
Apr 23 2009
parent reply dennis luehring <dl.soluz gmx.net> writes:
dennis luehring schrieb:
 Andrei Alexandrescu schrieb:
 dennis luehring wrote:
 Andrei Alexandrescu schrieb:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

 2. Do the similar thing for associative arrays.

 3. Have two object.d at hand: one is "normal" and uses garbage
 collection, the other (call it object_nogc.d) has an entirely different
 definition for arrays, hashes, and Object.

question about debug code speed: a far as i understand it now is the code-speed of (assoc) arrays independent from the debug-code-generator because of the buildinness i ask because i hate the speed of std::vectors/maps at debug-time especially when it comes to large datasets - my major slowdown in prototype development comes then from the awfully slow stl containers - does D suffer from this - or will it, when you add this extension?

I think we'll be in better shape than debug stl because ranges are inherently cheaper to check. But only testing will tell.

not only iterating is awfully slow - also the pushing of data (i can clearly say that my simple stl container based volume simulation is nearly not debugable because of the amount of data needed to get into an debugable state) - so im working only with release versions - and start to convert my (c++)modules into relaseable c-interfaced dlls (just to be able to debug the main-code) - and this situation is an absolut prototyping no-go please: always test if your extensions to containers do not harm the "debugging" speed - the stl is best example of killing this part of prototype-test-programm-cycle

does that also mean that strings are getting template based? what amount of debug-code will produce this? like with the stl (or boost) - zillion megabytes for 9 lines of code? and i can throw away my debugger (if D will get a good one) im always shocked when looking at the stl/boost based assembler code in debug releases - does D generate less code?
Apr 23 2009
parent reply dennis luehring <dl.soluz gmx.net> writes:
On 24.04.2009 08:24, dennis luehring wrote:
 dennis luehring schrieb:
 Andrei Alexandrescu schrieb:
 dennis luehring wrote:
 Andrei Alexandrescu schrieb:



what amount of debug-code will produce this? like with the stl (or boost) - zillion megabytes for 9 lines of code? and i can throw away my debugger (if D will get a good one) im always shocked when looking at the stl/boost based assembler code in debug releases - does D generate less code?

but i still like the idea of getting more generic at the base :-) the result will less special cases for the compiler and no that-can-only-the-compiler-do scenarios but the downside - are the error messages then still that informative and what is the code bloat in the end - but maybe walter found a way to code-specialize the Array!/Assoc! (i hope to see better initalizers then - maybe static ones :-]) stuff for the standard datatypes - and then everything is fine ... still waiting for D 3 - the concurrency wonderland ...
Apr 24 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
dennis luehring wrote:
 On 24.04.2009 08:24, dennis luehring wrote:
 dennis luehring schrieb:
 Andrei Alexandrescu schrieb:
 dennis luehring wrote:
 Andrei Alexandrescu schrieb:



what amount of debug-code will produce this? like with the stl (or boost) - zillion megabytes for 9 lines of code? and i can throw away my debugger (if D will get a good one) im always shocked when looking at the stl/boost based assembler code in debug releases - does D generate less code?

but i still like the idea of getting more generic at the base :-) the result will less special cases for the compiler and no that-can-only-the-compiler-do scenarios but the downside - are the error messages then still that informative and what is the code bloat in the end - but maybe walter found a way to code-specialize the Array!/Assoc! (i hope to see better initalizers then - maybe static ones :-]) stuff for the standard datatypes - and then everything is fine ... still waiting for D 3 - the concurrency wonderland ...

Let's take an example. Code for Array!(char) and Array!(ubyte) are probably going to be exactly identical. Of course, it's very hard for the compiler to know that. It sees two distinct types. Simply looking at the template arguments doesn't help thanks to static if. The real question is: is the generated machine code identical? If it is, there's probably no benefit in having multiple versions of it. I know it's been said a bazillion times already, but maybe we should look at replacing OPTLINK, and adding a "collapse identical code sections" option while we're at it. :P -- Daniel
Apr 24 2009
parent reply Walter Bright <newshound digitalmars.com> writes:
== Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 I know it's been said a bazillion times already, but maybe we should
 look at replacing OPTLINK, and adding a "collapse identical code
 sections" option while we're at it.  :P

Fixing optlink is a worthy goal, but that is only one platform. D has to work with linkers on many platforms, and relying on such a solution existing across linkers we have no control over is not practical.
Apr 25 2009
parent "Unknown W. Brackets" <unknown simplemachines.org> writes:
Well, if they're in the same section, the compiler _could_ detect this 
itself, couldn't it?  I mean, it already does an optimization pass 
before the codegen, it could theoretically do another collapse pass 
after that.

I don't know how easy that would be, or if it would be worth it, but it 
would cover all platforms at that point.

-[Unknown]


Walter Bright wrote:
 == Quote from Daniel Keep (daniel.keep.lists gmail.com)'s article
 I know it's been said a bazillion times already, but maybe we should
 look at replacing OPTLINK, and adding a "collapse identical code
 sections" option while we're at it.  :P

Fixing optlink is a worthy goal, but that is only one platform. D has to work with linkers on many platforms, and relying on such a solution existing across linkers we have no control over is not practical.

Apr 25 2009
prev sibling next sibling parent reply Christian Kamm <kamm-removethis andthis.incasoftware.andthis.de> writes:
Andrei Alexandrescu Wrote:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

While I'd support such a rewriting of builtin arrays and associative arrays, it comes at a cost: one template instantiation per contained type, which will lead to more code and TypeInfo / ClassInfo initializers than the current implementation requires. Can you come up with a solution that can seamlessly switch between RTTI based containers and templated containers?
Apr 24 2009
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Christian Kamm wrote:
 Andrei Alexandrescu Wrote:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]" ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I think
 superdan suggested that when he wasn't busy cursing :o).

While I'd support such a rewriting of builtin arrays and associative arrays, it comes at a cost: one template instantiation per contained type, which will lead to more code and TypeInfo / ClassInfo initializers than the current implementation requires. Can you come up with a solution that can seamlessly switch between RTTI based containers and templated containers?

Good question. I don't know how the typeinfo could go away, but right now we're in that boat already - each T[] has its own typeinfo. Implementation-wise, there are techniques to reduce code bloating. Andrei
Apr 24 2009
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Fri, 24 Apr 2009 11:57:45 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Christian Kamm wrote:
 Andrei Alexandrescu Wrote:
 1. Put array definitions in object.d. Have the compiler rewrite "T[]"  
 ->
 ".Array!(T)" and "[ a, b, c ]" -> ".Array!(typeof(a))(a, b, c)". I  
 think
 superdan suggested that when he wasn't busy cursing :o).

arrays, it comes at a cost: one template instantiation per contained type, which will lead to more code and TypeInfo / ClassInfo initializers than the current implementation requires. Can you come up with a solution that can seamlessly switch between RTTI based containers and templated containers?

Good question. I don't know how the typeinfo could go away, but right now we're in that boat already - each T[] has its own typeinfo. Implementation-wise, there are techniques to reduce code bloating.

Yes, but each T[] does not have it's own sort routine. With a template, that would not be the case (I think). Not that I think the current way is the best, just pointing out a difference. -Steve
Apr 24 2009
prev sibling parent "Vladimir Panteleev" <thecybershadow gmail.com> writes:
On Thu, 23 Apr 2009 13:58:38 +0300, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 [snip]

Yes please. D's GC sometimes just gets in the way, and you end up managing most memory manually. Just don't forget to have a version(nogc) ! -- Best regards, Vladimir mailto:thecybershadow gmail.com
Apr 24 2009