www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - The purpose of D (GC rant, long)

reply Serg Kovrov <kovrov no.spam> writes:
Hello, fellows!

First of all a small disclaimer - I really do like D, I appreciate
Walter's great work, and I actually do use it (or at least trying to) in
my personal projects. So it's not just complains for the sake of
complaining.

In D I like (or at least can live with) pretty much everything but one
thing. The memory management thing. I complain to this before, and yet
again I loose my nerves. Call me memory freak if you will. That the
reason I do not like Java and .NET. That the reason I was stick with C++
instead of jumping to managed platforms long time ago.

The problem for me is not that GC is hard-coupled with *standard*
library. Actually I do like the idea to have GC in *standard* library.
The real pain for me is that (as I see it) D is designed to be used with 
GC. That is, operations with arrays and strings heavy relaying on GC. 
When I say "D", I mean "D and Phobos". For me they are one thing.

The problem is, I can't do without standard library (that is Phobos).
That's because I'm simple folk. I really am. I can't write my own or
even use third-party. Sorry Sean, with all my respect, I cant invest my
time to Ares. My experience taught me just that - to stick
with standard library.

As bottom line, there are a few statements about me:
- I am *not* a system developer. I am an application developer.
- I'm very paranoid regarding resources (memory in particular).
- I *do* like GC, but will use it only if I can fully control it (as 
user, of course).
- I can live with manual memory management, but I *will not* write basic
routines to substitute standard library.


And now point of my rant: For whom D is really designed?

If you are an application developer who do not care much about memory 
usage - "go managed" (that is Java, .Net).

System developer? Have no idea. Maybe. But I'm sure they have to roll 
their own pretty much everything (to get rid of GC). And how much there 
are system development comparing to application development?

Embedded devices? Again, with GC, no way. And I leave out compiler 
support for that matter.


What I really would like to know is, will standard library ever have 
full functionality without using GC? Will GC evolving further? Which way?

No offense, those nifty language features that you people constantly 
discussing here, are great indeed, but what most simple fellows like me 
needed is pragmatic stuff like controllable GC and/or ability to 
painless ignore GC.

-- 
serg.
Oct 26 2006
next sibling parent =?UTF-8?B?QW5kZXJzIEYgQmrDtnJrbHVuZA==?= <afb algonet.se> writes:
Serg Kovrov wrote:

 The real pain for me is that (as I see it) D is designed to be used with 
 GC. That is, operations with arrays and strings heavy relaying on GC. 
 When I say "D", I mean "D and Phobos". For me they are one thing.

I think that is one of the features of D. I miss my GC when doing C/C++.
 As bottom line, there are a few statements about me:
 - I am *not* a system developer. I am an application developer.
 - I'm very paranoid regarding resources (memory in particular).
 - I *do* like GC, but will use it only if I can fully control it (as 
 user, of course).

Sounds like C++ would be a better match, GC will be optional in C++0X ? I think that garbage collection is a nice language FEATURE, and the only part I don't like is when it doesn't work and the other remaining bugs. But it *is* something of a must, once it's in the language and library - I don't think you can use D without using GC in (parts) of your program. See http://www.digitalmars.com/d/garbage.html for the rationale why. I've used GC in both C (Apache "pooling" functions) and C++ (as a lib), and it's useful there too ? I don't think you *have* to go to Java/C#. Most of the time I don't miss managing memory resources manually more than I miss writing assembler code, and computer is better at it too. At least for me, D fills a niche between C and Java ? A simpler C++... --anders
Oct 26 2006
prev sibling next sibling parent reply Kyle Furlong <kylefurlong gmail.com> writes:
Serg Kovrov wrote:
 Hello, fellows!
 
 First of all a small disclaimer - I really do like D, I appreciate
 Walter's great work, and I actually do use it (or at least trying to) in
 my personal projects. So it's not just complains for the sake of
 complaining.
 
 In D I like (or at least can live with) pretty much everything but one
 thing. The memory management thing. I complain to this before, and yet
 again I loose my nerves. Call me memory freak if you will. That the
 reason I do not like Java and .NET. That the reason I was stick with C++
 instead of jumping to managed platforms long time ago.
 
 The problem for me is not that GC is hard-coupled with *standard*
 library. Actually I do like the idea to have GC in *standard* library.
 The real pain for me is that (as I see it) D is designed to be used with 
 GC. That is, operations with arrays and strings heavy relaying on GC. 
 When I say "D", I mean "D and Phobos". For me they are one thing.
 
 The problem is, I can't do without standard library (that is Phobos).
 That's because I'm simple folk. I really am. I can't write my own or
 even use third-party. Sorry Sean, with all my respect, I cant invest my
 time to Ares. My experience taught me just that - to stick
 with standard library.
 
 As bottom line, there are a few statements about me:
 - I am *not* a system developer. I am an application developer.
 - I'm very paranoid regarding resources (memory in particular).
 - I *do* like GC, but will use it only if I can fully control it (as 
 user, of course).
 - I can live with manual memory management, but I *will not* write basic
 routines to substitute standard library.
 
 
 And now point of my rant: For whom D is really designed?
 
 If you are an application developer who do not care much about memory 
 usage - "go managed" (that is Java, .Net).
 
 System developer? Have no idea. Maybe. But I'm sure they have to roll 
 their own pretty much everything (to get rid of GC). And how much there 
 are system development comparing to application development?
 
 Embedded devices? Again, with GC, no way. And I leave out compiler 
 support for that matter.
 
 
 What I really would like to know is, will standard library ever have 
 full functionality without using GC? Will GC evolving further? Which way?
 
 No offense, those nifty language features that you people constantly 
 discussing here, are great indeed, but what most simple fellows like me 
 needed is pragmatic stuff like controllable GC and/or ability to 
 painless ignore GC.
 

If you are writing your own memory management anyways, whats to stop you from implementing your own GC, which will be as paranoid about memory as you are? The hooks are all there in the phobos source. Ares makes it even simpler, I believe, but I think Sean can help you with that better than I can. That said, isnt one able to disable the GC through std.gc.disable() or some such call? What exactly does it do, now that I think about it? Perhaps, for cases such as this, it would be helpful for there to be a page on digitalmars.com/d/ which lists the language features that are gc supported such that any prospective developers of the OP's bent can easily avoid the allocations and memory waste they dont want to invoke.
Oct 26 2006
next sibling parent reply Serg Kovrov <kovrov no.spam> writes:
Hi Kyle Furlong, you wrote:
 If you are writing your own memory management anyways, whats to stop you 
 from implementing your own GC, which will be as paranoid about memory as 
 you are? The hooks are all there in the phobos source. Ares makes it 
 even simpler, I believe, but I think Sean can help you with that better 
 than I can.

Sorry if I wasn't clear, perhaps it's because of my English. I do not intend to write low level stuff like GC. I write applications. -- serg.
Oct 26 2006
parent Kyle Furlong <kylefurlong gmail.com> writes:
Serg Kovrov wrote:
 Hi Kyle Furlong, you wrote:
 If you are writing your own memory management anyways, whats to stop 
 you from implementing your own GC, which will be as paranoid about 
 memory as you are? The hooks are all there in the phobos source. Ares 
 makes it even simpler, I believe, but I think Sean can help you with 
 that better than I can.

Sorry if I wasn't clear, perhaps it's because of my English. I do not intend to write low level stuff like GC. I write applications.

... and in those applications you said you were willing to write manual memory management code. I don't think the step from there to hooking your malloc/free code with the gc hooks is that difficult. I mean, in any sane manually managed memory environment, you are gonna want to have some sort of bookkeeping anyways, so why not implement it in a way that would benefit you? But perhaps I really did misunderstand you.
Oct 26 2006
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
Kyle Furlong wrote:
 
 That said, isnt one able to disable the GC through std.gc.disable() or 
 some such call? What exactly does it do, now that I think about it? 

When memory is allocated from the GC it first looks to see if it has any available internally. If it can't find any it runs a collection and looks again. If it still can't find any then it obtains more memory from the OS and uses a portion of that for the allocation. Setting gc.disable() prevents the collection run from occurring so the GC will simply obtain more memory from the OS when it runs out. The reason behind this is that collections can take a long time, and performance-critical sections of code don't necessarily want to wait for a collection to occur just because they called 'new'. So they'll either disable and re-enable the GC around critical sections of code or they'll simply keep the GC disabled and manually collect during idle periods using gc.fullCollect().
 Perhaps, for cases such as this, it would be helpful for there to be a 
 page on digitalmars.com/d/ which lists the language features that are gc 
 supported such that any prospective developers of the OP's bent can 
 easily avoid the allocations and memory waste they dont want to invoke.

Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then back to zero). However, some operations will still cause an allocation, such as appending to a slice. Sean
Oct 26 2006
parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Kyle Furlong wrote:
 That said, isnt one able to disable the GC through std.gc.disable() or 
 some such call? What exactly does it do, now that I think about it? 

When memory is allocated from the GC it first looks to see if it has any available internally. If it can't find any it runs a collection and looks again. If it still can't find any then it obtains more memory from the OS and uses a portion of that for the allocation. Setting gc.disable() prevents the collection run from occurring so the GC will simply obtain more memory from the OS when it runs out. The reason behind this is that collections can take a long time, and performance-critical sections of code don't necessarily want to wait for a collection to occur just because they called 'new'. So they'll either disable and re-enable the GC around critical sections of code or they'll simply keep the GC disabled and manually collect during idle periods using gc.fullCollect().
 Perhaps, for cases such as this, it would be helpful for there to be a 
 page on digitalmars.com/d/ which lists the language features that are 
 gc supported such that any prospective developers of the OP's bent can 
 easily avoid the allocations and memory waste they dont want to invoke.

Now that memory is not freed when string.length is set to zero it's quite possible to avoid most reallocations simply by preallocating in buffers before using them (ie. set length to some large number and then

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well? I'm betting this has been discussed or at least proposed, but here goes again; let's get an array.reserve at least for native arrays (that could be implemented as {arr.length = nnn; arr.length = 0;}). That way it would make for less of a hack than re/setting the length, and also codify it as part of the language.
 back to zero).  However, some operations will still cause an allocation, 
 such as appending to a slice.
 
 
 Sean

Oct 26 2006
parent reply Sean Kelly <sean f4.ca> writes:
Dave wrote:
 Sean Kelly wrote:
 Now that memory is not freed when string.length is set to zero it's 
 quite possible to avoid most reallocations simply by preallocating in 
 buffers before using them (ie. set length to some large number and then 

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand.
 I'm betting this has been discussed or at least proposed, but here goes 
 again; let's get an array.reserve at least for native arrays (that could 
 be implemented as {arr.length = nnn; arr.length = 0;}). That way it 
 would make for less of a hack than re/setting the length, and also 
 codify it as part of the language.

I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp; Sean
Oct 27 2006
parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Now that memory is not freed when string.length is set to zero it's 
 quite possible to avoid most reallocations simply by preallocating in 
 buffers before using them (ie. set length to some large number and then 

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand.
 I'm betting this has been discussed or at least proposed, but here 
 goes again; let's get an array.reserve at least for native arrays 
 (that could be implemented as {arr.length = nnn; arr.length = 0;}). 
 That way it would make for less of a hack than re/setting the length, 
 and also codify it as part of the language.

I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;

Oops, you're right.. I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2)) 1 (DMD v0.172 -O -inline -release, libc v2.4): new(20): 2.316 malloc(20): 0.83 new(40): 3.029 malloc(40): 0.831 new(60): 2.627 malloc(60): 0.872 new(80): 4.138 malloc(80): 1.754 new(100): 3.968 malloc(100): 1.756 2: new(20): 1.451 malloc(20): 0.835 new(40): 2.137 malloc(40): 0.838 new(60): 1.765 malloc(60): 0.874 new(80): 3.33 malloc(80): 2.114 new(100): 3.108 malloc(100): 2.164 3: new(20): 1.133 malloc(20): 0.838 new(40): 1.658 malloc(40): 0.834 new(60): 1.657 malloc(60): 0.981 new(80): 1.871 malloc(80): 1.888 new(100): 1.871 malloc(100): 1.899 The cost of initialization is actually *higher* than the cost of allocation/GC, and for larger arrays the performance is comparable to malloc/free. One thing I noticed is that for most/all of the _d_new* functions, initialization will be done twice, once in the gcx.malloc and again in the _d_new* function. I believe the extra initialization could be removed in most cases (perhaps with an optional parameter to gcx.malloc()?). Maybe also some syntax to support 'void' initializers for heap allocated arrays? Those may go along way toward getting rid of the OP's concerns with the performance of the GC. //---------------- import std.date, std.c.stdlib, std.stdio, std.gc; void main() { const iters = 10_000_000; for(int j = 1; j <= 5; j++) { { d_time s = getUTCtime(); for(int i = 0; i < iters; i++) { char[] str = new char[j * 20]; } d_time e = getUTCtime(); writefln("new(",j*20,"): ",(e-s)/cast(float)TicksPerSecond); } { d_time s = getUTCtime(); for(int i = 0; i < iters; i++) { char* str = cast(char*)malloc(j * 20 + 1); free(str); } d_time e = getUTCtime(); writefln("malloc(",j*20,"): ",(e-s)/cast(float)TicksPerSecond); } fullCollect; } }
Oct 27 2006
parent reply Sean Kelly <sean f4.ca> writes:
Dave wrote:
 Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Now that memory is not freed when string.length is set to zero it's 
 quite possible to avoid most reallocations simply by preallocating 
 in buffers before using them (ie. set length to some large number 
 and then 

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand.
 I'm betting this has been discussed or at least proposed, but here 
 goes again; let's get an array.reserve at least for native arrays 
 (that could be implemented as {arr.length = nnn; arr.length = 0;}). 
 That way it would make for less of a hack than re/setting the length, 
 and also codify it as part of the language.

I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;

Oops, you're right.. I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))

 One thing I noticed is that for most/all of the _d_new* functions, 
 initialization will be done twice, once in the gcx.malloc and again in 
 the _d_new* function. I believe the extra initialization could be 
 removed in most cases (perhaps with an optional parameter to 
 gcx.malloc()?). Maybe also some syntax to support 'void' initializers 
 for heap allocated arrays?

What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays. Sean
Oct 28 2006
parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Now that memory is not freed when string.length is set to zero it's 
 quite possible to avoid most reallocations simply by preallocating 
 in buffers before using them (ie. set length to some large number 
 and then 

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand.
 I'm betting this has been discussed or at least proposed, but here 
 goes again; let's get an array.reserve at least for native arrays 
 (that could be implemented as {arr.length = nnn; arr.length = 0;}). 
 That way it would make for less of a hack than re/setting the 
 length, and also codify it as part of the language.

I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;

Oops, you're right.. I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))

 One thing I noticed is that for most/all of the _d_new* functions, 
 initialization will be done twice, once in the gcx.malloc and again in 
 the _d_new* function. I believe the extra initialization could be 
 removed in most cases (perhaps with an optional parameter to 
 gcx.malloc()?). Maybe also some syntax to support 'void' initializers 
 for heap allocated arrays?

What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays.

Line 296: foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; } Right above the debug (MEMSTOMP) Not really 'initialization' as it just clears the unused portion of the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have. The important point is that currently the initialization/clearing in itself takes longer than stdlib.malloc, so no matter how fast the allocator is, initialization will be a bottleneck. Any ideas on how to optimize that?
 
 Sean

Oct 28 2006
parent reply Sean Kelly <sean f4.ca> writes:
Dave wrote:
 Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Dave wrote:
 Sean Kelly wrote:
 Now that memory is not freed when string.length is set to zero 
 it's quite possible to avoid most reallocations simply by 
 preallocating in buffers before using them (ie. set length to some 
 large number and then 

I did not know that had been changed.. Is that now part of the language 'spec' somewhere as well?

No. It was implemented in 170-172 by request from Derek. I don't know the issue number offhand.
 I'm betting this has been discussed or at least proposed, but here 
 goes again; let's get an array.reserve at least for native arrays 
 (that could be implemented as {arr.length = nnn; arr.length = 0;}). 
 That way it would make for less of a hack than re/setting the 
 length, and also codify it as part of the language.

I agree that this would be useful. Though it would probably be more like: size_t tmp = arr.length; arr.length = nnn; arr.length = tmp;

Oops, you're right.. I was curious as to how much initialization cost. I took the code (bottom) and ran it three times: 1) as-is, 2) with the initialization code in gc._d_newarrayi() commented out and 3) with the initialization code in gcx._malloc() also commented out (along with (2))

 One thing I noticed is that for most/all of the _d_new* functions, 
 initialization will be done twice, once in the gcx.malloc and again 
 in the _d_new* function. I believe the extra initialization could be 
 removed in most cases (perhaps with an optional parameter to 
 gcx.malloc()?). Maybe also some syntax to support 'void' initializers 
 for heap allocated arrays?

What initialization in gcx.malloc? The only call to memset I see has a debug flag. And I believe void initializers already work for arrays.

Line 296: foreach(inout byte b; cast(byte[])(p + size)[0..binsize[bin] - size]) { b = 0; } Right above the debug (MEMSTOMP)

Oops! Dunno how I missed that.
 Not really 'initialization' as it just clears the unused portion of the 
 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't have.
 
 The important point is that currently the initialization/clearing in 
 itself takes longer than stdlib.malloc, so no matter how fast the 
 allocator is, initialization will be a bottleneck.
 
 Any ideas on how to optimize that?

I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed. So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync. Sean
Oct 28 2006
parent Sean Kelly <sean f4.ca> writes:
Sean Kelly wrote:
 Dave wrote:
 Not really 'initialization' as it just clears the unused portion of 
 the 'bin', but it's still overhead that std.c.stdlib.malloc() doesn't 
 have.

 The important point is that currently the initialization/clearing in 
 itself takes longer than stdlib.malloc, so no matter how fast the 
 allocator is, initialization will be a bottleneck.

 Any ideas on how to optimize that?

I'm not sure of the ideal approach for Phobos, but in Ares I have separate malloc and calloc methods exposed. So I'll likely just change calloc to initialize the entire block instead of just the allocated portion, and remove the spare space initializer from mallocNoSync.

You know, one thing to be said for the current approach is that it will result in fewer memory 'leaks' because unused memory is initialized to a value that is guaranteed not to look like a reference to actual memory. I may leave things as-is. Sean
Oct 28 2006
prev sibling next sibling parent reply Sean Kelly <sean f4.ca> writes:
Serg Kovrov wrote:
 
 The problem is, I can't do without standard library (that is Phobos).
 That's because I'm simple folk. I really am. I can't write my own or
 even use third-party. Sorry Sean, with all my respect, I cant invest my
 time to Ares. My experience taught me just that - to stick
 with standard library.

No problem. I'd say this is true of most programmers.
 And now point of my rant: For whom D is really designed?

I think D is designed for a larger audience than C++, since it's somewhat forgiving in design (similar to Java), but can be used for systems programming (similar to C++). By comparison, C++ was truly designed for advanced programmers despite the range of people that actually use it.
 If you are an application developer who do not care much about memory 
 usage - "go managed" (that is Java, .Net).

...and D.
 System developer? Have no idea. Maybe. But I'm sure they have to roll 
 their own pretty much everything (to get rid of GC). And how much there 
 are system development comparing to application development?

There are very few systems developers compared to application developers. But D can be used here as well. The standard library might be largely ignored and dynamic arrays might be used carefully, but D can really do everything C++ can here, unless you absolutely insist on having user-defined data types that are indistinguishable from built-in types.
 Embedded devices? Again, with GC, no way. And I leave out compiler 
 support for that matter.

And embedded developers can turn the GC off or write a custom GC to operate within their constraints. However, I disagree that GC is completely incompatible with embedded devices. Java runs on just about everything these days and unlike D it doesn't even offer the option of stack allocation or explicit deletion.
 What I really would like to know is, will standard library ever have 
 full functionality without using GC? Will GC evolving further? Which way?

I'd like to think that the standard library will eventually minimize its use of GC allocations over time, but they will never be completely eliminated. Some algorithms just aren't represented well without some form of internal dynamic allocation.
 No offense, those nifty language features that you people constantly 
 discussing here, are great indeed, but what most simple fellows like me 
 needed is pragmatic stuff like controllable GC and/or ability to 
 painless ignore GC.

A controllable GC isn't terribly difficult. In fact, std.gc in Phobos offers a lot of functionality right now. Sean
Oct 26 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
  but D can
 really do everything C++ can here, unless you absolutely insist on 
 having user-defined data types that are indistinguishable from built-in 
 types.

C++ is not capable of having user-defined types indistinguishable from built-in ones.
Oct 26 2006
parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 Sean Kelly wrote:
  but D can
 really do everything C++ can here, unless you absolutely insist on 
 having user-defined data types that are indistinguishable from 
 built-in types.

C++ is not capable of having user-defined types indistinguishable from built-in ones.

How so? It can certainly get pretty close, but I'm unaware of the limitations. Sean
Oct 27 2006
parent Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 Walter Bright wrote:
 Sean Kelly wrote:
  but D can
 really do everything C++ can here, unless you absolutely insist on 
 having user-defined data types that are indistinguishable from 
 built-in types.

C++ is not capable of having user-defined types indistinguishable from built-in ones.

How so? It can certainly get pretty close, but I'm unaware of the limitations.

Consider std::string. It cannot deal with "string1"+"string2". std::vector<> cannot be statically initialized. There's no way to create user-defined literals. Etc.
Oct 30 2006
prev sibling next sibling parent reply Walter Bright <newshound digitalmars.com> writes:
I can understand your desire to control memory explicitly. I grew up 
programming C and C++, and considered myself a professional, and 
professionals have complete control over the operation of their 
programs. gc was for lazy, wussy, less competent programmers. I bought 
the conventional wisdom that gc was slow and inefficient.

Then, for a project my employer put me on, I had to use a gc, in fact, I 
had to work on a gc. I slowly came to realize I was wrong about gc on 
all counts.

Then I began to think about why C++ was so complicated. I eventually 
began to realize it's because of explicit memory management. Have a gc, 
and suddenly you can make a language with even greater power that is 
much, much simpler.

For one example, you cannot do array slices in C++ without considerable 
agony. In D, they are easy as pie.

P.S. It *is* true (before D) that gc based languages are slower than 
C/C++. The conventional wisdom says that this is caused by the gc. This 
simply is not true, the slowness is usually caused by lack of 
expressiveness in the language (Java) or dynamic typing (Python, 
Javascript, etc.).
Oct 26 2006
next sibling parent reply Sean Kelly <sean f4.ca> writes:
Walter Bright wrote:
 
 Then I began to think about why C++ was so complicated. I eventually 
 began to realize it's because of explicit memory management. Have a gc, 
 and suddenly you can make a language with even greater power that is 
 much, much simpler.

After using D for a bit, I came to the same conclusion. That isn't to say you can completely forget about the cost of reallocations or data ownership, but having a GC simplifies the things that should be simple, without creating additional complexity elsewhere.
 For one example, you cannot do array slices in C++ without considerable 
 agony. In D, they are easy as pie.

True enough. I have a slice class I use in C++ for specialized purposes (a compiler, for example), but dealing with the memory ownership issues are just too much of a headache for apps where data doesn't have such a predictable lifetime.
 P.S. It *is* true (before D) that gc based languages are slower than 
 C/C++. The conventional wisdom says that this is caused by the gc. This 
 simply is not true, the slowness is usually caused by lack of 
 expressiveness in the language (Java) or dynamic typing (Python, 
 Javascript, etc.).

I think another issue is that garbage collection *can* cause a noticeable stutter in user applications which pay no attention to memory management, and it's easy for someone to point at that hitch and proclaim that garbage collection itself is slow. People seem to like the straw man argument for some reason. Sean
Oct 26 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
Sean Kelly wrote:
 I think another issue is that garbage collection *can* cause a 
 noticeable stutter in user applications which pay no attention to memory 
 management, and it's easy for someone to point at that hitch and 
 proclaim that garbage collection itself is slow.  People seem to like 
 the straw man argument for some reason.

You know it's a straw man when they also say that malloc/free or C++ new/delete have predictable latency. They don't. People who write code that *requires* predictable latency preallocate all their data.
Oct 26 2006
prev sibling parent reply Dave <Dave_member pathlink.com> writes:
Sean Kelly wrote:
 Walter Bright wrote:
 Then I began to think about why C++ was so complicated. I eventually 
 began to realize it's because of explicit memory management. Have a 
 gc, and suddenly you can make a language with even greater power that 
 is much, much simpler.

After using D for a bit, I came to the same conclusion. That isn't to say you can completely forget about the cost of reallocations or data ownership, but having a GC simplifies the things that should be simple, without creating additional complexity elsewhere.
 For one example, you cannot do array slices in C++ without 
 considerable agony. In D, they are easy as pie.

True enough. I have a slice class I use in C++ for specialized purposes (a compiler, for example), but dealing with the memory ownership issues are just too much of a headache for apps where data doesn't have such a predictable lifetime.
 P.S. It *is* true (before D) that gc based languages are slower than 
 C/C++. The conventional wisdom says that this is caused by the gc. 
 This simply is not true, the slowness is usually caused by lack of 
 expressiveness in the language (Java) or dynamic typing (Python, 
 Javascript, etc.).

I think another issue is that garbage collection *can* cause a noticeable stutter in user applications which pay no attention to memory management, and it's easy for someone to point at that hitch and

It's called "The path of least resistance" <g> When even a good programmer who knows better is in a hurry (and when are they not? ;)), they'll abuse the heck out of the heap because it's more expedient, and the end result still makes it past QC. Pretty soon you have this largely sub-optimal program made up of many small, somewhat sub-optimal pieces. D though at least gives us 'delete' as well as some control over the GC. As an aside, academia moving away from explicit memory managed languages in their curriculum's is probably producing quite a few programmers who know very little about memory management and how it can effect performance.
 proclaim that garbage collection itself is slow.  People seem to like 
 the straw man argument for some reason.
 
 
 Sean

Oct 26 2006
next sibling parent reply Charles Fox <charles SPAMMENOT.robots.ox.ac.uk> writes:
As an aside, academia moving away from explicit memory managed languages in


probably producing quite a few programmers who know very little about memory management and how it can effect performance. Interesting -- I just read this on the same day as noticing that Cambridge has ditched Java from its course and put C++ back there -- make of that what you will!
Oct 27 2006
parent Sean Kelly <sean f4.ca> writes:
Charles Fox wrote:
 As an aside, academia moving away from explicit memory managed languages in


probably producing quite a few programmers who know very little about memory management and how it can effect performance. Interesting -- I just read this on the same day as noticing that Cambridge has ditched Java from its course and put C++ back there -- make of that what you will!

According to Bjarne, this is happening at a few schools. He said it's the result of pressure from the industry, which has a high demand for students who know C++. Sean
Oct 27 2006
prev sibling parent Sean Kelly <sean f4.ca> writes:
Dave wrote:
 
 As an aside, academia moving away from explicit memory managed languages 
 in their curriculum's is probably producing quite a few programmers who 
 know very little about memory management and how it can effect performance.

Yup. This is one of my major complaints about using Java in place of C++ as a teaching language. Sure, it allows the focus to be on the algorithm instead of all the peripheral aspects, but the result is students who aren't aware of the peripheral aspects. The cost of virtual methods is another issue. I'd rather students suffer through a bit of additional complexity and end up not needing the additional experience they've gained than be ignorant of the issues. Sean
Oct 27 2006
prev sibling parent reply BCS <BCS pathilink.com> writes:
How much does the runtime, generated code and Phobos depend on the GC?
Does it try to optimize out allocations or delete things it can?
For example, does this preform one allocation or several?

char[] foo = "abc", bar = "baz";
char[] ret = foo ~ bar ~ foo;

// ret.length = 9;
// ret[0..3] = foo;
// ret[3..6] = bar;
// ret[6..9] = foo;

and would these temporaries get deleted (or maybe not even created)?

if((foo ~ bar)[4]=='g') go();

//auto __tmp = foo ~ bar
//if(__tmp[4] == 'g') go();
//delete __tmp;

or

//if( (foo.length>4 && foo[4]=='g') || bar[4-foo.length]=='g') go();
Oct 26 2006
next sibling parent Walter Bright <newshound digitalmars.com> writes:
BCS wrote:
 How much does the runtime, generated code and Phobos depend on the GC?
 Does it try to optimize out allocations or delete things it can?
 For example, does this preform one allocation or several?

DMD isn't very good at eliminating unnecessary temporaries. This isn't a defect in D, but in DMD. DMC++ is pretty good at eliminating unnecessary temporaries, not because C++ is better, but because I put a lot of effort into that aspect (for example, mine was the first to do the "named return value" optimization).
Oct 26 2006
prev sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
BCS wrote:
 How much does the runtime, generated code and Phobos depend on the GC?
 Does it try to optimize out allocations or delete things it can?
 For example, does this preform one allocation or several?
 
 char[] foo = "abc", bar = "baz";
 char[] ret = foo ~ bar ~ foo;

Only one IIRC. There's a function called _d_arraycatn in the runtime that accepts an element size and a variable number of arrays, allocates size * (total length) and starts copying. That should be the one invoked for a concatenation like the above.
 // ret.length = 9;
 // ret[0..3] = foo;
 // ret[3..6] = bar;
 // ret[6..9] = foo;
 
 and would these temporaries get deleted (or maybe not even created)?
 
 if((foo ~ bar)[4]=='g') go();
 
 //auto __tmp = foo ~ bar
 //if(__tmp[4] == 'g') go();
 //delete __tmp;
 
 or
 
 //if( (foo.length>4 && foo[4]=='g') || bar[4-foo.length]=='g') go();

I don't think these get deleted[1], nor do I expect creating these is avoided. Could be wrong about this one though, you'd have to check the compiler output to be sure[2]. Well, that or wait until Walter or someone who has checked for you answers ;). [1]: before the next GC cycle, that is. [2]: Don't forget to turn on optimizations, might make a difference here.
Oct 26 2006
prev sibling parent reply "Andrey Khropov" <andkhropov_nosp m_mtu-net.ru> writes:
Serg Kovrov wrote:

Maybe you should take a look at this:

http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=dlang&l
ang2=gpp

http://shootout.alioth.debian.org/gp4sandbox/benchmark.php?test=all&lang=dlang&l
ang2=java

and also try to test some programs for yourself.

I also can say that recently there was a discussion on RSDN (Russian software
developers forum) about string splitting performance and D std.string's split
routine was the fastest and beat C++/boost.split 20x :

http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1

(it's in Russian but you can easily recognize the numbers)

(some src code also here: http://rsdn.ru/forum/Message.aspx?mid=2113456&only=1)


--------------------------------------------------------------------------------

As D is a superset of C so you can still stick to structs and malloc/free 
and you can overload new/delete as well. 

And you can even allocate on the stack (syntax is horrible, though - perhaps to
discourage this).

-- 
AKhropov
Oct 26 2006
parent reply Walter Bright <newshound digitalmars.com> writes:
Andrey Khropov wrote:
 http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1
 
 (it's in Russian but you can easily recognize the numbers)

I tried google's translator on it, but unfortunately Russian is not supported!
Oct 30 2006
parent Roberto Mariottini <rmariottini mail.com> writes:
Walter Bright wrote:
 Andrey Khropov wrote:
 http://rsdn.ru/forum/Message.aspx?mid=2126270&only=1

 (it's in Russian but you can easily recognize the numbers)

I tried google's translator on it, but unfortunately Russian is not supported!

But it's supported by Altavista: http://babelfish.altavista.com/ Ciao P.S.: The Java example should use StringBuffer, not String!
Nov 02 2006