www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - manual memory management

reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
Hello, folks!

I'm on to a project, which requires manual memory management using custom
allocators, but I can't seem to get dynamic arrays and associative arrays
to work.

The std.conv.emplace only allocates the pointer and the size of the dynamic
array and pointer to the associative array, which is half the issue.

I can work around the dynamic array by manually allocating the elements of
the array and returning a slice to the result (although I'd be really glad
if I could directly use arrays with my custom allocators).

The biggest problem is the associative array, the storage of which is
completely hidden and implemenentation-specific, so I can't work around it
the way I can with dynamic arrays.

How can I have an associative array, which uses a custom allocator?

-- 
Bye,
Gor Gyolchanyan.
Jan 07 2013
next sibling parent reply "David Nadlinger" <see klickverbot.at> writes:
On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan wrote:
 How can I have an associative array, which uses a custom 
 allocator?
I'm afraid the only viable solution right now is to implement your own AA type as a struct with overloaded operators (which is in fact what the built-in AAs are lowered to as well). There are two downside to this, though - besides, of course, the fact that you need a custom implementation: - You cannot pass your type to library functions expecting a built-in associative array. - You lose the convenient literal syntax. This could be fixed in the language, though, by providing a rewrite to a variadic constructor of user types for array/AA literals, thus eliminating the need for GC allocations (gah, another thing I just need to find the time to write up a DIP for…). David
Jan 07 2013
parent reply Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Mon, Jan 7, 2013 at 7:25 PM, David Nadlinger <see klickverbot.at> wrote:

 On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan wrote:

 How can I have an associative array, which uses a custom allocator?
I'm afraid the only viable solution right now is to implement your own AA type as a struct with overloaded operators (which is in fact what the built-in AAs are lowered to as well). There are two downside to this, though - besides, of course, the fact tha=
t
 you need a custom implementation:
  - You cannot pass your type to library functions expecting a built-in
 associative array.
  - You lose the convenient literal syntax. This could be fixed in the
 language, though, by providing a rewrite to a variadic constructor of use=
r
 types for array/AA literals, thus eliminating the need for GC allocations
 (gah, another thing I just need to find the time to write up a DIP for=E2=
=80=A6).
 David
This means, that dlang.org is lying. D doesn't provide both a garbage collector and manual memory management. It provides a garbage collector and a lousy excuse for manual memory management. As much as I love D for it's metaprogramming and generative programming, it's not even remotely fit for system-level programming the way it claims it is. I don't mean to be trolling, but it's not the first time I got grossly disappointed in D. --=20 Bye, Gor Gyolchanyan.
Jan 07 2013
next sibling parent reply "mist" <none none.none> writes:
How is D manual memory management any worse than plain C one?
Plenty of language features depend on GC but stuff that is left 
can hardly be named "a lousy excuse". It lacks some convenience 
and guidelines based on practical experience but it is already as 
capable as some of wide-spread solutions for systems programming 
(C). In fact I'd be much more afraid of runtime issues when doing 
system stuff than GC ones.

On Monday, 7 January 2013 at 15:49:50 UTC, Gor Gyolchanyan wrote:
 On Mon, Jan 7, 2013 at 7:25 PM, David Nadlinger 
 <see klickverbot.at> wrote:

 On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan 
 wrote:

 How can I have an associative array, which uses a custom 
 allocator?
I'm afraid the only viable solution right now is to implement your own AA type as a struct with overloaded operators (which is in fact what the built-in AAs are lowered to as well). There are two downside to this, though - besides, of course, the fact that you need a custom implementation: - You cannot pass your type to library functions expecting a built-in associative array. - You lose the convenient literal syntax. This could be fixed in the language, though, by providing a rewrite to a variadic constructor of user types for array/AA literals, thus eliminating the need for GC allocations (gah, another thing I just need to find the time to write up a DIP for…). David
This means, that dlang.org is lying. D doesn't provide both a garbage collector and manual memory management. It provides a garbage collector and a lousy excuse for manual memory management. As much as I love D for it's metaprogramming and generative programming, it's not even remotely fit for system-level programming the way it claims it is. I don't mean to be trolling, but it's not the first time I got grossly disappointed in D.
Jan 07 2013
parent reply "Rob T" <rob ucora.com> writes:
On Monday, 7 January 2013 at 16:12:22 UTC, mist wrote:
 How is D manual memory management any worse than plain C one?
 Plenty of language features depend on GC but stuff that is left 
 can hardly be named "a lousy excuse". It lacks some convenience 
 and guidelines based on practical experience but it is already 
 as capable as some of wide-spread solutions for systems 
 programming (C). In fact I'd be much more afraid of runtime 
 issues when doing system stuff than GC ones.
I think the point being made was that built in language features should not be dependent on the need for a GC because it means that you cannot fully use the language without a GC present and active. We can perhaps excuse the std library, but certainly not the language itself, because the claim is made that D's GC is fully optional. --rt
Jan 07 2013
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Monday, January 07, 2013 17:55:35 Rob T wrote:
 On Monday, 7 January 2013 at 16:12:22 UTC, mist wrote:
 How is D manual memory management any worse than plain C one?
 Plenty of language features depend on GC but stuff that is left
 can hardly be named "a lousy excuse". It lacks some convenience
 and guidelines based on practical experience but it is already
 as capable as some of wide-spread solutions for systems
 programming (C). In fact I'd be much more afraid of runtime
 issues when doing system stuff than GC ones.
I think the point being made was that built in language features should not be dependent on the need for a GC because it means that you cannot fully use the language without a GC present and active. We can perhaps excuse the std library, but certainly not the language itself, because the claim is made that D's GC is fully optional.
I don't think that any of the documentation or D's developers have ever claimed that you could use the full language without the GC. Quite the opposite in fact. There are a number of language features that require the GC - including AAs, array concatenation, and closures. You _can_ program in D without the GC, but you lose features, and there's no way around that. It may be the case that some features currently require the GC when they shouldn't, but there are definitely features that _must_ have the GC and _cannot_ be implemented otherwise (e.g. array concatenation and closures). So, if you want to ditch the GC completely, it comes at a cost, and AFAIK no one around here is saying otherwise. You _can_ do it though if you really want to. In general however, the best approach if you want to minimize GC involvement is to generally use manual memory management and minimize your usage of features that require the GC rather than try and get rid of it entirely, because going the extra mile to remove its use completely generally just isn't worth it. Kith-Sa posted some good advice on this just the other day, and he's written a game engine in D: http://forum.dlang.org/post/vbsajlgotanuhmmpnspf forum.dlang.org - Jonathan M Davis
Jan 07 2013
next sibling parent reply "Rob T" <rob ucora.com> writes:
On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
 I don't think that any of the documentation or D's developers 
 have ever
 claimed that you could use the full language without the GC. 
 Quite the
 opposite in fact. There are a number of language features that 
 require the GC
 - including AAs, array concatenation, and closures.
True, there is some documentation describing that certain features require the use of the GC. Although I would say that the documentation needs to be made a lot more clear on this point. For example in the AA section there's no mention that the GC is required. What you are saying is that while the GC is considered optional, it is not really optional given the language as a whole, only a (I assume large) subset of the language will work without the GC. In other words, the GC is partly optional. I think we can do a lot better to make it more clear that the GC is not 100% optional, and also indicate clearly what features will not work without one.
 You _can_
program in D without the GC, but you lose features, and there's no way around that. It may be the case that some features currently require the GC when they shouldn't, but there are definitely features that _must_ have the GC and _cannot_ be implemented otherwise (e.g. array concatenation and closures).
Is this a hard fact, or can there be a way to make it work? For example what about the custom allocator idea? From a marketing POV, if the language can be made 100% free of the GC it would at least not be a deterrent to those who cannot accept having to use one. From a technical POV, there are definitely many situations where not using a GC is desirable. --rt
Jan 07 2013
next sibling parent "bearophile" <bearophileHUGS lycos.com> writes:
Rob T:

 What you are saying is that while the GC is considered 
 optional, it is not really optional given the language as a 
 whole, only a (I assume large) subset of the language will work 
 without the GC. In other words, the GC is partly optional.
Technical users get angry when they uncover some marketing lies in technical documentation. It's much better to tell them the truth since the beginning. Bye, bearophile
Jan 07 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 07, 2013 at 11:26:02PM +0100, Rob T wrote:
 On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
[...]
You _can_ program in D without the GC, but you lose features, and
there's no way around that. It may be the case that some features
currently require the GC when they shouldn't, but there are
definitely features that _must_ have the GC and _cannot_ be
implemented otherwise (e.g. array concatenation and closures).
Is this a hard fact, or can there be a way to make it work? For example what about the custom allocator idea?
Some features of D were *designed* with a GC in mind. As Jonathan has already said, array slicing, concatenation, etc., pretty much *require* a GC. I don't see how else you could implement code like this: int[] f(int[] arr) { assert(arr.length >= 4); return arr[2..4]; } int[] g(int[] arr) { assert(arr.length >= 2); return arr[0..2]; } int[] h(int[] arr) { assert(arr.length >= 3); if (arr[0] > 5) return arr[1..3]; else return arr[2..3] ~ 6; } void main() { int[] arr = [1,2,3,4,5,6,7,8]; auto a1 = f(arr[1..5]); auto a2 = g(arr[3..$]); auto a3 = h(arr[0..6]); a2 ~= 123; // Exercise for the reader: write manual deallocation // for this code. } Yes, this code *can* be rewritten to use manual allocation, but it will be a major pain in the neck (not to mention likely to be inefficient, due to the required overhead of tracking where each array slice went and whether a reallocation was needed and what must be freed at the end). Not to mention that h() makes it impossible to do static analysis in the compiler to keep track of what's going on (it will reallocate the array or not depending on runtime data, for example). So you're pretty much screwed if you don't have a GC. To make it possible to do without the GC at the language level, you'd have to basically cripple most of the main selling points of D arrays, so that they become nothing more than C arrays with fancy syntax. Along with all the nasty caveats that made C arrays (esp. strings) so painful to work with. In particular, h() would require manual re-implementation and major API change (it needs to somehow return a flag of some sort to indicate whether or not the input array was reallocated), along with all code that calls it (check for the flag, then decide based on where a whole bunch of other pointers are pointing whether the input array needs to be deallocated, etc., all the usual daily routine of a C programmer's painful life). This cannot be feasibly automated, which means it can't be done by the compiler, which means using D doesn't really give you any advantage here, and therefore you might as well just write it in straight C to begin with.
 From a marketing POV, if the language can be made 100% free of the GC
 it would at least not be a deterrent to those who cannot accept having
 to use one. From a technical POV, there are definitely many situations
 where not using a GC is desirable.
[...] I think much of the aversion to GCs is misplaced. I used to be very aversive of GCs as well, so I totally understand where you're coming from. I used to believe that GCs are for lazy programmers who can't be bothered to think through their code and how to manage memory properly, and that therefore GCs encourage sloppy coding. But then, after having used D extensively for my personal projects, I discovered to my surprise that having a GC actually *improved* the quality of my code -- it's much more readable because I don't have to keep fiddling with pointers and ownership (or worse, reference counts), and I can actually focus on how to make the algorithms better. Not to mention the countless frustrating hours spent chasing pointer bugs and memory leaks are all gone -- 'cos I don't have to use pointers directly anymore. As for performance, I have not noticed any significant performance problems with using a GC in my D code. Now I know that there are cases when the intermittent pause of the GC's mark-n-sweep cycle may not be acceptable, but I suspect that 90% of applications don't even need to care about this. Most applications won't even have any noticeable pauses. The most prominent case where this *does* matter is in game engines, that must squeeze out every last drop of performance from the hardware, no matter what. But then, when you're coding a game engine, you aren't writing general application code per se; you're engineering a highly-polished and meticulously-tuned codebase where all data structures are already carefully controlled and mapped out -- IOW, you wouldn't be using GC-dependent features of D in this code anyway. So it shouldn't even be a problem. The problem case comes when you have to interface this highly-optimized core with application-level code, like in-game scripting or what-not. I see a lot of advantages in separating out the scripting engine into a separate process from the high-performance video/whatever-handling code, so you can have the GC merrily doing its thing in the scripting engine (targeted for script writers, level designers, who aren't into doing pointer arithmetic in order to get the highest polygon rates from the video hardware), without affecting the GC-independent core at all. So you get the best of both worlds. Crippling the language to cater to the 10% crowd who want to squeeze every last drop of performance from the hardware is the wrong approach IMO. T -- "Life is all a great joke, but only the brave ever get the point." -- Kenneth Rexroth
Jan 07 2013
next sibling parent reply "Rob T" <rob ucora.com> writes:
Yes I can see in your example why removing the GC fully will be 
difficult to deal with.

I am not actually against the use of the GC, I was only wondering 
if it could be fully removed. I too did not at first agree with 
the GC concept, thinking the same things you mention. I still 
have to consider performance issues caused by the GC, but the 
advantage is that I can do things that before I would not even 
bother attempting because the cost was too high. The way I 
program has changed for the better, there's no doubt about it.

So if the GC cannot be removed fully, then there's no point 
trying to fully remove it, and performance issues have to be 
solved through improving the GC implementation, and also with 
better selective manual control methods.

As for the claims made that D's GC is "optional", that message is 
coming from various sources one encounters when reading about D 
for the first time.

For example:
http://www.drdobbs.com/tools/new-native-languages/232901643
"D has grown to embrace a wide range of features — optional 
memory management (garbage collection), ..."

Sure you can "optionally" disable the GC, but it means certain 
fundamental parts of the language will no longer be usable, 
leading to misconceptions that the GC is fully optional and 
everything can be made to work as before.

I know D's documentation is *not* claiming that the GC is 
optional, you get that impression from reading external sources 
instead, however it may be a good idea to counter the possible 
misconception in the FAQ.

Improved documentation will also help those who want to do 
selective manual memory management. As it is, I cannot say for 
certain what parts of the language require the use of the GC 
because the specification either leaves this information out, or 
is not specified clearly enough.

--rt
Jan 07 2013
next sibling parent reply Brad Roberts <braddr slice-2.puremagic.com> writes:
On Tue, 8 Jan 2013, Rob T wrote:

 I am not actually against the use of the GC, I was only wondering if it could
 be fully removed. I too did not at first agree with the GC concept, thinking
 the same things you mention. I still have to consider performance issues
 caused by the GC, but the advantage is that I can do things that before I
 would not even bother attempting because the cost was too high. The way I
 program has changed for the better, there's no doubt about it.
There's some issues that can rightfully be termed "caused by the GC", but most of the performance issues are probably better labled "agregious use of short lived allocations", which cost performance regardless of how memory is managed. The key difference being that in manual management the impact is spread out and in periodic garbage collection it's batched up. My primary point being, blaming the GC when it's the application style that generates enough garbage to result in wanting to blame the GC for the performance cost is misplaced blame. My 2 cents, Brad
Jan 07 2013
next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 8 January 2013 at 02:06:02 UTC, Brad Roberts wrote:
 On Tue, 8 Jan 2013, Rob T wrote:

 I am not actually against the use of the GC, I was only 
 wondering if it could
 be fully removed. I too did not at first agree with the GC 
 concept, thinking
 the same things you mention. I still have to consider 
 performance issues
 caused by the GC, but the advantage is that I can do things 
 that before I
 would not even bother attempting because the cost was too 
 high. The way I
 program has changed for the better, there's no doubt about it.
There's some issues that can rightfully be termed "caused by the GC", but most of the performance issues are probably better labled "agregious use of short lived allocations", which cost performance regardless of how memory is managed. The key difference being that in manual management the impact is spread out and in periodic garbage collection it's batched up. My primary point being, blaming the GC when it's the application style that generates enough garbage to result in wanting to blame the GC for the performance cost is misplaced blame. My 2 cents, Brad
You'll also find out that D's GC is kind of slow, but this is an implementation issue more than a conceptual problem with he GC.
Jan 07 2013
prev sibling next sibling parent "Rob T" <rob ucora.com> writes:
On Tuesday, 8 January 2013 at 02:06:02 UTC, Brad Roberts wrote:
 There's some issues that can rightfully be termed "caused by 
 the GC", but
 most of the performance issues are probably better labled 
 "agregious use
 of short lived allocations", which cost performance regardless 
 of how
 memory is managed.  The key difference being that in manual 
 management the
 impact is spread out and in periodic garbage collection it's 
 batched up.

 My primary point being, blaming the GC when it's the 
 application style
 that generates enough garbage to result in wanting to blame the 
 GC for the
 performance cost is misplaced blame.

 My 2 cents,
 Brad
There's more to it than just jerkiness caused by batching. The GC will do collection runs at inappropriate times, and that can cause slow downs well in excess of an otherwise identical application with manual memory management. For example, I've seen 3x performance penalty caused by the GC doing collection runs at the wrong times. The fix required manually disabling the GC during certain points and re-enabling afterwards. The 2 or 3 lines of extra code I inserted to fix the 3x performance penalty was a lot easier than performing full manual management, but it means that you cannot sit back and expect the GC to always do the right thing. --rt
Jan 07 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/7/2013 6:23 PM, Brad Roberts wrote:
 My primary point being, blaming the GC when it's the application style
 that generates enough garbage to result in wanting to blame the GC for the
 performance cost is misplaced blame.
True dat. There is no such thing as a memory allocation technology that will enable users to code without thought of it and yet get optimal performance.
Jan 08 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 08, 2013 at 02:57:31AM +0100, Rob T wrote:
[...]
 So if the GC cannot be removed fully, then there's no point trying
 to fully remove it, and performance issues have to be solved through
 improving the GC implementation, and also with better selective
 manual control methods.
I know people *have* tried to use D without GC-dependent features; it would be great if this information can be collected in one place and put into the official docs. That way, people who are writing game engines or real-time code know what to do, and the other 90% of coders can just continue using D as before.
 As for the claims made that D's GC is "optional", that message is
 coming from various sources one encounters when reading about D for
 the first time.
 
 For example:
 http://www.drdobbs.com/tools/new-native-languages/232901643
 "D has grown to embrace a wide range of features — optional memory
 management (garbage collection), ..."
 
 Sure you can "optionally" disable the GC, but it means certain
 fundamental parts of the language will no longer be usable, leading
 to misconceptions that the GC is fully optional and everything can
 be made to work as before.
Does Dr. Dobbs allow revisions to previously published articles? If not, the best we can do is to update our own docs to address this issue.
 I know D's documentation is *not* claiming that the GC is optional,
 you get that impression from reading external sources instead, however
 it may be a good idea to counter the possible misconception in the
 FAQ.
Yeah that's a good idea.
 Improved documentation will also help those who want to do selective
 manual memory management. As it is, I cannot say for certain what
 parts of the language require the use of the GC because the
 specification either leaves this information out, or is not specified
 clearly enough.
[...] I don't know if I know them all, but certainly the following are GC-dependent: - Slicing/appending arrays (which includes a number of string operations), .dup, .idup; - Delegates & anything requiring access to local variables after the containing scope has exited; - Built-in AA's; - Classes (though I believe it's possible to manually manage memory for classes via Phobos' emplace), including exceptions (IIRC); - std.container (IIRC Andrei was supposed to work on an allocator model for it so that it's usable without a GC) AFAIK, the range-related code in Phobos has been under scrutiny to contain no hidden allocations (hence the use of structs instead of classes for various range constructs). So unless there are bugs, std.range and std.algorithm should be safe to use without involving the GC. Static arrays are GC-free, and so are array literals (I *think*) as long as you don't do any memory-related operation on them like appending or .dup'ing. So strings should be still somewhat usable, though quite limited. I don't know if std.format (including writefln & friends) invoke the GC -- I think they do, under the hood. So writefln may not be usable, or maybe it's just certain format strings that can't be used, and if you're careful you may be able to pull it off without touching the GC. AA literals are NOT safe, though -- anything to do with built-in AA's will involve the GC. (I have an idea that may make AA literals usable without runtime allocation -- but CTFE is still somewhat limited right now so my implementation doesn't quite work yet.) But yeah, it would be nice if the official docs can indicate which features are GC-dependent. T -- Latin's a dead language, as dead as can be; it killed off all the Romans, and now it's killing me! -- Schoolboy
Jan 07 2013
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/08/2013 06:30 AM, H. S. Teoh wrote:
 ...

 I don't know if I know them all, but certainly the following are
 GC-dependent:

 - Slicing/appending arrays (which includes a number of string
    operations), .dup, .idup;
Slicing is not GC-dependent.
 - Delegates & anything requiring access to local variables after the
    containing scope has exited;
Yes, but this does not make delegate literals useless without a GC. scope delegate literals do not allocate but instead point directly to the stack.
 - Built-in AA's;
 - Classes (though I believe it's possible to manually manage memory for
    classes via Phobos' emplace), including exceptions (IIRC);
Classes are not GC-dependent at all. 'new'-expressions are GC-dependent. (though I think DMD still allows overloading them.)
 - std.container (IIRC Andrei was supposed to work on an allocator model
    for it so that it's usable without a GC)

 AFAIK, the range-related code in Phobos has been under scrutiny to
 contain no hidden allocations (hence the use of structs instead of
 classes for various range constructs). So unless there are bugs,
 std.range and std.algorithm should be safe to use without involving the
 GC.
The choice of structs vs classes is kind of necessary. I do not always want to write 'save' in order to not consume other copies of the range. (InputRangeObject notably gets this wrong, this is why it is not used.)
 Static arrays are GC-free, and so are array literals (I *think*) as long
 as you don't do any memory-related operation on them like appending or
 .dup'ing.
Currently, with DMD, array literals always allocate if they are not static. (Even if they are directly assigned to a static array variable!)
 So strings should be still somewhat usable, though quite
 limited. I don't know if std.format (including writefln & friends)
 invoke the GC -- I think they do,  under the hood. So writefln may not be
 usable, or maybe it's just certain format strings that can't be used,
 and if you're careful you may be able to pull it off without touching
 the GC.
I think they use output ranges under the hood. The main issue is toString(), which inherently requires GC allocation.
 AA literals are NOT safe, though -- anything to do with built-in AA's
 will involve the GC. (I have an idea that may make AA literals usable
 without runtime allocation -- but CTFE is still somewhat limited right
 now so my implementation doesn't quite work yet.)
I think the fact that it is not possible to save away arbitrary data structures at compile time for runtime use is just a limitation of the current implementation. Anyways mutable literals require allocation because of aliasing concerns.
 But yeah, it would be nice if the official docs can indicate which
 features are GC-dependent.


 T
Jan 10 2013
prev sibling next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Monday, 7 January 2013 at 23:13:13 UTC, H. S. Teoh wrote:
 ...

 Crippling the language to cater to the 10% crowd who want to 
 squeeze
 every last drop of performance from the hardware is the wrong 
 approach
 IMO.


 T
Agreed. Having used GC languages for the last decade, I think the cases where manual memory management is really required are very few. Even if one is forced to do manual memory management over GC, it is still better to have the GC around than do everything manually. But this is based on my experience doing business applications, desktop and server side or services/daemons. Other's experience may vary. -- Paulo
Jan 08 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 08, 2013 at 10:29:26AM +0100, Paulo Pinto wrote:
 On Monday, 7 January 2013 at 23:13:13 UTC, H. S. Teoh wrote:
...

Crippling the language to cater to the 10% crowd who want to squeeze
every last drop of performance from the hardware is the wrong
approach IMO.
[...]
 Agreed.
 
 Having used GC languages for the last decade, I think the cases
 where manual memory management is really required are very few.
 
 Even if one is forced to do manual memory management over GC, it is
 still better to have the GC around than do everything manually.
Yes, hence my idea of splitting up the performance-critical core of a game engine vs. the higher-level application stuff (like scripting, etc.) that aren't as performance-critical. The latter would be greatly helped by a GC -- it makes it easier for scripting people to use, whereas writing GC-less code demands a certain level of rigor and certainly requires more effort and care than is necessary for the most part.
 But this is based on my experience doing business applications,
 desktop and server side or services/daemons.
[...] Well, business applications and server-side stuff (I assume it's web-based stuff) are exactly the kind of applications that benefit the most from a GC. In my mind, they are just modern incarnations of batch processing applications, where instant response isn't critical, and so the occasional GC pause is acceptable and, indeed, mostly unnoticeable. Game engines, OTOH, are a step away from hard real-time applications, where pause-the-world GCs are unacceptable. While it isn't fatal for a game engine to pause every now and then, it is very noticeable, and detrimental to the players' experience, so game devs generally shy away from anything that needs to pause the world. For real-time apps, though, it's not only noticeable, it can mean the difference between life and death (e.g., in controllers for medical equipment -- pausing for 1/2 seconds while the GC runs can mean that the laser burns off stuff that shouldn't be burned off the patient's body). But then again, considering the bulk of all software being written today, how much code is actually mission-critical real-time apps or game engine cores? I suspect real-time apps are <5% of all software, and while games are a rapidly growing market, I daresay less than 30-40% of game code actually needs to be pauseless (mainly just video-rendering code -- code that handles monster AI, for example, wouldn't fail horribly if it had to take a few extra frames to decide what to do next -- in fact, it may even be more realistic that way). Which, in my estimation, probably doesn't account for more than 10% of all software out there. The bulk of software being written today don't really need to be GC-less. T -- The richest man is not he who has the most, but he who needs the least.
Jan 08 2013
next sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 08.01.2013 16:25, schrieb H. S. Teoh:
 On Tue, Jan 08, 2013 at 10:29:26AM +0100, Paulo Pinto wrote:
 On Monday, 7 January 2013 at 23:13:13 UTC, H. S. Teoh wrote:
 ...

 Crippling the language to cater to the 10% crowd who want to squeeze
 every last drop of performance from the hardware is the wrong
 approach IMO.
[...]
 Agreed.

 Having used GC languages for the last decade, I think the cases
 where manual memory management is really required are very few.

 Even if one is forced to do manual memory management over GC, it is
 still better to have the GC around than do everything manually.
Yes, hence my idea of splitting up the performance-critical core of a game engine vs. the higher-level application stuff (like scripting, etc.) that aren't as performance-critical. The latter would be greatly helped by a GC -- it makes it easier for scripting people to use, whereas writing GC-less code demands a certain level of rigor and certainly requires more effort and care than is necessary for the most part.
 But this is based on my experience doing business applications,
 desktop and server side or services/daemons.
[...] Well, business applications and server-side stuff (I assume it's web-based stuff) are exactly the kind of applications that benefit the most from a GC. In my mind, they are just modern incarnations of batch processing applications, where instant response isn't critical, and so the occasional GC pause is acceptable and, indeed, mostly unnoticeable. Game engines, OTOH, are a step away from hard real-time applications, where pause-the-world GCs are unacceptable. While it isn't fatal for a game engine to pause every now and then, it is very noticeable, and detrimental to the players' experience, so game devs generally shy away from anything that needs to pause the world. For real-time apps, though, it's not only noticeable, it can mean the difference between life and death (e.g., in controllers for medical equipment -- pausing for 1/2 seconds while the GC runs can mean that the laser burns off stuff that shouldn't be burned off the patient's body). But then again, considering the bulk of all software being written today, how much code is actually mission-critical real-time apps or game engine cores? I suspect real-time apps are <5% of all software, and while games are a rapidly growing market, I daresay less than 30-40% of game code actually needs to be pauseless (mainly just video-rendering code -- code that handles monster AI, for example, wouldn't fail horribly if it had to take a few extra frames to decide what to do next -- in fact, it may even be more realistic that way). Which, in my estimation, probably doesn't account for more than 10% of all software out there. The bulk of software being written today don't really need to be GC-less. T
So how much experience do you have with game engine programming to make such statements? Kind Regards Benjamin Thaut
Jan 08 2013
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 08, 2013 at 04:31:45PM +0100, Benjamin Thaut wrote:
 Am 08.01.2013 16:25, schrieb H. S. Teoh:
[...]
Game engines, OTOH, are a step away from hard real-time applications,
where pause-the-world GCs are unacceptable. While it isn't fatal for
a game engine to pause every now and then, it is very noticeable, and
detrimental to the players' experience, so game devs generally shy
away from anything that needs to pause the world. For real-time apps,
though, it's not only noticeable, it can mean the difference between
life and death (e.g., in controllers for medical equipment -- pausing
for 1/2 seconds while the GC runs can mean that the laser burns off
stuff that shouldn't be burned off the patient's body).

But then again, considering the bulk of all software being written
today, how much code is actually mission-critical real-time apps or
game engine cores? I suspect real-time apps are <5% of all software,
and while games are a rapidly growing market, I daresay less than
30-40% of game code actually needs to be pauseless (mainly just
video-rendering code -- code that handles monster AI, for example,
wouldn't fail horribly if it had to take a few extra frames to decide
what to do next -- in fact, it may even be more realistic that way).
Which, in my estimation, probably doesn't account for more than 10%
of all software out there. The bulk of software being written today
don't really need to be GC-less.


T
So how much experience do you have with game engine programming to make such statements?
[...] Not much, I'll admit. So maybe I'm just totally off here. But the last two sentences weren't specific to game code, I was just making a statement about software in general. (It would be a gross misrepresentation to claim that only 10% of a game is performance critical!) T -- Many open minds should be closed for repairs. -- K5 user
Jan 08 2013
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 08.01.2013 16:46, schrieb H. S. Teoh:
 So how much experience do you have with game engine programming to
 make such statements?
[...] Not much, I'll admit. So maybe I'm just totally off here. But the last two sentences weren't specific to game code, I was just making a statement about software in general. (It would be a gross misrepresentation to claim that only 10% of a game is performance critical!) T
So to give a little background about me. I'm currently doing my masters degree in informatics which is focused on media related programming. (E.g. games, applications with other visual output, mobile apps, etc). Besides my studies I'm working at Havok, the biggest middle ware company in the gaming industry. I'm working there since about a year. I also have some contacts to people working at Crytek. My impression so far: No one who is writing a tripple A gaming title or engine is only remotly interested in using a GC. Game engine programmers almost do anything to get better performance on a certain plattform. There are really elaborate taks beeing done just to get 1% more performance. And because of that, a GC is the very first thing every serious game engine programmer will kick. You have to keep in mind that most games run at 30 FPS. That means you only have 33 ms to do everything. Rendering, simulating physics, doing the game logic, handling network input, playing sounds, streaming data, and so on. Some games even try to get 60 FPS which makes it even harder as you only have 16 ms to compute everything. Everything is performance critical if you try to achive that. I also know that Crytek used Lua for game scripting in Crysis 1. It was one of the reasons they never managed to get it onto the Consoles (ps3, xbox 360). In Crysis 2 they removed all the lua game logic and wrote everything in C++ to get better performance. Doing pooling with a GC enabled, still wastes a lot of time. Because when pooling is used almost all data will survive a collection anyway (because most of it is in pools). So when the GC runs, most of the work it does is wasted, because its running over instances that are going to survive anyway. Pooling is just another way of manual memory management and I don't find this a valid argument for using a GC. Also my own little test case (a game I wrote for university) has shown that I get a 300% improvement by not using a GC. At the beginning when I wrote the game I was convinced that one could make a game work when using a GC with only a little performance impact (10%-5%). I already heavily optimized the game with some background knowdelge about how the GC works. I even did some manual memory mangement for memory blocks that were garantueed to not contain any pointers to GC data. Despite all this I got a 300% performance improvement after swichting to pure manual memory management and removing the GC from druntime. When D wants to get into the gaming space, there has to be a GC free option. Otherwise D will not even be considered when programming languages are evaluated. Kind Regards Benjamin Thaut
Jan 08 2013
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2013-01-08 17:12, Benjamin Thaut wrote:

 So to give a little background about me. I'm currently doing my masters
 degree in informatics which is focused on media related programming.
 (E.g. games, applications with other visual output, mobile apps, etc).

 Besides my studies I'm working at Havok, the biggest middle ware company
 in the gaming industry. I'm working there since about a year. I also
 have some contacts to people working at Crytek.
Impressive.
 When D wants to get into the gaming space, there has to be a GC free
 option. Otherwise D will not even be considered when programming
 languages are evaluated.
It seems you already have done great progress in this area, with your fork of druntime and Phobos. -- /Jacob Carlborg
Jan 08 2013
prev sibling next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 08.01.2013 17:12, schrieb Benjamin Thaut:
 Am 08.01.2013 16:46, schrieb H. S. Teoh:
 So how much experience do you have with game engine programming to
 make such statements?
[...] Not much, I'll admit. So maybe I'm just totally off here. But the last two sentences weren't specific to game code, I was just making a statement about software in general. (It would be a gross misrepresentation to claim that only 10% of a game is performance critical!) T
So to give a little background about me. I'm currently doing my masters degree in informatics which is focused on media related programming. (E.g. games, applications with other visual output, mobile apps, etc). Besides my studies I'm working at Havok, the biggest middle ware company in the gaming industry. I'm working there since about a year. I also have some contacts to people working at Crytek. My impression so far: No one who is writing a tripple A gaming title or engine is only remotly interested in using a GC. Game engine programmers almost do anything to get better performance on a certain plattform. There are really elaborate taks beeing done just to get 1% more performance. And because of that, a GC is the very first thing every serious game engine programmer will kick. You have to keep in mind that most games run at 30 FPS. That means you only have 33 ms to do everything. Rendering, simulating physics, doing the game logic, handling network input, playing sounds, streaming data, and so on. Some games even try to get 60 FPS which makes it even harder as you only have 16 ms to compute everything. Everything is performance critical if you try to achive that. I also know that Crytek used Lua for game scripting in Crysis 1. It was one of the reasons they never managed to get it onto the Consoles (ps3, xbox 360). In Crysis 2 they removed all the lua game logic and wrote everything in C++ to get better performance. Doing pooling with a GC enabled, still wastes a lot of time. Because when pooling is used almost all data will survive a collection anyway (because most of it is in pools). So when the GC runs, most of the work it does is wasted, because its running over instances that are going to survive anyway. Pooling is just another way of manual memory management and I don't find this a valid argument for using a GC. Also my own little test case (a game I wrote for university) has shown that I get a 300% improvement by not using a GC. At the beginning when I wrote the game I was convinced that one could make a game work when using a GC with only a little performance impact (10%-5%). I already heavily optimized the game with some background knowdelge about how the GC works. I even did some manual memory mangement for memory blocks that were garantueed to not contain any pointers to GC data. Despite all this I got a 300% performance improvement after swichting to pure manual memory management and removing the GC from druntime. When D wants to get into the gaming space, there has to be a GC free option. Otherwise D will not even be considered when programming languages are evaluated. Kind Regards Benjamin Thaut
Without dismissing your experience in game development, I think that your experience was spoiled by D's GC quality. After all, there are Java VMs driving missiles and ship battle systems, which have even higher timer requirements. -- Paulo
Jan 08 2013
parent reply "eles" <eles eles.com> writes:
On Tuesday, 8 January 2013 at 23:10:01 UTC, Paulo Pinto wrote:
 Am 08.01.2013 17:12, schrieb Benjamin Thaut:
 Am 08.01.2013 16:46, schrieb H. S. Teoh:
Without dismissing your experience in game development, I think that your experience was spoiled by D's GC quality. After all, there are Java VMs driving missiles and ship battle systems, which have even higher timer requirements.
Yes, but they are relying on specific, very special constructs of Java, such as: http://www.rtsj.org/specjavadoc/javax/realtime/NoHeapRealtimeThread.html http://www.rtsj.org/specjavadoc/javax/realtime/RealtimeThread.html which have very little, if any, to do do with regular Java. BTW, since when a Java programmer should be concerned about... heap? What's that? malloc()? The fact that it uses the same syntax as Java, simply does not mae it Java, at least in the (regular) JVM sense.
Jan 10 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 10 January 2013 at 09:05:53 UTC, eles wrote:
 On Tuesday, 8 January 2013 at 23:10:01 UTC, Paulo Pinto wrote:
 Am 08.01.2013 17:12, schrieb Benjamin Thaut:
 Am 08.01.2013 16:46, schrieb H. S. Teoh:
Without dismissing your experience in game development, I think that your experience was spoiled by D's GC quality. After all, there are Java VMs driving missiles and ship battle systems, which have even higher timer requirements.
Yes, but they are relying on specific, very special constructs of Java, such as: http://www.rtsj.org/specjavadoc/javax/realtime/NoHeapRealtimeThread.html http://www.rtsj.org/specjavadoc/javax/realtime/RealtimeThread.html which have very little, if any, to do do with regular Java. BTW, since when a Java programmer should be concerned about... heap? What's that? malloc()?
Any developer worth its salt should worry how their application makes use of the available resources.
 The fact that it uses the same syntax as Java, simply does not 
 mae it Java, at least in the (regular) JVM sense.
The JVM is just a possible implementation of Java, the language. It was quite an unfortunate decision for the Sun marketing team to call the language and the VM the same name. This always ends up in lots of confusions when people consider the JVM to be the only way to execute Java. Usually only developers with compiler design background end up making the difference, as languages and implementations are quite two different issues. -- Paulo
Jan 10 2013
prev sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 8 January 2013 at 16:12:41 UTC, Benjamin Thaut wrote:
 My impression so far: No one who is writing a tripple A gaming 
 title or engine is only remotly interested in using a GC. Game 
 engine programmers almost do anything to get better performance 
 on a certain plattform. There are really elaborate taks beeing 
 done just to get 1% more performance. And because of that, a GC 
 is the very first thing every serious game engine programmer 
 will kick. You have to keep in mind that most games run at 30 
 FPS. That means you only have 33 ms to do everything. 
 Rendering, simulating physics, doing the game logic, handling 
 network input, playing sounds, streaming data, and so on.
 Some games even try to get 60 FPS which makes it even harder as 
 you only have 16 ms to compute everything. Everything is 
 performance critical if you try to achive that.
That is a real misrepresentation of the reality. Such people avoid the GC, but simply because they avoid all kind of allocation altogether, preferring allocating up-front.
Jan 08 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 12:21:05AM +0100, deadalnix wrote:
 On Tuesday, 8 January 2013 at 16:12:41 UTC, Benjamin Thaut wrote:
My impression so far: No one who is writing a tripple A gaming
title or engine is only remotly interested in using a GC. Game
engine programmers almost do anything to get better performance on
a certain plattform. There are really elaborate taks beeing done
just to get 1% more performance. And because of that, a GC is the
very first thing every serious game engine programmer will kick.
You have to keep in mind that most games run at 30 FPS. That means
you only have 33 ms to do everything. Rendering, simulating
physics, doing the game logic, handling network input, playing
sounds, streaming data, and so on.
Some games even try to get 60 FPS which makes it even harder as
you only have 16 ms to compute everything. Everything is
performance critical if you try to achive that.
That is a real misrepresentation of the reality. Such people avoid the GC, but simply because they avoid all kind of allocation altogether, preferring allocating up-front.
Interesting, that sounds like the heapless programming of the old, old days where you map out everything beforehand, and limit yourself to use only what is there. Y'know, with fixed-sized arrays, stacks, etc., fixed maximum number of objects, etc.. Not a bad approach if you want to tightly control everything. Well, except that you do have preallocation, presumably during runtime at startup (and between game levels, perhaps?), so it's not as rigid, but during gameplay itself this is pretty much what it amounts to, right? T -- Let's not fight disease by killing the patient. -- Sean 'Shaleh' Perry
Jan 08 2013
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2013 02:59, schrieb H. S. Teoh:
 On Wed, Jan 09, 2013 at 12:21:05AM +0100, deadalnix wrote:
 On Tuesday, 8 January 2013 at 16:12:41 UTC, Benjamin Thaut wrote:
 My impression so far: No one who is writing a tripple A gaming
 title or engine is only remotly interested in using a GC. Game
 engine programmers almost do anything to get better performance on
 a certain plattform. There are really elaborate taks beeing done
 just to get 1% more performance. And because of that, a GC is the
 very first thing every serious game engine programmer will kick.
 You have to keep in mind that most games run at 30 FPS. That means
 you only have 33 ms to do everything. Rendering, simulating
 physics, doing the game logic, handling network input, playing
 sounds, streaming data, and so on.
 Some games even try to get 60 FPS which makes it even harder as
 you only have 16 ms to compute everything. Everything is
 performance critical if you try to achive that.
That is a real misrepresentation of the reality. Such people avoid the GC, but simply because they avoid all kind of allocation altogether, preferring allocating up-front.
Interesting, that sounds like the heapless programming of the old, old days where you map out everything beforehand, and limit yourself to use only what is there. Y'know, with fixed-sized arrays, stacks, etc., fixed maximum number of objects, etc.. Not a bad approach if you want to tightly control everything. Well, except that you do have preallocation, presumably during runtime at startup (and between game levels, perhaps?), so it's not as rigid, but during gameplay itself this is pretty much what it amounts to, right? T
No its not heapless programming. You just try to avoid allocations during gameplay. But they are done if needed (or if there is no time to do it the "clean" way) Kind Regards Benjamin Thaut
Jan 09 2013
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2013 00:21, schrieb deadalnix:
 That is a real misrepresentation of the reality. Such people avoid the
 GC, but simply because they avoid all kind of allocation altogether,
 preferring allocating up-front.
But in the end they still don't want a GC, correct? Kind Regards Benjamin Thaut
Jan 09 2013
parent reply "SomeDude" <lovelydear mailmetrash.com> writes:
On Wednesday, 9 January 2013 at 08:52:37 UTC, Benjamin Thaut 
wrote:
 Am 09.01.2013 00:21, schrieb deadalnix:
 That is a real misrepresentation of the reality. Such people 
 avoid the
 GC, but simply because they avoid all kind of allocation 
 altogether,
 preferring allocating up-front.
But in the end they still don't want a GC, correct? Kind Regards Benjamin Thaut
If everything is preallocated and reused, does it really matter whether there is a GC or not ?
Jan 09 2013
parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 22:14:15 SomeDude wrote:
 If everything is preallocated and reused, does it really matter
 whether there is a GC or not ?
It would if the GC were running in the background (which isn't currently the case for D's default GC), but other than that, it would just affect program shutdown, because that's when the GC would actually run. If the GC isn't run, it can't affect anything. - Jonathan M Davis
Jan 09 2013
prev sibling next sibling parent reply "Peter Alexander" <peter.alexander.au gmail.com> writes:
On Tuesday, 8 January 2013 at 15:27:21 UTC, H. S. Teoh wrote:
 But then again, considering the bulk of all software being 
 written
 today, how much code is actually mission-critical real-time 
 apps or game
 engine cores?
You also need to consider the market for D. Performance is one of D's key selling points. If it had the performance of Python then D would be a much less interesting language, and I honestly doubt anyone would even look at it. Whether or not the bulk of software written is critically real-time is irrelevant. The question is whether the bulk of software written *in D* is critically real-time. I don't know what the % is, but I'd assume it is much larger than the average piece of software.
Jan 08 2013
parent reply "Rob T" <rob ucora.com> writes:
On Tuesday, 8 January 2013 at 18:35:19 UTC, Peter Alexander wrote:
 You also need to consider the market for D. Performance is one 
 of D's key selling points. If it had the performance of Python 
 then D would be a much less interesting language, and I 
 honestly doubt anyone would even look at it.

 Whether or not the bulk of software written is critically 
 real-time is irrelevant. The question is whether the bulk of 
 software written *in D* is critically real-time. I don't know 
 what the % is, but I'd assume it is much larger than the 
 average piece of software.
Well I for one looked at D *only* because the specifications claimed you'd get performance comparable with C/C++, which is about as good as it gets, and also that I could compile standalone executables not dependent on a separate runtime environment. The fact that I can integrate systems level code along with high level code in a seamless and safe way sealed the deal. The only major thing that concerns me is the lack of proper shared library support. I hope this omission is resolved soon. --rt
Jan 08 2013
parent reply "David Nadlinger" <see klickverbot.at> writes:
On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper 
 shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
Jan 08 2013
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/8/13 3:30 PM, David Nadlinger wrote:
 On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper shared
 library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
I really need real runtime loading of D code (i.e. dlopen and friends) for a project I can't share much about for the time being. Andrei
Jan 08 2013
prev sibling next sibling parent reply "Rob T" <rob ucora.com> writes:
On Tuesday, 8 January 2013 at 23:30:34 UTC, David Nadlinger wrote:
 On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper 
 shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
I *really* need runtime loading of plug-in code for a server application. This allows the server code to remain untouched while allowing extensions to be added on by a 3rd party. Runtime linked shared libs are also nice to have for the simple reason that shared libraries can be updated (to a point) without having to rebuild and relink all applications that make use of the libraries. There are pros and cons to static vs dynamic linking, but definitely both are useful to have. I'm very surprised that not too many people have been screaming for dynamic linking and runtime loading. It's very hard for me to imagine not having the feature because it's so darn useful and an essential feature if your strategy is to allow 3rd parties to extend an application without hacking at the source code. If there's another better way, I'd sure like to know about it! --rt
Jan 08 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 03:49:33AM +0100, Rob T wrote:
 On Tuesday, 8 January 2013 at 23:30:34 UTC, David Nadlinger wrote:
On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
The only major thing that concerns me is the lack of proper
shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
I *really* need runtime loading of plug-in code for a server application. This allows the server code to remain untouched while allowing extensions to be added on by a 3rd party. Runtime linked shared libs are also nice to have for the simple reason that shared libraries can be updated (to a point) without having to rebuild and relink all applications that make use of the libraries. There are pros and cons to static vs dynamic linking, but definitely both are useful to have. I'm very surprised that not too many people have been screaming for dynamic linking and runtime loading. It's very hard for me to imagine not having the feature because it's so darn useful and an essential feature if your strategy is to allow 3rd parties to extend an application without hacking at the source code.
I haven't been screaming yet because (so far) I haven't gotten to writing applications that need dynamic loading in D. But I did tell myself that I will be screaming when I do get to it, because it's a pain to have to recompile the entire application just to add a single addon.
 If there's another better way, I'd sure like to know about it!
[...] Another way, yes. Better, I don't know. You *could* load plugins as separate processes and communicate via some kind of IPC mechanism, like Unix pipes. But that's a royal pain (requires serialization / deserialization, with the associated overhead, and a network stack or equivalent, just to interface with each other). T -- Latin's a dead language, as dead as can be; it killed off all the Romans, and now it's killing me! -- Schoolboy
Jan 08 2013
parent reply "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 03:05:34 UTC, H. S. Teoh wrote:
 I haven't been screaming yet because (so far) I haven't gotten 
 to
 writing applications that need dynamic loading in D. But I did 
 tell
 myself that I will be screaming when I do get to it, because 
 it's a pain
 to have to recompile the entire application just to add a 
 single addon.
Unfortunately for me, I'm getting to that point.
 If there's another better way, I'd sure like to know about it!
[...] Another way, yes. Better, I don't know. You *could* load plugins as separate processes and communicate via some kind of IPC mechanism, like Unix pipes. But that's a royal pain (requires serialization / deserialization, with the associated overhead, and a network stack or equivalent, just to interface with each other). T
The messaging concept does have some advantages. For example, if your external "plugin" fails for any reason (for example due to a segfault), the rest of your main application can continue operating just fine. There are other advantages, such as distributed processing, but if you can benefit from those advantages depends entirely on what you are attempting to achieve. The simplest use case that gives you a good subset of the main advantages, is through runtime loaded plugins. In my case, I will making use of both methods, but I really prefer having the plug-in capability to start with. --rt
Jan 09 2013
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 09:03:17PM +0100, Rob T wrote:
 On Wednesday, 9 January 2013 at 03:05:34 UTC, H. S. Teoh wrote:
[...]
You *could* load plugins as separate processes and communicate via
some kind of IPC mechanism, like Unix pipes. But that's a royal pain
(requires serialization / deserialization, with the associated
overhead, and a network stack or equivalent, just to interface with
each other).
[...] Another disadvantage is the need for every plugin to have a runtime, since they'll need to start on their own (as a process, that is, even if it's the parent app that actually exec()s the process).
 The messaging concept does have some advantages. For example, if
 your external "plugin" fails for any reason (for example due to a
 segfault), the rest of your main application can continue operating
 just fine.
Hmm. I was leaning against using separate processes, but this may just convince me to favor it. I have been the victim of way too many browser plugin malfunctions, which inevitably brings down the entire browser. Recent browsers started using plugin wrappers that can handle plugin crashes, etc., which is an indication of the need of this kind of insulation. The fact that plugin wrappers don't always work very well is another factor that's pushing me in favor of using completely separate processes.
 There are other advantages, such as distributed processing, but if you
 can benefit from those advantages depends entirely on what you are
 attempting to achieve. The simplest use case that gives you a good
 subset of the main advantages, is through runtime loaded plugins.
[...] True. Using separate processes also incurs performance overhead -- for example, I have a C++ program that dynamically loads plugins that compute the value of some opaque mathematical function, from which a rather large sample set is repeatedly taken. The reason I elected to use plugins is because the precompiled computation code runs faster -- every sample incurs just the cost of a single function call. If I were forced to make these plugins external processes, the performance hit would be so bad that I might as well just resort to using a boring old expression interpreter instead. T -- MACINTOSH: Most Applications Crash, If Not, The Operating System Hangs
Jan 09 2013
prev sibling next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 02:49:34 UTC, Rob T wrote:
 On Tuesday, 8 January 2013 at 23:30:34 UTC, David Nadlinger 
 wrote:
 On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper 
 shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
I *really* need runtime loading of plug-in code for a server application. This allows the server code to remain untouched while allowing extensions to be added on by a 3rd party. Runtime linked shared libs are also nice to have for the simple reason that shared libraries can be updated (to a point) without having to rebuild and relink all applications that make use of the libraries. There are pros and cons to static vs dynamic linking, but definitely both are useful to have. I'm very surprised that not too many people have been screaming for dynamic linking and runtime loading. It's very hard for me to imagine not having the feature because it's so darn useful and an essential feature if your strategy is to allow 3rd parties to extend an application without hacking at the source code. If there's another better way, I'd sure like to know about it! --rt
You can write plugins without dynamic loading, they just are a bit more cumbersome to write. Like I used to do back in the late 90's, by making use of UNIX's IPC. The IPC to use (shared memory, pipes, mailbox, sockets) depends on what is required from the plugin. With shared memory being the closest to what dynamic loading achieves. Of course this raises another set of issues like: - Take care what happens when a plugin dies - Too many loaded plugins can stress the scheduler - You might have synchronization issues when using shared memory - It is a bit more painful to code for This is the school of thought of the Plan9/Go guys and most operating system micro-kernel architectures. -- Paulo
Jan 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 08:46:20 UTC, Paulo Pinto wrote:
 You can write plugins without dynamic loading, they just are a 
 bit more cumbersome to write.

 Like I used to do back in the late 90's, by making use of 
 UNIX's IPC.

 The IPC to use (shared memory, pipes, mailbox, sockets) depends 
 on what is required from the plugin. With shared memory being 
 the closest to what dynamic loading achieves.

 Of course this raises another set of issues like:

 - Take care what happens when a plugin dies
 - Too many loaded plugins can stress the scheduler
 - You might have synchronization issues when using shared memory
 - It is a bit more painful to code for

 This is the school of thought of the Plan9/Go guys and most 
 operating system micro-kernel architectures.
Such a solution cause the same kind of issue for the runtime. For instance, how to handle the garbage collection of the shared memory ? What do happen if I get the typeid of an object of a type that only exists in the plugin in the core app ? It quite don't solve problems we have.
Jan 09 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 09:35:28 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 08:46:20 UTC, Paulo Pinto wrote:
 You can write plugins without dynamic loading, they just are a 
 bit more cumbersome to write.

 Like I used to do back in the late 90's, by making use of 
 UNIX's IPC.

 The IPC to use (shared memory, pipes, mailbox, sockets) 
 depends on what is required from the plugin. With shared 
 memory being the closest to what dynamic loading achieves.

 Of course this raises another set of issues like:

 - Take care what happens when a plugin dies
 - Too many loaded plugins can stress the scheduler
 - You might have synchronization issues when using shared 
 memory
 - It is a bit more painful to code for

 This is the school of thought of the Plan9/Go guys and most 
 operating system micro-kernel architectures.
Such a solution cause the same kind of issue for the runtime. For instance, how to handle the garbage collection of the shared memory ? What do happen if I get the typeid of an object of a type that only exists in the plugin in the core app ? It quite don't solve problems we have.
Usually you serialize the data into the shared memory when using stronger type languages and only use public types defined in the common interface between the main application and plugins. -- Paulo
Jan 09 2013
prev sibling parent reply "nazriel" <spam dzfl.pl> writes:
On Wednesday, 9 January 2013 at 02:49:34 UTC, Rob T wrote:
 I'm very surprised that not too many people have been screaming 
 for dynamic linking and runtime loading. It's very hard for me 
 to imagine not having the feature because it's so darn useful 
 and an essential feature if your strategy is to allow 3rd 
 parties to extend an application without hacking at the source 
 code.

 If there's another better way, I'd sure like to know about it!

 --rt
There were many people screaming about it. Just there is nobody who could make it work. Walter claimed that compiler is shared-lib ready, it is just druntime that is lacking. And he hasn't got knowledge to make it work on his own. Sean Kelly is out - he was Walter's bet to make it work. My hope was Martin Nowak, he was working on it but seems that he also got busy with other stuff
Jan 09 2013
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2013 12:39, schrieb nazriel:
 On Wednesday, 9 January 2013 at 02:49:34 UTC, Rob T wrote:
 I'm very surprised that not too many people have been screaming for
 dynamic linking and runtime loading. It's very hard for me to imagine
 not having the feature because it's so darn useful and an essential
 feature if your strategy is to allow 3rd parties to extend an
 application without hacking at the source code.

 If there's another better way, I'd sure like to know about it!

 --rt
There were many people screaming about it. Just there is nobody who could make it work. Walter claimed that compiler is shared-lib ready, it is just druntime that is lacking. And he hasn't got knowledge to make it work on his own. Sean Kelly is out - he was Walter's bet to make it work. My hope was Martin Nowak, he was working on it but seems that he also got busy with other stuff
The compiler is not shared-lib ready. At least not on windows. It does not support exporting data symbols. E.g. export uint g_myGlobal; This is mostly a problem for "hidden" data symbols, like vtables, module info objects, type info objects and other stuff D relies on. Druntime on windows does already handle everything else pefectly (e.g. threads and TLS) Kind Regards Benjamin Thaut
Jan 09 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 4:25 AM, Benjamin Thaut wrote:
 The compiler is not shared-lib ready. At least not on windows. It does
 not support exporting data symbols. E.g.

 export uint g_myGlobal;

 This is mostly a problem for "hidden" data symbols, like vtables, module
 info objects, type info objects and other stuff D relies on.
Are there bugzilla entries for this? Andrei
Jan 09 2013
parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 09.01.2013 16:49, schrieb Andrei Alexandrescu:
 On 1/9/13 4:25 AM, Benjamin Thaut wrote:
 The compiler is not shared-lib ready. At least not on windows. It does
 not support exporting data symbols. E.g.

 export uint g_myGlobal;

 This is mostly a problem for "hidden" data symbols, like vtables, module
 info objects, type info objects and other stuff D relies on.
Are there bugzilla entries for this? Andrei
Yes its pretty old too. If you read through the discussion in the ticket and through the code Rainer Schuetze provided you will have a list of all the issues that need to be fixed for shared dlls to work: http://d.puremagic.com/issues/show_bug.cgi?id=4071 In the following patch: http://d.puremagic.com/issues/attachment.cgi?id=601&action=edit Rainer Schuetze does manual patching for data symbols. But this is hardcoded to only work for his phobos shared dll. The function it is done in is called dll_patchImportRelocations. If I understand DLLs correctly this should usually be done by the import library that is created by the compiler for a shared dll. Maybe Rainer can shed some mor light on this. Kind Regards Benjamin Thaut
Jan 09 2013
parent Rainer Schuetze <r.sagitario gmx.de> writes:
On 09.01.2013 19:57, Benjamin Thaut wrote:
 Am 09.01.2013 16:49, schrieb Andrei Alexandrescu:
 On 1/9/13 4:25 AM, Benjamin Thaut wrote:
 The compiler is not shared-lib ready. At least not on windows. It does
 not support exporting data symbols. E.g.

 export uint g_myGlobal;

 This is mostly a problem for "hidden" data symbols, like vtables, module
 info objects, type info objects and other stuff D relies on.
Are there bugzilla entries for this? Andrei
Yes its pretty old too. If you read through the discussion in the ticket and through the code Rainer Schuetze provided you will have a list of all the issues that need to be fixed for shared dlls to work: http://d.puremagic.com/issues/show_bug.cgi?id=4071
I doubt it is easily mergeable now, but the major points are listed in the bug report. Some of the patches are meant as tests whether the approach is feasable and should be discussed (like the -exportall switch which might be exporting a bit to much, but could not be implemented by a def file due to optlink limitations).
 In the following patch:
 http://d.puremagic.com/issues/attachment.cgi?id=601&action=edit
 Rainer Schuetze does manual patching for data symbols. But this is
 hardcoded to only work for his phobos shared dll. The function it is
 done in is called dll_patchImportRelocations. If I understand DLLs
 correctly this should usually be done by the import library that is
 created by the compiler for a shared dll. Maybe Rainer can shed some mor
 light on this.
The import library can only help with function calls by providing the call target and creating an indirect jump through the import table to the actual function in the other DLL. Data accesses need another indirection through the import table in the code if they want to access the actual data in another DLL. This indirection is not generated by the compiler. That's why a pass is made to patch all relocation into the import table to their respective targets (which also eliminates the call indirections). It also has the benefit of being able to use the same object files for static or dynamic linking. The hardcoding of the DLL name was meant for testing purposes. What's needed is a method to figure out whether the target DLL is written in D and that data relocations are actually wrong. That would support sharing data between multiple DLLs aswell (It currently only allows sharing objects created in the D runtime). Just to make it clear: I distinguish 2 kinds of DLLs written in D: 1. A DLL that contains the statically linked D runtime and interfaces to other DLLs and the application without sharing ownership (as if all other DLLs are written in C). This works pretty well on Windows (Visual D is such a DLL). 2. A DLL that shares ownership of memory, objects, threads, etc with the executable and other DLLs if they are also written in D. This is realized by placing the D runtime into its own DLL that is implicitely loaded with the other binary. (In contrast to some rumors that I remember that on posix systems the runtime would be linked into the application image.) This is what the patches in the bugzilla entry implement.
Jan 09 2013
prev sibling parent "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 12:25:08 UTC, Benjamin Thaut 
wrote:
 Walter claimed that compiler is shared-lib ready, it is just 
 druntime
 that is lacking. And he hasn't got knowledge to make it work 
 on his own.

 Sean Kelly is out - he was Walter's bet to make it work.

 My hope was Martin Nowak, he was working on it but seems that 
 he also
 got busy with other stuff
The compiler is not shared-lib ready. At least not on windows. It does not support exporting data symbols. E.g. export uint g_myGlobal; This is mostly a problem for "hidden" data symbols, like vtables, module info objects, type info objects and other stuff D relies on. Druntime on windows does already handle everything else pefectly (e.g. threads and TLS) Kind Regards Benjamin Thaut
http://dlang.org/phobos/core_runtime.html I see library load and unload functions, although the required "dlsym" feature is absent. What's the status, does any of it actually work? --rt
Jan 09 2013
prev sibling next sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 8 January 2013 at 23:30:34 UTC, David Nadlinger wrote:
 On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper 
 shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now. David
1/ Load 3rd party code that the core app have no idea about (plugins). 2/ Reduce the feedback loop for projects that take ages to link during dev.
Jan 08 2013
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 00:30:32 David Nadlinger wrote:
 On Tuesday, 8 January 2013 at 23:12:43 UTC, Rob T wrote:
 The only major thing that concerns me is the lack of proper
 shared library support. I hope this omission is resolved soon.
What do you need it for? Runtime loading of D shared objects? Or just linking to them (i.e. binding by ld/dyld at load time)? I'm trying to collect data on real-world use cases resp. expectations right now.
You pretty much need shared libraries for plugins (so, runtime loading of shared libraries), whereas all dynamic linking really does is save disk space. So, I'd consider the loading of shared libraries at runtime to be a necessity for some types of projects whereas for pretty much anything other than embedded systems (which will probably want to use C anyway), the saved disk space part is fairly useless. With the stuff I work on at work, I couldn't use D precisely because we require plugins (think of trying to do something like gstreamer but without plugins - it doesn't work very well). Now, much as I love D, I wouldn't try and get the stuff that I'm doing at work moved over to D, because it's a large, existing, C++ code base which works quite well, and I don't think that switching to D would be worth it, but anyone looking to do similar projects in D would be out of luck right now. - Jonathan M Davis
Jan 09 2013
parent "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 08:59:01 UTC, Jonathan M Davis 
wrote:
 [...] whereas all dynamic linking really does is save disk 
 space.
Saving on disk space is a minor advantage. The main advantage is allowing shared libs to be distributed without having to re-link then in manually. For example, if a bug is fixed in a shared lib, all applications automatically get the bug fix, but with statically linked libs, you have to re-link all the apps that use the lib to gain access to the bug fix. With static linking you also have no easy way to ensure that your apps are all using the most up-to-date version of a shared lib. Effectively, without dynamic linking, collections of applications, such as operating systems would be very difficult to deploy and maintain to the point of being impractical. D is simply a whole lot less useful without full dynamic runtime linking. --rt
Jan 09 2013
prev sibling next sibling parent reply Paulo Pinto <pjmlp progtools.org> writes:
Am 08.01.2013 16:25, schrieb H. S. Teoh:
 On Tue, Jan 08, 2013 at 10:29:26AM +0100, Paulo Pinto wrote:
 On Monday, 7 January 2013 at 23:13:13 UTC, H. S. Teoh wrote:
 ...

 Crippling the language to cater to the 10% crowd who want to squeeze
 every last drop of performance from the hardware is the wrong
 approach IMO.
[...]
 Agreed.

 Having used GC languages for the last decade, I think the cases
 where manual memory management is really required are very few.

 Even if one is forced to do manual memory management over GC, it is
 still better to have the GC around than do everything manually.
Yes, hence my idea of splitting up the performance-critical core of a game engine vs. the higher-level application stuff (like scripting, etc.) that aren't as performance-critical. The latter would be greatly helped by a GC -- it makes it easier for scripting people to use, whereas writing GC-less code demands a certain level of rigor and certainly requires more effort and care than is necessary for the most part.
 But this is based on my experience doing business applications,
 desktop and server side or services/daemons.
[...] Well, business applications and server-side stuff (I assume it's web-based stuff) are exactly the kind of applications that benefit the most from a GC. In my mind, they are just modern incarnations of batch processing applications, where instant response isn't critical, and so the occasional GC pause is acceptable and, indeed, mostly unnoticeable.
Besides Web applications, I also took part in projects that ported high performance C++ daemons to Java. These were servers doing millions of data processing manipulations per second of telecommunication data used in mobile networks. In a famous Finn/German telecommunications company lots of server code has been migrated from C++ to Java in the last years. -- Paulo
Jan 08 2013
parent reply "Dan" <dbdavidson yahoo.com> writes:
On Tuesday, 8 January 2013 at 23:04:48 UTC, Paulo Pinto wrote:
 Besides Web applications, I also took part in projects that 
 ported high
 performance C++ daemons to Java.
Curious as to why? What was to be gained/overcome?
 These were servers doing millions of data processing 
 manipulations per
 second of telecommunication data used in mobile networks.
Did it prove a worthwhile move? Did the move relieve any issues with C++? Was GC an issue in the end? Thanks, Dan
Jan 09 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 20:00:28 UTC, Dan wrote:
 On Tuesday, 8 January 2013 at 23:04:48 UTC, Paulo Pinto wrote:
 Besides Web applications, I also took part in projects that 
 ported high
 performance C++ daemons to Java.
Curious as to why? What was to be gained/overcome?
The architects decided it was cool to replace server code done in a mixture of C++/CORBA/SNMP/Perl by simple Java based servers.
 These were servers doing millions of data processing 
 manipulations per
 second of telecommunication data used in mobile networks.
Did it prove a worthwhile move?
It was a lengthy process with quite some victims along the way, but it did improve the overall quality in the end. - Way shorter build times; - Better tooling infrastructure; - Thanks to the JVM, the application monitoring improved - Unit tests, code coverage and static analysis tools are much better - Better support of the operating systems in use without #ifdefs everywhere To be honest, it would have been better to refactor the C++ code to improve the overall quality, but there was another issue related with the language change, as you can read in the next answer.
 Did the move relieve any issues with C++?
Yes, because in 300+ team projects not everyone is a C++ guru and you end up chasing language issues and pointer problems all the time. Not fun when you get a mobile operator complaining that their customers cannot do proper calls. So moving to Java allowed to improve overall code quality at the time, although I must say it also allowed them to easily outsource parts of the applications to replaceable developers a few years later.
 Was GC an issue in the end?
In the beginning, yes. Most developers did not understand that in GC aware languages you need to change the way you code. So that was a learning process for many of them. But the JVM has lots of nice monitoring tools to help you track down the issues, which allowed us to refactor the places that were giving problems. Just some small parts were later also written in C, for deployment into network elements that were not capable to run a JVM with the desired hardware requirements. So depending on where in the globe you're located, your phone calls data might be processed by those applications.
 Thanks,
 Dan
Jan 09 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Tuesday, 8 January 2013 at 15:27:21 UTC, H. S. Teoh wrote:
 But then again, considering the bulk of all software being 
 written
 today, how much code is actually mission-critical real-time 
 apps or game
 engine cores? I suspect real-time apps are <5% of all software, 
 and
 while games are a rapidly growing market, I daresay less than 
 30-40% of
 game code actually needs to be pauseless (mainly just 
 video-rendering
 code -- code that handles monster AI, for example, wouldn't fail
 horribly if it had to take a few extra frames to decide what to 
 do next
 -- in fact, it may even be more realistic that way). Which, in 
 my
 estimation, probably doesn't account for more than 10% of all 
 software
 out there. The bulk of software being written today don't 
 really need to
 be GC-less.
This is an horrible idea for multiplayer and replay standpoint.
Jan 08 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/7/2013 3:11 PM, H. S. Teoh wrote:
 I think much of the aversion to GCs is misplaced.  I used to be very
 aversive of GCs as well, so I totally understand where you're coming
 from. I used to believe that GCs are for lazy programmers who can't be
 bothered to think through their code and how to manage memory properly,
 and that therefore GCs encourage sloppy coding. But then, after having
 used D extensively for my personal projects, I discovered to my surprise
 that having a GC actually *improved* the quality of my code -- it's much
 more readable because I don't have to keep fiddling with pointers and
 ownership (or worse, reference counts), and I can actually focus on how
 to make the algorithms better. Not to mention the countless frustrating
 hours spent chasing pointer bugs and memory leaks are all gone -- 'cos I
 don't have to use pointers directly anymore.
I had the same experience. For the first half of my programming career, I regarded GC as a crutch for loser programmers. After working on Symantec's Java compiler, and later implementing Javascript, I discovered that I was quite wrong about that, pretty much the same as you. One thing I'd add is that a GC is *required* if you want to have a language that guarantees memory safety, which D aims to do. There's an inescapable reason why manual memory management in D is only possible in system code. Interestingly, carefully written code using a GC can be *faster* than manual memory management, for a number of rather subtle reasons.
Jan 08 2013
next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Jan 08 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
Jan 08 2013
next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 06:57:34 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
He said _safety_... memory leaks are perfectly safe. Then again, doesn't Java also have memory leaks when e.g. you when you attach an event handler to a static event and never remove it? What exactly does a GC bring to the table here?
Jan 08 2013
prev sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 06:57:34 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
Oh and you should also take a look at Newlisp
Jan 08 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 07:06:03 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 06:57:34 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
Oh and you should also take a look at Newlisp
I certainly wont if you don't even bother explain why I should.
Jan 08 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 07:14:19 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 07:06:03 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 06:57:34 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
Oh and you should also take a look at Newlisp
I certainly wont if you don't even bother explain why I should.
'cause it's memory-safe LISP and without a GC?
Jan 08 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 07:16:15 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:14:19 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 07:06:03 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 06:57:34 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 06:56:00 UTC, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want 
 to have a language that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Such a program is guaranteed to have memory leak, unless you add a GC on top of the managed pointers.
Oh and you should also take a look at Newlisp
I certainly wont if you don't even bother explain why I should.
'cause it's memory-safe LISP and without a GC?
"Sharing of sub-objects among objects, cyclic structures, or multiple variables pointing to the same object are not supported in newLISP." Well, you CAN indeed, create a dumbed down language that is memory safe and don't require a GC.
Jan 08 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is 
 memory safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to the
Jan 08 2013
parent reply "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 07:23:57 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is 
 memory safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to
There is a point being made here that is perfectly valid. There is a form of memory leak that a GC can never catch, such as when when memory is allocated and simply never deallocated by mistake due to a persistent "in use" pointer that should have been nulled but wasn't. In addition, the GC itself may fail to deallocated freed memory or even free live memory by mistake. I've seen bugs described to that effect. There simply is no panacea to the memory leak problem. What a GC does do, is free the programmer from a ton of tedium, and even allow for constructs that would normally not be practical to implement, but it can never guarantee anything more than that. --rt
Jan 08 2013
next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 07:33:24 UTC, Rob T wrote:
 On Wednesday, 9 January 2013 at 07:23:57 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is 
 memory safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to
There is a point being made here that is perfectly valid. There is a form of memory leak that a GC can never catch, such as when when memory is allocated and simply never deallocated by mistake due to a persistent "in use" pointer that should have been nulled but wasn't. In addition, the GC itself may fail to deallocated freed memory or even free live memory by mistake. I've seen bugs described to that effect. There simply is no panacea to the memory leak problem. What a GC does do, is free the programmer from a ton of tedium, and even allow for constructs that would normally not be practical to implement, but it can never guarantee anything more than that. --rt
Yup. Also might mention, we implemented a compiler for a subdialect of Python (including full closures) for our compilers course, which the compiler subsequently translated to C++. GC wasn't required (we were allowed to never deallocate), but since I didn't feel like doing that I added reference counting using a lot of shared_ptr's and intrusive_ptr's. I also added a GC just for the sake of catching cyclic references, but it was just that -- everything was reference counted, so if you never had cyclic references, the GC _never_ kicked in, period. (True, it wouldn't give you the power of a systems language, but that's quite obviouly not my point -- the point is that it's a _perfectly possible_ memory-safe language which we made, so I don't understand Walter's comment about a GC being "required" for a memory-safe language.)
Jan 08 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, but that's quite
 obviouly not my point -- the point is that it's a _perfectly possible_
 memory-safe language which we made, so I don't understand Walter's comment
about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, 
 but that's quite
 obviouly not my point -- the point is that it's a _perfectly 
 possible_
 memory-safe language which we made, so I don't understand 
 Walter's comment about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Jan 09 2013
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 08:28:44 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright 
 wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, 
 but that's quite
 obviouly not my point -- the point is that it's a _perfectly 
 possible_
 memory-safe language which we made, so I don't understand 
 Walter's comment about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Yes. Reference counting is a poor man's form of garbage collection and almost every book about garbage collection starts by introducing reference counting, before moving on to mark-and-sweep and all the remaining algorithms. Both fall under the umbrella of automatic memory management. Oh, a bit off topic but are you aware that C++11 has a GC API? -- Paulo
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 08:51:47 UTC, Paulo Pinto wrote:
 On Wednesday, 9 January 2013 at 08:28:44 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright 
 wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, 
 but that's quite
 obviouly not my point -- the point is that it's a _perfectly 
 possible_
 memory-safe language which we made, so I don't understand 
 Walter's comment about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Yes.
You (or Walter I guess) are the first person I've seen who calls C++ garbage collected.
 Oh, a bit off topic but are you aware that C++11 has a GC API?
Yes, but I'm not aware of any code which claims to have written a GC for it.
 --
 Paulo
Jan 09 2013
next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 08:54:11 UTC, Mehrdad wrote:
 Yes, but I'm not aware of any code which claims to have written 
 a GC for it.
A precise* GC that is, not a conservative one.
Jan 09 2013
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 09:54:10 Mehrdad wrote:
 You (or Walter I guess) are the first person I've seen who calls
 C++ garbage collected.
I sure wouldn't call that garbage collection - not when there's no garbage collector. But Walter has certainly called it that from time to time. I think that the other term that Paulo just used - automatic memory management - is far more accurate. - Jonathan M Davis
Jan 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 09:01:46 UTC, Jonathan M Davis 
wrote:
 On Wednesday, January 09, 2013 09:54:10 Mehrdad wrote:
 You (or Walter I guess) are the first person I've seen who 
 calls
 C++ garbage collected.
I sure wouldn't call that garbage collection - not when there's no garbage collector. But Walter has certainly called it that from time to time. I think that the other term that Paulo just used - automatic memory management - is far more accurate. - Jonathan M Davis
I used to think that. But I was convinced otherwise when considering carefully the cost and benefit involved. I so no interest in make the difference anymore. The technical solution you choose for GC imply some tradeoff, and it amazing to see how the fact that you refcount or trace don't matter that much. If you want a tracing collector to behave more like reference counting, you'll need to add barriers, so you ends up with the same cost to get the same benefit. Same goes for variations around reference counting, or any mix of both.
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 09:43:01 UTC, deadalnix wrote:
 The technical solution you choose for GC imply some tradeoff, 
 and it amazing to see how the fact that you refcount or trace 
 don't matter that much. If you want a tracing collector to 
 behave more like reference counting, you'll need to add 
 barriers, so you ends up with the same cost to get the same 
 benefit.
A single 100-ms pause is not equivalent to 10,000 0.1-ms pauses for all apps. Just because they have the "same cost" doesn't necessarily imply they're equal.
Jan 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 09:50:25 UTC, Mehrdad wrote:
 A single 100-ms pause is not equivalent to 10,000  0.1-ms 
 pauses for all apps.

 Just because they have the "same cost" doesn't necessarily 
 imply they're equal.
You can have pauseless tracing GC, at the price of barrier + more floating garbage. Reference counting tend to create big pauses when deallocating as objects tends to dies in group. You can solve that issue by delaying cascading deallocation, which cause more floating garbage.
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when deallocating 
 as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
Jan 09 2013
next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 10:09:42 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when deallocating 
 as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
*YOU* mentionned pause time. In regard of pause time, yes, it is very different.
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 10:13:03 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 10:09:42 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when 
 deallocating as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
*YOU* mentionned pause time. In regard of pause time, yes, it is very different.
Yes I did, but I didn't realize you were talking about a background GC, sorry. Yeah, if you can have a background GC that can keep up with your needs, then the world is great. Trouble is, I don't see how that can be true for a intensive applications like games.
Jan 09 2013
parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 10:19:24 UTC, Mehrdad wrote:
 Yes I did, but I didn't realize you were talking about a 
 background GC, sorry.

 Yeah, if you can have a background GC that can keep up with 
 your needs, then the world is great. Trouble is, I don't see 
 how that can be true for a intensive applications like games.
You are changing the subject discussed every 2 posts. I'm loosing my time here, that will be my last answer to you. For a game, latency is more important than actual performance. You prefers very much to guarantee 60fps than to have 100fps but with some frame that may take 200ms to compute. At this game, stop the world GC and reference counting are very bad as both causes large pauses. A concurrent GC is a viable option, granteed you don't generate an insane amount of garbage (which is problematic whatever the memory management used anyway). It is even a good way to increase parralelism in the program and to better exploit the resource of a multicore machine. In fact, most code critical in video games avoid memory allocation altogether.
Jan 09 2013
parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 10:31:51 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 10:19:24 UTC, Mehrdad wrote:
 Yes I did, but I didn't realize you were talking about a 
 background GC, sorry.

 Yeah, if you can have a background GC that can keep up with 
 your needs, then the world is great. Trouble is, I don't see 
 how that can be true for a intensive applications like games.
You are changing the subject discussed every 2 posts. I'm loosing my time here, that will be my last answer to you.
That wasn't my intention, but the subject suddenly changed to background GCs which I didn't expect to be talking about either... but if you don't have the time to continue then I'll avoid responding too.
Jan 09 2013
prev sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.01.2013 11:09, schrieb Mehrdad:
 On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when deallocating
 as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
could be - think of an large hierarchy of objects which tends to take some time to deconstruct ... a background gc could be better for your application speed in this situation - by the cost of smaller pause and more resource usage collectors if ref-counting will be always better (except cyclic stuff)?
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 10:14:07 UTC, dennis luehring 
wrote:
 Am 09.01.2013 11:09, schrieb Mehrdad:
 On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when deallocating
 as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
could be - think of an large hierarchy of objects which tends to take some time to deconstruct ... a background gc could be better for your application speed in this situation - by the cost of smaller pause and more resource usage
Come to think of it, C++ allocators are meant for exactly this: throwing away an entire batch of objects in 1 go. Beats GCs any day.

 garbage collectors if ref-counting will be always better 
 (except cyclic stuff)?
stuff, and that's it. If you have any other reasons please show me a benchmark that shows a GC being faster than the equivalent refcounted code (I've seen lots of talks in theory about how it _COULD_ be different but never seen any examples in practice; would love to see one).
Jan 09 2013
next sibling parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 10:21:29 UTC, Mehrdad wrote:
 Come to think of it, C++ allocators are meant for exactly this
s/are meant for/are used for/
Jan 09 2013
prev sibling next sibling parent reply dennis luehring <dl.soluz gmx.net> writes:
Am 09.01.2013 11:21, schrieb Mehrdad:
 Come to think of it, C++ allocators are meant for exactly this:
 throwing away an entire batch of objects in 1 go. Beats GCs any
 day.
but a gc is much more generic then a specialized allocator redefine you scenario please: are we talking about many,any or special program situations? for my understanding there is no one-for-all-perfect-solution but many perfect-solution-for-excatly-this-case that is the reason for having ref-counting & GCs around
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 11:10:40 UTC, dennis luehring 
wrote:
 Am 09.01.2013 11:21, schrieb Mehrdad:
 Come to think of it, C++ allocators are meant for exactly this:
 throwing away an entire batch of objects in 1 go. Beats GCs any
 day.
but a gc is much more generic then a specialized allocator redefine you scenario please: are we talking about many,any or special program situations?
We're talking about a language that should be able to handle any realistic situation.
 for my understanding there is no one-for-all-perfect-solution 
 but many perfect-solution-for-excatly-this-case

 that is the reason for having ref-counting & GCs around
Yeah we agree on that, no discussion there. Speaking of which, I have a feeling what I said didn't send the message I meant: I didn't mean we should reference-count EVERYTHING. Allocators, etc. have their places too -- and they all under manual (or automatic, whatever you wish to call it) memory management. My entire point during this discussion has been that you _don't_ _require_ a GC for anything, unlike what Walter said. Manual (/automatic/whatever you want to call it) memory management can take its place just fine.
Jan 09 2013
parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 12:16:46 Mehrdad wrote:
 My entire point during this discussion has been that you _don't_
 _require_ a GC for anything, unlike what Walter said. Manual
 (/automatic/whatever you want to call it) memory management can
 take its place just fine.
Walter wasn't arguing that there wasn't a place for manual memory management. He was arguing that you can't guarantee memory safety if you're using manual memory management - hence why malloc and free are system. system code is not memory safe like safe code is, but it still very much has it's place. - Jonathan M Davis
Jan 09 2013
parent reply "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 11:24:32 UTC, Jonathan M Davis 
wrote:
 Walter wasn't arguing that there wasn't a place for manual 
 memory management.
 He was arguing that you can't guarantee memory safety if you're 
 using manual
 memory management - hence why malloc and free are  system. 
  system code is not
 memory safe like  safe code is, but it still very much has it's 
 place.

 - Jonathan M Davis
You cannot guarantee memory safety with a GC either, depending on the definition of "memory safety". For example, you can still access deallocated memory by mistake, run out of memory due to accumulating persistent pointers left around by mistake, or free memory that was not supposed to be freed by mistake. The GC implementation may fail due to bugs, deallocating live memory or failing to deallocate inactive memory. The only thing a GC can do for you, is free up the programmer from the tedium of managing memory. It also allows constructs that otherwise would be very difficult or impractical to implement. The effect can be very positive, but there are no guarantees of memory safety. I also doubt that a "one size fits" all approach to garbage collection will ever satisfy everyone. What's needed is the ability to both manage memory manually and automatically (D allows this which is good), but also allow for the automated methods to be easily replaced (plug-ins come to mind), and if possible allow sections of code to be managed by completely different garbage collector implementations that are designed to serve different purposes (this may or may not be practical to do). --rt
Jan 09 2013
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 06:26:55PM +0100, Rob T wrote:
 On Wednesday, 9 January 2013 at 11:24:32 UTC, Jonathan M Davis
 wrote:
Walter wasn't arguing that there wasn't a place for manual memory
management.  He was arguing that you can't guarantee memory safety if
you're using manual memory management - hence why malloc and free are
 system.  system code is not memory safe like  safe code is, but it
still very much has it's place.

- Jonathan M Davis
You cannot guarantee memory safety with a GC either, depending on the definition of "memory safety". For example, you can still access deallocated memory by mistake,
Are you sure about this? Short of doing pointer arithmetic (which is unsafe by definition), I don't see how you can do this. Where would you get the pointer to that memory from? It has to be stored somewhere, meaning the GC will not deallocate the memory pointed to.
 run out of memory due to accumulating persistent pointers left around
 by mistake,
That is not unsafe. That just means you run out of memory and get an OutOfMemory exception or the OS kills your process. No safety issue there.
 or free memory that was not supposed to be freed by mistake.
How? The GC, by definition, doesn't free the memory unless nothing is pointing to it.
 The GC implementation may fail due to bugs, deallocating live memory
 or failing to deallocate inactive memory.
You're conflating the theory of GCs and its implementation, which is prone to bugs. By that argument, you might as well say that there is no such thing as memory safety, because implementations are always prone to bugs. Your RAM could catch fire, for example. There's nothing you can do in software to prevent that. That's not a helpful argument, though. The question at hand is, given a GC, which presumably has been thoroughly debugged, does it guarantee memory safety? I think it does, because as long as there's a pointer to the memory left, the GC will not collect it, and so you can't use the pointer to access invalid memory. If there are no more pointers to that memory, then by definition you have nothing to access that memory with. Like I said, you have to get the pointer from *somewhere*. As long as the pointer is somewhere, the GC will find it, and mark the memory as used, so it won't be collected. If there are no pointers left, you also have no way to access that memory anymore. So you can never access invalid memory. It's very straightforward. This memory safety can be broken only if you allow pointer arithmetic and casts to/from pointers. But everyone knows that pointer arithmetic is unsafe by definition; I don't think we're talking about that here. [...]
 I also doubt that a "one size fits" all approach to garbage
 collection will ever satisfy everyone.
Of course not. That's why the game devs on this list have been doing GC-less D programming. :-) But that also means they have to either preallocate everything, or they have to manually manage the memory (with malloc/free or equivalent), which is unsafe, because it's very easy to free some memory while there are still pointers to it left.
 What's needed is the ability to both manage memory manually and
 automatically (D allows this which is good), but also allow for the
 automated methods to be easily replaced (plug-ins come to mind), and
 if possible allow sections of code to be managed by completely
 different garbage collector implementations that are designed to serve
 different purposes (this may or may not be practical to do).
[...] I don't know about the practicality of using multiple GCs in a single app, but certainly the ability to choose from a number of alternative GC implementations at compile-time would be very welcome. You could have a GC that has high throughput but introduces intermittent long pauses, and another GC that has lower throughput but has no pauses longer than, say, 0.1 seconds. Then at compile-time you choose one, depending on what you need. An interactive app will choose the second, but a batch processing app will benefit from the first. T -- Why waste time learning, when ignorance is instantaneous? -- Hobbes, from Calvin & Hobbes
Jan 09 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 10:32 AM, H. S. Teoh wrote:
 does it guarantee memory safety?
I think it does,
A GC is a necessary, but not sufficient, condition for memory safety.
Jan 09 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 9:26 AM, Rob T wrote:
 You cannot guarantee memory safety with a GC either, depending on the
definition
 of "memory safety".
GC is a necessary requirement for memory safety, but not sufficient.
 For example, you can still access deallocated memory by mistake,
If you're not playing with pointers, then this is a buggy GC.
 run out of
 memory due to accumulating persistent pointers left around by mistake,
Memory safety does not imply never running out of memory.
 or free memory that was not supposed to be freed by mistake.
Then it's a buggy GC.
 The GC implementation may
 fail due to bugs, deallocating live memory or failing to deallocate inactive
 memory.
Of course memory safety presumes a correctly implemented GC.
 The only thing a GC can do for you, is free up the programmer from the tedium
of
 managing memory. It also allows constructs that otherwise would be very
 difficult or impractical to implement. The effect can be very positive, but
 there are no guarantees of memory safety.
This is incorrect - see above. A bug free GC, and not "cheating" in using it, guarantees memory safety. This is a big deal.
Jan 09 2013
next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 18:46:53 UTC, Walter Bright wrote:
 GC is a necessary requirement for memory safety, but not 
 sufficient.
Walter, would you mind explaining WHY it's "necessary"? I just spent so many comments explaining why NO form of automatic memory management is required for guaranteeing memory safety () and then you reply and say "GC is a necessary requirement" and leave it at that. See my comment here regarding handles, etc.: http://forum.dlang.org/thread/mailman.232.1357570887.22503.digitalmars-d puremagic.com?page=7#post-jimseaovuxmribkqbict:40forum.dlang.org
Jan 09 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 11:18 AM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 18:46:53 UTC, Walter Bright wrote:
 GC is a necessary requirement for memory safety, but not sufficient.
Walter, would you mind explaining WHY it's "necessary"? I just spent so many comments explaining why NO form of automatic memory management is required for guaranteeing memory safety () and then you reply and say "GC is a necessary requirement" and leave it at that. See my comment here regarding handles, etc.: http://forum.dlang.org/thread/mailman.232.1357570887.22503.digitalmars-d puremagic.com?page=7#post-jimseaovuxmribkqbict:40forum.dlang.org
This is true but uninteresting. Entire classes of languages can be made memory-safe without garbage collection, such as many Turing incomplete languages, languages without referential structures, languages that don't expose pointers (such as your example) and more. At the end of the day if references are part of the language and programs can build arbitrary reference topologies, safety entails GC. Andrei
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 19:34:19 UTC, Andrei Alexandrescu 
wrote:
 At the end of the day if references are part of the language 
 and programs can build arbitrary reference topologies, safety 
 entails GC.
It looks like a non sequitur to me... wouldn't this work? A language X has a built-in data type called Reference, and no classes. The only thing you can do with it are using these functions: Reference CreateObject(Reference typename); Reference DeleteValue(Reference object, Reference field); Reference GetValue(Reference object, Reference field); Reference SetValue(Reference object, Reference field, Reference value); Given _just_ these functions you can build _any_ arbitrary reference topology whatsoever. There's no need for a GC to be running, and it's completely manual memory management. It's memory-safe too. What am I missing here?
Jan 09 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 12:09 PM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 19:34:19 UTC, Andrei Alexandrescu wrote:
 At the end of the day if references are part of the language and
 programs can build arbitrary reference topologies, safety entails GC.
It looks like a non sequitur to me... wouldn't this work? A language X has a built-in data type called Reference, and no classes. The only thing you can do with it are using these functions: Reference CreateObject(Reference typename); Reference DeleteValue(Reference object, Reference field); Reference GetValue(Reference object, Reference field); Reference SetValue(Reference object, Reference field, Reference value); Given _just_ these functions you can build _any_ arbitrary reference topology whatsoever. There's no need for a GC to be running, and it's completely manual memory management. It's memory-safe too. What am I missing here?
What you're missing is that you define a store that doesn't model object references with object addresses. That's what I meant by "references are part of the language". If store is modeled by actual memory (i.e. accessing an object handle takes you to the object), you must have GC for the language to be safe. If store is actually indirected and gives up on the notion of address, then sure you can implement safety checks. The thing is everybody wants for references to model actual object addresses; indirect handles as the core abstraction are uninteresting. Andrei
Jan 09 2013
next sibling parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu 
wrote:
 What you're missing is that you define a store that doesn't 
 model object references with object addresses. That's what I 
 meant by "references are part of the language". If store is 
 modeled by actual memory (i.e. accessing an object handle takes 
 you to the object), you must have GC for the language to be 
 safe. If store is actually indirected and gives up on the 
 notion of address, then sure you can implement safety checks. 
 The thing is everybody wants for references to model actual 
 object addresses; indirect handles as the core abstraction are 
 uninteresting.

 Andrei
But why can't Reference hold the actual address too? Seems like a perfectly reasonable implementation to me, and there's no extra indirection involved that way, right?
Jan 09 2013
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 1:09 PM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu wrote:
 What you're missing is that you define a store that doesn't model
 object references with object addresses. That's what I meant by
 "references are part of the language". If store is modeled by actual
 memory (i.e. accessing an object handle takes you to the object), you
 must have GC for the language to be safe. If store is actually
 indirected and gives up on the notion of address, then sure you can
 implement safety checks. The thing is everybody wants for references
 to model actual object addresses; indirect handles as the core
 abstraction are uninteresting.

 Andrei
But why can't Reference hold the actual address too?
If it holds the actual address you can't implement memory reclamation and keep it safe. Andrei
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 21:14:56 UTC, Andrei Alexandrescu 
wrote:
 If it holds the actual address you can't implement memory 
 reclamation and keep it safe.

 Andrei
You mean because of circular references, or something else? And are you considering reference counting to be garbage collection like Walter does, or are you claiming refcounting won't solve this problem but GC will?
Jan 09 2013
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 1:23 PM, Mehrdad wrote:
 And are you considering reference counting to be garbage collection like Walter
 does,
It's not something I came up with. It is generally accepted that ref counting is a form of garbage collection.
Jan 09 2013
parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 14:14:15 Walter Bright wrote:
 On 1/9/2013 1:23 PM, Mehrdad wrote:
 And are you considering reference counting to be garbage collection like
 Walter does,
It's not something I came up with. It is generally accepted that ref counting is a form of garbage collection.
I definitely would never have referred to reference counting as garbage collection and find it quite odd that it would be considered to be such, but Wikipedia does indeed list it as being a form of garbage collection: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science) So, it's backing you up in your usage of the term. - Jonathan M Davis
Jan 09 2013
parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 22:29:36 UTC, Jonathan M Davis 
wrote:
 On Wednesday, January 09, 2013 14:14:15 Walter Bright wrote:
 On 1/9/2013 1:23 PM, Mehrdad wrote:
 And are you considering reference counting to be garbage 
 collection like
 Walter does,
It's not something I came up with. It is generally accepted that ref counting is a form of garbage collection.
I definitely would never have referred to reference counting as garbage collection and find it quite odd that it would be considered to be such, but Wikipedia does indeed list it as being a form of garbage collection: http://en.wikipedia.org/wiki/Garbage_collection_(computer_science) So, it's backing you up in your usage of the term. - Jonathan M Davis
This is what I was saying all along, in CS GC books reference counting is usually introduced as poor man's GC solution. As the most simple way to implement some kind of automatic memory management, specially in memory constrained devices at the expense of execution speed. -- Paulo
Jan 10 2013
parent reply "Rob T" <alanb ucora.com> writes:
On Thursday, 10 January 2013 at 08:38:18 UTC, Paulo Pinto wrote:
 This is what I was saying all along, in CS GC books reference 
 counting is usually introduced as poor man's GC solution. As 
 the most simple way to implement some kind of automatic memory 
 management, specially in memory constrained devices at the 
 expense of execution speed.

 --
 Paulo
This is likely a long shot (and may have already been proposed), but what the heck: If reference counting is considered to be garbage collection, and D is a garbage collected language, then can the current form of GC be replaced with a reference counted version that is fully automated, or does something like that always have to be manually hand crafted because it is not generic enough to fit in? BTW: I did read through the responses concerning by past posts about the GG and memory safety, and I agree with those responses. I now have a much better understanding of what is meant by "memory safety", so thanks to everyone who took the time to respond to my ramblings, I've learned something new. --rt
Jan 10 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/10/2013 4:05 PM, Rob T wrote:
 This is likely a long shot (and may have already been proposed), but what the
 heck: If reference counting is considered to be garbage collection, and D is a
 garbage collected language, then can the current form of GC be replaced with a
 reference counted version that is fully automated,
Yes, but you'd have to rework the semantics of array slices, function closures, etc., to add in the reference counts.
Jan 10 2013
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 1:23 PM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 21:14:56 UTC, Andrei Alexandrescu wrote:
 If it holds the actual address you can't implement memory reclamation
 and keep it safe.

 Andrei
You mean because of circular references, or something else? And are you considering reference counting to be garbage collection like Walter does, or are you claiming refcounting won't solve this problem but GC will?
This is a bit of a crash course in GC and formal semantics in the form of a Q&A, which is rather inefficient. The topic of GC and memory safety is well studied but unfortunately little information about it is available in book format. I suggest you start e.g. with http://llvm.org/pubs/2003-05-05-LCTES03-CodeSafety.pdf and the papers it refers to get a grip on the challenges and tradeoffs involved. If anyone has better suggestions of reading materials, please chime in - I'd be very interested as well. Thanks, Andrei
Jan 09 2013
next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 23:14:37 UTC, Andrei Alexandrescu 
wrote:
 On 1/9/13 1:23 PM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 21:14:56 UTC, Andrei 
 Alexandrescu wrote:
 If it holds the actual address you can't implement memory 
 reclamation
 and keep it safe.

 Andrei
You mean because of circular references, or something else? And are you considering reference counting to be garbage collection like Walter does, or are you claiming refcounting won't solve this problem but GC will?
This is a bit of a crash course in GC and formal semantics in the form of a Q&A, which is rather inefficient. The topic of GC and memory safety is well studied but unfortunately little information about it is available in book format. I suggest you start e.g. with http://llvm.org/pubs/2003-05-05-LCTES03-CodeSafety.pdf and the papers it refers to get a grip on the challenges and tradeoffs involved. If anyone has better suggestions of reading materials, please chime in - I'd be very interested as well. Thanks, Andrei
My favorite GC book: http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484/ref=sr_1_2?s=books-intl-de&ie=UTF8&qid=1357774599&sr=1-2
Jan 09 2013
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 3:38 PM, Paulo Pinto wrote:
 My favorite GC book:

 http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484/ref=sr_1_2?s=books-intl-de&ie=UTF8&qid=1357774599&sr=1-2
I own the more recent edition and made a pass through it but I don't think it discusses safety that much. Any chapter I should be looking at? Andrei
Jan 09 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 23:53:40 UTC, Andrei Alexandrescu 
wrote:
 On 1/9/13 3:38 PM, Paulo Pinto wrote:
 My favorite GC book:

 http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484/ref=sr_1_2?s=books-intl-de&ie=UTF8&qid=1357774599&sr=1-2
I own the more recent edition and made a pass through it but I don't think it discusses safety that much. Any chapter I should be looking at? Andrei
No really. It is my favorite GC book, because it is the only one I know where so many GC algorithms get described, but I haven't read it in depth to be able to assert how safety is described. And here is another one about real time GC that I own, although it focus on JVMs: http://www.amazon.com/Realtime-Collection-Oriented-Programming-Languages/dp/3831138931/ref=sr_1_1?ie=UTF8&qid=1357806697&sr=8-1&keywords=Fridtjof+Siebert
Jan 10 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 23:38:10 UTC, Paulo Pinto wrote:
 On Wednesday, 9 January 2013 at 23:14:37 UTC, Andrei 
 Alexandrescu wrote:
 On 1/9/13 1:23 PM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 21:14:56 UTC, Andrei 
 Alexandrescu wrote:
 If it holds the actual address you can't implement memory 
 reclamation
 and keep it safe.

 Andrei
You mean because of circular references, or something else? And are you considering reference counting to be garbage collection like Walter does, or are you claiming refcounting won't solve this problem but GC will?
This is a bit of a crash course in GC and formal semantics in the form of a Q&A, which is rather inefficient. The topic of GC and memory safety is well studied but unfortunately little information about it is available in book format. I suggest you start e.g. with http://llvm.org/pubs/2003-05-05-LCTES03-CodeSafety.pdf and the papers it refers to get a grip on the challenges and tradeoffs involved. If anyone has better suggestions of reading materials, please chime in - I'd be very interested as well. Thanks, Andrei
My favorite GC book: http://www.amazon.com/Garbage-Collection-Algorithms-Automatic-Management/dp/0471941484/ref=sr_1_2?s=books-intl-de&ie=UTF8&qid=1357774599&sr=1-2
I have that one and it is very informative : http://www.amazon.com/The-Garbage-Collection-Handbook-Management/dp/1420082795/
Jan 09 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 3:14 PM, Andrei Alexandrescu wrote:
 If anyone has better suggestions of reading materials, please chime in - I'd be
 very interested as well.
"Garbage Collection" is the K+R of garbage collection: http://digitalmars.com/bibliography.html#Compilers
Jan 09 2013
prev sibling parent reply "Tove" <tove fransson.se> writes:
On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu 
wrote:
 On 1/9/13 12:09 PM, Mehrdad wrote:
 It's memory-safe too. What am I missing here?
What you're missing is that you define a store that doesn't model object references with object addresses. That's what I meant by "references are part of the language". If store is modeled by actual memory (i.e. accessing an object handle takes you to the object), you must have GC for the language to be safe. If store is actually indirected and gives up on the notion of address, then sure you can implement safety checks. The thing is everybody wants for references to model actual object addresses; indirect handles as the core abstraction are uninteresting. Andrei
Quote from OpenBSD's malloc implementation: "On a call to free, memory is released and unmapped from the process address space using munmap." I don't see why this approach is less safe than a GC... in fact, I claim it's safer, because it's far simpler to implement, and thus less likely to contain bugs and in addition it's easy to make performance vs safety trade-offs, simply by linking with another memory-allocator.
Jan 09 2013
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 1/9/13 1:11 PM, Tove wrote:
 On Wednesday, 9 January 2013 at 20:16:04 UTC, Andrei Alexandrescu wrote:
 On 1/9/13 12:09 PM, Mehrdad wrote:
 It's memory-safe too. What am I missing here?
What you're missing is that you define a store that doesn't model object references with object addresses. That's what I meant by "references are part of the language". If store is modeled by actual memory (i.e. accessing an object handle takes you to the object), you must have GC for the language to be safe. If store is actually indirected and gives up on the notion of address, then sure you can implement safety checks. The thing is everybody wants for references to model actual object addresses; indirect handles as the core abstraction are uninteresting. Andrei
Quote from OpenBSD's malloc implementation: "On a call to free, memory is released and unmapped from the process address space using munmap." I don't see why this approach is less safe than a GC... in fact, I claim it's safer, because it's far simpler to implement, and thus less likely to contain bugs and in addition it's easy to make performance vs safety trade-offs, simply by linking with another memory-allocator.
No. When you allocate again and remap memory, you may get the same address range for an object of different type. Andrei
Jan 09 2013
prev sibling parent reply "Rob T" <rob ucora.com> writes:
On Wednesday, 9 January 2013 at 18:46:53 UTC, Walter Bright wrote:
 On 1/9/2013 9:26 AM, Rob T wrote:
 For example, you can still access deallocated memory by 
 mistake,
If you're not playing with pointers, then this is a buggy GC.
Yes you are correct. I was thinking of nullable references when I made that comment. When you dereference on a nulled reference, I suppose that's not really referencing deallocated memory. I'm not sure what exactly happens behind the scenes when you dereference a null pointer, but obviously bad things happen and it's not safe, also it is a memory related problem.
 run out of
 memory due to accumulating persistent pointers left around by 
 mistake,
Memory safety does not imply never running out of memory.
I did qualify what I said by mentioning that it depends on the definition of memory safety. According to my definition of memory safety, a memory leak is still a memory leak no matter how it happens. I can however see an alternate definition which is likely what you are suggesting, where so long as you are not accessing memory that is not allocated, you are memory safe. There must be more to it than that, so if you can supply a more correct definition, that would be welcome.
 or free memory that was not supposed to be freed by mistake.
Then it's a buggy GC.
Yes. The point however is that a GC can be buggy, so by having one kicking around guarantees nothing unless you can prove that the implementation is 100% correct. I am reminded of the back up system that is less reliable than the medium it is supposed to be protecting, or the power backup supply that is less reliable than the main power grid. My guess is that no one knows how good or bad a GC is relative to what a manual system can do. The claims are likely the result of anecdotal evidence alone. Has anyone actually done a scientifically valid study that shows that a GC implementation is statistically more reliable than a manual implementation? Of course I understand that the lack of a scientific study proves nothing either way, I'm simply pointing out that we're likely making assumptions without any proof to back them up.
 The GC implementation may
 fail due to bugs, deallocating live memory or failing to 
 deallocate inactive
 memory.
Of course memory safety presumes a correctly implemented GC.
Yes, of course, but this is a bit of a wild card since we probably have no proof that any given GC implementation will be correct.
 The only thing a GC can do for you, is free up the programmer 
 from the tedium of
 managing memory. It also allows constructs that otherwise 
 would be very
 difficult or impractical to implement. The effect can be very 
 positive, but
 there are no guarantees of memory safety.
This is incorrect - see above. A bug free GC, and not "cheating" in using it, guarantees memory safety. This is a big deal.
Yes and no. Yes if the definition excludes the ability to "leak memory" due to programmer error, meaning allocating but failing to deallocate - a GC cannot prevent this, only a programmer can. I figure your definition excludes this kind of programmer error, and that's OK with me. I do however wonder about the ability to dereference null pointers, specifically pointers that are considered to be references. In D references are nullable, and I believe this is considered safe. -rt
Jan 09 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 11:40 AM, Rob T wrote:
 Yes you are correct. I was thinking of nullable references when I made that
 comment. When you dereference on a nulled reference, I suppose that's not
really
 referencing deallocated memory. I'm not sure what exactly happens behind the
 scenes when you dereference a null pointer, but obviously bad things happen and
 it's not safe, also it is a memory related problem.
Null pointer faults are not a memory safety issue. Memory safety is not defined as "no bugs", it is defined as "no bugs that corrupt memory".
 I did qualify what I said by mentioning that it depends on the definition of
 memory safety. According to my definition of memory safety, a memory leak is
 still a memory leak no matter how it happens.
If we each define our own meanings for words, we cannot understand each other. Memory safety is a standard piece of jargon, with a standard definition.
Jan 09 2013
prev sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 08:40:28PM +0100, Rob T wrote:
[...]
 According to my definition of memory safety, a memory leak is still a
 memory leak no matter how it happens. I can however see an alternate
 definition which is likely what you are suggesting, where so long as
 you are not accessing memory that is not allocated, you are memory
 safe. There must be more to it than that, so if you can supply a more
 correct definition, that would be welcome.
[...] And here we finally get to the root of the problem. Walter's definition of memory-safe (or what I understand it to be) is that you can't: (1) Access memory that's been freed (2) Access memory that was never allocated (3) As a result of the above, read garbage values or other data that you aren't supposed to be able to access from memory. The context of this definition, from what I understand, is security breaches that exploit buffer overruns, stack overruns, and pointer arithmetic to read stuff that one isn't supposed to be able to read or write stuff into places where one shouldn't be able to write to. A good number of security holes are caused by being able to do such things, due to the lack of memory safety in C/C++. Running out of memory is moot, because the OS will just kill your app (or an exception will be thrown and the runtime will terminate), so that presents no exploit path. Dereferencing null is also moot, because you'll just get an exception or a segfault, which is no help for a potential expoit. Memory leak isn't something directly exploitable (though it *can* be used in a DOS attack), so it doesn't fall under the definition of "memory safety" either. If you want to address memory leaks or dereferencing nulls, that's a different kettle o' fish. T -- Those who've learned LaTeX swear by it. Those who are learning LaTeX swear at it. -- Pete Bleackley
Jan 09 2013
next sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 1:11 PM, H. S. Teoh wrote:
 Walter's definition of memory-safe
It is not *my* definition. It is a common programming term with a generally accepted definition.
Jan 09 2013
prev sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Wednesday, 9 January 2013 at 21:13:35 UTC, H. S. Teoh wrote:
 Dereferencing null is also moot, because you'll just get an 
 exception or a segfault, which is no help for a potential 
 expoit.
BTW, not necessarily... this is a fairly unlikely situation, granted, but imagine: struct Thing { ubyte[1024*1024] buffer; int a; } Thing* t = null; t.a = 10; That'd turn into something like mov eax, 0 ; the pointer value itself mov dword ptr [eax + 1024*1024], 10 ; add the offset of the field before doing the read/write... which quite possibly does overwrite something exploitable.
Jan 09 2013
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 2:30 PM, Adam D. Ruppe wrote:
 On Wednesday, 9 January 2013 at 21:13:35 UTC, H. S. Teoh wrote:
 Dereferencing null is also moot, because you'll just get an exception or a
 segfault, which is no help for a potential expoit.
BTW, not necessarily... this is a fairly unlikely situation, granted, but imagine:
And that is not dereferencing null, it is dereferencing 0x1000000.
Jan 09 2013
parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Thursday, 10 January 2013 at 00:18:26 UTC, Walter Bright wrote:
 And that is not dereferencing null, it is dereferencing 
 0x1000000.
Yes, but it is worth noting that dmd will happily compile that code, even if marked safe - just because the pointer on the language level is null doesn't mean it is memory safe at the assembly level. the generated code with safe is still just what we'd expect too: 3: 31 c0 xor eax,eax 5: c7 80 00 00 10 00 0a mov DWORD PTR [eax+0x100000],0xa
Jan 09 2013
next sibling parent "David Nadlinger" <see klickverbot.at> writes:
On Thursday, 10 January 2013 at 00:50:30 UTC, Adam D. Ruppe wrote:
 Yes, but it is worth noting that dmd will happily compile that 
 code, even if marked  safe - just because the pointer on the 
 language level is null doesn't mean it is memory safe at the 
 assembly level.
See: http://d.puremagic.com/issues/show_bug.cgi?id=5176 David
Jan 09 2013
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Jan 10, 2013 at 01:50:28AM +0100, Adam D. Ruppe wrote:
 On Thursday, 10 January 2013 at 00:18:26 UTC, Walter Bright wrote:
And that is not dereferencing null, it is dereferencing 0x1000000.
Yes, but it is worth noting that dmd will happily compile that code, even if marked safe - just because the pointer on the language level is null doesn't mean it is memory safe at the assembly level. the generated code with safe is still just what we'd expect too: 3: 31 c0 xor eax,eax 5: c7 80 00 00 10 00 0a mov DWORD PTR [eax+0x100000],0xa
Yeah that's exactly what I was thinking too. To DMD, it's a null pointer dereference. But actually, it's dereferencing something else, because x.fieldName is, in general, *not* null when x is null. Hmm. This looks like another hole in SafeD? Unless null pointer checks are inserted. (The checks have to be made on x, not x.fieldName, of course.) T -- Notwithstanding the eloquent discontent that you have just respectfully expressed at length against my verbal capabilities, I am afraid that I must unfortunately bring it to your attention that I am, in fact, NOT verbose.
Jan 09 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Thursday, 10 January 2013 at 01:10:06 UTC, H. S. Teoh wrote:
 On Thu, Jan 10, 2013 at 01:50:28AM +0100, Adam D. Ruppe wrote:
 On Thursday, 10 January 2013 at 00:18:26 UTC, Walter Bright 
 wrote:
And that is not dereferencing null, it is dereferencing 
0x1000000.
Yes, but it is worth noting that dmd will happily compile that code, even if marked safe - just because the pointer on the language level is null doesn't mean it is memory safe at the assembly level. the generated code with safe is still just what we'd expect too: 3: 31 c0 xor eax,eax 5: c7 80 00 00 10 00 0a mov DWORD PTR [eax+0x100000],0xa
Yeah that's exactly what I was thinking too. To DMD, it's a null pointer dereference. But actually, it's dereferencing something else, because x.fieldName is, in general, *not* null when x is null. Hmm. This looks like another hole in SafeD? Unless null pointer checks are inserted. (The checks have to be made on x, not x.fieldName, of course.)
That is exactly why my NPE proposal do trigger on other address that 0. Still, it require to add check for big objects (or high indices in arrays).
Jan 09 2013
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Thursday, 10 January 2013 at 00:50:30 UTC, Adam D. Ruppe wrote:
 On Thursday, 10 January 2013 at 00:18:26 UTC, Walter Bright 
 wrote:
 And that is not dereferencing null, it is dereferencing 
 0x1000000.
Yes, but it is worth noting that dmd will happily compile that code, even if marked safe - just because the pointer on the language level is null doesn't mean it is memory safe at the assembly level. the generated code with safe is still just what we'd expect too: 3: 31 c0 xor eax,eax 5: c7 80 00 00 10 00 0a mov DWORD PTR [eax+0x100000],0xa
That brings us to another point. It does not matter how safe a language might be, if someone can change the assembly then everything is possible. Or am I going too off-topic?
Jan 10 2013
prev sibling parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 10:21:29 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 10:14:07 UTC, dennis luehring 
 wrote:
 Am 09.01.2013 11:09, schrieb Mehrdad:
 On Wednesday, 9 January 2013 at 10:07:42 UTC, deadalnix wrote:
 Reference counting tend to create big pauses when 
 deallocating
 as objects tends to dies in group.
I don't get it... it's slower to deallocate a bunch of objects together with refcounting than to deallocate all of them individually over a longer period of time?
could be - think of an large hierarchy of objects which tends to take some time to deconstruct ... a background gc could be better for your application speed in this situation - by the cost of smaller pause and more resource usage
Come to think of it, C++ allocators are meant for exactly this: throwing away an entire batch of objects in 1 go. Beats GCs any day.

 garbage collectors if ref-counting will be always better 
 (except cyclic stuff)?
stuff, and that's it. If you have any other reasons please show me a benchmark that shows a GC being faster than the equivalent refcounted code (I've seen lots of talks in theory about how it _COULD_ be different but never seen any examples in practice; would love to see one).
Reference counting always implies extra booking code per memory access, I fail to see how it can be made faster than any parallel GC. The Aonix VM for example is used in military scenarios like missile radar systems and battleship gun's control systems, beats any game timing requirements, I would say. -- Paulo
Jan 09 2013
prev sibling next sibling parent reply Brad Roberts <braddr puremagic.com> writes:
On 1/9/2013 1:00 AM, Jonathan M Davis wrote:
 On Wednesday, January 09, 2013 09:54:10 Mehrdad wrote:
 You (or Walter I guess) are the first person I've seen who calls
 C++ garbage collected.
I sure wouldn't call that garbage collection - not when there's no garbage collector. But Walter has certainly called it that from time to time.
There's a collector, it's in the refcount decrement (a little simplified): if (refcount == 0) free(obj); Granted, it's terribly simple, but it's there.
Jan 09 2013
parent reply "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 09:08:40 UTC, Brad Roberts wrote:
 On 1/9/2013 1:00 AM, Jonathan M Davis wrote:
 On Wednesday, January 09, 2013 09:54:10 Mehrdad wrote:
 You (or Walter I guess) are the first person I've seen who 
 calls
 C++ garbage collected.
I sure wouldn't call that garbage collection - not when there's no garbage collector. But Walter has certainly called it that from time to time.
There's a collector, it's in the refcount decrement (a little simplified): if (refcount == 0) free(obj); Granted, it's terribly simple, but it's there.
Sure, it's there. The problem I have with it is that this line of reasoning makes no sense of what Walter said, which was: "A GC is *required* if you want to have a language that guarantees memory safety." No matter how he defines the word GC, I _STILL_ don't see how this is true. I can perfectly well imagine a language which allows you to use integers as _handles_ to objects (perfectly _manual_ management of _everything_), and which gives you access to their fields via external functions. The language need not give you any direct access to memory, making everything perfectly safe. I really don't think Walter's statement made any sense whatsoever.
Jan 09 2013
parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 1:24 AM, Mehrdad wrote:
 The language need not give you any direct access to memory, making everything
 perfectly safe.
You can add a runtime check for every "pointer" access (much like what valgrind does), and abort on an invalid access. But that is not what memory safety is.
Jan 09 2013
prev sibling next sibling parent reply Dmitry Olshansky <dmitry.olsh gmail.com> writes:
09-Jan-2013 12:54, Mehrdad пишет:
 On Wednesday, 9 January 2013 at 08:51:47 UTC, Paulo Pinto wrote:
 On Wednesday, 9 January 2013 at 08:28:44 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, but
 that's quite
 obviouly not my point -- the point is that it's a _perfectly possible_
 memory-safe language which we made, so I don't understand Walter's
 comment about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Yes.
You (or Walter I guess) are the first person I've seen who calls C++ garbage collected.
That's a stretch - IMHO I'd call the language garbage collected iff it is the default way language-wise (e.g. if C++ new was ref-counted I'd call C++ garbage collected). This way D is garbage collected language because the language default is GC. -- Dmitry Olshansky
Jan 09 2013
parent "Paulo Pinto" <pjmlp progtools.org> writes:
On Wednesday, 9 January 2013 at 09:17:35 UTC, Dmitry Olshansky 
wrote:
 09-Jan-2013 12:54, Mehrdad пишет:
 On Wednesday, 9 January 2013 at 08:51:47 UTC, Paulo Pinto 
 wrote:
 On Wednesday, 9 January 2013 at 08:28:44 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright 
 wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems 
 language, but
 that's quite
 obviouly not my point -- the point is that it's a 
 _perfectly possible_
 memory-safe language which we made, so I don't understand 
 Walter's
 comment about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Yes.
You (or Walter I guess) are the first person I've seen who calls C++ garbage collected.
That's a stretch - IMHO I'd call the language garbage collected iff it is the default way language-wise (e.g. if C++ new was ref-counted I'd call C++ garbage collected). This way D is garbage collected language because the language default is GC.
I was being provocative on purpose. Having said that, a C++ application where *_ptr<> + STL types are used everywhere can make C++ almost a safe language. Unfortunately most C++ libraries make use of naked pointers. .. Paulo
Jan 09 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/09/2013 09:54 AM, Mehrdad wrote:
 You (or Walter I guess) are the first person I've seen who calls C++ garbage
 collected.
Maybe not GC in the truest sense, but one of the first realizations I had when I started teaching myself C++ (having originally learned C) was, "Oh, gosh, I don't have to manually handle memory allocation/deallocation any more." Of course, that's an overstatement in reality, but I still find that for almost all of the C++ code I've ever had to write, I've been able to avoid manual memory management.
Jan 09 2013
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 1/9/2013 12:28 AM, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 08:14:35 UTC, Walter Bright wrote:
 On 1/8/2013 11:42 PM, Mehrdad wrote:
 (True, it wouldn't give you the power of a systems language, but that's quite
 obviouly not my point -- the point is that it's a _perfectly possible_
 memory-safe language which we made, so I don't understand Walter's comment
about
 a GC being "required" for a memory-safe language.)
The misunderstanding is you are not considering reference counting as a form of GC. It is.
So you would say that C++ code (which uses reference counting) uses garbage collection?
Yes, but it is not memory safe as C++ allows escapes from it.
Jan 09 2013
prev sibling parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 07:42:39 UTC, Mehrdad wrote:
 Also might mention, we implemented a compiler for a subdialect 
 of Python (including full closures) for our compilers course, 
 which the compiler subsequently translated to C++.

 GC wasn't required (we were allowed to never deallocate), but 
 since I didn't feel like doing that I added reference counting 
 using a lot of shared_ptr's and intrusive_ptr's.

 I also added a GC just for the sake of catching cyclic 
 references, but it was just that -- everything was reference 
 counted, so if you never had cyclic references, the GC _never_ 
 kicked in, period.
This is a very valid way to manage things in D as well, remember that you have GC.free available.
Jan 09 2013
prev sibling next sibling parent reply "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 07:33:24 UTC, Rob T wrote:
 On Wednesday, 9 January 2013 at 07:23:57 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is 
 memory safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to
There is a point being made here that is perfectly valid. There is a form of memory leak that a GC can never catch, such as when when memory is allocated and simply never deallocated by mistake due to a persistent "in use" pointer that should have been nulled but wasn't.
As long as you have the pointer around, the memory leak is not GC's.
 In addition, the GC itself may fail to deallocated freed memory 
 or even free live memory by mistake. I've seen bugs described 
 to that effect. There simply is no panacea to the memory leak 
 problem. What a GC does do, is free the programmer from a ton 
 of tedium, and even allow for constructs that would normally 
 not be practical to implement, but it can never guarantee 
 anything more than that.
False pointer are mostly solved by using 64bits pointers. See : http://www.deadalnix.me/2012/03/05/impact-of-64bits-vs-32bits-when-using-non-precise-gc/
Jan 09 2013
parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 09:30:41 UTC, deadalnix wrote:
 On Wednesday, 9 January 2013 at 07:33:24 UTC, Rob T wrote:
 On Wednesday, 9 January 2013 at 07:23:57 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is 
 memory safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to
There is a point being made here that is perfectly valid. There is a form of memory leak that a GC can never catch, such as when when memory is allocated and simply never deallocated by mistake due to a persistent "in use" pointer that should have been nulled but wasn't.
As long as you have the pointer around, the memory leak is not GC's.
"The _leak_ is not GC'd"?
 In addition, the GC itself may fail to deallocated freed 
 memory or even free live memory by mistake. I've seen bugs 
 described to that effect. There simply is no panacea to the 
 memory leak problem. What a GC does do, is free the programmer 
 from a ton of tedium, and even allow for constructs that would 
 normally not be practical to implement, but it can never 
 guarantee anything more than that.
False pointer are mostly solved by using 64bits pointers. See : http://www.deadalnix.me/2012/03/05/impact-of-64bits-vs-32bits-when-using-non-precise-gc/
Re-read what he wrote, you completely missed what we're saying.
Jan 09 2013
prev sibling next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Jan 09, 2013 at 08:33:23AM +0100, Rob T wrote:
[...]
 There is a point being made here that is perfectly valid. There is a
 form of memory leak that a GC can never catch, such as when when
 memory is allocated and simply never deallocated by mistake due to a
 persistent "in use" pointer that should have been nulled but wasn't.
No form of memory management will be immune to programmer error. That's plain impossible. You can write broken code in any language, under any kind of memory management system. Short of going heapless, nothing is going to help you here. (Actually, even going heapless won't help you -- think of what happens if you have a fixed-sized stack and forget to pop used-up elements. Yes such a bug is way more obvious than a pointer that didn't get nulled, but both are still just bugs.)
 In addition, the GC itself may fail to deallocated freed memory or
 even free live memory by mistake. I've seen bugs described to that
 effect.
I thought the idea was to make it so that SafeD is free from this kind of problem (short of a bug in the GC itself) -- AFAIK, the GC freeing live memory is caused by having XOR'ed pointers or other such tricks that cause the GC to fail to detect the pointer to the memory. In theory, SafeD should prevent this by not allowing unsafe operations on pointers like XOR'ing. SafeD still has a long ways to go, though.
 There simply is no panacea to the memory leak problem. What
 a GC does do, is free the programmer from a ton of tedium, and even
 allow for constructs that would normally not be practical to
 implement, but it can never guarantee anything more than that.
[...] Yes. I don't think there are many things in the programming world that can be guaranteed. The compiler would have to be clairvoyant (not merely pass the Turing test) to catch these sorts of errors. T -- To provoke is to call someone stupid; to argue is to call each other stupid.
Jan 09 2013
prev sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 01/09/2013 08:33 AM, Rob T wrote:
 On Wednesday, 9 January 2013 at 07:23:57 UTC, Mehrdad wrote:
 On Wednesday, 9 January 2013 at 07:22:51 UTC, deadalnix wrote:
 Well, you CAN indeed, create a dumbed down language that is memory
 safe and don't require a GC.
Yeah, that's 1 of my 2 points. The other one you still ignored: the GC doesn't bring much to the
There is a point being made here that is perfectly valid. There is a form of memory leak that a GC can never catch, such as when when memory is allocated and simply never deallocated by mistake due to a persistent "in use" pointer that should have been nulled but wasn't. ...
This is not entirely accurate. A GC does not necessarily have to assume that every reachable pointer will be accessed again. Every memory leak can be caught by some GC, but no GC catches all.
Jan 12 2013
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/8/2013 10:55 PM, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright wrote:
 One thing I'd add is that a GC is *required* if you want to have a language
 that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Reference counting is a valid form of GC. C++'s shared_ptr, however, is both optional and allows access to the underlying raw pointers. Hence, memory safety cannot be guaranteed.
Jan 08 2013
parent "Mehrdad" <wfunction hotmail.com> writes:
On Wednesday, 9 January 2013 at 07:23:16 UTC, Walter Bright wrote:
 On 1/8/2013 10:55 PM, Mehrdad wrote:
 On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright 
 wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language
 that guarantees memory safety
Pardon? shared_ptr anyone? You can totally have a language that only provides new/delete facilities and which only access to memory through managed pointers like shared_ptr... without a GC. I don't see where a GC is "required" as you say.
Reference counting is a valid form of GC. C++'s shared_ptr, however, is both optional and allows access to the underlying raw pointers. Hence, memory safety cannot be guaranteed.
Right, I never claimed C++ is memory safe. I just said that a language that does something similar without giving you raw pointer access is perfectly possible and (short of memory leaks due to cycles) also perfectly safe.
Jan 08 2013
prev sibling next sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright wrote:
 Interestingly, carefully written code using a GC can be 
 *faster* than manual memory management, for a number of rather 
 subtle reasons.
One being calling the OS to allocate memory is an expensive operation (freeing as well?). I would think a smart GC once it identifies a free memory block may not free it to the OS but hold onto it, then give it to another process when asked for memory, thereby skipping the OS step.
Jan 09 2013
prev sibling parent "eles" <eles eles.com> writes:
On Tuesday, 8 January 2013 at 22:19:56 UTC, Walter Bright wrote:
 On 1/7/2013 3:11 PM, H. S. Teoh wrote:
 One thing I'd add is that a GC is *required* if you want to 
 have a language that guarantees memory safety, which D aims to 
 do. There's an inescapable reason why manual memory management 
 in D is only possible in  system code.
I think that, at least for system code (or invent another ), D programs should be able to require only GC-free code (forbiding, for example, features requiring GC). I do some Linux kernel code. For most of it, I find it quite difficult to rely on the GC if I would use D instead of C. One specific question: when handling an interrupt in Linux kernel, you are not allowed to sleep. But I think GC (and GC-triggering features) will sleep. So their use should be forbidden. The main difference wrt to C programming for kernel: in C, to avoid sleeping, you need to stay away of some kernel/library functions; in D, you need to stay away of some language features. Is one thing to avoid using specific functions. Is another thing to avoid using specific language constructs.
Jan 10 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Monday, January 07, 2013 23:26:02 Rob T wrote:
 Is this a hard fact, or can there be a way to make it work? For
 example what about the custom allocator idea?
 
 From a marketing POV, if the language can be made 100% free of
 the GC it would at least not be a deterrent to those who cannot
 accept having to use one. From a technical POV, there are
 definitely many situations where not using a GC is desirable.
It's a hard fact. Some features (e.g. appending to an array) require the GC and will always require the GC. There may be features which currently require the GC but shouldn't necessarily require it (e.g. AAs may fall in that camp), but some features absolutely require it, and there's no way around that. You can limit your use of the GC or outright not use it at all, but it comes with the cost of not being able to use certain features (mostly with regards to arrays). - Jonathan M Davis
Jan 08 2013
prev sibling parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/08/2013 08:09 PM, Jonathan M Davis wrote:
 It's a hard fact. Some features (e.g. appending to an array) require the GC
 and will always require the GC. There may be features which currently require
 the GC but shouldn't necessarily require it (e.g. AAs may fall in that camp),
 but some features absolutely require it, and there's no way around that.
... but there is also std.container.Array which if I understand right, does its own memory management and does not require the GC, no? Which leads to the question, to what extent is it possible to use built-in arrays and std.container.Arrays interchangeably? What are the things you can't do with a std.container.Array that you can with a built-in one?
Jan 08 2013
prev sibling parent reply "ixid" <nuaccount gmail.com> writes:
On Monday, 7 January 2013 at 17:19:25 UTC, Jonathan M Davis wrote:
 On Monday, January 07, 2013 17:55:35 Rob T wrote:
 On Monday, 7 January 2013 at 16:12:22 UTC, mist wrote:
 How is D manual memory management any worse than plain C one?
 Plenty of language features depend on GC but stuff that is 
 left
 can hardly be named "a lousy excuse". It lacks some 
 convenience
 and guidelines based on practical experience but it is 
 already
 as capable as some of wide-spread solutions for systems
 programming (C). In fact I'd be much more afraid of runtime
 issues when doing system stuff than GC ones.
I think the point being made was that built in language features should not be dependent on the need for a GC because it means that you cannot fully use the language without a GC present and active. We can perhaps excuse the std library, but certainly not the language itself, because the claim is made that D's GC is fully optional.
I don't think that any of the documentation or D's developers have ever claimed that you could use the full language without the GC. Quite the opposite in fact. There are a number of language features that require the GC - including AAs, array concatenation, and closures. You _can_ program in D without the GC, but you lose features, and there's no way around that. It may be the case that some features currently require the GC when they shouldn't, but there are definitely features that _must_ have the GC and _cannot_ be implemented otherwise (e.g. array concatenation and closures). So, if you want to ditch the GC completely, it comes at a cost, and AFAIK no one around here is saying otherwise. You _can_ do it though if you really want to. In general however, the best approach if you want to minimize GC involvement is to generally use manual memory management and minimize your usage of features that require the GC rather than try and get rid of it entirely, because going the extra mile to remove its use completely generally just isn't worth it. Kith-Sa posted some good advice on this just the other day, and he's written a game engine in D: http://forum.dlang.org/post/vbsajlgotanuhmmpnspf forum.dlang.org - Jonathan M Davis
Just speaking as a bystander but I believe it is becoming apparent that a good guide to using D without the GC is required. We have a growing number of users who could be useful converts doing things like using it as a game engine, giving some general help with approaches and warnings about what does and doesn't require the GC would greatly smooth the process. Sadly I lack the talent to write such a guide.
Jan 08 2013
parent Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/08/2013 03:43 PM, ixid wrote:
 Just speaking as a bystander but I believe it is becoming apparent that a good
 guide to using D without the GC is required.
I'd second that. I've tried on a couple of occasions to use D with a minimal-to-no GC approach (e.g. using std.container.Array in place of builtin arrays, etc.) and ran into difficulties. It would be very useful to have a carefully written guide or tutorial on GC-less D.
Jan 08 2013
prev sibling parent Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> writes:
On Mon, Jan 7, 2013 at 8:55 PM, Rob T <rob ucora.com> wrote:

 On Monday, 7 January 2013 at 16:12:22 UTC, mist wrote:

 How is D manual memory management any worse than plain C one?
 Plenty of language features depend on GC but stuff that is left can
 hardly be named "a lousy excuse". It lacks some convenience and guidelines
 based on practical experience but it is already as capable as some of
 wide-spread solutions for systems programming (C). In fact I'd be much more
 afraid of runtime issues when doing system stuff than GC ones.
I think the point being made was that built in language features should not be dependent on the need for a GC because it means that you cannot fully use the language without a GC present and active. We can perhaps excuse the std library, but certainly not the language itself, because the claim is made that D's GC is fully optional. --rt
You're absolutely right. D would be far better if there was a way to specify custom allocators for built-in data structures. Perhaps another magical property: int[int] a; a.allocator = new MyCustomAllocator; a[5] = 5; That's the least code-breaking way I can think of. -- Bye, Gor Gyolchanyan.
Jan 07 2013
prev sibling parent reply Benjamin Thaut <code benjamin-thaut.de> writes:
Am 07.01.2013 16:49, schrieb Gor Gyolchanyan:
 On Mon, Jan 7, 2013 at 7:25 PM, David Nadlinger <see klickverbot.at
 <mailto:see klickverbot.at>> wrote:

     On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan wrote:

         How can I have an associative array, which uses a custom allocator?


     I'm afraid the only viable solution right now is to implement your
     own AA type as a struct with overloaded operators (which is in fact
     what the built-in AAs are lowered to as well).

     There are two downside to this, though - besides, of course, the
     fact that you need a custom implementation:
       - You cannot pass your type to library functions expecting a
     built-in associative array.
       - You lose the convenient literal syntax. This could be fixed in
     the language, though, by providing a rewrite to a variadic
     constructor of user types for array/AA literals, thus eliminating
     the need for GC allocations (gah, another thing I just need to find
     the time to write up a DIP for…).

     David


 This means, that dlang.org <http://dlang.org> is lying. D doesn't
 provide both a garbage collector and manual memory management. It
 provides a garbage collector and a lousy excuse for manual memory
 management. As much as I love D for it's metaprogramming and generative
 programming, it's not even remotely fit for system-level programming the
 way it claims it is.

 I don't mean to be trolling, but it's not the first time I got grossly
 disappointed in D.

 --
 Bye,
 Gor Gyolchanyan.
You can use my GC free hashmap if you want: https://github.com/Ingrater/druntime/blob/master/src/core/hashmap.d Kind Regards Benjamin Thaut
Jan 07 2013
parent reply "nazriel" <spam dzfl.pl> writes:
On Monday, 7 January 2013 at 17:57:49 UTC, Benjamin Thaut wrote:
 Am 07.01.2013 16:49, schrieb Gor Gyolchanyan:
 On Mon, Jan 7, 2013 at 7:25 PM, David Nadlinger 
 <see klickverbot.at
 <mailto:see klickverbot.at>> wrote:

    On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan 
 wrote:

        How can I have an associative array, which uses a 
 custom allocator?


    I'm afraid the only viable solution right now is to 
 implement your
    own AA type as a struct with overloaded operators (which is 
 in fact
    what the built-in AAs are lowered to as well).

    There are two downside to this, though - besides, of 
 course, the
    fact that you need a custom implementation:
      - You cannot pass your type to library functions 
 expecting a
    built-in associative array.
      - You lose the convenient literal syntax. This could be 
 fixed in
    the language, though, by providing a rewrite to a variadic
    constructor of user types for array/AA literals, thus 
 eliminating
    the need for GC allocations (gah, another thing I just need 
 to find
    the time to write up a DIP for…).

    David


 This means, that dlang.org <http://dlang.org> is lying. D 
 doesn't
 provide both a garbage collector and manual memory management. 
 It
 provides a garbage collector and a lousy excuse for manual 
 memory
 management. As much as I love D for it's metaprogramming and 
 generative
 programming, it's not even remotely fit for system-level 
 programming the
 way it claims it is.

 I don't mean to be trolling, but it's not the first time I got 
 grossly
 disappointed in D.

 --
 Bye,
 Gor Gyolchanyan.
You can use my GC free hashmap if you want: https://github.com/Ingrater/druntime/blob/master/src/core/hashmap.d Kind Regards Benjamin Thaut
Benjamin, maybe you could in your spare time draw a DIP for Allocators? This is really very important thing we need for yesterday and putting all the work on Andrei won't help with it. Your solution seems to work very well in practice, maybe it would be possible to adapt it for Druntime/Phobos needs? Or maybe its already fully designed and only needs polishing and pull request? Thanks!
Jan 07 2013
parent Benjamin Thaut <code benjamin-thaut.de> writes:
Am 07.01.2013 20:10, schrieb nazriel:
 Benjamin, maybe you could in your spare time draw a DIP for Allocators?
 This is really very important thing we need for yesterday and putting
 all the work on Andrei won't help with it.

 Your solution seems to work very well in practice, maybe it would be
 possible to adapt it for Druntime/Phobos needs? Or maybe its already
 fully designed and only needs polishing and pull request?

 Thanks!
I'm very busy with university atm. But if I have some free time again I could think about doing this. But I actually would want someone who can do a decision on this to actually look at my fork and confirm that this is actually wanted. I'm not going to do all the work for pull request without actually having some kind of confirmation. Kind Regards Benjamin Thaut
Jan 07 2013
prev sibling next sibling parent "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Tuesday, January 08, 2013 20:32:29 Joseph Rushton Wakeling wrote:
 On 01/08/2013 08:09 PM, Jonathan M Davis wrote:
 It's a hard fact. Some features (e.g. appending to an array) require the
 GC
 and will always require the GC. There may be features which currently
 require the GC but shouldn't necessarily require it (e.g. AAs may fall in
 that camp), but some features absolutely require it, and there's no way
 around that.
... but there is also std.container.Array which if I understand right, does its own memory management and does not require the GC, no? Which leads to the question, to what extent is it possible to use built-in arrays and std.container.Arrays interchangeably? What are the things you can't do with a std.container.Array that you can with a built-in one?
std.container.Array and built-in arrays are _very_ different. Array is a container, not a range. You can slice it to get a range and operate on that, but it's not a range itself. On the other hand, built-in arrays aren't true containers. They don't own or manage their own memory in any way, shape, or form, and they're ranges. The fact that an array _is_ a slice has a _huge_ impact on arrays, and that's not the case with Array. And of course, the APIs for the two are quite different. They're not really interchangeable at all. True, you can use Array in a lot of places that you can use built-in arrays, but they're fundamentally different, and one is definitely _not_ a drop-in replacement for the other. - Jonathan M Davis
Jan 08 2013
prev sibling next sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On 01/08/2013 10:43 PM, Jonathan M Davis wrote:
 std.container.Array and built-in arrays are _very_ different. Array is a
 container, not a range. You can slice it to get a range and operate on that,
 but it's not a range itself.
Is there a particular reason why Array can't have a range interface itself?
 On the other hand, built-in arrays aren't true containers. They don't own or
 manage their own memory in any way, shape, or form, and they're ranges.
Forgive the naive question, but what _is_ the definition of a 'true container'? Is managing its own memory a necessary component? Or just for D's concept of a container?
Jan 08 2013
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
09-Jan-2013 03:37, Joseph Rushton Wakeling пишет:
 On 01/08/2013 10:43 PM, Jonathan M Davis wrote:
 std.container.Array and built-in arrays are _very_ different. Array is a
 container, not a range. You can slice it to get a range and operate on
 that,
 but it's not a range itself.
Is there a particular reason why Array can't have a range interface itself?
Of course, there is - it then will have to keep around the head pointer or to truly remove say front element by shift the rest of the array. Aside from awful implementation contortions it probably can be done but I'd strongly argue against it.
 On the other hand, built-in arrays aren't true containers. They don't
 own or
 manage their own memory in any way, shape, or form, and they're ranges.
Forgive the naive question, but what _is_ the definition of a 'true container'? Is managing its own memory a necessary component?
Managing its own memory is required for container. Ranges don't manage memory they don't have insert/delete whatever. Built-in arrays are strange beasts but for good reason.
 Or just
 for D's concept of a container?
-- Dmitry Olshansky
Jan 08 2013
prev sibling next sibling parent reply "Jonathan M Davis" <jmdavisProg gmx.com> writes:
On Wednesday, January 09, 2013 00:37:10 Joseph Rushton Wakeling wrote:
 On 01/08/2013 10:43 PM, Jonathan M Davis wrote:
 std.container.Array and built-in arrays are _very_ different. Array is a
 container, not a range. You can slice it to get a range and operate on
 that, but it's not a range itself.
Is there a particular reason why Array can't have a range interface itself?
It's a container. Turning a container into a range is just begging for trouble. For instance, what happens when you iterate over it? You remove all of its elements, because you keep calling popFront on it. Making a container into a range is an incredibly bad idea. Things are already weird enough with the built-in arrays.
 On the other hand, built-in arrays aren't true containers. They don't own
 or manage their own memory in any way, shape, or form, and they're
 ranges.
Forgive the naive question, but what _is_ the definition of a 'true container'? Is managing its own memory a necessary component? Or just for D's concept of a container?
A container owns and manages its elements. It may not manage their memory (e.g. it could all be garbage collected, or if its elements are references or pointers, it's really the references or pointers that it owns and manages, not what they point to, so it's not managing the memory that they point to). This is in direct contrast to a slice which is simply a view into a container. You can mess with a slice as much as you want without altering the container itself. It's just that altering the elements in the slice alters those in the container, because they're the same. You can also have multiple slices which view the same container. But you only have one copy of a given container, and so it's the owner of its elements. Copying the container would mean copying the elements themselves, whereas copying a slice would only copy the state of the view into the container and wouldn't copy any elements at all. A container contains elements. A slice is a view into a container and doesn't actually, technically contain them. It has no control over their memory or existence whatsoever. It only provides a way to view them. A lot of what makes D arrays (and therefore ranges) so confusing is the fact that arrays don't own their elements. They're slices. It's the runtime that owns and manages the elements. But people think of arrays as being containers, so they end up getting confused. - Jonathan M Davis
Jan 08 2013
parent "deadalnix" <deadalnix gmail.com> writes:
On Wednesday, 9 January 2013 at 00:50:29 UTC, Jonathan M Davis 
wrote:
 On Wednesday, January 09, 2013 00:37:10 Joseph Rushton Wakeling 
 wrote:
 On 01/08/2013 10:43 PM, Jonathan M Davis wrote:
 std.container.Array and built-in arrays are _very_ 
 different. Array is a
 container, not a range. You can slice it to get a range and 
 operate on
 that, but it's not a range itself.
Is there a particular reason why Array can't have a range interface itself?
It's a container. Turning a container into a range is just begging for trouble. For instance, what happens when you iterate over it? You remove all of its elements, because you keep calling popFront on it. Making a container into a range is an incredibly bad idea. Things are already weird enough with the built-in arrays.
It seems to me like a container should be able to provide a range to iterate its content (but yeah, the container shouldn't be the range itself).
Jan 08 2013
prev sibling parent "Era Scarecrow" <rtcvb32 yahoo.com> writes:
On Monday, 7 January 2013 at 15:01:27 UTC, Gor Gyolchanyan wrote:
 Hello, folks!

 I'm on to a project, which requires manual memory management 
 using custom allocators, but I can't seem to get dynamic arrays 
 and associative arrays to work.

 The std.conv.emplace only allocates the pointer and the size of 
 the dynamic array and pointer to the associative array, which 
 is half the issue.

 I can work around the dynamic array by manually allocating the 
 elements of the array and returning a slice to the result 
 (although I'd be really glad if I could directly use arrays 
 with my custom allocators).
I got to Page 3 but I really wanted to comment on some of this; perhaps a step in the non-GC portion. A thought coming to mind is to modify the existing D language to include custom allocator/deallocator. The idea of sorta borrowing from java will do the job. So 'new' can be a function which's only purpose is to allocate memory, renew can be resizing (appending?), and release is to handle deallocation. struct S { //similar to java's 'new' (C++ as well?) function. //'this' referencing the struct's current storage/pointer location. new(this, int size = S.sizeof) { assert(size >= S.sizeof); this = cast(S) GC.malloc(size); //ctor's called after new ends } //array (multidimentional?) allocation handling. //new[] or newArray?. Probably no ctor afterwards new(this, int size, int[] multiDimentional ...); //handles deallocation (if applicable), with GC it's probably empty //coinsides with already used releasing of memory. void release(); //handles resizing (in place?), probably arrays, //could be realloc() and referring to the current object already...? //void realloc(int newSize); renew(this, int newSize) { S old = this.ptr; this.ptr = GC.realloc(this.ptr, newSize); if (old !is this.ptr) {} //reallocated not appended } //potentially above new could be used for reallocation as well... So //rewritten instead as: new(this, int size = S.sizeof) { assert(size >= S.sizeof); this = cast(S) GC.realloc(this, size); //ctor's called after new ends } } Just a thrown together gist. It's kinda how I remember with zlib's setup where you could specify the allocator/deallocator, and if you didn't it used the default. I know it's got problems as it is, but that doesn't mean it can't be used for brainstorming to handle alternate memory management.
Jan 09 2013