digitalmars.D - Idea: "Explicit" Data Types
- Craig Black (44/44) Apr 01 2008 Before I get into my proposal, I want to vote for stack maps to be added...
- Craig Black (1/2) Apr 01 2008 Should read: data refereced by Foo is not allocated on the GC heap.
- janderson (3/57) Apr 01 2008 I like this idea.
- Craig Black (6/63) Apr 02 2008 I'm waiting for at least three votes before I delve more into the detail...
- Bill Baxter (5/74) Apr 02 2008 I'm not voting because it sounds like it solves a problem that I don't
- Christopher Wright (9/14) Apr 02 2008 A stack map is just a data structure (a bitvector, possibly) that
- Craig Black (7/20) Apr 02 2008 I admit I may know less about stack maps than you, but in the few cases ...
- Christopher Wright (3/10) Apr 02 2008 True. There are use cases where stack maps would hurt performance,
- Craig Black (6/9) Apr 02 2008 If you never use explicit memory management, and always use GC, then it
- Bruno Medeiros (8/50) Apr 10 2008 That seems an idea with limited to no usefullness.
- Craig Black (12/23) Apr 11 2008 I strongly disagree that this is useless. I am thinking of porting C++ ...
Before I get into my proposal, I want to vote for stack maps to be added to D. IMO, stack maps are the next logical step to making the GC faster. They don't require a fundamental shift in the library like a moving GC would. Once stack maps are added, then perhaps the following proposal should be considered to glean additional GC performance. I'm not stuck on terminology here, so if you don't like the term "explicit" because it's too overloaded, that's fine with me. Pick another term. The concept is what's important. This proposal is about getting GC and explicit memory management to play well together. The idea is to give the compiler information that allows the GC to scan less data, and hence perform better. Let's start with a class that uses explcit memory management. class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } This works fine, but doesn't tell the compiler whether data referenced by Foo is allocated on the GC heap or not. If we preceded the class with some kind of qualifier, like "explicit", this would indicate to the compiler that data referenced by Foo is not allocated on the heap. Note: this constraint can't be enforced by the compiler, but could be enforced via run-time debug assertions. explicit class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } A problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const. Thus, for the explicit qualifier to be useful, it must be able to be applied to a struct, class, pointer, reference, or array type. However, it doesn't make sense to apply it to primitive or POD types. If you follow my logic you understand what explicit types can do. They inform the compiler that no GC heap data will be referenced, so that the compiler can exclude explicit types from GC scanning. Further, the use of explicit can be enforced via run-time debug assertions. Note that there are a few implementation details that I'm ignoring now for simplicity sake. -Craig
Apr 01 2008
data referenced by Foo is not allocated on the heap.Should read: data refereced by Foo is not allocated on the GC heap.
Apr 01 2008
Craig Black wrote:Before I get into my proposal, I want to vote for stack maps to be added to D. IMO, stack maps are the next logical step to making the GC faster. They don't require a fundamental shift in the library like a moving GC would. Once stack maps are added, then perhaps the following proposal should be considered to glean additional GC performance. I'm not stuck on terminology here, so if you don't like the term "explicit" because it's too overloaded, that's fine with me. Pick another term. The concept is what's important. This proposal is about getting GC and explicit memory management to play well together. The idea is to give the compiler information that allows the GC to scan less data, and hence perform better. Let's start with a class that uses explcit memory management. class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } This works fine, but doesn't tell the compiler whether data referenced by Foo is allocated on the GC heap or not. If we preceded the class with some kind of qualifier, like "explicit", this would indicate to the compiler that data referenced by Foo is not allocated on the heap. Note: this constraint can't be enforced by the compiler, but could be enforced via run-time debug assertions. explicit class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } A problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const. Thus, for the explicit qualifier to be useful, it must be able to be applied to a struct, class, pointer, reference, or array type. However, it doesn't make sense to apply it to primitive or POD types. If you follow my logic you understand what explicit types can do. They inform the compiler that no GC heap data will be referenced, so that the compiler can exclude explicit types from GC scanning. Further, the use of explicit can be enforced via run-time debug assertions. Note that there are a few implementation details that I'm ignoring now for simplicity sake. -CraigI like this idea. ++vote
Apr 01 2008
"janderson" <askme me.com> wrote in message news:fsundp$17pd$1 digitalmars.com...Craig Black wrote:I'm waiting for at least three votes before I delve more into the details of the implementation. Seems like everybody's preoccupied with const right now though. -CraigBefore I get into my proposal, I want to vote for stack maps to be added to D. IMO, stack maps are the next logical step to making the GC faster. They don't require a fundamental shift in the library like a moving GC would. Once stack maps are added, then perhaps the following proposal should be considered to glean additional GC performance. I'm not stuck on terminology here, so if you don't like the term "explicit" because it's too overloaded, that's fine with me. Pick another term. The concept is what's important. This proposal is about getting GC and explicit memory management to play well together. The idea is to give the compiler information that allows the GC to scan less data, and hence perform better. Let's start with a class that uses explcit memory management. class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } This works fine, but doesn't tell the compiler whether data referenced by Foo is allocated on the GC heap or not. If we preceded the class with some kind of qualifier, like "explicit", this would indicate to the compiler that data referenced by Foo is not allocated on the heap. Note: this constraint can't be enforced by the compiler, but could be enforced via run-time debug assertions. explicit class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } A problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const. Thus, for the explicit qualifier to be useful, it must be able to be applied to a struct, class, pointer, reference, or array type. However, it doesn't make sense to apply it to primitive or POD types. If you follow my logic you understand what explicit types can do. They inform the compiler that no GC heap data will be referenced, so that the compiler can exclude explicit types from GC scanning. Further, the use of explicit can be enforced via run-time debug assertions. Note that there are a few implementation details that I'm ignoring now for simplicity sake. -CraigI like this idea. ++vote
Apr 02 2008
Craig Black wrote:"janderson" <askme me.com> wrote in message news:fsundp$17pd$1 digitalmars.com...I'm not voting because it sounds like it solves a problem that I don't have. Or else I just haven't understood. I don't know what stack maps are, so you kinda lost me on the first sentence. --bbCraig Black wrote:I'm waiting for at least three votes before I delve more into the details of the implementation. Seems like everybody's preoccupied with const right now though. -CraigBefore I get into my proposal, I want to vote for stack maps to be added to D. IMO, stack maps are the next logical step to making the GC faster. They don't require a fundamental shift in the library like a moving GC would. Once stack maps are added, then perhaps the following proposal should be considered to glean additional GC performance. I'm not stuck on terminology here, so if you don't like the term "explicit" because it's too overloaded, that's fine with me. Pick another term. The concept is what's important. This proposal is about getting GC and explicit memory management to play well together. The idea is to give the compiler information that allows the GC to scan less data, and hence perform better. Let's start with a class that uses explcit memory management. class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } This works fine, but doesn't tell the compiler whether data referenced by Foo is allocated on the GC heap or not. If we preceded the class with some kind of qualifier, like "explicit", this would indicate to the compiler that data referenced by Foo is not allocated on the heap. Note: this constraint can't be enforced by the compiler, but could be enforced via run-time debug assertions. explicit class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } A problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const. Thus, for the explicit qualifier to be useful, it must be able to be applied to a struct, class, pointer, reference, or array type. However, it doesn't make sense to apply it to primitive or POD types. If you follow my logic you understand what explicit types can do. They inform the compiler that no GC heap data will be referenced, so that the compiler can exclude explicit types from GC scanning. Further, the use of explicit can be enforced via run-time debug assertions. Note that there are a few implementation details that I'm ignoring now for simplicity sake. -CraigI like this idea. ++vote
Apr 02 2008
Bill Baxter wrote:I'm not voting because it sounds like it solves a problem that I don't have. Or else I just haven't understood. I don't know what stack maps are, so you kinda lost me on the first sentence. --bbA stack map is just a data structure (a bitvector, possibly) that records what on the stack is a pointer (and possibly what type of pointer it is). Instead of considering every word-size chunk as a pointer, you can be a lot more precise in garbage collection. And possibly a bit slower, but on the other hand, you might not have to go through as much memory on some collections. So you'll take a small, continual hit for occasional gains in speed and probably frequent gains in memory usage.
Apr 02 2008
"Christopher Wright" <dhasenan gmail.com> wrote in message news:ft1etp$eaj$1 digitalmars.com...Bill Baxter wrote:I admit I may know less about stack maps than you, but in the few cases I've read about them, they always speak of them as having a positive impact on performance. For example, if the GC runs in the middle of a recursive function that doesn't use pointers, there would be a big benefit to this. -CraigI'm not voting because it sounds like it solves a problem that I don't have. Or else I just haven't understood. I don't know what stack maps are, so you kinda lost me on the first sentence. --bbA stack map is just a data structure (a bitvector, possibly) that records what on the stack is a pointer (and possibly what type of pointer it is). Instead of considering every word-size chunk as a pointer, you can be a lot more precise in garbage collection. And possibly a bit slower, but on the other hand, you might not have to go through as much memory on some collections. So you'll take a small, continual hit for occasional gains in speed and probably frequent gains in memory usage.
Apr 02 2008
Craig Black wrote:I admit I may know less about stack maps than you, but in the few cases I've read about them, they always speak of them as having a positive impact on performance. For example, if the GC runs in the middle of a recursive function that doesn't use pointers, there would be a big benefit to this. -CraigTrue. There are use cases where stack maps would hurt performance, though these would be relatively rare and minor.
Apr 02 2008
I'm not voting because it sounds like it solves a problem that I don't have. Or else I just haven't understood. I don't know what stack maps are, so you kinda lost me on the first sentence.If you never use explicit memory management, and always use GC, then it probably doesn't affect you. If you use explicit memory management, then it will improve GC performance. This is about making the GC even more precise. Stack maps also make the GC more precise, so I thought I would put my vote in for them as well. Most modern GC's use stack maps. -Craig
Apr 02 2008
Craig Black wrote:Before I get into my proposal, I want to vote for stack maps to be added to D. IMO, stack maps are the next logical step to making the GC faster. They don't require a fundamental shift in the library like a moving GC would. Once stack maps are added, then perhaps the following proposal should be considered to glean additional GC performance. I'm not stuck on terminology here, so if you don't like the term "explicit" because it's too overloaded, that's fine with me. Pick another term. The concept is what's important. This proposal is about getting GC and explicit memory management to play well together. The idea is to give the compiler information that allows the GC to scan less data, and hence perform better. Let's start with a class that uses explcit memory management. class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } This works fine, but doesn't tell the compiler whether data referenced by Foo is allocated on the GC heap or not. If we preceded the class with some kind of qualifier, like "explicit", this would indicate to the compiler that data referenced by Foo is not allocated on the heap. Note: this constraint can't be enforced by the compiler, but could be enforced via run-time debug assertions. explicit class Foo { public: new(size_t sz) { return std.c.stdlib.malloc(sz); } delete(void* p) { std.c.stdlib.free(p); } } A problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const.That seems an idea with limited to no usefullness. What if you want to have a class which contains references to both GC-managed data and manually-managed data (which would certainly be a most common case)? -- Bruno Medeiros - MSc in CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Apr 10 2008
I strongly disagree that this is useless. I am thinking of porting C++ code to D and this would be very useful for that, since my C++ code has absolutely no GC at all. Further, GC objects could contain both explicit and non-explicit references. BTW, I'm not stuck on this particular idea. Another strategy would be to make "explicit" non-transitive. This would allow for more control, but would require the programmer to label more things "explicit". Either way, the basic concept is what is important. When you have GC and explicit memory managment in the same application, it is beneficial for performance to tell the compiler what pointers and references are definitely not on the GC heap. Otherwise the GC is doing unnecessary work. -CraigA problem here arises because even though Foo is allocated on the malloc heap, it could contain references, pointers, or arrays that touch the GC heap. Thus, making Foo "explicit" also denotes that any reference, pointer or array contained by Foo is also explicit, and therefore does not refer to data on the GC heap. Interestingly, this means that "explicit" would have to be transitive, like D's const.That seems an idea with limited to no usefullness. What if you want to have a class which contains references to both GC-managed data and manually-managed data (which would certainly be a most common case)?
Apr 11 2008