www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Java String vs wchar[] Was: Re: inner classes

reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 Are you going to have string constants castable to String, BTW?
 Or any other class? That would be nice...
 Walter asks:
 What advantage does java.lang.String have? Why does string need to be a
 class?
string does not need to be a class. It is nice to be able to declare methods for it though. At least for the sake of Java-2-D tool or so. java.lang.String class has a) methods b) String owns buffer - it controls buffer. In D is possible: int[char[]] map; char[] s = "something"; map[s] = 1; s[0] = '?'; // I have no idea what result will be. sure not good. And you can bump into such problem quite easily in D. I personally did many times. And too hard to find source sometimes. In Java such collision is not possible in principle: String is final and immutable. Java strings more (I would say - too) greedy but more robust. In D I tried to create something like String but declarations like str = new String("real string"); or with structs str = String("mmmm"); are just boring and aestheticlly disastrous. For Java guys such D strings will be just a source of permanent errors. To prevent collisions Mango library (nice one!) uses two versions of classes e.g. Dictionary/MutableDictionary - a bit overkill, imho, but works. Ideally in D it should be possible to reproduce at least std::string. (I am yet silent about copy-on-write version) I tried four times - did not find yet reliable solution. I am pretty sure - it is impossible to implement the same abstarction in D and with the same overhead. struct was good candiadate for such wrapper but no copying ctor. class needs allocation. Only dup so far. But solution with dup is even worse than in Java. See: class Url { char[] _hostname; ... char[] hostname() { return _hostname.dup; } // Doh! } if( url.hostname == "terrainformatica.com" ) // 32 bytes less in memory, just to compare it! .... Ideal from many points of view would be a solution with const class Url { char[] _hostname; const char[] hostname() { return _hostname; } // Yep! this exactly what we need. } I think that it would be just enough to be able to declare as const variables of simple types - array and pointers. Generally speaking const does not imply better assembler code. But const helps to build optimal and fast systems where GC spends 1% of time and not 20%. Andrew.
May 30 2005
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


[snip]
 
 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.
 
 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.
I'm sure you already know this, but for the benefit of others, you can avoid this trap by coding ... int[char[]] map; char[] s = "something"; map[s.dup] = 1; // NB: .dup call. s[0] = '?'; // Does not mess up the index to map. -- Derek Melbourne, Australia 31/05/2005 4:05:26 PM
May 30 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:1p5feg14mh412.1x6qgemuugouf$.dlg 40tude.net...
 On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


 [snip]

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.
I'm sure you already know this, but for the benefit of others, you can avoid this trap by coding ... int[char[]] map; char[] s = "something"; map[s.dup] = 1; // NB: .dup call. s[0] = '?'; // Does not mess up the index to map.
Thanks, Derek. But shall I put you recomendation into comments for each function returning string? Don't store, don't modify, etc? If you put my string into your map always do its dup, etc. This is not that I am considering as technically correct solution. Andrew.
May 31 2005
prev sibling next sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.
A number of languages use the immutable string idiom, and its corollary "always implicitly copy the string when writing to it". They all share another common characteristic - they're slow, and they're slow in a manner that is *not fixable*. And they're not just slower by a factor, many algorithms run *exponentially* slower because of the copying. D must be fast, and the only way to be fast with strings (and arrays) is to not have the language implicitly copy them, but to allow the programmer the flexibility to copy or not copy. To know when to copy, use the Copy On Write principle (COW). That is, if you're not *sure* you've got the only copy of a string, .dup it before modifying it. So why isn't that just as bad as the languages that implicitly copy on write? The answer is that often, you know that you are the sole owner, such as: char[] s = new char[10]; for (i = 0; i < 10; i++) s[i] = 'c'; Those other languages are doomed to make 10 copies of s. The D programmer needs to make 0 copies. As to your example above, when you pass a reference to a string to an associative array, then you aren't the sole owner of that string anymore. Don't change it. .dup it.
May 31 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Walter" <newshound digitalmars.com> wrote in message 
news:d7h4rf$1345$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.
A number of languages use the immutable string idiom, and its corollary "always implicitly copy the string when writing to it". They all share another common characteristic - they're slow, and they're slow in a manner that is *not fixable*. And they're not just slower by a factor, many algorithms run *exponentially* slower because of the copying. D must be fast, and the only way to be fast with strings (and arrays) is to not have the language implicitly copy them, but to allow the programmer the flexibility to copy or not copy. To know when to copy, use the Copy On Write principle (COW). That is, if you're not *sure* you've got the only copy of a string, .dup it before modifying it. So why isn't that just as bad as the languages that implicitly copy on write? The answer is that often, you know that you are the sole owner, such as: char[] s = new char[10]; for (i = 0; i < 10; i++) s[i] = 'c'; Those other languages are doomed to make 10 copies of s. The D programmer needs to make 0 copies. As to your example above, when you pass a reference to a string to an associative array, then you aren't the sole owner of that string anymore. Don't change it. .dup it.
Gotcha. And what will be your advice then for: class Url { char[] _hostname; char[] hostname() { return _hostname; } } _hostname should not be changeable nor intentionally nor accidentally. hostname access pattern is primarily read. But it could possibly be passed in some third party functions. I am serious. I really want to know how to design it better. I've made an ugly struct string { wchar[] chars; bool mutable; } But this not working in 15% of cases. I am remebering old good days of C programming with these char[]s. Damned fast but not maintainable. In C++ I have my own nice tool::string with reliable copy-on-write..... sigh. Andrew.
May 31 2005
next sibling parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Andrew Fedoniouk wrote:

 And what will be your advice then for:
 
 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }
 
 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.
 
 I am serious. I really want to know how to design it better.
Magic Eight Ball says: ___ / \ / \ / ASK \ / AGAIN \ / LATER \ \___________/ My own prediction is that we argue about it for a few months more, and then Walter caves in and adds a "readonly" keyword to D... :-) For the time being, I think returning the string and asking others to be nice is better than using a Class or a struct ?
 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
That's where we are at now, I suppose. I've already run into some things regarding string literals. And that was even before any potential class library user... Copy on Write is currently just a Gentlemen's Agreement. And it needs the client using Url.hostname to play along. --anders
May 31 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Anders F Björklund" <afb algonet.se> wrote in message 
news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Magic Eight Ball says: ___ / \ / \ / ASK \ / AGAIN \ / LATER \ \___________/ My own prediction is that we argue about it for a few months more, and then Walter caves in and adds a "readonly" keyword to D... :-) For the time being, I think returning the string and asking others to be nice is better than using a Class or a struct ?
 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
That's where we are at now, I suppose. I've already run into some things regarding string literals. And that was even before any potential class library user... Copy on Write is currently just a Gentlemen's Agreement. And it needs the client using Url.hostname to play along.
Yep, Anders, seems like we are on the same track with you. The thing is: I really don't know of a good style of library/component design in D. If I am writing all code in one EXE by myself - fine - I am genetleman with myself. Well... even with myself... 'Today me' and 'yesterday me' frequently different persons. But! If I am desisginig library of common use.... I can imagine: documentation starts from: "For gentlemen only...."
May 31 2005
parent reply "Kris" <fu bar.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote ...
 "Anders F Björklund" <afb algonet.se> wrote in message
 news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Magic Eight Ball says: ___ / \ / \ / ASK \ / AGAIN \ / LATER \ \___________/ My own prediction is that we argue about it for a few months more, and then Walter caves in and adds a "readonly" keyword to D... :-) For the time being, I think returning the string and asking others to be nice is better than using a Class or a struct ?
 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
That's where we are at now, I suppose. I've already run into some things regarding string literals. And that was even before any potential class library user... Copy on Write is currently just a Gentlemen's Agreement. And it needs the client using Url.hostname to play along.
Yep, Anders, seems like we are on the same track with you.
Aye. There's a fair number of people who share this concern, and we all seem to be asking for the same thing. I think/hope it's a matter of time rather than legitimacy.
 The thing is: I really don't know of a good style of
 library/component design in D.
FWIW: I've been forced down the path of the Gentleman's agreement, with the expectation that 'readonly' will materialize in some form. It is possible to minimise the occurence of such things, but it doesn't always provide the most lightweight implementation (as you noted elsewhere).
 If I am writing all code in one EXE by myself -
 fine - I am genetleman with myself. Well... even with
 myself... 'Today me' and 'yesterday me' frequently
 different persons.

 But! If I am desisginig library of common use....

 I can imagine: documentation starts from:

 "For gentlemen only...."
Good one! Perhaps we should come up with a GLA: "Gentleman's License Agreement" :-)
May 31 2005
next sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...
FWIW: I've been forced down the path of the Gentleman's agreement, with the
expectation that 'readonly' will materialize in some form. It is possible to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).
FWIW, logical const behavior can *almost* be modeled decently using a sort of honor system as well: The obvious problems with the above are that (1) it requires a lot of cooperation and (2) the read function has no easy way of storing whether C was mutable *before* the function was called, so it may 'restore' the wrong state (though this issue could be solved with thread local storage, as the expense of added complexity). Sean
May 31 2005
parent reply Brad Beveridge <brad somewhere.net> writes:
Sean Kelly wrote:
 In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...
 
FWIW: I've been forced down the path of the Gentleman's agreement, with the
expectation that 'readonly' will materialize in some form. It is possible to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).
FWIW, logical const behavior can *almost* be modeled decently using a sort of honor system as well: The obvious problems with the above are that (1) it requires a lot of cooperation and (2) the read function has no easy way of storing whether C was mutable *before* the function was called, so it may 'restore' the wrong state (though this issue could be solved with thread local storage, as the expense of added complexity). Sean
Do you have any ideas on how this could work with basic data types and arrays? With classes, is it possible for the compiler to generate accessors for all members automatically? That would ease implementation details. Perhaps a template could be created so that class code could look like class C { bit mutable; mixin constcapable!(int) someInt; } Brad
May 31 2005
next sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7ifkp$2goj$1 digitaldaemon.com...
 Sean Kelly wrote:
 In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...

FWIW: I've been forced down the path of the Gentleman's agreement, with 
the
expectation that 'readonly' will materialize in some form. It is possible 
to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).
FWIW, logical const behavior can *almost* be modeled decently using a sort of honor system as well: The obvious problems with the above are that (1) it requires a lot of cooperation and (2) the read function has no easy way of storing whether C was mutable *before* the function was called, so it may 'restore' the wrong state (though this issue could be solved with thread local storage, as the expense of added complexity). Sean
Do you have any ideas on how this could work with basic data types and arrays? With classes, is it possible for the compiler to generate accessors for all members automatically? That would ease implementation details. Perhaps a template could be created so that class code could look like class C { bit mutable; mixin constcapable!(int) someInt; } Brad
The main problem that for basic types there are no solution in principle. At least I did not find proper idiom. I've ended up with a "solution" to patch std.internals and to add readonly flag to arrays. But this is not working for pointers. The only feasible solution is const - checks at compile time and not in runtime. Andrew.
May 31 2005
prev sibling parent Sean Kelly <sean f4.ca> writes:
In article <d7ifkp$2goj$1 digitaldaemon.com>, Brad Beveridge says...
Do you have any ideas on how this could work with basic data types and 
arrays?
I don't know that this is worth doing for POD types because they're copied when passed as 'in' parameters anyway. But I suppose strings could be an issue. One possibility would be to fingerprint the memory using a checksum and verify that fingerprint in the out clause. Perhaps someone has a better suggestion?
With classes, is it possible for the compiler to generate 
accessors for all members automatically?
Certainly. If DBC became a popular means for verifying const correctness I'd suggest that the compiler offer a means to do this. I don't think it would be too terribly complicated, but I haven't given it much thought.
That would ease implementation 
details.  Perhaps a template could be created so that class code could 
look like
class C
{
	bit mutable;
	mixin constcapable!(int) someInt;
}
Definately a possibility. My only concern with the DBC method is that it requires the library writer to build stuff in to support it to keep the code clean. The client could do the switching instead: c.mutable = false; func( c, d ); c.mutable = true; but this is obviously pretty clunky. It might be possible to do this with auto classes: but this is very clunky and doesn't even offer the potential to streamline it because of auto lifetime rules, though it's worth noting that the above example prints this: pre ctor post dtor and wrapping the function call in its own scope doesn't help: { func( cast(C)(new SetConst!(C)( c )) ); } it's little things like this that has me cursing the restriction that structs can't have ctors. Not only does the above code require a completely pointless memory allocation, but the lifetime of the class isn't even what it should be (though I'd consider this latter issue to be a bug). Sean
May 31 2005
prev sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Kris" <fu bar.com> wrote in message news:d7ic55$2df4$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote ...
 "Anders F Björklund" <afb algonet.se> wrote in message
 news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Magic Eight Ball says: ___ / \ / \ / ASK \ / AGAIN \ / LATER \ \___________/ My own prediction is that we argue about it for a few months more, and then Walter caves in and adds a "readonly" keyword to D... :-) For the time being, I think returning the string and asking others to be nice is better than using a Class or a struct ?
 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
That's where we are at now, I suppose. I've already run into some things regarding string literals. And that was even before any potential class library user... Copy on Write is currently just a Gentlemen's Agreement. And it needs the client using Url.hostname to play along.
Yep, Anders, seems like we are on the same track with you.
Aye. There's a fair number of people who share this concern, and we all seem to be asking for the same thing. I think/hope it's a matter of time rather than legitimacy.
 The thing is: I really don't know of a good style of
 library/component design in D.
FWIW: I've been forced down the path of the Gentleman's agreement, with the expectation that 'readonly' will materialize in some form. It is possible to minimise the occurence of such things, but it doesn't always provide the most lightweight implementation (as you noted elsewhere).
 If I am writing all code in one EXE by myself -
 fine - I am genetleman with myself. Well... even with
 myself... 'Today me' and 'yesterday me' frequently
 different persons.

 But! If I am desisginig library of common use....

 I can imagine: documentation starts from:

 "For gentlemen only...."
Good one! Perhaps we should come up with a GLA: "Gentleman's License Agreement" :-)
Cool! :) My variation: "Dgentleman's License Agreement" Andrew.
May 31 2005
prev sibling parent reply "Walter" <newshound digitalmars.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Third party functions should follow the COW principle too. They should not modify strings that they don't know they are the owner of. Look at std.string.tolower for an example of this.
 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
 In C++ I have my own nice tool::string with reliable
 copy-on-write..... sigh.
The C++ std::string is slower than D strings. www.digitalmars.com/d/cppstrings.html C++ strings have another serious problem (that D doesn't have): you have to keep track of who the owner is, so it can be deleted (else you get a memory leak). In my experience, that is a LOT harder to get right than adhering to COW. Trying to absolutely determine ownership, like C++ does, is much harder than just being able to assume you don't own it.
May 31 2005
parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 31 May 2005 19:37:03 -0700, Walter wrote:

 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Third party functions should follow the COW principle too. They should not modify strings that they don't know they are the owner of.
Yes, and cyclists shouldn't run red lights either. We have to code in a world in which many people using our libraries don't care about what they 'should' do; they use anything that seems like an expedient idea at the time. Yes I know that not following the CoW rules is dangerous, but its not as dangerous as cyclists running red lights and they continue to do that. -- Derek Melbourne, Australia 1/06/2005 12:56:20 PM
May 31 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:7aqw8u524dge$.1hmvgp4jz3dvc.dlg 40tude.net...
 On Tue, 31 May 2005 19:37:03 -0700, Walter wrote:

 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.
Third party functions should follow the COW principle too. They should not modify strings that they don't know they are the owner of.
Yes, and cyclists shouldn't run red lights either. We have to code in a world in which many people using our libraries don't care about what they 'should' do; they use anything that seems like an expedient idea at the time. Yes I know that not following the CoW rules is dangerous, but its not as dangerous as cyclists running red lights and they continue to do that.
hmm. around here it isn't the cyclists that run red lights - it's the things with 4 wheels and that unused pedal called the "brake". :-P But more to topic I'm with Walter that when you look at the big picture COW is a reasonable balance of trade-offs. The only suggestion I have is to put COW more front-and-center in the array help so that people see it from the start and it becomes second nature. Compiler protection against malicious code isn't that important to me since people will go out of their way to write malicious code no matter what the compiler does. I'm more worried about the accidental D-newbie who doesn't know about arrays or COW. For those cases talking about COW right away in the doc will decrease the likelihood of newbie errors.
May 31 2005
parent reply kris <fu bar.org> writes:
Ben Hinkle wrote:
And what will be your advice then for:

class Url {
  char[] _hostname;
  char[] hostname() { return _hostname; }
}

_hostname should not be changeable nor intentionally
nor accidentally.
hostname access pattern is primarily read. But it could possibly be
passed in some third party functions.

I am serious. I really want to know how to design it better.
Third party functions should follow the COW principle too. They should not modify strings that they don't know they are the owner of.
Yes, and cyclists shouldn't run red lights either. We have to code in a world in which many people using our libraries don't care about what they 'should' do; they use anything that seems like an expedient idea at the time. Yes I know that not following the CoW rules is dangerous, but its not as dangerous as cyclists running red lights and they continue to do that.
hmm. around here it isn't the cyclists that run red lights - it's the things with 4 wheels and that unused pedal called the "brake". :-P But more to topic I'm with Walter that when you look at the big picture COW is a reasonable balance of trade-offs. The only suggestion I have is to put COW more front-and-center in the array help so that people see it from the start and it becomes second nature. Compiler protection against malicious code isn't that important to me since people will go out of their way to write malicious code no matter what the compiler does. I'm more worried about the accidental D-newbie who doesn't know about arrays or COW. For those cases talking about COW right away in the doc will decrease the likelihood of newbie errors.
Ben; Walter; I think perhaps you're missing a significant point being made? CoW is not the issue at stake ~ instead, what's being asked for is a mechanism to /enforce/ CoW. For example: the little example above should not be dup'ing the content before return, if it's only being used for reference (read-only) purposes by both parties (caller and callee). I think we can all agree on that? Yes? What's being asked for is a means whereby the compiler will 'prohibit' some other caller from using the returned array as a writable lValue; at compile time. That is, the CoW should be performed by the caller (not the callee), /if and when the caller needs to perform a write upon it/. And only at that time. Again, CoW is not being questioned. It's the total lack of enforcement that would be good to do something about. The compiler goes out of its way to catch out-of-bounds errors WRT arrays ~ we're asking for something similar here to avoid a source of silly, easily preventable, and hard to track down bugs. It would add some noticable weight to any story regarding robustness. Turn things around for a minute, and assume such a facility was available. It's not hard to see how this would be viewed in a most favourable light. And there's no downside for the code, or for the developer. Best of all worlds?
May 31 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"kris" <fu bar.org> wrote in message news:d7jemq$k2n$1 digitaldaemon.com...
 Ben Hinkle wrote:
And what will be your advice then for:

class Url {
  char[] _hostname;
  char[] hostname() { return _hostname; }
}

_hostname should not be changeable nor intentionally
nor accidentally.
hostname access pattern is primarily read. But it could possibly be
passed in some third party functions.

I am serious. I really want to know how to design it better.
Third party functions should follow the COW principle too. They should not modify strings that they don't know they are the owner of.
Yes, and cyclists shouldn't run red lights either. We have to code in a world in which many people using our libraries don't care about what they 'should' do; they use anything that seems like an expedient idea at the time. Yes I know that not following the CoW rules is dangerous, but its not as dangerous as cyclists running red lights and they continue to do that.
hmm. around here it isn't the cyclists that run red lights - it's the things with 4 wheels and that unused pedal called the "brake". :-P But more to topic I'm with Walter that when you look at the big picture COW is a reasonable balance of trade-offs. The only suggestion I have is to put COW more front-and-center in the array help so that people see it from the start and it becomes second nature. Compiler protection against malicious code isn't that important to me since people will go out of their way to write malicious code no matter what the compiler does. I'm more worried about the accidental D-newbie who doesn't know about arrays or COW. For those cases talking about COW right away in the doc will decrease the likelihood of newbie errors.
Ben; Walter; I think perhaps you're missing a significant point being made? CoW is not the issue at stake ~ instead, what's being asked for is a mechanism to /enforce/ CoW. For example: the little example above should not be dup'ing the content before return, if it's only being used for reference (read-only) purposes by both parties (caller and callee). I think we can all agree on that? Yes? What's being asked for is a means whereby the compiler will 'prohibit' some other caller from using the returned array as a writable lValue; at compile time. That is, the CoW should be performed by the caller (not the callee), /if and when the caller needs to perform a write upon it/. And only at that time.
*exactly*.
 Again, CoW is not being questioned. It's the total lack of enforcement 
 that would be good to do something about. The compiler goes out of its way 
 to catch out-of-bounds errors WRT arrays ~ we're asking for something 
 similar here to avoid a source of silly, easily preventable, and hard to 
 track down bugs. It would add some noticable weight to any story regarding 
 robustness.

 Turn things around for a minute, and assume such a facility was available. 
 It's not hard to see how this would be viewed in a most favourable light. 
 And there's no downside for the code, or for the developer. Best of all 
 worlds?
proposed const costs nothing in runtime. Even better - it helps to reduce unnecessary allocations. Let's take a look in some code fragments of Phobos: module std.openrj ----------------------------------- class Record { Field[] fields() { return m_fields.dup; } // just in case? // probably following is better? // const Field[] fields() { return m_fields; } } class Database { Record[] records() { return m_records.dup; } // the same Field[] fields() { return m_fields.dup; } } std.file---------------------------------------------- class FileException : Exception { this(char[] name, uint errno) { char* s = strerror(errno); // I have no idea this(name, std.string.toString(s).dup); // what is going on here. this.errno = errno; } } void listdir(char[] pathname, bool delegate(char[] filename) callback) { .... int len = std.string.strlen(fdata.d_name); if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed here??? // allocation of new string on each entry! Doh! .... } std.loader ---------------------------------------------- public class ExeModuleException : Exception { this(uint errcode) { super(std.string.toString(strerror(errcode)).dup); // why? } } std.socket ---------------------------------------------- void populate(protoent* proto) { type = cast(ProtocolType)proto.p_proto; name = std.string.toString(proto.p_name).dup; // why? .... aliases = new char[][i]; for(i = 0; i != aliases.length; i++) { aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for? } .... } ---------------------------------------- etc.
Jun 05 2005
parent reply Ben Hinkle <Ben_member pathlink.com> writes:
[snip]
Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
    Field[] fields() { return m_fields.dup;  } // just in case?
    // probably following is better?
    // const Field[] fields() { return m_fields;  }
}

class Database
{
    Record[]  records()  {   return m_records.dup;   } // the same
    Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
    this(char[] name, uint errno)
    { char* s = strerror(errno);             //  I have no idea
 this(name, std.string.toString(s).dup); //  what is going on here.
 this.errno = errno;
    }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
 ....
     int len = std.string.strlen(fdata.d_name);
     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed 
here???
             // allocation of new string on each entry! Doh!

 ....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
    this(uint errcode)
    {
 super(std.string.toString(strerror(errcode)).dup); // why?
    }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
 {
  type = cast(ProtocolType)proto.p_proto;
  name = std.string.toString(proto.p_name).dup; // why?
....
   aliases = new char[][i];
   for(i = 0; i != aliases.length; i++)
   {
    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
   }
....

}
----------------------------------------
etc.
Are the comments in the code above editorial by you or are they actually in the code? I'd say someone needs to look at phobos to clean up the dups. If you've already done the sweep then sending Walter the fixes would be helpful. Phobos can contain non-D programming styles occasionally - each module has a strong indication of the author's attitudes IMHO. ps - for kicks this weekend I've been adding a parameter to the MinTL containers to indicate read-only vs read-write. For example struct List(Value, bit ReadOnly = false) { static if (!ReadOnly) { void addTail(Value v){...} .. other functions that modify the list ... } .. functions that don't modify the list ... } You get a read-only view of a container by using the "readonly" property. I'll be finishing this stuff up soon and post to D.dtl later in the week.
Jun 05 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Ben Hinkle" <Ben_member pathlink.com> wrote in message 
news:d80fq2$2gfv$1 digitaldaemon.com...
 [snip]
Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
    Field[] fields() { return m_fields.dup;  } // just in case?
    // probably following is better?
    // const Field[] fields() { return m_fields;  }
}

class Database
{
    Record[]  records()  {   return m_records.dup;   } // the same
    Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
    this(char[] name, uint errno)
    { char* s = strerror(errno);             //  I have no idea
 this(name, std.string.toString(s).dup); //  what is going on here.
 this.errno = errno;
    }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
 ....
     int len = std.string.strlen(fdata.d_name);
     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed
here???
             // allocation of new string on each entry! Doh!

 ....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
    this(uint errcode)
    {
 super(std.string.toString(strerror(errcode)).dup); // why?
    }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
 {
  type = cast(ProtocolType)proto.p_proto;
  name = std.string.toString(proto.p_name).dup; // why?
....
   aliases = new char[][i];
   for(i = 0; i != aliases.length; i++)
   {
    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
   }
....

}
----------------------------------------
etc.
Are the comments in the code above editorial by you or are they actually in the code? I'd say someone needs to look at phobos to clean up the dups. If you've already done the sweep then sending Walter the fixes would be helpful. Phobos can contain non-D programming styles occasionally - each module has a strong indication of the author's attitudes IMHO.
Yes, comments are mine. I took a look again: 1) std.socket is fine: std.string.toString(proto.p_aliases[i]).dup is really needed as proto.p_aliases[i] is temporary string coming from system (sockets) 2) strerror(errcode)).dup probably make sense, as string there is coming from extern (C) char* strerror(int); 3) For std.openrj I don't really know Matthew intention. Seems like he needs such implementation for some reasons or he is following .dup advices for library module safe implementation. No idea to be short. const will help in such cases. 4) std.file -> void listdir(char[] pathname, bool delegate(char[] filename) callback) the way it is implemented and without const char[] filename seems like only reliable way of how to accomplish this as fdata.d_name coming from system static buffer. Other option would be to create temp buffer on stack, copy string value there and pass this buffer reference. But this is a copy on each iteration. With const such string can be passed 'as is'.
 ps - for kicks this weekend I've been adding a parameter to the MinTL 
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" property. 
 I'll
 be finishing this stuff up soon and post to D.dtl later in the week.
(no offence, just wondering) I really don't know how such static flag would work. Suppose I declared: List(int, true /*ReadOnly*/) readOnlyList; How to fill this list then? You will need some mechanism allowing to do mutable->immutable conversion, right? My perception is that just something like const_iterator needs to be designed. Or ConstList without modification methods which will have sole ctor ConstList(List data). Andrew.
Jun 05 2005
parent reply "Ben Hinkle" <ben.hinkle gmail.com> writes:
 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" property. 
 I'll
 be finishing this stuff up soon and post to D.dtl later in the week.
(no offence, just wondering) I really don't know how such static flag would work. Suppose I declared: List(int, true /*ReadOnly*/) readOnlyList; How to fill this list then? You will need some mechanism allowing to do mutable->immutable conversion, right?
That's what the "readonly" property does. For example void foo(List!(int,ReadOnly) y) { ... } List!(int) x; x.add(10,20,30); foo(x.readonly); The 'ReadOnly' in the code above is a constant defined to be true. I think it improves readability of the code. If one wants to "cast away the const" I also added a property x.readwrite.
 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).
Those are equivalent to iterator!(true) where iterator(bit Const = false) and List!(true) where List(bit Const = false). The "constness" becomes part of the type just like the types List/ConstList or iterator/const_iterator.
Jun 06 2005
next sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d81fb2$7el$1 digitaldaemon.com...
 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" 
 property. I'll
 be finishing this stuff up soon and post to D.dtl later in the week.
(no offence, just wondering) I really don't know how such static flag would work. Suppose I declared: List(int, true /*ReadOnly*/) readOnlyList; How to fill this list then? You will need some mechanism allowing to do mutable->immutable conversion, right?
That's what the "readonly" property does. For example void foo(List!(int,ReadOnly) y) { ... } List!(int) x; x.add(10,20,30); foo(x.readonly); The 'ReadOnly' in the code above is a constant defined to be true. I think it improves readability of the code. If one wants to "cast away the const" I also added a property x.readwrite.
 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).
Those are equivalent to iterator!(true) where iterator(bit Const = false) and List!(true) where List(bit Const = false). The "constness" becomes part of the type just like the types List/ConstList or iterator/const_iterator.
Jun 06 2005
prev sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d81fb2$7el$1 digitaldaemon.com...
 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" 
 property. I'll
 be finishing this stuff up soon and post to D.dtl later in the week.
(no offence, just wondering) I really don't know how such static flag would work. Suppose I declared: List(int, true /*ReadOnly*/) readOnlyList; How to fill this list then? You will need some mechanism allowing to do mutable->immutable conversion, right?
That's what the "readonly" property does. For example void foo(List!(int,ReadOnly) y) { ... } List!(int) x; x.add(10,20,30); foo(x.readonly); The 'ReadOnly' in the code above is a constant defined to be true. I think it improves readability of the code. If one wants to "cast away the const" I also added a property x.readwrite.
 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).
Those are equivalent to iterator!(true) where iterator(bit Const = false) and List!(true) where List(bit Const = false). The "constness" becomes part of the type just like the types List/ConstList or iterator/const_iterator.
Yep. It will work. It will increase size of generated code twice as you need second instantiation of the template (readonly version). But such specialized wrapper is definitely a solution for containers for places where robustness and modulraity is main requirements. Wish it will be possible to say something like char[].readonly or so for primary containers and pointers. As in most cases they are pretty sufficient and more effective. I took a look in compiler code. Seems I can implement constness for array and pointers pretty easily as everything is already there. Another option would be in implementation of extended/selected typedef notation: typedef char[] cstring: opSlice, opIndexAssign; // list of operations derived from base type. A bit ugly but would work. Andrew.
Jun 06 2005
prev sibling next sibling parent reply U.Baumanis <U.Baumanis_member pathlink.com> writes:
How about immutable final String for general stuff end StringBuffer (or
whatever) for performance needs.

ubau

In article <d7h4rf$1345$1 digitaldaemon.com>, Walter says...
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.
A number of languages use the immutable string idiom, and its corollary "always implicitly copy the string when writing to it". They all share another common characteristic - they're slow, and they're slow in a manner that is *not fixable*. And they're not just slower by a factor, many algorithms run *exponentially* slower because of the copying. D must be fast, and the only way to be fast with strings (and arrays) is to not have the language implicitly copy them, but to allow the programmer the flexibility to copy or not copy. To know when to copy, use the Copy On Write principle (COW). That is, if you're not *sure* you've got the only copy of a string, .dup it before modifying it. So why isn't that just as bad as the languages that implicitly copy on write? The answer is that often, you know that you are the sole owner, such as: char[] s = new char[10]; for (i = 0; i < 10; i++) s[i] = 'c'; Those other languages are doomed to make 10 copies of s. The D programmer needs to make 0 copies. As to your example above, when you pass a reference to a string to an associative array, then you aren't the sole owner of that string anymore. Don't change it. .dup it.
May 31 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
U.Baumanis wrote:

 How about immutable final String for general stuff end StringBuffer (or
 whatever) for performance needs.
If you want some Java-like string classes, I hacked some stuff together: http://www.algonet.se/~afb/d/dcaf/html/class_string.html http://www.algonet.se/~afb/d/dcaf/html/class_string_buffer.html That doesn't change the "readonly" (was: const) needs of the built-in string types of D (code unit arrays) ? Just something of a workaround. Kris has a much nicer wrapper (with ICU features) under the Mango Tree. --anders
May 31 2005
parent reply U.Baumanis <U.Baumanis_member pathlink.com> writes:
In article <d7hagf$194p$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
U.Baumanis wrote:

 How about immutable final String for general stuff end StringBuffer (or
 whatever) for performance needs.
If you want some Java-like string classes, I hacked some stuff together: http://www.algonet.se/~afb/d/dcaf/html/class_string.html http://www.algonet.se/~afb/d/dcaf/html/class_string_buffer.html That doesn't change the "readonly" (was: const) needs of the built-in string types of D (code unit arrays) ? Just something of a workaround. Kris has a much nicer wrapper (with ICU features) under the Mango Tree. --anders
Thanks! It would be nice to have it in std.string. Well, better somewhere than nowhere. :-) -- ubau
May 31 2005
parent reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
U.Baumanis wrote:

If you want some Java-like string classes, I hacked some stuff together:

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.
Thanks! It would be nice to have it in std.string. Well, better somewhere than nowhere. :-)
Don't misunderstand me, I do not think that D needs a String class... (it was just a small example on how one could implement such a beast) The default D string type is still "char[]". Just wished there was a simple way to preserve the readonly-ness of it, without resorting to using a fullblown wrapper class - like Java does ? Something like: readonly char[] s = "hello"; --anders PS. Name "const" has been renamed for political reasons. Kinda like typedef, which changed name into "alias".
May 31 2005
parent U.Baumanis <U.Baumanis_member pathlink.com> writes:
In article <d7hl28$1ikj$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
U.Baumanis wrote:

If you want some Java-like string classes, I hacked some stuff together:

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.
Thanks! It would be nice to have it in std.string. Well, better somewhere than nowhere. :-)
Don't misunderstand me, I do not think that D needs a String class... (it was just a small example on how one could implement such a beast) The default D string type is still "char[]". Just wished there was a simple way to preserve the readonly-ness of it, without resorting to using a fullblown wrapper class - like Java does ? Something like: readonly char[] s = "hello"; --anders PS. Name "const" has been renamed for political reasons. Kinda like typedef, which changed name into "alias".
You are right! I forgot why i wanted a String class... May be because I have to use Java at work. ;-) -- ubau
May 31 2005
prev sibling next sibling parent reply Eugene Pelekhay <pelekhay gmail.com> writes:
Walter wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 
java.lang.String class has a) methods b) String owns buffer - it controls
buffer.

In D is possible:
int[char[]] map;
char[] s = "something";
map[s] = 1;
s[0] = '?'; // I have no idea what result will be. sure not good.

And you can bump into such problem quite easily in D. I personally
did many times. And too hard to find source sometimes.

In Java such collision is not possible in principle: String is final and
immutable.
A number of languages use the immutable string idiom, and its corollary "always implicitly copy the string when writing to it". They all share another common characteristic - they're slow, and they're slow in a manner that is *not fixable*. And they're not just slower by a factor, many algorithms run *exponentially* slower because of the copying. D must be fast, and the only way to be fast with strings (and arrays) is to not have the language implicitly copy them, but to allow the programmer the flexibility to copy or not copy. To know when to copy, use the Copy On Write principle (COW). That is, if you're not *sure* you've got the only copy of a string, .dup it before modifying it. So why isn't that just as bad as the languages that implicitly copy on write? The answer is that often, you know that you are the sole owner, such as: char[] s = new char[10]; for (i = 0; i < 10; i++) s[i] = 'c';
May be I'm dummy, but I don't see in this example why this other languages must copy it 10 times. For my implementation of reference counted string in my C++ project, copy will be performed also 0 times. And if there is more then 1 reference to instance exsits it's only one copy operation will be performed. I see only one advantage in current implementation of string - not need to check or increment/decrement reference counter, but instead of this string duplication is required
 
 Those other languages are doomed to make 10 copies of s. The D programmer
 needs to make 0 copies.
 
 As to your example above, when you pass a reference to a string to an
 associative array, then you aren't the sole owner of that string anymore.
 Don't change it. .dup it.
 
 
May 31 2005
parent reply "Walter" <newshound digitalmars.com> writes:
"Eugene Pelekhay" <pelekhay gmail.com> wrote in message
news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required
You're right that you can avoid excessive copying by doing ref counting. Reference counting carries with it other penalties - storage must be allocated for the ref count, every copy increments the count, and every reference that goes out of scope must decrement the count. Add in exception handling, and the price is high (although C++'s mechanisms hide that price from you). Ref counting would make it impractical to do D's array slices. Furthermore, in the presence of garbage collection, layering on top a reference counting mechanism probably means you'll want to ditch the gc and go with a full ref counting architecture for every object. In my experience, such is slower than using mark/sweep gc.
May 31 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Walter" <newshound digitalmars.com> wrote in message 
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required
You're right that you can avoid excessive copying by doing ref counting. Reference counting carries with it other penalties - storage must be allocated for the ref count, every copy increments the count, and every reference that goes out of scope must decrement the count. Add in exception handling, and the price is high (although C++'s mechanisms hide that price from you). Ref counting would make it impractical to do D's array slices. Furthermore, in the presence of garbage collection, layering on top a reference counting mechanism probably means you'll want to ditch the gc and go with a full ref counting architecture for every object. In my experience, such is slower than using mark/sweep gc.
Yes, GC does good job. In some places. In other places ref-counting is better. Ideal language shall allow to use both. Dot. Everything has its own price: as much objects allocated (your .dup advice) as slow their scanning will be by GC. And I am not sure what is faster in fact in big picture - ref-counting for strings or GC. rather defeat. At least in real life projects I can test by hands. But in abstract tests - everything is just perfect. I know only one: as less GC cycle as better. As it locks everything and at unpredictable moment. ref-counting has price but this price is acceptable as it is predictable and accountible and equally spreaded. The best solution is as always - in the middle - in the balance between GC and not-GC. If I have vector of passive elements (chars) I would go with ref-countng for creating envelope safe to pass back and forth. If I have container of active elements (objects) with complex and sometimes unknown system of relationship I'll go with GC to avoid headaches with cyclic references and so on and broken pointers. Strings are strange types, they are both : wave and particle - scalar and aggregate at the same time. String as a wrapper-owner of character buffer allows somehow (not ideally!) to work with the string using its both forms, balancing between str1 = str2, str1 == str2 and str1.ptr == str2.ptr. Back to const. Having bultin arrays and slicing now creates *prerequisites* of optimal or suboptimal string handling. But e.g. slicing is just nothing without const ( for strings especially). See: I've found some string fragment and passed it to some function. This function does something and is passing it further. All these functions were built with good intentions and good programmers. But these programmers live in 12 hours timezone shift . The only one feasible way for them is self documenting code. Someone thinked that this particular string is safe to zero terminate it. Everything is ruined. To find source of it is not trivial. I bet that second time when it will happen D will be dead for the project. When it happened for me first time I've decided to do a string wrapper emulating constness. JUST NO WAY IN D. not technically nor theoretically. Neither '=' overload (to implement ownership and refcounting) nor const. Nothing. Dead corner. char[] is not a string - it is array of chars. Pattern of string use is quite different from array. As a rule array is a heart of some container and pretty frequently already wrapped. But strings are flying everywhere. D shall have const for arrays and pointers to be considered as a language for teams and serious projects. IMHO.
Jun 01 2005
parent reply Kramer <Kramer_member pathlink.com> writes:
Well put.  Again, I think a point has been made for having a facility in the
language to say "this thing shouldn't change value".  I understand that some
devious programmer can find a way to change something that the compiler verified
shouldn't be changed.  But I think that programmer is in the minority and the
majority of programmers could use some self-documenting help from the
language/compiler (and the devious programmer is specifically and intentionally
going outside of the program specification, which I'd reckon could somehow be
done in any language).

Again, my $0.02.  But I think many have put in their $0.02 and will continue to
do so because many believe it's an important concept.  When will all the $0.02
contributions add up to be enough?

-Kramer

In article <d7li3v$2psu$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Walter" <newshound digitalmars.com> wrote in message 
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required
You're right that you can avoid excessive copying by doing ref counting. Reference counting carries with it other penalties - storage must be allocated for the ref count, every copy increments the count, and every reference that goes out of scope must decrement the count. Add in exception handling, and the price is high (although C++'s mechanisms hide that price from you). Ref counting would make it impractical to do D's array slices. Furthermore, in the presence of garbage collection, layering on top a reference counting mechanism probably means you'll want to ditch the gc and go with a full ref counting architecture for every object. In my experience, such is slower than using mark/sweep gc.
Yes, GC does good job. In some places. In other places ref-counting is better. Ideal language shall allow to use both. Dot. Everything has its own price: as much objects allocated (your .dup advice) as slow their scanning will be by GC. And I am not sure what is faster in fact in big picture - ref-counting for strings or GC. rather defeat. At least in real life projects I can test by hands. But in abstract tests - everything is just perfect. I know only one: as less GC cycle as better. As it locks everything and at unpredictable moment. ref-counting has price but this price is acceptable as it is predictable and accountible and equally spreaded. The best solution is as always - in the middle - in the balance between GC and not-GC. If I have vector of passive elements (chars) I would go with ref-countng for creating envelope safe to pass back and forth. If I have container of active elements (objects) with complex and sometimes unknown system of relationship I'll go with GC to avoid headaches with cyclic references and so on and broken pointers. Strings are strange types, they are both : wave and particle - scalar and aggregate at the same time. String as a wrapper-owner of character buffer allows somehow (not ideally!) to work with the string using its both forms, balancing between str1 = str2, str1 == str2 and str1.ptr == str2.ptr. Back to const. Having bultin arrays and slicing now creates *prerequisites* of optimal or suboptimal string handling. But e.g. slicing is just nothing without const ( for strings especially). See: I've found some string fragment and passed it to some function. This function does something and is passing it further. All these functions were built with good intentions and good programmers. But these programmers live in 12 hours timezone shift . The only one feasible way for them is self documenting code. Someone thinked that this particular string is safe to zero terminate it. Everything is ruined. To find source of it is not trivial. I bet that second time when it will happen D will be dead for the project. When it happened for me first time I've decided to do a string wrapper emulating constness. JUST NO WAY IN D. not technically nor theoretically. Neither '=' overload (to implement ownership and refcounting) nor const. Nothing. Dead corner. char[] is not a string - it is array of chars. Pattern of string use is quite different from array. As a rule array is a heart of some container and pretty frequently already wrapped. But strings are flying everywhere. D shall have const for arrays and pointers to be considered as a language for teams and serious projects. IMHO.
Jun 01 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 ....for having a facility in the
 language to say "this thing shouldn't change value".
exactly. It is enough to have const T[] and const T* as distinct types from just T[] and T* . const T[] type has no opIndexAssign, length(int) and cannot be lvalue at all. Simple as 1-2-3. I really don't understand what is the motivation to do not have them. String literals are const char[] by definition. Andrew. "Kramer" <Kramer_member pathlink.com> wrote in message news:d7ljra$2qvj$1 digitaldaemon.com...
 Well put.  Again, I think a point has been made for having a facility in 
 the
 language to say "this thing shouldn't change value".  I understand that 
 some
 devious programmer can find a way to change something that the compiler 
 verified
 shouldn't be changed.  But I think that programmer is in the minority and 
 the
 majority of programmers could use some self-documenting help from the
 language/compiler (and the devious programmer is specifically and 
 intentionally
 going outside of the program specification, which I'd reckon could somehow 
 be
 done in any language).

 Again, my $0.02.  But I think many have put in their $0.02 and will 
 continue to
 do so because many believe it's an important concept.  When will all the 
 $0.02
 contributions add up to be enough?

 -Kramer

 In article <d7li3v$2psu$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Walter" <newshound digitalmars.com> wrote in message
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required
You're right that you can avoid excessive copying by doing ref counting. Reference counting carries with it other penalties - storage must be allocated for the ref count, every copy increments the count, and every reference that goes out of scope must decrement the count. Add in exception handling, and the price is high (although C++'s mechanisms hide that price from you). Ref counting would make it impractical to do D's array slices. Furthermore, in the presence of garbage collection, layering on top a reference counting mechanism probably means you'll want to ditch the gc and go with a full ref counting architecture for every object. In my experience, such is slower than using mark/sweep gc.
Yes, GC does good job. In some places. In other places ref-counting is better. Ideal language shall allow to use both. Dot. Everything has its own price: as much objects allocated (your .dup advice) as slow their scanning will be by GC. And I am not sure what is faster in fact in big picture - ref-counting for strings or GC. rather defeat. At least in real life projects I can test by hands. But in abstract tests - everything is just perfect. I know only one: as less GC cycle as better. As it locks everything and at unpredictable moment. ref-counting has price but this price is acceptable as it is predictable and accountible and equally spreaded. The best solution is as always - in the middle - in the balance between GC and not-GC. If I have vector of passive elements (chars) I would go with ref-countng for creating envelope safe to pass back and forth. If I have container of active elements (objects) with complex and sometimes unknown system of relationship I'll go with GC to avoid headaches with cyclic references and so on and broken pointers. Strings are strange types, they are both : wave and particle - scalar and aggregate at the same time. String as a wrapper-owner of character buffer allows somehow (not ideally!) to work with the string using its both forms, balancing between str1 = str2, str1 == str2 and str1.ptr == str2.ptr. Back to const. Having bultin arrays and slicing now creates *prerequisites* of optimal or suboptimal string handling. But e.g. slicing is just nothing without const ( for strings especially). See: I've found some string fragment and passed it to some function. This function does something and is passing it further. All these functions were built with good intentions and good programmers. But these programmers live in 12 hours timezone shift . The only one feasible way for them is self documenting code. Someone thinked that this particular string is safe to zero terminate it. Everything is ruined. To find source of it is not trivial. I bet that second time when it will happen D will be dead for the project. When it happened for me first time I've decided to do a string wrapper emulating constness. JUST NO WAY IN D. not technically nor theoretically. Neither '=' overload (to implement ownership and refcounting) nor const. Nothing. Dead corner. char[] is not a string - it is array of chars. Pattern of string use is quite different from array. As a rule array is a heart of some container and pretty frequently already wrapped. But strings are flying everywhere. D shall have const for arrays and pointers to be considered as a language for teams and serious projects. IMHO.
Jun 01 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Wed, 1 Jun 2005 17:55:04 -0700:
 ....for having a facility in the
 language to say "this thing shouldn't change value".
exactly. It is enough to have const T[] and const T* as distinct types from just T[] and T* . const T[] type has no opIndexAssign, length(int) and cannot be lvalue at all. Simple as 1-2-3. I really don't understand what is the motivation to do not have them. String literals are const char[] by definition.
Sadly it's not that simple. The content of the string literals are known at compile time and can thus be placed in an OS-protected memory area. If the array content is mutable by default, the const attribute for arrays is a mere suggestion. Do a bit of pointer math or store arrays as elements in other arrays and the const attribute loses it's effect. If the array content is by default immutable - that is, once an element is set it can't be changed - the "mutable" attribute could be used to allow the editing of array element. In difference to the current system the compiler could only allow those cases where it can _prove_ that the array is mutable. What happens if you store pointers/object references in the array?! In addition this has some negative impact for mixed closed-open source projects as the compiler would have to treat all arrays comming from the closed source part as immutable. The third way is to allow only assigning and not changeing any var. It's neat but pointer math gets in the way again. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCnsBx3w+/yD4P9tIRAj+BAKCZTrUBYd4ARfWuxMcmN9126dyVmQCcDI5/ St2gvmZjNrzBbITklIQYb+g= =WdJv -----END PGP SIGNATURE-----
Jun 02 2005
next sibling parent reply kris <fu bar.org> writes:
Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array element. 
Now there's a different idea. Talk about CoW enforcement :-)
 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!
Doesn't have to prove anything if the mutable aspect is part of the type; right, Thomas? Aliases upon the array still have to go through the same type-matching procedure; Yes? If you cast an immutable type to a mutable type, then all bets are off; just as they are with *cast(void *)0 = 0;
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
Jun 02 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"kris" <fu bar.org> wrote in message news:d7mbac$c2d$1 digitaldaemon.com...
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array element.
Now there's a different idea. Talk about CoW enforcement :-)
 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!
Doesn't have to prove anything if the mutable aspect is part of the type; right, Thomas? Aliases upon the array still have to go through the same type-matching procedure; Yes?
Thanks, Kris. This is exactly what I have in my mind.
 If you cast an immutable type to a mutable type, then all bets are off; 
 just as they are with *cast(void *)0 = 0;

 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
This "all arrays comming from the closed source part as immutable." sounds like Ride of the Valkyries from Wagner. I can feel two layers of sense there but only managed to get one :). Thomas, for the D sake, what it was all about? Andrew.
Jun 02 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:51:30 -0700:
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
This "all arrays comming from the closed source part as immutable." sounds like Ride of the Valkyries from Wagner. I can feel two layers of sense there but only managed to get one :). Thomas, for the D sake, what it was all about?
If mutability is a "suggestion" than there is no problem with open source projects that use close source libs. However if mutability is to be strictly enforced, we would either need runtime checks or would have to treat the arrays from the closed source part as immutable. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCnuJO3w+/yD4P9tIRApxdAJ9f/18fE6/lblc/VPVItHbhkR9JXQCdGW/E IUXVTD7HHXPJy0qhpHxFfLE= =b0Hv -----END PGP SIGNATURE-----
Jun 02 2005
next sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:eb43n2-na4.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:51:30 -0700:
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
This "all arrays comming from the closed source part as immutable." sounds like Ride of the Valkyries from Wagner. I can feel two layers of sense there but only managed to get one :). Thomas, for the D sake, what it was all about?
If mutability is a "suggestion" than there is no problem with open source projects that use close source libs. However if mutability is to be strictly enforced, we would either need runtime checks or would have to treat the arrays from the closed source part as immutable. Thomas
I think I see now what you mean ... Thanks, Thomas.
Jun 02 2005
prev sibling parent reply Brad Beveridge <brad somewhere.net> writes:
Here are my thoughts - if people would like to hear them :)

1) Everyone that has posted to this thread agrees that there needs to be 
a language feature that prevents code from changing the value of a 
variable/array.  Call it readonly, const, whatever.

2) There have been no suggestions on how to actually enforce 
immutability in a manner that is 100% correct - there are always 
loopholes that a devious programmer can go through.  From my point of 
view, the only truely correct way of enforcing constness is to place the 
memory in question under the protection of the hardware MMU - which IMHO 
will be very wasteful on memory & probably non-trivial to implement.

3) C++ errors at compile time for const violations and that is generally 
what people would like from D also.

I think that we need to face reality - there is no easy solution that is 
actually 100% correct.  However, since D is a practical language we 
don't need to be 100% correct, we only need enough protection to prevent 
accidental bugs.  Given a language that allows pointers, a programmer 
will be able to break constness.  If you need to write a completely 
robust library that malicious programmers will _try_ to break, then 
relying on const will never cut it.  You must add your own protection 
mechanisms - like always duping memory before returning it.  But I would 
say that this kind of environment is extremely rare.

So I think that a simple const mechanism that aims only to help prevent 
bugs, and offers no actual assurances about true constness, ought to be 
enough to satisfy most people.

I am sure that I don't grasp the full implications of how to implement 
this, but honestly it doesn't sound that hard.  I've attached a simple 
example that very nearly does simple protection by only using the 
built-in type system.  If you could implicitly promote char -> 
const_char then this would pretty much work.

Does this sound workable - or am I being totally naive?

Brad

import std.stdio;

typedef char const_char;

int main (char[][] arg)
{
     char[] plain = "This is a plain char string";
     const_char[] strange = cast(const_char[])"This is a const_char string";

     plain = foo(plain);
     strange = foo(strange);

     // no implicit casting, so the below doesn't work
     //char[] test = strange;
     //const_char[] test2 = plain;
     return 0;
}

char[] foo(char[] f)
{
     // normally, writefln would only take const_char[]'s
     // but char[] would be allowed to be implicitly promoted to 
const_char[]
     writefln ("char[] is ", f);
     return f;
}

const_char[] foo(const_char[] f)
{
     // normally, writefln would only take const_char[]'s - so
     // wouldn't need to cast
     writefln ("const_char[] is ", cast(char[])f);
     return f;
}
Jun 02 2005
next sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7nbhs$1p2o$1 digitaldaemon.com...
 Here are my thoughts - if people would like to hear them :)

 1) Everyone that has posted to this thread agrees that there needs to be a 
 language feature that prevents code from changing the value of a 
 variable/array.  Call it readonly, const, whatever.

 2) There have been no suggestions on how to actually enforce immutability 
 in a manner that is 100% correct - there are always loopholes that a 
 devious programmer can go through.  From my point of view, the only truely 
 correct way of enforcing constness is to place the memory in question 
 under the protection of the hardware MMU - which IMHO will be very 
 wasteful on memory & probably non-trivial to implement.

 3) C++ errors at compile time for const violations and that is generally 
 what people would like from D also.

 I think that we need to face reality - there is no easy solution that is 
 actually 100% correct.  However, since D is a practical language we don't 
 need to be 100% correct, we only need enough protection to prevent 
 accidental bugs.  Given a language that allows pointers, a programmer will 
 be able to break constness.  If you need to write a completely robust 
 library that malicious programmers will _try_ to break, then relying on 
 const will never cut it.  You must add your own protection mechanisms - 
 like always duping memory before returning it.  But I would say that this 
 kind of environment is extremely rare.

 So I think that a simple const mechanism that aims only to help prevent 
 bugs, and offers no actual assurances about true constness, ought to be 
 enough to satisfy most people.
Exactly. And again, we don't need it at the scale it made in C++ . Classes can be protected "by hands", and in fact protection for class instnaces is already there: setter/getters, private/public, etc. Again just need const for T[] and T*. See: char[] is atomic type in terms of that it is builtin in D. But they are implemented differently from other scalars- they are references to some memory locations. And this referencing adds one more 'dimension' requirement - readonlyness. This is exactly as function parameter attributes: in, out. Walter said A but is shy to say B. Or I don't understand something. Andrew.
 I am sure that I don't grasp the full implications of how to implement 
 this, but honestly it doesn't sound that hard.  I've attached a simple 
 example that very nearly does simple protection by only using the built-in 
 type system.  If you could implicitly promote char -> const_char then this 
 would pretty much work.

 Does this sound workable - or am I being totally naive?

 Brad

 import std.stdio;

 typedef char const_char;

 int main (char[][] arg)
 {
     char[] plain = "This is a plain char string";
     const_char[] strange = cast(const_char[])"This is a const_char 
 string";

     plain = foo(plain);
     strange = foo(strange);

     // no implicit casting, so the below doesn't work
     //char[] test = strange;
     //const_char[] test2 = plain;
     return 0;
 }

 char[] foo(char[] f)
 {
     // normally, writefln would only take const_char[]'s
     // but char[] would be allowed to be implicitly promoted to 
 const_char[]
     writefln ("char[] is ", f);
     return f;
 }

 const_char[] foo(const_char[] f)
 {
     // normally, writefln would only take const_char[]'s - so
     // wouldn't need to cast
     writefln ("const_char[] is ", cast(char[])f);
     return f;
 } 
Jun 02 2005
prev sibling parent reply Sean Kelly <sean f4.ca> writes:
In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is 
actually 100% correct.  However, since D is a practical language we 
don't need to be 100% correct, we only need enough protection to prevent 
accidental bugs.
..
So I think that a simple const mechanism that aims only to help prevent 
bugs, and offers no actual assurances about true constness, ought to be 
enough to satisfy most people.
My impression of Walter is that he doesn't like solutions that are just okay, particularly if they add significant compiler complexity. And while I think this is probably feasible for arrays, it's a slippery slope from there to logical const-ness for user defined objects. Personally, I'll take anything I can get, but I think the likelihood we'll see this for 1.0 is quite slim. Sean
Jun 02 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.
..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.
My impression of Walter is that he doesn't like solutions that are just okay, particularly if they add significant compiler complexity.
Ummm... Is it in fact so complex? I think everything needed is already there.
 And while I think
 this is probably feasible for arrays, it's a slippery slope from there to
 logical const-ness for user defined objects.
Don't need this for UDO. Such objects have all needed for encapsulation/ protection already. Basic types in contrary are naked now. You cannot define methods/envelopes for them. const is not 100% solution in terms of real protection. But there are no 100% solutions at all, even in D. It is a compromise and a good one - it costs nothing in runtime and almost nothing in compile time.
Personally, I'll take anything I
 can get, but I think the likelihood we'll see this for 1.0 is quite slim.
Too bad if we will have not this in 1.0. Without const for arrays and pointers D feature set is incomplete. You can write word counters spanning two three pages but for serious projects with many developers involved const (especially for strings) is must have feature. I am managing GUI projects last 10 years and I am certain. Fighting with three bugs in Harmonia took me one week. Two of them was about stirng/array corruptions. And I was designing it by myself! I don't want to share this experience with other team members where probability of such things is higher. Again D is a good language. I would say D is near perfect especially for GUI. But this particular area (constness of builtin types) is not finished - as there are no workarounds at all to protect references. Advices to spread .dups everywhere I will left for .NET/Java community - there they will be accepted. Andrew.
Jun 02 2005
parent reply Sean Kelly <sean f4.ca> writes:
In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.
..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.
My impression of Walter is that he doesn't like solutions that are just okay, particularly if they add significant compiler complexity.
Ummm... Is it in fact so complex? I think everything needed is already there.
I don't think it is complex. I was thinking of object const-ness when I wrote this. Sorry for the confusion.
 And while I think
 this is probably feasible for arrays, it's a slippery slope from there to
 logical const-ness for user defined objects.
Don't need this for UDO. Such objects have all needed for encapsulation/ protection already. Basic types in contrary are naked now. You cannot define methods/envelopes for them. const is not 100% solution in terms of real protection. But there are no 100% solutions at all, even in D. It is a compromise and a good one - it costs nothing in runtime and almost nothing in compile time.
I disagree that UDO have what's needed in terms of protection. Though logical const-ness is far from simple to implement. Sean
Jun 02 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nod5$263o$1 digitaldaemon.com...
 In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Sean Kelly" <sean f4.ca> wrote in message
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.
..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.
My impression of Walter is that he doesn't like solutions that are just okay, particularly if they add significant compiler complexity.
Ummm... Is it in fact so complex? I think everything needed is already there.
I don't think it is complex. I was thinking of object const-ness when I wrote this. Sorry for the confusion.
Why sorry? Anyway...
 And while I think
 this is probably feasible for arrays, it's a slippery slope from there 
 to
 logical const-ness for user defined objects.
Don't need this for UDO. Such objects have all needed for encapsulation/ protection already. Basic types in contrary are naked now. You cannot define methods/envelopes for them. const is not 100% solution in terms of real protection. But there are no 100% solutions at all, even in D. It is a compromise and a good one - it costs nothing in runtime and almost nothing in compile time.
I disagree that UDO have what's needed in terms of protection. Though logical const-ness is far from simple to implement.
In user defined objects at least you can say 'protect'. Protect some operations and functions from outside This is compile time protection and not runtime! . In runtime you can relatively easy get a pointer to private variable and change it outside. Following logic that we cannot reach 100% protection we shall remove this 'private', 'package' etc. What for they were introduced? They are not 100% protecting!. You can do dynamic protection for UDO : you can naturally implement: "in this particular state of object this particular property is immutable, etc." Right? What we all want to have is simple: "by using const for arrays and pointers we want to make some methods of const types 'private' - not accessible". This is it.
 Sean

 
Jun 02 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
Ok, side question, assume we're going to use 'const' (or any keyword) for  
array contents (something I've asked for in the past), how do we write it?

As it stands:
   const char[] bob;

'currently' makes the "array reference" constant. Would we then need:
   char[] const bob;

or perhaps
   char[const] bob;

or maybe as it's const it has to have an initialiser:
   char[] bob = "test";

(assuming of course that "test" is automatically constant)
or would that give an error and require the const keyword somewhere, as in  
one of the first two ideas.

Regan

On Thu, 2 Jun 2005 13:48:06 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Sean Kelly" <sean f4.ca> wrote in message
 news:d7nod5$263o$1 digitaldaemon.com...
 In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
 "Sean Kelly" <sean f4.ca> wrote in message
 news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
 I think that we need to face reality - there is no easy solution  
 that is
 actually 100% correct.  However, since D is a practical language we
 don't need to be 100% correct, we only need enough protection to  
 prevent
 accidental bugs.
..
 So I think that a simple const mechanism that aims only to help  
 prevent
 bugs, and offers no actual assurances about true constness, ought to  
 be
 enough to satisfy most people.
My impression of Walter is that he doesn't like solutions that are just okay, particularly if they add significant compiler complexity.
Ummm... Is it in fact so complex? I think everything needed is already there.
I don't think it is complex. I was thinking of object const-ness when I wrote this. Sorry for the confusion.
Why sorry? Anyway...
 And while I think
 this is probably feasible for arrays, it's a slippery slope from there
 to
 logical const-ness for user defined objects.
Don't need this for UDO. Such objects have all needed for encapsulation/ protection already. Basic types in contrary are naked now. You cannot define methods/envelopes for them. const is not 100% solution in terms of real protection. But there are no 100% solutions at all, even in D. It is a compromise and a good one - it costs nothing in runtime and almost nothing in compile time.
I disagree that UDO have what's needed in terms of protection. Though logical const-ness is far from simple to implement.
In user defined objects at least you can say 'protect'. Protect some operations and functions from outside This is compile time protection and not runtime! . In runtime you can relatively easy get a pointer to private variable and change it outside. Following logic that we cannot reach 100% protection we shall remove this 'private', 'package' etc. What for they were introduced? They are not 100% protecting!. You can do dynamic protection for UDO : you can naturally implement: "in this particular state of object this particular property is immutable, etc." Right? What we all want to have is simple: "by using const for arrays and pointers we want to make some methods of const types 'private' - not accessible". This is it.
 Sean
Jun 02 2005
parent reply Brad Beveridge <brad somewhere.net> writes:
Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) 
 for  array contents (something I've asked for in the past), how do we 
 write it?
 
I like const char[] bob; since we aren't C++, I don't think that we should split hairs quite so much about where the const (or readonly) keyword goes. In D "const char[] bob", means that the contents of bob cannot be altered, the reference can't be changed, and you cannot slice out a chunk of bob unless you are slicing into another const char[]. Well it could mean that :) It could also mean anything else! Brad
Jun 02 2005
next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) for 
 array contents (something I've asked for in the past), how do we write 
 it?
I like const char[] bob; since we aren't C++, I don't think that we should split hairs quite so much about where the const (or readonly) keyword goes. In D "const char[] bob", means that the contents of bob cannot be altered, the reference can't be changed, and you cannot slice out a chunk of bob unless you are slicing into another const char[]. Well it could mean that :) It could also mean anything else! Brad
Yep, only one: "the reference can't be changed" I think this is too strict. const char[] Dolli = "McArtur"; // fine /// mariage happens Dolli = "O'Connor"; // shoud be also fine. /// but attempt to break Dolli's "private parts" ((C) Booch) - to change /// value iself will be an erro:r Dolli[0] = '\0'; /// ERROR
Jun 02 2005
next sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 2 Jun 2005 14:40:33 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Brad Beveridge" <brad somewhere.net> wrote in message
 news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword)  
 for
 array contents (something I've asked for in the past), how do we write
 it?
I like const char[] bob; since we aren't C++, I don't think that we should split hairs quite so much about where the const (or readonly) keyword goes. In D "const char[] bob", means that the contents of bob cannot be altered, the reference can't be changed, and you cannot slice out a chunk of bob unless you are slicing into another const char[]. Well it could mean that :) It could also mean anything else! Brad
Yep, only one: "the reference can't be changed" I think this is too strict. const char[] Dolli = "McArtur"; // fine /// mariage happens Dolli = "O'Connor"; // shoud be also fine. /// but attempt to break Dolli's "private parts" ((C) Booch) - to change /// value iself will be an erro:r Dolli[0] = '\0'; /// ERROR
I don't understand what you're saying. Which of these, if any, do you think we need to be able to do: 1 - have a constant reference, to non-constant data 2 - have a non-constant reference, to constant data 3 - have a constant reference, to constant data Bear in mind, when I say: "constant reference" I mean a char[] which cannot be assigned to, or it's length changed. eg. char[] foo = "bar"; foo = foo[1..3]; //illegal foo.length = foo.length + 10; //illegal foo[0] = 'a' //ok "constant data" I mean a char[] whose referenced data cannot be modified, eg. char[] foo = "bar"; foo = foo[1..3]; //ok foo.length = foo.length + 10; //ok foo[0] = 'a' //illegal Regan
Jun 02 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Regan Heath" <regan netwin.co.nz> wrote in message 
news:opsrrjkrfx23k2f5 nrage.netwin.co.nz...
 On Thu, 2 Jun 2005 14:40:33 -0700, Andrew Fedoniouk 
 <news terrainformatica.com> wrote:
 "Brad Beveridge" <brad somewhere.net> wrote in message
 news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) 
 for
 array contents (something I've asked for in the past), how do we write
 it?
I like const char[] bob; since we aren't C++, I don't think that we should split hairs quite so much about where the const (or readonly) keyword goes. In D "const char[] bob", means that the contents of bob cannot be altered, the reference can't be changed, and you cannot slice out a chunk of bob unless you are slicing into another const char[]. Well it could mean that :) It could also mean anything else! Brad
Yep, only one: "the reference can't be changed" I think this is too strict. const char[] Dolli = "McArtur"; // fine /// mariage happens Dolli = "O'Connor"; // shoud be also fine. /// but attempt to break Dolli's "private parts" ((C) Booch) - to change /// value iself will be an erro:r Dolli[0] = '\0'; /// ERROR
I don't understand what you're saying. Which of these, if any, do you think we need to be able to do: 1 - have a constant reference, to non-constant data 2 - have a non-constant reference, to constant data 3 - have a constant reference, to constant data
you can always wrap references in something. Or you can choose to use in, inout, etc to passing references. But values they are referring to are not protected. Makes sense?
 Bear in mind, when I say:

 "constant reference" I mean a char[] which cannot be assigned to, or it's 
 length changed. eg.

 char[] foo = "bar";
 foo = foo[1..3]; //illegal
 foo.length = foo.length + 10; //illegal
 foo[0] = 'a' //ok

 "constant data" I mean a char[] whose referenced data cannot be modified, 
 eg.

 char[] foo = "bar";
 foo = foo[1..3]; //ok
 foo.length = foo.length + 10; //ok
 foo[0] = 'a' //illegal

 Regan 
Jun 02 2005
parent "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 2 Jun 2005 15:16:00 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Regan Heath" <regan netwin.co.nz> wrote in message
 news:opsrrjkrfx23k2f5 nrage.netwin.co.nz...
 I don't understand what you're saying. Which of these, if any, do you
 think we need to be able to do:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data
you can always wrap references in something. Or you can choose to use in, inout, etc to passing references.
Yet 'in' does not prevent you from changing the parameter (a copy of the real reference) so you often get bugs where the programmer has desired change, implemented it, seen no errors from the compiler, and yet it fails. It would be nice to prevent this. It seems a simple change to have the compiler treat 'in' as 'const in' erroring on changes to the parameter itself.
 But values they are referring to are not protected.


 Makes sense?
Perfect. I'm just not convinced we don't want all 3, yet. Regan
Jun 02 2005
prev sibling parent reply Brad Beveridge <brad somewhere.net> writes:
Andrew Fedoniouk wrote:

 
 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.
 
 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR
Though I think I get your point - this just doesn't feel right to me. Assume for a moment that "const" in D means the simplest thing - the reference cannot be changed and the data cannot be changed. Your example can be rewritten as: char[] Dolli = "McArtur"; const char[] safeDolli = Dolli; // do things that aren't allowed to change safeDolli // Dolli gets married Dolli = "O'Connor"; // safeDolli also is now "O'Connor" I am all for things being simple, and to me the simplest use of const is to make both the reference and the data immutable. Brad
Jun 02 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7o3d9$2fqb$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.

 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR
Though I think I get your point - this just doesn't feel right to me. Assume for a moment that "const" in D means the simplest thing - the reference cannot be changed and the data cannot be changed. Your example can be rewritten as: char[] Dolli = "McArtur"; const char[] safeDolli = Dolli; // do things that aren't allowed to change safeDolli // Dolli gets married Dolli = "O'Connor"; // safeDolli also is now "O'Connor" I am all for things being simple, and to me the simplest use of const is to make both the reference and the data immutable.
Take a look on this: for( const int* p = ...; p < end; ++p ) { } You can enumerate but you cannot change. Again, there are mechanisms for practical implementations of const references in D now e.g.: in, inout, out for parameters but there are no convenient and effective ways to protect reference values. Also if you have const ref on const data then you will not be able to do: foo( out const char[] ) which is rare but desireable use case. Andrew.
 Brad 
Jun 02 2005
parent Brad Beveridge <brad somewhere.net> writes:
Andrew Fedoniouk wrote:

 You can enumerate but you cannot change.
 
 Again, there are mechanisms for practical implementations of const 
 references in D now
 e.g.: in, inout, out for parameters
 
 but there are no convenient and effective ways to protect reference values.
 
 Also if you have const ref on const data then you will not be able to do:
 
 foo( out const char[]  ) which is rare but desireable use case.
 
You have convinced me :) Const reference + const data is too simplistic. Brad
Jun 02 2005
prev sibling parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 02 Jun 2005 14:25:28 -0700, Brad Beveridge <brad somewhere.net>  
wrote:
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword)  
 for  array contents (something I've asked for in the past), how do we  
 write it?
I like const char[] bob; since we aren't C++, I don't think that we should split hairs quite so much about where the const (or readonly) keyword goes.
I wasn't concerned so much with 'where' to type the keyword, but rather whether we need to be able to: 1 - have a constant reference, to non-constant data 2 - have a non-constant reference, to constant data 3 - have a constant reference, to constant data If we need all of them, then: "const char[] bob" - to me, means const ref (as it currently does in D) "char[const] bob" - (my fav suggestion so far) to me, means const data "const char[const] bob" - const ref, const data
 In D "const char[] bob", means that the contents of bob cannot be  
 altered, the reference can't be changed
 , and you cannot slice out a chunk of bob unless you are slicing into  
 another const char[].
That would be illegal, surely? "another const char[]" is a constant reference, so you cannot change it - unless you were thinking we might lift this restriction at initialisation time, eg. const char[] abc = "abc"; const char[] bob = abc[0..2];
 Well it could mean that :)  It could also mean anything else!
Yeah, but we do have to make some temporary decisions in order to come to a temporary solution/idea in order to test it, before we can make any decisions. Regan
Jun 02 2005
parent reply Brad Beveridge <brad somewhere.net> writes:
Regan Heath wrote:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data
 
 If we need all of them, then:
 
 "const char[] bob" - to me, means const ref (as it currently does in D)
 "char[const] bob"  - (my fav suggestion so far) to me, means const data
 "const char[const] bob" - const ref, const data
 
 In D "const char[] bob", means that the contents of bob cannot be  
 altered, the reference can't be changed
Sorry, I don't like "char[const] bob", simply because I feel it is too close to the AA syntax, eg, what is "char[const int] bob" going to do?
 
 , and you cannot slice out a chunk of bob unless you are slicing into  
 another const char[].
That would be illegal, surely? "another const char[]" is a constant reference, so you cannot change it - unless you were thinking we might lift this restriction at initialisation time, eg.
I was thinking more along the lines of const char[] str = "This is a string"; void foo (const char[] f){...} void foo (char[] f){...} foo(str[0..4]); // slices and calls the foo(const char[]) func Brad
Jun 02 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Thu, 02 Jun 2005 15:07:42 -0700, Brad Beveridge <brad somewhere.net>  
wrote:
 Regan Heath wrote:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data
  If we need all of them, then:
  "const char[] bob" - to me, means const ref (as it currently does in D)
 "char[const] bob"  - (my fav suggestion so far) to me, means const data
 "const char[const] bob" - const ref, const data

 In D "const char[] bob", means that the contents of bob cannot be   
 altered, the reference can't be changed
would err on the side of caution and, presuming no reason against, make it possible.
 Sorry, I don't like "char[const] bob", simply because I feel it is too  
 close to the AA syntax, eg, what is "char[const int] bob" going to do?
Good question. Options are AFAICS: - illegal (raising the question of how to define constant data for an AA) - keys to AA are const (not sure what that might mean yet) I don't like either option myself.. So, exploring a syntax for enabling all 3 options, it looks like we have: Presuming we need all 3, any problems with this syntax? If not, the next question this raises is how/when is 'y' initialised. I recall Matthew (I think - don't quote me) asking for a 'const' that would allow initialisation in a constructor (class, or static module).
 , and you cannot slice out a chunk of bob unless you are slicing into   
 another const char[].
That would be illegal, surely? "another const char[]" is a constant reference, so you cannot change it - unless you were thinking we might lift this restriction at initialisation time, eg.
I was thinking more along the lines of const char[] str = "This is a string"; void foo (const char[] f){...} void foo (char[] f){...} foo(str[0..4]); // slices and calls the foo(const char[]) func
Ahh, that is in fact the same thing :) The slice is creating a new array, which is then initialised. So the rule might be "const arrays can be assigned only when they are initialised." The passing of the new const char[] into the function would follow normal rules from then on, as in, it's a const array so it's fine. Aside: AS non const char[] could be passed to this function, as non const can implicitly be promoted to const (but not vice-versa). Regan
Jun 02 2005
next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

[snip]

 So, exploring a syntax for enabling all 3 options, it looks like we have:
 



I'm getting confused now; sorry. Are these three things mean ... -- Derek Melbourne, Australia 3/06/2005 8:53:25 AM
Jun 02 2005
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we  
 have:




I'm getting confused now; sorry. Are these three things mean ...
Yep. Assuming we need all 3 of them. Regan
Jun 02 2005
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Regan Heath" <regan netwin.co.nz> wrote in message 
news:opsrrmmoii23k2f5 nrage.netwin.co.nz...
 On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we 
 have:




I'm getting confused now; sorry. Are these three things mean ...
Yep. Assuming we need all 3 of them. Regan
Let's just have It is quite enough. always can be expressed by other methods . E.g. in D if you want to have non-changeable reference field in the class you can always do class Foo { private char[] _bar; const char[] bar() { return _bar; } } and this is it.
Jun 02 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:d7o4tf$2h13$1 digitaldaemon.com...
 "Regan Heath" <regan netwin.co.nz> wrote in message 
 news:opsrrmmoii23k2f5 nrage.netwin.co.nz...
 On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> 
 wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we 
 have:




I'm getting confused now; sorry. Are these three things mean ...
Yep. Assuming we need all 3 of them. Regan
Let's just have
Sorry, above shall be read as:
 It is quite enough.


 always can be expressed by other methods .

 E.g. in D if you want to have non-changeable reference field in the class
 you can always do

 class Foo
 {
    private char[] _bar;

    const char[] bar() { return _bar; }
 }

 and this is it.






 
Jun 02 2005
prev sibling parent Tom S <h3r3tic remove.mat.uni.torun.pl> writes:
Regan Heath wrote:
 So, exploring a syntax for enabling all 3 options, it looks like we have:
 
 <snip />
 
 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data 
What about: It's consistent with the way D declarations are parsed by myBrain(tm) -- Tomasz Stachowiak /+ a.k.a. h3r3tic +/
Jun 02 2005
prev sibling next sibling parent Sean Kelly <sean f4.ca> writes:
In article <d7nra6$291a$1 digitaldaemon.com>, Andrew Fedoniouk says...
In user defined objects at least you can say 'protect'. Protect
some operations and functions from outside
This is compile time protection and not runtime! . In runtime you can
relatively easy get a pointer to private variable and change it outside.

Following logic that we cannot reach 100% protection we shall remove
this 'private', 'package' etc. What for they were introduced? They
are not 100% protecting!.
I was thinking more of something like this: 'val' might be protected, but it can still be altered by calling setVal. Since D has no concept of logical const-ness, there is no way to verify at compile-time that doNotChangeC actually did not change C. Though as I demonstrated in another thread, it is possible to verify this somewhat at compile-time using DBC.
You can do dynamic protection for UDO : you can naturally
implement:"in this particular state of object
this particular property is immutable, etc." Right?
Along those lines, I suppose this is an option:
What we all want to have is simple:
"by using const for arrays and pointers we want to make
some methods of const types 'private' - not accessible". This is it.
True enough. And this shouldn't be very hard to do. Sean
Jun 02 2005
prev sibling parent Sean Kelly <sean f4.ca> writes:
In article <d7nra6$291a$1 digitaldaemon.com>, Andrew Fedoniouk says...
What we all want to have is simple:
"by using const for arrays and pointers we want to make
some methods of const types 'private' - not accessible". This is it.
Actually, this would work for classes as well. Sean
Jun 02 2005
prev sibling parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

kris schrieb am Thu, 02 Jun 2005 00:15:46 -0700:
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array elements. 
Now there's a different idea. Talk about CoW enforcement :-)
 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!
Doesn't have to prove anything if the mutable aspect is part of the type; right, Thomas?
This isn't sufficent. class C{ <immutable> int i; } void bugger(){ C c = new C; c.i = 2; // **ok** <mutable> int* i = &c.i; *i = 1; // ***bug** }
 Aliases upon the array still have to go through the 
 same type-matching procedure; Yes?
In the sense that "<mutable> t" and "<immutable> t" are two distinct types.
 If you cast an immutable type to a mutable type, then all bets are off; 
 just as they are with *cast(void *)0 = 0;
How about enforcing that immutable types can only be casted to immutable types? <immutable> type[] a; <immutable> void* b = cast(<immutable> void*) a; // legal, the target is immutable <mutable> void* c = cast(<mutable> void*) a; // illegal, the target is mutable. This isn't sufficent. Let's rewrite the sample. class C{ <immutable> int i; } void bugger2(){ C c = new C; c.i = 2; // **ok** <immutable> size_t ptrI = cast(<immutabe> size_t)(cast(<immutable> void*)) &c.i); <mutable> size_t ptrM = ptrI; <mutable> void* v = cast(<mutable> void*) ptrM; <mutable> int* i = cast(<mutable> int*) v; *i = 1; // **bug, but legal if mutability is a plain attribute** }
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
This would reduce the the protection to a suggestion. As can be seen: a simple <mutable> attribute isn't a sufficent protection. The compiler would have to do a quite extensive flow analysis to provide even very limited mutable access. If the "default <immutable>" is limited to arrays, how deep would the array be protected? What about pointers as array elements? class D{ <mutable> int i; } <immutable> D[] o; o.length=1; o[0]= new D; // **ok** o[0]= new D; // **bug** o[0].i = 1; // legal or illegal? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCnt1K3w+/yD4P9tIRAuebAKCAgW0XGeL7/5QkZ+GmZnwefI+hzQCfdv6B isJeMx63fCvqJgoxpQhzKAk= =Q3ak -----END PGP SIGNATURE-----
Jun 02 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:a333n2-t84.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 kris schrieb am Thu, 02 Jun 2005 00:15:46 -0700:
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array elements.
Now there's a different idea. Talk about CoW enforcement :-)
 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!
Doesn't have to prove anything if the mutable aspect is part of the type; right, Thomas?
This isn't sufficent. class C{ <immutable> int i; } void bugger(){ C c = new C; c.i = 2; // **ok** <mutable> int* i = &c.i; *i = 1; // ***bug** }
 Aliases upon the array still have to go through the
 same type-matching procedure; Yes?
In the sense that "<mutable> t" and "<immutable> t" are two distinct types.
 If you cast an immutable type to a mutable type, then all bets are off;
 just as they are with *cast(void *)0 = 0;
How about enforcing that immutable types can only be casted to immutable types? <immutable> type[] a; <immutable> void* b = cast(<immutable> void*) a; // legal, the target is immutable <mutable> void* c = cast(<mutable> void*) a; // illegal, the target is mutable. This isn't sufficent. Let's rewrite the sample. class C{ <immutable> int i; } void bugger2(){ C c = new C; c.i = 2; // **ok**
 <immutable> size_t ptrI = cast(<immutabe> size_t)(cast(<immutable> void*)) 
 &c.i);
 <mutable> size_t ptrM = ptrI;
 <mutable> void* v = cast(<mutable> void*) ptrM;
 <mutable> int* i = cast(<mutable> int*) v;
 *i = 1; // **bug, but legal if mutability is a plain attribute**
 }
class C { const int i = 20; // works now and perfectly. it is a compile time // constant and such variable may not have even // location in runtime. } The only one: <immutable> int* ptrI = &someintvar; *ptrI = 20 ; // here compiler must generate error - is not an l-value. And this is it. Enough for most cases. Casting, slicing and dicing to remove constness shall be also possible for masochistic use cases. We shall help good people instead of fighting with bad ones. Only in this case we will have a chance to see transformation of bad guys into good guys. ( Pure Canadian statement :-) Andrew.
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.
Is it still an issue if third-party code can be declared/proto-typed appropriately?
This would reduce the the protection to a suggestion. As can be seen: a simple <mutable> attribute isn't a sufficent protection. The compiler would have to do a quite extensive flow analysis to provide even very limited mutable access. If the "default <immutable>" is limited to arrays, how deep would the array be protected? What about pointers as array elements? class D{ <mutable> int i; } <immutable> D[] o; o.length=1; o[0]= new D; // **ok** o[0]= new D; // **bug** o[0].i = 1; // legal or illegal? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCnt1K3w+/yD4P9tIRAuebAKCAgW0XGeL7/5QkZ+GmZnwefI+hzQCfdv6B isJeMx63fCvqJgoxpQhzKAk= =Q3ak -----END PGP SIGNATURE-----
Jun 02 2005
prev sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:hsr2n2-ic3.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Wed, 1 Jun 2005 17:55:04 -0700:
 ....for having a facility in the
 language to say "this thing shouldn't change value".
exactly. It is enough to have const T[] and const T* as distinct types from just T[] and T* . const T[] type has no opIndexAssign, length(int) and cannot be lvalue at all. Simple as 1-2-3. I really don't understand what is the motivation to do not have them. String literals are const char[] by definition.
Sadly it's not that simple. The content of the string literals are known at compile time and can thus be placed in an OS-protected memory area. If the array content is mutable by default, the const attribute for arrays is a mere suggestion. Do a bit of pointer math or store arrays as elements in other arrays and the const attribute loses it's effect. If the array content is by default immutable - that is, once an element is set it can't be changed - the "mutable" attribute could be used to allow the editing of array element. In difference to the current system the compiler could only allow those cases where it can _prove_ that the array is mutable. What happens if you store pointers/object references in the array?! In addition this has some negative impact for mixed closed-open source projects as the compiler would have to treat all arrays comming from the closed source part as immutable. The third way is to allow only assigning and not changeing any var. It's neat but pointer math gets in the way again. Thomas
Thomas, I am not proposing here any flags changeable in runtime or existing there. I was thinking aloud before about them but not in this message. I did tests already with flags - indeed they do not work in some cases. Let's forget about them and focus on just a const type modifier const T[] and T[] has exactly the same binary layout in runtime. The only difference is on compiler level (as in C++). Lets imagine that we have const char[] t = some_other_array; uint find( const char[] where, const char[] what) { ..... } uint replace( char[] where, const char[] from, const char[] to) { ..... } find(t, "hello"); // fine replace( t, "c++", "d" ); // bang! compile time error : cannot change const value. const type modifier is not to much complicated currently it is supported for scalars. We need to extend it to arrays and pointers with slightly modified meaning. Close to what C++ has. But not exactly. I thing we don't need to go to const methods or so they are too shaky even in C++. Just a type modifer for POD types (as currently) and for arrays and pointers which are generally speaking for D are also POD types. This is it. Not a rocket science. Andrew.
Jun 02 2005
parent reply Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:38:54 -0700:
<snip>
 Thomas, I am not proposing here any flags changeable in runtime or existing 
 there.
 I was thinking aloud before about them but not in this message.
 I did tests already with flags - indeed they do not work in some cases.
The only way I am aware of to enforce const at runtime could be compared to Java's String pool...
 Let's forget about them and focus on just a const type modifier

 const T[] and T[] has exactly the same binary layout in runtime.

 The only difference is on compiler level (as in C++).
 Lets imagine that we have

 const char[] t = some_other_array;

 uint find( const char[] where, const char[] what)
 {
 .....
 }

 uint replace( char[] where, const char[] from, const char[] to)
 {
 .....
 }

 find(t, "hello"); // fine
 replace( t, "c++", "d" ); // bang! compile time error : cannot change const 
 value.

 const type modifier is not to much complicated
 currently it is supported for scalars. We need to extend it
 to arrays and pointers with slightly modified meaning.
 Close to what C++ has. But not exactly.
Now that is the tricky part. ;) What would be the const rules for arrays and pointers/references? How/When would those rules be checked and/or enfored? Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFCnuIR3w+/yD4P9tIRApaoAKCfmszRUHdzY+D1ZqBE28fWkQ1cBQCcDouw Jjmq2a1Wkah6dhYchFArC+g= =VHXo -----END PGP SIGNATURE-----
Jun 02 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:h943n2-na4.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:38:54 -0700:
 <snip>
 Thomas, I am not proposing here any flags changeable in runtime or 
 existing
 there.
 I was thinking aloud before about them but not in this message.
 I did tests already with flags - indeed they do not work in some cases.
The only way I am aware of to enforce const at runtime could be compared to Java's String pool...
 Let's forget about them and focus on just a const type modifier

 const T[] and T[] has exactly the same binary layout in runtime.

 The only difference is on compiler level (as in C++).
 Lets imagine that we have

 const char[] t = some_other_array;

 uint find( const char[] where, const char[] what)
 {
 .....
 }

 uint replace( char[] where, const char[] from, const char[] to)
 {
 .....
 }

 find(t, "hello"); // fine
 replace( t, "c++", "d" ); // bang! compile time error : cannot change 
 const
 value.

 const type modifier is not to much complicated
 currently it is supported for scalars. We need to extend it
 to arrays and pointers with slightly modified meaning.
 Close to what C++ has. But not exactly.
Now that is the tricky part. ;) What would be the const rules for arrays and pointers/references? How/When would those rules be checked and/or enfored?
When: In compile time. How: by generating compile time error. How be checked? Exactly as right now. Imagine that const char[] is just a typedefed char[]. Can compiler check casting problems of typedefed types now? Yes. This is it. Don't need anything more here. The only one thing: string literals are const char arrays by their nature and definition. See I am clearly expressing my intentions - defining contracts on parameters: -------------------------------- foo(char[] ms) foo(const char[] ims) -------------------------------- const char[] s1 = url.hostname; char[] s2 = url.hostname; // compile time error - cast to non-const but this should be possible too (my guess): char[] s2 = cast(char[]) url.hostname; // const is recomendation, not more! better to have constcast() for such cases but we can live without that. Andrew.
Jun 02 2005
prev sibling parent Eugene Pelekhay <pelekhay gmail.com> writes:
Walter wrote:
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 
May be I'm dummy, but I don't see in this example why this other
languages must copy it 10 times. For my implementation of reference
counted string in my C++ project, copy will be performed also 0 times.
And if there is more then 1 reference to instance exsits it's only one
copy operation will be performed. I see only one advantage in current
implementation of string - not need to check or increment/decrement
reference counter, but instead of this string duplication is required
You're right that you can avoid excessive copying by doing ref counting. Reference counting carries with it other penalties - storage must be allocated for the ref count, every copy increments the count, and every reference that goes out of scope must decrement the count. Add in exception handling, and the price is high (although C++'s mechanisms hide that price from you).
I know about price I pay for ref counting. But in my field as in many others it is a reasonable price to be sure that my application will not freeze for some time to perform garbage collection. Another thing which is significant for me it is destructor calls from garbage collector. I don't care when memory will be released, but i won't be sure that destructor is called as soon as all references to object are gone. If last is not guaranteed then my code will look like in Java, with enormous amount of *try finally* blocks with forced call to cleanup methods. IMHO to be successful D language must reduce development cycle and this is only reason big bosses will understand.
 
 Ref counting would make it impractical to do D's array slices.
Yes this is nice feature, but some one (like me) can say that bitfields in C is also very nice feature I use quite often (in binary exchange protocol between my devices). This is all about constness and CoW and without it I only see hardly findable errors in code (if code is not just simple one page test.d). Don't think I'm against this nice feature.
 
 Furthermore, in the presence of garbage collection, layering on top a
 reference counting mechanism probably means you'll want to ditch the gc and
 go with a full ref counting architecture for every object. In my experience,
 such is slower than using mark/sweep gc.
  
Just remember all fields where performance is important i can list some of them: 1) real time systems programming (gc is not acceptable if it is not deterministic) 2) game with extensive usage of latest hardware (freezing for for some time in unpredictable moments. Who will play such game?) 3) embedded systems (resources in most cases are limited, so You need to utilize all resources You have) 4) scientific calculations As You see high performance requirements often go hand in hand with deterministic execution time, which is not guaranteed in the case of GC or at least existing implementation. PS: For whom who see this post and thinks he wont to rip D language, I'm not. I'm praying last ~5 years for success of D and I'm here to bring all my experience to make it successful. I'm not starting flame war here 'cause I really like the language
Jun 02 2005
prev sibling next sibling parent reply Derek Parnell <derek psych.ward> writes:
On Tue, 31 May 2005 00:46:35 -0700, Walter wrote:


[snip]
 
 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.
 
 D must be fast, and the only way to be fast with strings (and arrays) is to
 not have the language implicitly copy them, but to allow the programmer the
 flexibility to copy or not copy. To know when to copy, use the Copy On Write
 principle (COW). That is, if you're not *sure* you've got the only copy of a
 string, .dup it before modifying it.
I think there are two distinct aspects that are sometimes being confused or mingled. One is the idea that the compiler must *prevent* read-only variables from being modified, and the other is that the compiler must *report* when it detects (during compilation) code that attempts (i.e would attempt at run time) to write to a read-only item. The first idea is the subject of the CoW proposition above; that D takes the position that the compiler is not responsible for this but that the coder is. But I still think that we haven't heard Walter's final position on the second idea. So Walter, what if we could indicate which items we would like to be read-only and if the compiler detects code which is writing to them, it issues an error message? I know this would not cause these items to be read-only, but it may help prevent silly coding errors such as the other silly coding errors you've already added protection for in D. -- Derek Parnell Melbourne, Australia 31/05/2005 11:02:10 PM
May 31 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:djl0rio88ubd$.1v38dy1thwnze.dlg 40tude.net...
 On Tue, 31 May 2005 00:46:35 -0700, Walter wrote:


 [snip]

 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a 
 manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.

 D must be fast, and the only way to be fast with strings (and arrays) is 
 to
 not have the language implicitly copy them, but to allow the programmer 
 the
 flexibility to copy or not copy. To know when to copy, use the Copy On 
 Write
 principle (COW). That is, if you're not *sure* you've got the only copy 
 of a
 string, .dup it before modifying it.
I think there are two distinct aspects that are sometimes being confused or mingled. One is the idea that the compiler must *prevent* read-only variables from being modified, and the other is that the compiler must *report* when it detects (during compilation) code that attempts (i.e would attempt at run time) to write to a read-only item. The first idea is the subject of the CoW proposition above; that D takes the position that the compiler is not responsible for this but that the coder is. But I still think that we haven't heard Walter's final position on the second idea. So Walter, what if we could indicate which items we would like to be read-only and if the compiler detects code which is writing to them, it issues an error message? I know this would not cause these items to be read-only, but it may help prevent silly coding errors such as the other silly coding errors you've already added protection for in D.
Derek, this is exactly what const does in C++. It does one more thing in fact. char[] and const char[] are distinct types so you can do: void foo( char[] s) { ... } void foo( const char[] s) { ... } This also allows you to clear show your intentions and so on.
 -- 
 Derek Parnell
 Melbourne, Australia
 31/05/2005 11:02:10 PM 
May 31 2005
prev sibling parent Jan-Eric Duden <jeduden whisset.com> writes:
Walter wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 
java.lang.String class has a) methods b) String owns buffer - it controls
buffer.

In D is possible:
int[char[]] map;
char[] s = "something";
map[s] = 1;
s[0] = '?'; // I have no idea what result will be. sure not good.

And you can bump into such problem quite easily in D. I personally
did many times. And too hard to find source sometimes.

In Java such collision is not possible in principle: String is final and
immutable.
A number of languages use the immutable string idiom, and its corollary "always implicitly copy the string when writing to it". They all share another common characteristic - they're slow, and they're slow in a manner that is *not fixable*. And they're not just slower by a factor, many algorithms run *exponentially* slower because of the copying. D must be fast, and the only way to be fast with strings (and arrays) is to not have the language implicitly copy them, but to allow the programmer the flexibility to copy or not copy. To know when to copy, use the Copy On Write principle (COW). That is, if you're not *sure* you've got the only copy of a string, .dup it before modifying it. So why isn't that just as bad as the languages that implicitly copy on write? The answer is that often, you know that you are the sole owner, such as: char[] s = new char[10]; for (i = 0; i < 10; i++) s[i] = 'c'; Those other languages are doomed to make 10 copies of s. The D programmer needs to make 0 copies. As to your example above, when you pass a reference to a string to an associative array, then you aren't the sole owner of that string anymore. Don't change it. .dup it.
If D had a standard string class there wouldn't be any problem! The string class would implement immutable strings without COW. That's what you need in 99% of all applications. Applications that are an exeception to the rule should use char arrays with dup if needed. That's a win win situation - and no performance hit. The only thing that people need to get used to is that "normal" strings are immutable, but that's not really hard to accept. Cheers, Jan
Jun 03 2005
prev sibling parent reply Derek Parnell <derek psych.ward> writes:
On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


[snip]

 class Url
 {
    char[] _hostname;
    ...
    char[] hostname() { return _hostname.dup; } // Doh!
 }
 
 if( url.hostname == "terrainformatica.com" )
 // 32 bytes less in memory, just to compare it!
   ....
 
 Ideal from many points of view would be a solution with const
 
 class Url {
   char[] _hostname;
 
   const char[] hostname() { return _hostname; } // Yep! this exactly what we 
 need.
 
 }
 
Given the current semantics of D, could a workaround be that we give the caller the choice, thus making them take explicit responsibility for their usage. class Url { private char[] _hostname; ... char[] hostname_unsafe() { return _hostname; } char[] hostname() { return _hostname.dup; } } char[] a; char[] b; a = url.hostname; // Gets a string with safety b = url.hostname_unsafe; // Gets a string without safety <offtopic> Of course, if we had return type function matching this would be a whole lot easier and legible. typedef char[] safe_string; class Url { private char[] _hostname; ... char[] hostname() { return _hostname; } safe_string hostname() { return cast(safe_string)_hostname.dup; } } safe_string a; char[] b; a = url.hostname; // Gets a string with safety b = url.hostname; // Gets a string without safety But I should wake up from this dream now ... -) </offtopic> -- Derek Melbourne, Australia 2/06/2005 10:12:16 AM
Jun 01 2005
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:lktv44kgbu1j.1tp9nxs3uqvmr.dlg 40tude.net...
 On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


 [snip]

 class Url
 {
    char[] _hostname;
    ...
    char[] hostname() { return _hostname.dup; } // Doh!
 }

 if( url.hostname == "terrainformatica.com" )
 // 32 bytes less in memory, just to compare it!
   ....

 Ideal from many points of view would be a solution with const

 class Url {
   char[] _hostname;

   const char[] hostname() { return _hostname; } // Yep! this exactly what 
 we
 need.

 }
Given the current semantics of D, could a workaround be that we give the caller the choice, thus making them take explicit responsibility for their usage. class Url { private char[] _hostname; ... char[] hostname_unsafe() { return _hostname; } char[] hostname() { return _hostname.dup; } }
 char[] a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname_unsafe; // Gets a string without safety
Derek it is not a choice. Nobody in good mental health will do such double implementation. Easier to switch to C++. const char[] hostname() { return _hostname; } const char[] hostname = url.hostname; const char[] level1 = hostname[$-3..$]; char[] level2 = hostname[0...$-3]; // bang! We do have const for int, double, etc. Why not for arrays and pointers - they are also primitive types builtin in language, right? Yes this is a bit different in implementation but for the sake of consistency? Jees, I am loosing big project for D and Harmonia.... Team is voting for C++. Only const! :( Simply do not have moral rights to insist further. That was really good chance to feed Harmonia....
 <offtopic>
 Of course, if we had return type function matching this would be a whole
 lot easier and legible.

 typedef char[] safe_string;

 class Url
 {
    private char[] _hostname;
    ...
    char[] hostname()      { return _hostname; }
    safe_string hostname() { return cast(safe_string)_hostname.dup; }
 }

 safe_string a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname; // Gets a string without safety

 But I should wake up from this dream now ... -)
 </offtopic>

 -- 
 Derek
 Melbourne, Australia
 2/06/2005 10:12:16 AM 
Jun 01 2005