www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - constness for arrays

reply xs0 <xs0 xs0.com> writes:
As has been discussed, the lack of something like C++ const hurts most 
when using arrays, as you can't code around it like with classes, 
structs or primitives (with the latter two you can just pass by value, 
for classes you can make readonly versions). The fact that inbuilt 
strings are also arrays makes the problem occur often.

I was wondering whether the following would resolve that issue:

- the top bit of arrays' .length becomes an indicator of the 
readonlyness of the array reference

- type of .length is changed from uint to int (just to indicate the 
proper maximum value; it still can't be negative (from the user's POV))

- arrays get a .isReadonly property which tests the top bit

- arrays get a .lock() method that sets it to 1

- .dup clears it (obviously :)

- reading .length masks the bit out

- setting .length sets the bit to zero if reallocation occurs, and 
leaves it intact otherwise

- arrays get a .readonly property which returns a copy of the array 
reference with the bit set

- optionally, arrays get a .needToWrite() method which does the 
following: { if (arr.isReadonly) arr=arr.dup; }  (yes, the name sucks)


Now this has the following (imho) neat properties:

- initial implementation should be quite trivial, I bet Walter could do 
it in a few hours; eventually, debug builds could prevent you from 
writing to a readonly array, but that's even not that important

- losing that one bit has no real effect on anything

- it can be tested for at runtime

- it has practically negligible impact on efficiency:
   - reading .length needs one instruction more (AND with 0x7fffffff)
   - setting .length needs about three instructions more
   - reading and writing to the array has no additional cost
   - moving the reference around also costs the same
   - new operations are quite trivial as well

- fits with COW perfectly

- there's no const-pollution and no need to write two versions of functions


A quick example of the possibilities:

char[] toUpper(char[] txt)
{
     for (int i=0; i<txt.length; i++) {
         char c = s[i];
         if ('a'<=c && c<='z') {
             txt.needToWrite();
             txt[i] = c-(cast(char)'a'-'A');
         }
     }
}

char[] FOO = toUpper("foo"); // constants are readonly, so COW is made

char[] bibi = getBibi(); // who owns it? I can finally know if it's me
char[] BIBI = toUpper(bibi); // write into bibi, if owned
char[] BIBI = toUpper(bibi.readonly); // leave bibi alone, as I need it


What y'all think?


xs0

PS:
Credits: the idea is not all mine, I got it from the discussion with 
Reiner Pope on D.learn

I'm also sorry if this was already suggested, but I don't remember 
anything of the sorts discussed..
Jul 18 2006
next sibling parent Reiner Pope <reiner.pope gmail.com> writes:
I like it. Same capabilities as I was grasping at, but much more elegant.

   - reading and writing to the array has no additional cost

version with checking disabled for max speed, but for extra safety, you could set write-checking for debug builds, where the slowdown should be more acceptable.
Jul 18 2006
prev sibling next sibling parent reply David Medlock <noone nowhere.com> writes:
xs0 wrote:
 As has been discussed, the lack of something like C++ const hurts most 
 when using arrays, as you can't code around it like with classes, 
 structs or primitives (with the latter two you can just pass by value, 
 for classes you can make readonly versions). The fact that inbuilt 
 strings are also arrays makes the problem occur often.
 
 I was wondering whether the following would resolve that issue:
 
 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference
 

I like it except it drops the max array size by half, doesn't it? Since we are talking about dynamic arrays here, why not just: 1. add a flags byte or short to the internal array structure to hold it. 2. make the pointers lower bit hold it- this of course assumes the pointer is at least word-aligned. I am not sure if this would conflict with structs which are align(1). -DavidM
Jul 18 2006
parent xs0 <xs0 xs0.com> writes:
David Medlock wrote:
 xs0 wrote:
 As has been discussed, the lack of something like C++ const hurts most 
 when using arrays, as you can't code around it like with classes, 
 structs or primitives (with the latter two you can just pass by value, 
 for classes you can make readonly versions). The fact that inbuilt 
 strings are also arrays makes the problem occur often.

 I was wondering whether the following would resolve that issue:

 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference

I like it except it drops the max array size by half, doesn't it?

Theoretically, but in practice: - if you have a 64-bit machine, you don't care - if you have a 32-bit machine, you can't get the full 4GB anyway (on Windows, a process can only allocate 2GB, I bet it's similar in other OSes) - with anything larger than a byte you don't even theoretically need that bit
 Since we are talking about dynamic arrays here, why not just:
 
 1. add a flags byte or short to the internal array structure to hold it.

because that would increase the size of array structure, making it consume more memory (I'd guess at least 4 bytes per reference) and slower (more data to copy)
 2. make the pointers lower bit hold it- this of course assumes the 
 pointer is at least word-aligned.  I am not sure if this would conflict 
 with structs which are align(1).

that wouldn't work - the pointer is unrestricted and can point anywhere.. xs0
Jul 18 2006
prev sibling next sibling parent reply Don Clugston <dac nospam.com.au> writes:
xs0 wrote:
 As has been discussed, the lack of something like C++ const hurts most 
 when using arrays, as you can't code around it like with classes, 
 structs or primitives (with the latter two you can just pass by value, 
 for classes you can make readonly versions). The fact that inbuilt 
 strings are also arrays makes the problem occur often.
 
 I was wondering whether the following would resolve that issue:
 
 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference

This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real: instead of char[], use struct CharArray { char [] arr; bool readOnly; } for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair). For code that makes heavy use of COW, I suspect that the benefit could be considerable. You probably don't need to eliminate many .dups to pay for the slightly slower .length. The situation where a function only occasionally returns a read-only string is probably quite common: char [] func(int n) { if (n==0) return "annoying"; else return toString(n); } .. and you have to .dup it for the rare case where it's a literal.
Jul 18 2006
next sibling parent David Medlock <noone nowhere.com> writes:
Don Clugston wrote:
 xs0 wrote:
 
 As has been discussed, the lack of something like C++ const hurts most 
 when using arrays, as you can't code around it like with classes, 
 structs or primitives (with the latter two you can just pass by value, 
 for classes you can make readonly versions). The fact that inbuilt 
 strings are also arrays makes the problem occur often.

 I was wondering whether the following would resolve that issue:

 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference

This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real: instead of char[], use struct CharArray { char [] arr; bool readOnly; } for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair). For code that makes heavy use of COW, I suspect that the benefit could be considerable. You probably don't need to eliminate many .dups to pay for the slightly slower .length. The situation where a function only occasionally returns a read-only string is probably quite common: char [] func(int n) { if (n==0) return "annoying"; else return toString(n); } .. and you have to .dup it for the rare case where it's a literal.

Agreed , Don. Its important to note this is two issues: 1. A readonly property of arrays. 2. Implementation of #1. If Walter agrees on #1, I am sure he is best person to ask for advice on #2 (at least in the case of DMD). -DavidM
Jul 18 2006
prev sibling next sibling parent xs0 <xs0 xs0.com> writes:
Don Clugston wrote:
 xs0 wrote:
 As has been discussed, the lack of something like C++ const hurts most 
 when using arrays, as you can't code around it like with classes, 
 structs or primitives (with the latter two you can just pass by value, 
 for classes you can make readonly versions). The fact that inbuilt 
 strings are also arrays makes the problem occur often.

 I was wondering whether the following would resolve that issue:

 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference

This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real:

Well, actually the primary motivation is correctness*, but the whole thing does indeed seem to benefit performance as well (well, pending some actual tests, but it seems quite obvious).
 instead of char[], use
 
 struct CharArray {
  char [] arr;
  bool readOnly;
 }
 
 for both the existing and proposed behaviour (for the existing one, 
 readonly is ignored, but include it to make the parameter passing fair).

will probably do, as soon as I have some time (but if anyone else feels like it, go ahead ;)
 The situation where a function only occasionally returns a read-only 
 string is probably quite common:
 
 char [] func(int n) {
   if (n==0) return "annoying";
   else return toString(n);
 }
 .. and you have to .dup it for the rare case where it's a literal.

Yup.. and it's also common to .dup just because there is no indication at all whether it is required or not, and it's the safe thing to do.. like, if you call something like ucFirst(toLower("BIBI")) there's no need for ucFirst to .dup (already done in toLower), but it still does as it has no idea that is the case.. xs0 *) correctness in the sense that it becomes easier to write correct programs, and not in the sense that the compiler would force you to do so, at least until debugtime checking is done
Jul 18 2006
prev sibling parent xs0 <xs0 xs0.com> writes:
Don Clugston wrote:
 xs0 wrote:
 - the top bit of arrays' .length becomes an indicator of the 
 readonlyness of the array reference

This is a really interesting idea. You're essentially chasing a performance benefit, rather than program correctness. Some benchmarks ought to be able to tell you if the performance benefit is real: instead of char[], use struct CharArray { char [] arr; bool readOnly; } for both the existing and proposed behaviour (for the existing one, readonly is ignored, but include it to make the parameter passing fair). For code that makes heavy use of COW, I suspect that the benefit could be considerable. You probably don't need to eliminate many .dups to pay for the slightly slower .length.

Well, I did a (admittedly biased :) test, and there do seem to be potential large benefits.. I wrote an app that counts different words in the 5.6MB ASCII text from http://www.gutenberg.org/etext/1581 The text was read and duplicated 10 times (so I could do 10 runs). Then, words were extracted (word := a sequence of alnum chars), lowercased and placed into an AA. I ran each version about 20 times, and here are the fastest results for each: bib_current : 3641ms bib_str : 3031ms bib_str (old) : 3625ms bib_str (ugly) : 3109ms Commenting out the AA stuff, the results become bib_current : 1281 bib_str : 812 bib_str (old) : 1234 bib_str (ugly) : 812 About 11% of the calls to toLower would result in .duping currently, and none do with the new system, as it's not necessary in this particular case. Had I used toUpper, ... :) bib_current is exactly the same code, except it uses Phobos' tolower and char[] instead of string. For some reason, bib_current is (slightly) slower even than the string version that does exactly the same thing.. my guess would be that some more inlining/optimization was done in my code.. For some other reason, toLowerUgly is slower than toLowerNew, even though it potentially does less checks.. Probably the benefit of that was lost completely, as words tend to only have the first character uppercase, and more code just slowed the thing down. Well, anyway, the conclusion would be that using that bit for readonly indication does not cause slowdowns even for code that doesn't use it. If used for COW-only-when-necessary, speed gains can be considerable. xs0 The code was this (I snipped boring code in the interest of brevity, I can post the full code if someone wants it) struct string { char* ptr; uint _length; public static string opCall(char[] bu, int readonly) { ... } public int length() { return _length & 0x7fffffff } public void length(int newlen) { ... } public string dup() { ... } public char opIndex(int i) { return ptr[i]; } public char opIndexAssign(char c, int i) { return ptr[i] = c; } public char[] toString() { return ptr[0..length()]; } public void wantToWrite() { if (_length & 0x80000000 } { ... } } public string slice(int start, int end) { ... } } string toLowerOld(string txt) { int l = txt.length; for (int a=0; a<l; a++) { char c = txt[a]; if (c>='A' && c<='Z') { txt = txt.dup; txt[a] = c+32; for (int b=a+1; b<l; b++) { c = txt[b]; if (c>='A' && c<='Z') txt[b]=c+32; } return txt; } } return txt; } string toLowerNew(string txt) { int l = txt.length; for (int a=0; a<l; a++) { char c = txt[a]; if (c>='A' && c<='Z') { txt.wantToWrite(); txt[a]=c+32; } } return txt; } string toLowerUgly(string txt) { int l = txt.length; for (int a=0; a<l; a++) { char c = txt[a]; if (c>='A' && c<='Z') { txt.wantToWrite(); txt[a]=c+32; for (int b=a+1; b<l; b++) { c = txt[b]; if (c>='A' && c<='Z') txt[b]=c+32; } return txt; } } return txt; } void main() { string[] bible; bible.length = 10; for (int a=0; a<bible.length; a++) { if (a==0) { bible[a] = string(cast(char[])read("bible.txt"), 0); } else { bible[a] = bible[a-1].dup; } } long start = getUTCtime(); uint result; for (int q=0; q<bible.length; q++) { string txt = bible[q]; int[char[]] count; int pos = 0; while (pos<txt.length) { if (!isalnum(txt[pos])) { pos++; continue; } int len = 1; while (pos+len < txt.length && isalnum(txt[pos+len])) len++; //string word = toLowerOld(txt.slice(pos, pos+len)); string word = toLowerNew(txt.slice(pos, pos+len)); //string word = toLowerUgly(txt.slice(pos, pos+len)); pos+=len; if (auto c = word.toString() in count) { (*c)++; } else { count[word.toString()]=1; } } result = count.length; } long end = getUTCtime(); writefln("Different words found: ", result); writefln("Time taken: ", (end-start)); }
Jul 19 2006
prev sibling next sibling parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
Dynamic constness versus static (compile time) constness is not new.

For example in Ruby you can dynamicly declare object/array readonly and
its runtime will control all modifications and note - in full as Ruby's 
sandbox
(as any other VM based runtime) has all facilities to fully control
immutability of such objects.

In case of runtimes like D (natively compileable) such control is not an
option.

I beleive that proposed runtime flag a) is not a constness in any sense
b) does not solve compile verification of readonlyness and
c) can be implemented now by defining:
struct vector
{
    bool readonly;
    T*  data;
    uint length;
}

Declarative contness prevents data misuse at compile time
when runtime constness moves problem into execution time
when is a) too late to do anything and b) expensive.

I would mention old idea again - real solution would be in creating of
mechanism of disabling exiting or creating new opertaions
for intrinsic types.

For example string definition might look like as:

typedef  string char[]
{
    disable opAssign;
    ....
    char[] tolower() { ..... }
}

In any case such mechanism a) is more universal than const in C++
b) allows to create flexible type systems and finally
c) this will also legalize situation with
"external methods" D has now for array types.

The later one alone is a good enough motivation to do so
as current situation with "external methods" looks like as
just a bug of design or compiler to be honest.

I am yet silent that it will make D's type system unique
in this respect among other languages.

Andrew Fedoniouk.
http://terrainformatica.com
Jul 18 2006
next sibling parent reply Chad J <gamerChad _spamIsBad_gmail.com> writes:
Andrew Fedoniouk wrote:
 Dynamic constness versus static (compile time) constness is not new.
 
 For example in Ruby you can dynamicly declare object/array readonly and
 its runtime will control all modifications and note - in full as Ruby's 
 sandbox
 (as any other VM based runtime) has all facilities to fully control
 immutability of such objects.
 
 In case of runtimes like D (natively compileable) such control is not an
 option.
 
 I beleive that proposed runtime flag a) is not a constness in any sense
 b) does not solve compile verification of readonlyness and
 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }
 
 Declarative contness prevents data misuse at compile time
 when runtime constness moves problem into execution time
 when is a) too late to do anything and b) expensive.
 
 I would mention old idea again - real solution would be in creating of
 mechanism of disabling exiting or creating new opertaions
 for intrinsic types.
 
 For example string definition might look like as:
 
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }
 
 In any case such mechanism a) is more universal than const in C++
 b) allows to create flexible type systems and finally
 c) this will also legalize situation with
 "external methods" D has now for array types.
 
 The later one alone is a good enough motivation to do so
 as current situation with "external methods" looks like as
 just a bug of design or compiler to be honest.
 
 I am yet silent that it will make D's type system unique
 in this respect among other languages.
 
 Andrew Fedoniouk.
 http://terrainformatica.com
 

I like that typedef. Should be templatable though... typedef(T) array T[] { ... } Or some such. In an earlier post ("Module level operator overloading" at digitalmars.D/39504) I was hoping for external functions as operator overloads and IFTI to help with things like array operations. I just didn't know about external functions at the time. But if this is supposed to replace external functions, how would I do the array op overloads that external functions would help me with? Would be unfortunate to write something like this... typedef(T) T[] T[] // mmm what would this do { void opAdd(T[] array1, T[] array2) { etc... } ... }
Jul 18 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }


 I like that typedef.  Should be templatable though...

 typedef(T) array T[]
 {
     ...
 }

I think so too. It would be nice to have this but I think that it is enough to be able to define such types in each perticular case. I think that such extended typedef makes sense for other basic types: typedef color uint { uint red() { .... } uint blue() { .... } uint green() { .... } } Also such typedef makes sense for classes too. To avoid vtbl pollution. Especially actual for templated classes.
 Or some such.  In an earlier post ("Module level operator overloading" at 
 digitalmars.D/39504) I was 
 hoping for external functions as operator overloads and IFTI to help with 
 things like array operations.  I just didn't know about external functions 
 at the time.  But if this is supposed to replace external functions, how 
 would I do the array op overloads that external functions would help me 
 with?  Would be unfortunate to write something like this...

 typedef(T) T[] T[] // mmm what would this do
 {
   void opAdd(T[] array1, T[] array2)
   {
     etc...
   }
 }

What is the problem with the following: typedef(T) array T[] { void opAdd(array a1, array a2) { } } ? Andrew Fedoniouk. http://terrainformatica.com
Jul 18 2006
next sibling parent reply Chad J <gamerChad _spamIsBad_gmail.com> writes:
Andrew Fedoniouk wrote:
 
 What is the problem with the following:
 
 typedef(T) array T[]
 {
    void opAdd(array a1, array a2)
    {
 
    }
 }
 
 ?
 

In the usage. How do you use it? Something like this? array!(short) foo = [1,2,3]; array!(short) bar = [4,5,6]; array!(short) result = foo + bar; // result is now [5,7,9] not too bad... how about multidimensional stuff... array!(array!(short)) foo = [[1,2],[3,4]]; array!(array!(short)) bar = [[5,6],[7,8]]; // um, maybe a matrix multiply or something. I don't feel like it. Well doable, but it would be better to have ordinary array syntax. Also, what if some external library passes in an ordinary array that is not set up as one of these types, and you want to use the new fancy features on it: void gimmeAnArray( short[] foo ) { array!(short) bar = array!(short).convert( foo ); // ugh array!(short) bar = array.convert( foo ); // ahh IFTI is better, but this whole line should be unnecessary IMO. ... } I suppose the problem I have with your syntax suggestion is that it's impossible to add properties to existing types. It forces you to define new types to add properties.
Jul 18 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 void gimmeAnArray( short[] foo )
 {
   array!(short) bar = array!(short).convert( foo );
   // ugh
   array!(short) bar = array.convert( foo );
   // ahh IFTI is better, but this whole line should be unnecessary IMO.
   ...
 }

 I suppose the problem I have with your syntax suggestion is that it's 
 impossible to add properties to existing types.  It forces you to define 
 new types to add properties.

If you will define it as alias array short[] { ... void someNewOp(self) { } } then 1) you can use this someNewOp with it and 2) void gimmeAnArray( short[] foo ) { array bar = foo; // ok ... } Extended alias allows to extend base types. Extended typedef allows to extend and to reduce operations of base types. Andrew Fedoniouk. http://terrainformatica.com
Jul 18 2006
parent Chad J <gamerChad _spamIsBad_gmail.com> writes:
Andrew Fedoniouk wrote:
 
 If you will define it as
 
 alias array short[]
 {
     ...
     void someNewOp(self) {    }
 }
 
 then 1) you can use this someNewOp with it and 2)
 
 void gimmeAnArray( short[] foo )
 {
     array bar = foo; // ok
     ...
 }
 
 Extended alias allows to extend base types.
 Extended typedef allows to extend and to reduce operations of base types.
 
 Andrew Fedoniouk.
 http://terrainformatica.com
 

I suppose that means I could do something like alias array short[] { ... short[] opAdd( short[] other ) { ... } } short[] gimmeAnArray( short[] foo ) { short[] newArray; for ( int i = 0; i < foo.length; i++ ) newArray ~= i; return foo + newArray; // usage of extension on short[] } That would be cool. Though it would be nice if it didn't also stick an "array" type out there (does it?). Is there anywhere I can find a complete look at what you're proposing?
Jul 18 2006
prev sibling parent reply xs0 <xs0 xs0.com> writes:
Andrew Fedoniouk wrote:
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }



Is there any particular difference from struct string { char[] data; char[] tolower() { .... } } ?
 I think that such extended typedef makes sense for other basic types:
 
 typedef color uint
 {
     uint red() {  .... }
     uint blue() {  .... }
     uint green() {  .... }
 }

Is there any particular difference from struct color { uint value; uint red() { ... } ... } ?
 Also such typedef makes sense for classes too.

I don't get that.. Since you seem to want a new type, what's wrong with deriving?
 To avoid vtbl  pollution. Especially actual for templated classes.

Make the methods or class final, then they don't go into vtbl (or at least shouldn't). xs0
Jul 19 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"xs0" <xs0 xs0.com> wrote in message news:e9lu7n$2jv0$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }



Is there any particular difference from struct string { char[] data; char[] tolower() { .... } }

difference is in disabled opAssign**, so if you will define let's say following: typedef string char[] { disable opSliceAssign; .... char[] tolower() { ..... } } then you will not be able to compile following: string s = "something read-only"; s[0..s.length] = '\0'; // compile time error.
 ?

 I think that such extended typedef makes sense for other basic types:

 typedef color uint
 {
     uint red() {  .... }
     uint blue() {  .... }
     uint green() {  .... }
 }

Is there any particular difference from struct color { uint value; uint red() { ... } ... } ?

The difference is that such color is inherently uint so you can do following: color c = 0xFF00FF; c <<= 8; uint r = c.red();
 Also such typedef makes sense for classes too.

I don't get that.. Since you seem to want a new type, what's wrong with deriving?

See: alias NewClass OldClass { void foo() { .... } } will not create new VTBL for NewClass. It is just a syntactic sugar: Instead of defining and using: void foo_x( OldClass c ) { ..... } You can use NewClass nc = ....; nc.foo();
 To avoid vtbl  pollution. Especially actual for templated classes.

Make the methods or class final, then they don't go into vtbl (or at least shouldn't).

All classes in D has VTBL by definition as far as I remember. Andrew Fedoniouk. http://terrainformatica.com
Jul 19 2006
parent reply xs0 <xs0 xs0.com> writes:
Andrew Fedoniouk wrote:
 "xs0" <xs0 xs0.com> wrote in message news:e9lu7n$2jv0$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }



struct string { char[] data; char[] tolower() { .... } }

difference is in disabled opAssign**, so if you will define let's say following: typedef string char[] { disable opSliceAssign; .... char[] tolower() { ..... } } then you will not be able to compile following: string s = "something read-only"; s[0..s.length] = '\0'; // compile time error.

But you can also not define opSliceAssign in struct string, and get the compile time error?
 I think that such extended typedef makes sense for other basic types:

 typedef color uint
 {
     uint red() {  .... }
     uint blue() {  .... }
     uint green() {  .... }
 }

struct color { uint value; uint red() { ... } ... } ?

The difference is that such color is inherently uint so you can do following: color c = 0xFF00FF; c <<= 8; uint r = c.red();

Besides the first line, you can do the same with a struct. And I'd say it's good that color and uint are not fully interchangeable, considering how they have nothing in common; one is a 32-bit integer, the other is more a byte[3] or byte[4], and even then you can't really say that a 'generic 8-bit integer' and a 'level of red intensity' have much in common..
 Also such typedef makes sense for classes too.

deriving?

See: alias NewClass OldClass { void foo() { .... } } will not create new VTBL for NewClass. It is just a syntactic sugar: Instead of defining and using: void foo_x( OldClass c ) { ..... } You can use NewClass nc = ....; nc.foo();

Well, if you override a function, it should be virtual. If it didn't exist and is final, the compiler should be able to determine it can call it directly. If it is not final (meaning you plan to override it in a further derived class), it should again be virtual.. So I don't see the problem.. Also, why would you want a non-member function to look like it is a member function? Just causes confusion.. Finally, bar.foo() isn't really a shorthand for foo(bar), being one character longer.. Seems more like syntactic saccharin :)
 To avoid vtbl  pollution. Especially actual for templated classes.

shouldn't).

All classes in D has VTBL by definition as far as I remember.

Yup. xs0
Jul 19 2006
next sibling parent reply Johan Granberg <lijat.meREM OVEgmail.com> writes:
xs0 wrote:
 But you can also not define opSliceAssign in struct string, and get the 
 compile time error?

I think you are missing the point of this proposal (which I like a lot by the way). (Andrew Fedoniouk if I have missinterepted your proposal, pleas excuse me) The purpose is not to extend a type as with a class inheritance or to create a new type as with a new class or struct, but to alter the behavior of an existing type to allow for things like read only and color types that can bee passed as is to common graphics api's that expects uints etc. This would add missing power too the language that are not available at the moment. An alternative weaker (but in my opinion worse) syntax to do this would be this. struct string : char[] { //posibly override opSliceAssign(){throw new Exception("");} //or disable opSliceAssign(); } Here we introduce struct inheritance and the use of built in types as base classes but due too the extending nature of inheritance (the child beeing a superset of the parent) the disable syntax is bad and the use of inheritance forces the use of the virtual table (which structs and inbuilt types don't have). The proposed syntax allows for this on the other hand typedef char[] string { disable opSliceAssign; string toLower(){..} } If we used inheritance this would not bee possible because we remove opSliceAssign from the interface of the type. Here we use the new syntax to describe a "starting from" relation ship while inheritance creates a is a relationship. We could create an entirely new type using structs and so on but the we would have to specify all the methods and fields of the new type rather than our changes. This is very much against the principles of code reuse. /Johan Granberg ps. Walter this could bee a nice feature to have as it would allow the creation of subsets of types (and intersecting types) as well as supersets.
Jul 19 2006
parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Johan Granberg" <lijat.meREM OVEgmail.com> wrote in message 
news:e9m79k$2sf$1 digitaldaemon.com...
 xs0 wrote:
 But you can also not define opSliceAssign in struct string, and get the 
 compile time error?

I think you are missing the point of this proposal (which I like a lot by the way). (Andrew Fedoniouk if I have missinterepted your proposal, pleas excuse me)

Johan, you've got it right.
 The purpose is not to extend a type as with a class inheritance or to 
 create a new type as with a new class or struct, but to alter the behavior 
 of an existing type to allow for things like read only and color types 
 that can bee passed as is to common graphics api's that expects uints etc.

Exactly. The main purpose is not for extending classes but to give opportunity to extend intrinsic and value types. It makes real sense for arrays, integers, enums, etc. Again, external methods for arrays are here anyway - this is good chance to legalize them.
 This would add missing power too the language that are not available at 
 the moment. An alternative weaker (but in my opinion worse) syntax to do 
 this would be this.

 struct string : char[]
 {
 //posibly
 override opSliceAssign(){throw new Exception("");}

 //or
 disable opSliceAssign();
 }

 Here we introduce struct inheritance and the use of built in types as base 
 classes but due too the extending nature of inheritance (the child beeing 
 a superset of the parent) the disable syntax is bad and the use of 
 inheritance forces the use of the virtual table (which structs and inbuilt 
 types don't have).

 The proposed syntax allows for this on the other hand

 typedef char[] string
 {
 disable opSliceAssign;
 string toLower(){..}
 }

 If we used inheritance this would not bee possible because we remove 
 opSliceAssign from the interface of the type.

 Here we use the new syntax to describe a "starting from" relation ship 
 while inheritance creates a is a relationship.

 We could create an entirely new type using structs and so on but the we 
 would have to specify all the methods and fields of the new type rather 
 than our changes. This is very much against the principles of code reuse.

Exactly! Consider this typedef uint color { ubyte red() {....} } I want to keep color all attrributes and operations of uint but to give it couple of specific methods. The thing is that declaration of such type will force all methods to be declared in one place. Intellisense engine will like such declarations....
 /Johan Granberg

 ps. Walter this could bee a nice feature to have as it would allow the 
 creation of subsets of types (and intersecting types) as well as 
 supersets.

Yep. You've got an idea. sub- and super-sets are right words. Andrew Fedoniouk.
Jul 19 2006
prev sibling parent "Andrew Fedoniouk" <news terrainformatica.com> writes:
I think I need to explain the idea using different words.

In terms of C++

"char[]" and
"const char[]"
are two distinct types.

"const char[]" is a reduced version of "char[]"

Reduced means that "const char[]" as a type has no
mutating methods like length(uint newLength),
opIndexAssign, etc.

extended typedef allows you to define
explicitly such const types by reducing
set of operations (what C++ does implicitly)
and also allows you to extend such types by new
methods.

Main value of the approach is for array and
pointer types I guess.

Andrew Fedoniouk.
http://terrainformatica.com
Jul 19 2006
prev sibling next sibling parent reply Dave <Dave_member pathlink.com> writes:
Andrew Fedoniouk wrote:
 Dynamic constness versus static (compile time) constness is not new.
 
 For example in Ruby you can dynamicly declare object/array readonly and
 its runtime will control all modifications and note - in full as Ruby's 
 sandbox
 (as any other VM based runtime) has all facilities to fully control
 immutability of such objects.
 
 In case of runtimes like D (natively compileable) such control is not an
 option.
 
 I beleive that proposed runtime flag a) is not a constness in any sense
 b) does not solve compile verification of readonlyness and
 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }
 
 Declarative contness prevents data misuse at compile time
 when runtime constness moves problem into execution time
 when is a) too late to do anything and b) expensive.
 
 I would mention old idea again - real solution would be in creating of
 mechanism of disabling exiting or creating new opertaions
 for intrinsic types.
 
 For example string definition might look like as:
 
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }
 
 In any case such mechanism a) is more universal than const in C++
 b) allows to create flexible type systems and finally
 c) this will also legalize situation with
 "external methods" D has now for array types.
 
 The later one alone is a good enough motivation to do so
 as current situation with "external methods" looks like as
 just a bug of design or compiler to be honest.
 
 I am yet silent that it will make D's type system unique
 in this respect among other languages.
 
 Andrew Fedoniouk.
 http://terrainformatica.com
 
 

What do you mean by external methods? This? import std.stdio; void main() { char[] str = "abc"; writefln(str.ucase()); // "ABC" } char[] ucase(char[] str) { foreach(inout char c; str) if(c >= 'a' && c <= 'z') c += 'A' - 'a'; return str; } If so, that's not a bug, it's intentional. Line 4141 of expression.c. - Dave
Jul 18 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
 What do you mean by external methods?

 This?

Positive.
 import std.stdio;
 void main()
 {
     char[] str = "abc";
     writefln(str.ucase()); // "ABC"
 }
 char[] ucase(char[] str)
 {
     foreach(inout char c; str) if(c >= 'a' && c <= 'z') c += 'A' - 'a';
     return str;
 }

 If so, that's not a bug, it's intentional. Line 4141 of expression.c.

People are looking in the doc/ language specification first. Line 4141 of expression.c is the last place where someone will try to find answer on what language features D has. Andrew Fedoniouk. http://terrainformatica.com
Jul 18 2006
parent Dave <Dave_member pathlink.com> writes:
Andrew Fedoniouk wrote:
 
 People are looking in the doc/ language specification first.
 
 Line 4141 of expression.c is the last place where someone will
 try to find answer on what language features D has.
 

I agree; just pointing out that it is there by design even if that design hasn't been codified in the docs. <g>
 Andrew Fedoniouk.
 http://terrainformatica.com

Jul 18 2006
prev sibling next sibling parent reply xs0 <xs0 xs0.com> writes:
Andrew Fedoniouk wrote:
 Dynamic constness versus static (compile time) constness is not new.

Never said it was.
 For example in Ruby you can dynamicly declare object/array readonly and
 its runtime will control all modifications and note - in full as Ruby's 
 sandbox
 (as any other VM based runtime) has all facilities to fully control
 immutability of such objects.

Cool! OTOH, I'm proposing of making the reference readonly, not the data itself.
 In case of runtimes like D (natively compileable) such control is not an
 option.

Because?
 I beleive that proposed runtime flag a) is not a constness in any sense

It's more like readonlyness.
 b) does not solve compile verification of readonlyness and

I said so myself :P But, the question is whether compile-time verification is better or not. In some cases it definitely isn't; int[] cowFoo(int[] a) { if (whatever) { a=a.dup; a[0] = 5; } } int[] cowBar(int[] a) { if (something) { a=a.dup; a[1] = 10; } } int[] result=cowFoo(cowBar(whatever)); How can a compile-time check ever help you avoid the (unnecessary) second .dup when both funcs decide to modify the data?
 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }

So? How does that help when using built-in arrays?
 Declarative contness prevents data misuse at compile time
 when runtime constness moves problem into execution time
 when is a) too late to do anything and b) expensive.

I disagree. A single .dup probably costs more than tens (if not hundreds) of checks of a single bit (which can even be disabled in release builds). And why would it be too late to do anything?
 I would mention old idea again - real solution would be in creating of
 mechanism of disabling exiting or creating new opertaions
 for intrinsic types.

Start your own thread :P xs0
Jul 19 2006
next sibling parent xs0 <xs0 xs0.com> writes:
 int[] cowFoo(int[] a) { if (whatever) { a=a.dup; a[0] = 5; } }
 int[] cowBar(int[] a) { if (something) { a=a.dup; a[1] = 10; } }

Of course, both return a as well :) xs0
Jul 19 2006
prev sibling parent Bruno Medeiros <brunodomedeirosATgmail SPAM.com> writes:
xs0 wrote:
 Andrew Fedoniouk wrote:
 b) does not solve compile verification of readonlyness and

I said so myself :P But, the question is whether compile-time verification is better or not. In some cases it definitely isn't; int[] cowFoo(int[] a) { if (whatever) { a=a.dup; a[0] = 5; } } int[] cowBar(int[] a) { if (something) { a=a.dup; a[1] = 10; } } int[] result=cowFoo(cowBar(whatever)); How can a compile-time check ever help you avoid the (unnecessary) second .dup when both funcs decide to modify the data?

For that case (how to avoid the unnecessary dups): int[] cowFoo(int[] a) { if (whatever) { a[0] = 5; return a; } } int[] cowBar(int[] a) { if (something) { a[1] = 10; return a; } } int[] result=cowFoo(cowBar(someintar)); What's the a.dup for? Do you realize that if the parameter (a) is non-const then that means the function is allowed to change it? Perhaps you meant a different use case? -- Bruno Medeiros - CS/E student http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
Jul 20 2006
prev sibling parent reply Reiner Pope <reiner.pope gmail.com> writes:
Andrew Fedoniouk wrote:
 Dynamic constness versus static (compile time) constness is not new.

 In case of runtimes like D (natively compileable) such control is not an
 option.

control the code, as we know, such control could be forced at compile time. As to the fact that the runtime could be subverted, well, since we have assembly in D, static const can similarly be converted. If speed issues are the concern, read on.
 I beleive that proposed runtime flag a) is not a constness in any sense

could be caught in debug builds? That effectively ensures that the arrays are kept *constant*, doesn't it?
 b) does not solve compile verification of readonlyness and

readonlyness: speed and certainty. For reasons outlined below, speed is actually likely to be _greater_ with runtime const than with compile-time const. As for certainty, readonlyness is just one of many bug-catching mechanisms. Others include: - Design by Contract (pre- and post- conditions and invariants) - Unit testing - Typing mechanism (partial type safety) - Array bounds checking - GC (catches memory and type-safety errors) All of these checking mechanisms other than type safety are implemented at runtime, yet there is not too much debate about that fact, even though they *could* be checked for at compile time, using theorem proving, (see http://en.wikipedia.org/wiki/SPARK_programming_language for a programming language that does this). The fact that they are checked at runtime means that, like runtime const-ness, the certainty of static checking isn't present. However, it still many more bugs to be caught than no const system at all, and I would even go so far as to say that it would catch *most* const violations if combined with good unit tests. The main advantage of runtime checking is flexibility/speed, as well as no 'const-pollution', as xs0 put it. You get the speed gains from avoiding all unnecessary duplications, a feat which simple (a la C++) static const-checking can't achieve. Imagine that we had a static const-checking system in D: const char[] tolower(const char[] input) // the input must be const, because we agree with CoW, so we won't change it // Because of below, we also declare the output of the function const { // do some stuff if ( a write is necessary ) { // copy it into another variable, since we can't change input (it's const) } return something; // This something could possibly be input, so it also needs to be declared const. So we go back and make the return value of the function also a const. } // Now, since the return value is const, we *must* dup it whenever we call it. This is *very* inefficient if we own the string, because we get two unnecessary dups. This is a big price to pay just to keep static const-checking.
 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }

effectively copy exactly what an array does already, but a) it takes up more memory than what xs0 proposed, and b) it isn't supported natively by the language's arrays, so it is less likely to be used.
 
 Declarative contness prevents data misuse at compile time
 when runtime constness moves problem into execution time
 when is a) too late to do anything and b) expensive.

coverage tool included in DMD, should pick up most, if not all, of the const violations in your code, when you still do have a chance to do something about it. It's impossible to rely on the compiler to pick up all your bugs in any situation. b) It's not expensive, because it avoids unnecessary duplications and there should be a compiler switch to turn of the readonly checks in release builds, once you're sure of safety. xs0 covered the costs and concluded they weren't many.
 I would mention old idea again - real solution would be in creating of
 mechanism of disabling exiting or creating new opertaions
 for intrinsic types.
 
 For example string definition might look like as:
 
 typedef  string char[]
 {
     disable opAssign;
     ....
     char[] tolower() { ..... }
 }

data-protection is just WAY TOO inflexible, and it removes the areas where D's string (and array) processing is so powerful. Cheers, Reiner
Jul 19 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Reiner Pope" <reiner.pope gmail.com> wrote in message 
news:e9kunq$qli$1 digitaldaemon.com...
 You get the speed gains from avoiding all unnecessary duplications, a feat 
 which simple (a la C++) static const-checking can't achieve. Imagine that 
 we had a static const-checking system in D:

 const char[] tolower(const char[] input)
 // the input must be const, because we agree with CoW, so we won't change 
 it
 // Because of below, we also declare the output of the function const
 {
   // do some stuff
   if ( a write is necessary )
   { // copy it into another variable, since we can't change input (it's 
 const)
   }
   return something;
 // This something could possibly be input, so it also needs to be declared 
 const. So we go back and make the return value of the function also a 
 const.
 }

 // Now, since the return value is const, we *must* dup it whenever we call 
 it. This is *very* inefficient if we own the string, because we get two 
 unnecessary dups. This is a big price to pay just to keep static 
 const-checking.


 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }

copy exactly what an array does already, but a) it takes up more memory than what xs0 proposed, and b) it isn't supported natively by the language's arrays, so it is less likely to be used.

propsed readonly solves one particular pretty narrow case of COW (only for arrays and only in functions aware about this flag) C++ has better and more universal mechanism for this. inline string & string::operator= ( const string &s ) { release_data(); set_data ( s.data ); return *this; } inline string & string::operator += ( const string &s ) { mutate(*this); resize( length() + s.length() ); ..... return *this; } I beleive that COW arrays (strings in particular) if they needed cannot be made without operator= in structures in D. Reference counting cannot be made in D with the same elegancy as in C++. But in pure GC world COW strings are not used. Strings in Java, C#, JavaScript, etc. are immutable character ranges - string as a type simply has no such things as str[i] = 'o'; There are strong reasons for that. extended typedef and alias will allow D to have strings as value types without any additional runtime costs. Andrew Fedoniouk. http://terrainformatica.com
Jul 19 2006
parent Reiner Pope <reiner.pope gmail.com> writes:
Andrew Fedoniouk wrote:
 "Reiner Pope" <reiner.pope gmail.com> wrote in message 
 news:e9kunq$qli$1 digitaldaemon.com...
 You get the speed gains from avoiding all unnecessary duplications, a feat 
 which simple (a la C++) static const-checking can't achieve. Imagine that 
 we had a static const-checking system in D:

 const char[] tolower(const char[] input)
 // the input must be const, because we agree with CoW, so we won't change 
 it
 // Because of below, we also declare the output of the function const
 {
   // do some stuff
   if ( a write is necessary )
   { // copy it into another variable, since we can't change input (it's 
 const)
   }
   return something;
 // This something could possibly be input, so it also needs to be declared 
 const. So we go back and make the return value of the function also a 
 const.
 }

 // Now, since the return value is const, we *must* dup it whenever we call 
 it. This is *very* inefficient if we own the string, because we get two 
 unnecessary dups. This is a big price to pay just to keep static 
 const-checking.


 c) can be implemented now by defining:
 struct vector
 {
     bool readonly;
     T*  data;
     uint length;
 }

copy exactly what an array does already, but a) it takes up more memory than what xs0 proposed, and b) it isn't supported natively by the language's arrays, so it is less likely to be used.

propsed readonly solves one particular pretty narrow case of COW (only for arrays and only in functions aware about this flag)

saying that the opIndexAssign property of arrays is limited because it can only be used by the functions that know about it. I see this proposal as an alternative to C++-style const, and with regards to functions being aware of the features, xs0's solution is better because it avoids const propogation throughout the code
 
 C++ has better and more universal mechanism for this.
 
 inline string &
     string::operator= ( const string &s )
   {
     release_data();
     set_data ( s.data );
     return *this;
   }
 

is a duplication, which is the runtime costs we are trying to avoid.
 inline string & string::operator += ( const string &s )
   {
     mutate(*this);
     resize( length() + s.length() );
     .....
     return *this;
   }
 

The other point to make is that this seems not to be a C++ feature, but rather a library feature. I'm probably not understanding your examples, but can you, say, provide C++ code to match the following D code's functionality while avoiding unnecessary duplicates _and having const safety_: char[] foo = "foo"; foo = tolower(toupper(foo)); I don't see how you can manage that with static const-checking. Please explain, and maybe then I can understand how the C++ solution is 'better'.
 I beleive that COW arrays (strings in particular) if they needed cannot
 be made without operator= in structures in D.

you haven't outlined a technical reason for it not working. If Walter integrates it into D, then that isn't going to cause any problems.
 Reference counting cannot be made in D with the same elegancy as in C++.

need ref-counting for CoW strings. Doesn't mark-and-sweep manage it better?
 But in pure GC world COW strings are not used.
 Strings in Java, C#, JavaScript, etc. are immutable character ranges -

fast string processing is largely diminished. However, D's string processing capabilities are good, and since it is possible to keep them, why shouldn't we?
 string as a type simply has no such things as str[i] = 'o';
 There are strong reasons for that.

cumbersome to process strings, with all the calls to foo.substring(0, 2); and so on. The other downside is that the processing is *slow* *as*. Cheers, Reiner
Jul 20 2006
prev sibling parent reply "Craig Black" <cblack ara.com> writes:
Sounds like a great idea to me.  Easy to implement, improves correctness and 
performance.  What are we waiting for?

-Craig 
Jul 19 2006
parent reply xs0 <xs0 xs0.com> writes:
Craig Black wrote:
 Sounds like a great idea to me.  Easy to implement, improves correctness and 
 performance.  What are we waiting for?

Personally, I'm waiting/hoping for Walter to see the proposal and say what he thinks :) I'm also wondering whether the "overwhelming" response to the proposal is because - I didn't write "proposal" in the subject - it's from me (I used to argue in a bad way too much, I'm sure I'm being filtered at least by some people :) - it's so bad it's not even worth a comment - it's so good everybody is already waiting for Walter to say yes ;) xs0
Jul 19 2006
parent reply Don Clugston <dac nospam.com.au> writes:
xs0 wrote:
 Craig Black wrote:
 Sounds like a great idea to me.  Easy to implement, improves 
 correctness and performance.  What are we waiting for?

Personally, I'm waiting/hoping for Walter to see the proposal and say what he thinks :) I'm also wondering whether the "overwhelming" response to the proposal is because - I didn't write "proposal" in the subject - it's from me (I used to argue in a bad way too much, I'm sure I'm being filtered at least by some people :) - it's so bad it's not even worth a comment - it's so good everybody is already waiting for Walter to say yes ;)

Maybe you just need some better terminology. How about arr.clone to replace arr with a writable copy of arr (instead of "needToWrite"). (you don't care if its the original arr, or a dup) and turn it into a proposal about a more efficient dup. Modify Only One Copy On Write. (MOO COW). <g>.
Jul 20 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Don Clugston" <dac nospam.com.au> wrote in message 
news:e9nvee$2h4m$1 digitaldaemon.com...
 xs0 wrote:
 Craig Black wrote:
 Sounds like a great idea to me.  Easy to implement, improves correctness 
 and performance.  What are we waiting for?

Personally, I'm waiting/hoping for Walter to see the proposal and say what he thinks :) I'm also wondering whether the "overwhelming" response to the proposal is because - I didn't write "proposal" in the subject - it's from me (I used to argue in a bad way too much, I'm sure I'm being filtered at least by some people :) - it's so bad it's not even worth a comment - it's so good everybody is already waiting for Walter to say yes ;)

Maybe you just need some better terminology. How about arr.clone to replace arr with a writable copy of arr (instead of "needToWrite"). (you don't care if its the original arr, or a dup) and turn it into a proposal about a more efficient dup. Modify Only One Copy On Write. (MOO COW). <g>.

Don, I think that reference counting (MOO COW) has the same set of "civil rights" as GC so probably it makes sense to look on this from language design perspective in more universal fashion. RefCounting of arrays is only one particular thing I mean - language shall support this idiom with the same quality as GC. In fact for typical and effective refcounting implementation it is enough to have ctors/dtors/assignement in structs. Having them MOO COW can be implemented easily without need of runtime model changes. And MOO COW is somehow orthogonal to constness. Again, I would try to find here more universal solution rather than particular array problem. I beleive that "smart pointer" as an entity will cover MOO COW cases. But D does not have facilities now for smart pointers at all. Andrew Fedoniouk. http://terrainformatica.com
Jul 20 2006
parent reply Don Clugston <dac nospam.com.au> writes:
Andrew Fedoniouk wrote:
 Don, I think that reference counting (MOO COW) has the
 same set of "civil rights" as GC so probably it makes sense
 to look on this from language design perspective in more universal
 fashion. RefCounting of arrays is only one particular thing I mean -
 language shall support this idiom with the same quality as GC.

I agree that something more universal would be better. But there's a really interesting feature: when you have GC, you don't need full reference counting, because you don't need deterministic destruction. You only need a single bit. (I think this is correct, but it needs more thought).
 In fact for typical and effective refcounting implementation it is
 enough to have ctors/dtors/assignement in structs.

I don't quite agree with this. I think that arrays in D are fundamentally different from arrays in C/C++. In C, they're little more than syntactic sugar for pointers, whereas in D, they are more like very important, built-in structs. If refcounting were more integral in the language, it would need to be available for built-in arrays.
 Having them MOO COW can be implemented easily without
 need of runtime model changes.
 
 And MOO COW is somehow orthogonal to constness.

Yes, that's the point I was trying to make. I thought it was an interesting proposal, but doesn't have much to do with compile-time constness, except insofar as it reduces the need for full const.
 Again, I would try to find here more universal solution
 rather than particular array problem.
 
 I beleive that "smart pointer" as an entity will cover
 MOO COW cases. But D does not have facilities
 now for smart pointers at all.

Walter seems to have vehement opposition to operator =. I wonder if it is really necessary. Maybe a single opXXX() function could do the job, if the compiler had some extra intelligence. (Much as opCmp does all of the >,<, >=, <=. Every op= and copy constructor I've ever seen in C++ was very tedious, I wonder if that design pattern could be factored into a single function).
Jul 21 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Don Clugston" <dac nospam.com.au> wrote in message 
news:e9qb6h$2r4m$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:
 Don, I think that reference counting (MOO COW) has the
 same set of "civil rights" as GC so probably it makes sense
 to look on this from language design perspective in more universal
 fashion. RefCounting of arrays is only one particular thing I mean -
 language shall support this idiom with the same quality as GC.

I agree that something more universal would be better. But there's a really interesting feature: when you have GC, you don't need full reference counting, because you don't need deterministic destruction. You only need a single bit. (I think this is correct, but it needs more thought).

"But there's a really interesting feature: when you have GC, you don't need full reference counting, because you don't need deterministic destruction." Theoretically, yes. Practically... I would change it to "when you have perfect GC". And about GC: Here is the sample I know pretty well: There is no HTML rendering engine in the wild based on GC memory management ( http://en.wikipedia.org/wiki/List_of_layout_engines ) I've seen attempts to do them in Java or C# - not even close. (In Harmonia I've decided to do not use GC heap for the DOM too) Deterministic memory management has one and big benefit - it is manageable and predictable.
 In fact for typical and effective refcounting implementation it is
 enough to have ctors/dtors/assignement in structs.

I don't quite agree with this. I think that arrays in D are fundamentally different from arrays in C/C++. In C, they're little more than syntactic sugar for pointers, whereas in D, they are more like very important, built-in structs. If refcounting were more integral in the language, it would need to be available for built-in arrays.
 Having them MOO COW can be implemented easily without
 need of runtime model changes.

 And MOO COW is somehow orthogonal to constness.

Yes, that's the point I was trying to make. I thought it was an interesting proposal, but doesn't have much to do with compile-time constness, except insofar as it reduces the need for full const.

There is a string struct in Harmonia using such bit (string.d): struct tstring(CHAR) { CHAR[] chars; bit mutable = true; // by default empty string is mutable ...... } and this is exactly wat was propsed (except of placement of the bit) Without constness and ability to define methods it worth nothing for arrays.
 Again, I would try to find here more universal solution
 rather than particular array problem.

 I beleive that "smart pointer" as an entity will cover
 MOO COW cases. But D does not have facilities
 now for smart pointers at all.

Walter seems to have vehement opposition to operator =. I wonder if it is really necessary. Maybe a single opXXX() function could do the job, if the compiler had some extra intelligence. (Much as opCmp does all of the >,<, >=, <=. Every op= and copy constructor I've ever seen in C++ was very tedious, I wonder if that design pattern could be factored into a single function).

operator "=" IS really necessary. As there is no method in D currently to guard assignment to variable (memory location). Again without it good chunk of RAII methods and smart pointers are not implementable in D. In HTMLayout SDK I have dom::element object which is wrapper around internal DOM element handler. It is in C++. I physically cannot write something close and so easy to use in D. **DOM element.*/ class element { protected: HELEMENT he; void use(HELEMENT h) { he = (HTMLayout_UseElement(h) == HLDOM_OK)? h: 0; } void unuse() { if(he) HTMLayout_UnuseElement(he); he = 0; } void set(HELEMENT h) { unuse(); use(h); } public: element(): he(0) { } element(HELEMENT h) { use(h); } element(const element& e) { use(e.he); } operator HELEMENT() const { return he; } ~element() { unuse(); } element& operator = (HELEMENT h) { set(h); return *this; } element& operator = (const element& e) { set(e.he); return *this; } .... }
Jul 21 2006
parent reply Ben Phillips <Ben_member pathlink.com> writes:
operator "="  IS really necessary.  As there is no method in D
currently to guard assignment to variable (memory location).
Again without it good chunk of RAII methods and smart pointers
are not implementable in D.

It is impossible to allow operator "=" to be overloaded without totally killing the way D works because D uses references. Example: ClassA a = new ClassA(); ClassA b = a; // b now refers to a b.mutate(); // both 'b' and 'a' are changed since they refer to the same object What is possible is to define a new operator (such as ":=") that means copy assignment, but I don't see how this differs from creating a method that does the same thing.
Jul 21 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Ben Phillips" <Ben_member pathlink.com> wrote in message 
news:e9rc1u$1g71$1 digitaldaemon.com...
operator "="  IS really necessary.  As there is no method in D
currently to guard assignment to variable (memory location).
Again without it good chunk of RAII methods and smart pointers
are not implementable in D.

It is impossible to allow operator "=" to be overloaded without totally killing the way D works because D uses references. Example: ClassA a = new ClassA(); ClassA b = a; // b now refers to a b.mutate(); // both 'b' and 'a' are changed since they refer to the same object

I think that operator= shall be available only for structs and probably other value types. So it will be no conflict with current situation.
 What is possible is to define a new operator (such as ":=") that means 
 copy
 assignment, but I don't see how this differs from creating a method that 
 does
 the same thing.

method is not an option at all. operator= is a guard of memory loacation and method, well, is method. struct guard { int v; void opAssign(int nv) { alarm("value 'v' is about to change"); v = v; } } guard gv; gv = 12; As you may see operator= guards memory location allowing you to intercept all assignments into the variable. Too many things (RAII, smart pointers) were built around this in C++. Method of the struct will not help you here in principle. Andrew Fedoniouk. http://terrainformatica.com
Jul 21 2006
parent reply "Derek Parnell" <derek psych.ward> writes:
On Sat, 22 Jul 2006 06:27:24 +1000, Andrew Fedoniouk  =

<news terrainformatica.com> wrote:

 "Ben Phillips" <Ben_member pathlink.com> wrote in message
 news:e9rc1u$1g71$1 digitaldaemon.com...
 operator "=3D"  IS really necessary.  As there is no method in D
 currently to guard assignment to variable (memory location).
 Again without it good chunk of RAII methods and smart pointers
 are not implementable in D.

It is impossible to allow operator "=3D" to be overloaded without tot=


 killing
 the way D works because D uses references.
 Example:
 ClassA a =3D new ClassA();
 ClassA b =3D a; // b now refers to a
 b.mutate(); // both 'b' and 'a' are changed since they refer to the s=


 object


But some of the side effects of a new operator ':=3D' would be that it c= ould = be used with value types and reference types alike and mean that the = information contained in the right-hand side member is copy to the = left-hand side member. It would remove the need for ".dup" for example. char[] a; a :=3D toString(4);
 method is not an option at all.
 operator=3D is a guard of memory loacation and method, well, is method=

 struct guard {
    int v;
    void opAssign(int nv) {  alarm("value 'v' is about to change"); v =3D=

 v;  }
 }

 guard gv;
 gv =3D 12;

 As you may see operator=3D guards memory location allowing you to inte=

 all assignments into the variable. Too many things (RAII, smart pointe=

 were built
 around this in C++.

 Method of the struct will not help you here in principle.

struct guard { private int _v; void v(int nv) { alarm("value 'v' is about to change"); _v =3D v; = } } guard gv; gv.v =3D 12 ; -- = Derek Parnell Melbourne, Australia
Jul 21 2006
parent reply "Andrew Fedoniouk" <news terrainformatica.com> writes:
"Derek Parnell" <derek psych.ward> wrote in message 
news:op.tc2lghnq6b8z09 ginger.vic.bigpond.net.au...
[skiped]
 method is not an option at all.
 operator= is a guard of memory loacation and method, well, is method.

 struct guard {
    int v;
    void opAssign(int nv) {  alarm("value 'v' is about to change"); v =
 v;  }
 }

 guard gv;
 gv = 12;

 As you may see operator= guards memory location allowing you to intercept
 all assignments into the variable. Too many things (RAII, smart pointers)
 were built
 around this in C++.

 Method of the struct will not help you here in principle.

struct guard { private int _v; void v(int nv) { alarm("value 'v' is about to change"); _v = v; } } guard gv; gv.v = 12 ;

Consider this: guard gv, gv1; gv1.v = 24; gv.v = 12 ; gv = gv1; //oops, where is my alarm()? Again there is no method in "modern D" to catch assignment to the variable. In my case (wrapper of htmlayout), in following assignment: dom::element root = dom::element::get_root(hWnd); operator= calls HTMLayout_useElement of the HELEMENT returned by get_root. (C++) And C++ will call destructor for the root at the end of the block. And in destructor happens HTMLayout_unuseElement. Such use case allows to hold resources for limited (deterministic) time. At the end system is more responsive than any GCable one. There is no way in D to implement this. Sorry, but this is true. Andrew Fedoniouk. http://terrainformatica.com
Jul 22 2006
parent "Derek Parnell" <derek psych.ward> writes:
On Sat, 22 Jul 2006 18:03:33 +1000, Andrew Fedoniouk  =

<news terrainformatica.com> wrote:


 There is no way in D to implement this. Sorry, but this is true.

Agreed. The ':=3D' operator could be one solution. I suggest that when applied, = it = should only copy one level deep and if one needs deeper copies then the = = opCopy() function could be overloaded to provide that functionality. Of = = course, we should also have opXXX functionality when using basic types a= nd = arrays. Hopefully, this concept can be seriously considered for v2.0 -- = Derek Parnell Melbourne, Australia
Jul 22 2006