digitalmars.D - Java String vs wchar[] Was: Re: inner classes

Andrew Fedoniouk (54/59) May 30 2005 string does not need to be a class.

Derek Parnell (13/21) May 30 2005 On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:

Andrew Fedoniouk (10/26) May 31 2005 Thanks, Derek.

Walter (23/34) May 31 2005 A number of languages use the immutable string idiom, and its corollary

Andrew Fedoniouk (24/66) May 31 2005 Gotcha.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (19/34) May 31 2005 Magic Eight Ball says:

Andrew Fedoniouk (12/45) May 31 2005 Yep, Anders, seems like we are on the same track with you.

Kris (10/63) May 31 2005 Aye. There's a fair number of people who share this concern, and we all ...

Sean Kelly (52/56) May 31 2005 FWIW, logical const behavior can *almost* be modeled decently using a so...

Brad Beveridge (12/77) May 31 2005 Do you have any ideas on how this could work with basic data types and

Andrew Fedoniouk (9/85) May 31 2005 The main problem that for basic types there are no solution in principle...
Sean Kelly (65/77) May 31 2005 I don't know that this is worth doing for POD types because they're copi...

Andrew Fedoniouk (5/76) May 31 2005 Cool! :)

Walter (12/26) May 31 2005 Third party functions should follow the COW principle too. They should n...

Derek Parnell (11/30) May 31 2005 Yes, and cyclists shouldn't run red lights either.

Ben Hinkle (13/41) May 31 2005 hmm. around here it isn't the cyclists that run red lights - it's the th...

kris (24/65) May 31 2005 Ben; Walter;

Andrew Fedoniouk (58/124) Jun 05 2005 proposed const costs nothing in runtime. Even better - it helps

Ben Hinkle (17/71) Jun 05 2005 Are the comments in the code above editorial by you or are they actually...

Andrew Fedoniouk (33/121) Jun 05 2005 Yes, comments are mine. I took a look again:

Ben Hinkle (11/33) Jun 06 2005 That's what the "readonly" property does. For example

Andrew Fedoniouk (2/41) Jun 06 2005
Andrew Fedoniouk (17/55) Jun 06 2005 Yep. It will work. It will increase size of generated code twice as you

U.Baumanis (4/41) May 31 2005 How about immutable final String for general stuff end StringBuffer (or

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (8/10) May 31 2005 If you want some Java-like string classes, I hacked some stuff together:

U.Baumanis (6/16) May 31 2005 Thanks! It would be nice to have it in std.string.

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (10/18) May 31 2005 Don't misunderstand me, I do not think that D needs a String class...

U.Baumanis (6/24) May 31 2005 You are right! I forgot why i wanted a String class...

Eugene Pelekhay (8/55) May 31 2005 May be I'm dummy, but I don't see in this example why this other

Walter (13/20) May 31 2005 You're right that you can avoid excessive copying by doing ref counting.

Andrew Fedoniouk (58/81) Jun 01 2005 Yes, GC does good job. In some places. In other places ref-counting is

Kramer (13/101) Jun 01 2005 Well put. Again, I think a point has been made for having a facility in...

Andrew Fedoniouk (11/141) Jun 01 2005 exactly.

Thomas Kuehne (25/35) Jun 02 2005 -----BEGIN PGP SIGNED MESSAGE-----

kris (9/19) Jun 02 2005 Doesn't have to prove anything if the mutable aspect is part of the

Andrew Fedoniouk (8/27) Jun 02 2005 This "all arrays comming from the closed source part as immutable." soun...

Thomas Kuehne (13/23) Jun 02 2005 -----BEGIN PGP SIGNED MESSAGE-----

Andrew Fedoniouk (4/24) Jun 02 2005 I think I see now what you mean ...
Brad Beveridge (59/59) Jun 02 2005 Here are my thoughts - if people would like to hear them :)

Andrew Fedoniouk (14/74) Jun 02 2005 Exactly. And again, we don't need it at the scale it made in C++ .
Sean Kelly (8/15) Jun 02 2005 My impression of Walter is that he doesn't like solutions that are just ...

Andrew Fedoniouk (30/48) Jun 02 2005 Ummm... Is it in fact so complex? I think everything needed is already

Sean Kelly (6/33) Jun 02 2005 I don't think it is complex. I was thinking of object const-ness when I...

Andrew Fedoniouk (17/58) Jun 02 2005 Why sorry? Anyway...

Regan Heath (16/85) Jun 02 2005 Ok, side question, assume we're going to use 'const' (or any keyword) fo...

Brad Beveridge (10/14) Jun 02 2005 I like

Andrew Fedoniouk (11/25) Jun 02 2005 Yep, only one:

Regan Heath (21/50) Jun 02 2005 I don't understand what you're saying. Which of these, if any, do you

Andrew Fedoniouk (9/61) Jun 02 2005 I mean #2.

Regan Heath (10/25) Jun 02 2005 Yet 'in' does not prevent you from changing the parameter (a copy of the...

Brad Beveridge (13/24) Jun 02 2005 Though I think I get your point - this just doesn't feel right to me.

Andrew Fedoniouk (14/38) Jun 02 2005 Take a look on this:

Brad Beveridge (4/16) Jun 02 2005 You have convinced me :)

Regan Heath (22/36) Jun 02 2005 I wasn't concerned so much with 'where' to type the keyword, but rather ...

Brad Beveridge (11/36) Jun 02 2005 I think so, though I don't feel strongly - we may need #2. I don't see

Regan Heath (25/52) Jun 02 2005 I cannot think of a specific use off the top of my head for #1, but, I

Derek Parnell (10/15) Jun 02 2005 I'm getting confused now; sorry. Are these three things mean ...

Regan Heath (3/15) Jun 02 2005 Yep. Assuming we need all 3 of them.

Andrew Fedoniouk (15/34) Jun 02 2005 Let's just have

Andrew Fedoniouk (4/42) Jun 02 2005 Sorry, above shall be read as:

Tom S (8/15) Jun 02 2005 What about:

Sean Kelly (46/59) Jun 02 2005 I was thinking more of something like this:
Sean Kelly (3/6) Jun 02 2005 Actually, this would work for classes as well.

Thomas Kuehne (54/74) Jun 02 2005 -----BEGIN PGP SIGNED MESSAGE-----

Andrew Fedoniouk (16/97) Jun 02 2005 class C {

Andrew Fedoniouk (33/69) Jun 02 2005 Thomas, I am not proposing here any flags changeable in runtime or exist...

Thomas Kuehne (15/39) Jun 02 2005 -----BEGIN PGP SIGNED MESSAGE-----

Andrew Fedoniouk (21/63) Jun 02 2005 When: In compile time. How: by generating compile time error.

Eugene Pelekhay (32/59) Jun 02 2005 I know about price I pay for ref counting. But in my field as in many

Derek Parnell (21/32) May 31 2005 On Tue, 31 May 2005 00:46:35 -0700, Walter wrote:

Andrew Fedoniouk (8/47) May 31 2005 Derek, this is exactly what const does in C++.

Jan-Eric Duden (11/58) Jun 03 2005 If D had a standard string class there wouldn't be any problem!

Derek Parnell (38/59) Jun 01 2005 Given the current semantics of D, could a workaround be that we give the

Andrew Fedoniouk (17/76) Jun 01 2005 Derek it is not a choice.

"Andrew Fedoniouk" <news terrainformatica.com> writes:

 Are you going to have string constants castable to String, BTW?
 Or any other class? That would be nice...


 Walter asks:

 What advantage does java.lang.String have? Why does string need to be a
 class?

string does not need to be a class.
It is nice to be able to declare methods for it though.
At least for the sake of Java-2-D tool or so.

java.lang.String class has a) methods b) String owns buffer - it controls
buffer.

In D is possible:
int[char[]] map;
char[] s = "something";
map[s] = 1;
s[0] = '?'; // I have no idea what result will be. sure not good.

And you can bump into such problem quite easily in D. I personally
did many times. And too hard to find source sometimes.

In Java such collision is not possible in principle: String is final and 
immutable.
Java strings more (I would say - too) greedy but more robust. In D I tried 
to create
something like String but declarations like str = new String("real string");
or with structs str = String("mmmm");
are just boring and aestheticlly disastrous.

For Java guys such D strings will be just a source of permanent errors.

To prevent collisions Mango library (nice one!)
uses two versions of classes e.g. Dictionary/MutableDictionary -
a bit overkill, imho, but works.

Ideally in D it should be possible to reproduce
at least std::string. (I am yet silent about copy-on-write version)
I tried four times - did not find yet reliable
solution. I am pretty sure - it is impossible to implement the same 
abstarction in D and with the same
overhead.
struct was good candiadate for such wrapper but no copying ctor. class needs 
allocation.
Only dup so far. But solution with dup is even worse than in Java. See:

class Url
{
   char[] _hostname;
   ...
   char[] hostname() { return _hostname.dup; } // Doh!
}

if( url.hostname == "terrainformatica.com" )
// 32 bytes less in memory, just to compare it!
  ....



Ideal from many points of view would be a solution with const

class Url {
  char[] _hostname;

  const char[] hostname() { return _hostname; } // Yep! this exactly what we 
need.

}

I think that it would be just enough to be able to declare as const 
variables of simple
types - array and pointers.

Generally speaking const does not imply better assembler code.
But const helps to build optimal and fast systems where GC spends
1% of time and not 20%.

Andrew.

May 30 2005

Derek Parnell <derek psych.ward> writes:

On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


[snip]
 
 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.
 
 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

I'm sure you already know this, but for the benefit of others, you can
avoid this trap by coding ...

 int[char[]] map;
 char[] s = "something";
 map[s.dup] = 1; // NB: .dup call.
 s[0] = '?'; // Does not mess up the index to map.

-- 
Derek
Melbourne, Australia
31/05/2005 4:05:26 PM

May 30 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:1p5feg14mh412.1x6qgemuugouf$.dlg 40tude.net...
 On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


 [snip]

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 I'm sure you already know this, but for the benefit of others, you can
 avoid this trap by coding ...

 int[char[]] map;
 char[] s = "something";
 map[s.dup] = 1; // NB: .dup call.
 s[0] = '?'; // Does not mess up the index to map.

Thanks, Derek.
But shall I put you recomendation into comments for each function returning 
string?
Don't store, don't modify, etc?
If you put my string into your map always do its dup, etc.

This is not that I am considering as technically
correct solution.

Andrew.

May 31 2005

"Walter" <newshound digitalmars.com> writes:

"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.

A number of languages use the immutable string idiom, and its corollary
"always implicitly copy the string when writing to it". They all share
another common characteristic - they're slow, and they're slow in a manner
that is *not fixable*. And they're not just slower by a factor, many
algorithms run *exponentially* slower because of the copying.

D must be fast, and the only way to be fast with strings (and arrays) is to
not have the language implicitly copy them, but to allow the programmer the
flexibility to copy or not copy. To know when to copy, use the Copy On Write
principle (COW). That is, if you're not *sure* you've got the only copy of a
string, .dup it before modifying it.

So why isn't that just as bad as the languages that implicitly copy on
write? The answer is that often, you know that you are the sole owner, such
as:

    char[] s = new char[10];
    for (i = 0; i < 10; i++)
        s[i] = 'c';

Those other languages are doomed to make 10 copies of s. The D programmer
needs to make 0 copies.

As to your example above, when you pass a reference to a string to an
associative array, then you aren't the sole owner of that string anymore.
Don't change it. .dup it.

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Walter" <newshound digitalmars.com> wrote in message 
news:d7h4rf$1345$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.

 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.

 D must be fast, and the only way to be fast with strings (and arrays) is 
 to
 not have the language implicitly copy them, but to allow the programmer 
 the
 flexibility to copy or not copy. To know when to copy, use the Copy On 
 Write
 principle (COW). That is, if you're not *sure* you've got the only copy of 
 a
 string, .dup it before modifying it.

 So why isn't that just as bad as the languages that implicitly copy on
 write? The answer is that often, you know that you are the sole owner, 
 such
 as:

    char[] s = new char[10];
    for (i = 0; i < 10; i++)
        s[i] = 'c';

 Those other languages are doomed to make 10 copies of s. The D programmer
 needs to make 0 copies.

 As to your example above, when you pass a reference to a string to an
 associative array, then you aren't the sole owner of that string anymore.
 Don't change it. .dup it.

Gotcha.

And what will be your advice then for:

class Url {
  char[] _hostname;
  char[] hostname() { return _hostname; }
}

_hostname should not be changeable nor intentionally
nor accidentally.
hostname access pattern is primarily read. But it could possibly be
passed in some third party functions.

I am serious. I really want to know how to design it better.

I've made an ugly

struct string {
   wchar[] chars;
   bool      mutable;
}

But this not working in 15% of cases.

I am remebering old good days of C programming with these char[]s.
Damned fast but not maintainable.

In C++ I have my own nice tool::string with reliable
copy-on-write..... sigh.

Andrew.

May 31 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Andrew Fedoniouk wrote:

 And what will be your advice then for:
 
 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }
 
 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.
 
 I am serious. I really want to know how to design it better.

Magic Eight Ball says:
                ___
               /   \
              /     \
             /  ASK  \
            /  AGAIN  \
           /   LATER   \
           \___________/

My own prediction is that we argue about it for a few months more,
and then Walter caves in and adds a "readonly" keyword to D... :-)

For the time being, I think returning the string and asking
others to be nice is better than using a Class or a struct ?

 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.

That's where we are at now, I suppose.

I've already run into some things regarding string literals.
And that was even before any potential class library user...

Copy on Write is currently just a Gentlemen's Agreement.
And it needs the client using Url.hostname to play along.

--anders

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message 
news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

 Magic Eight Ball says:
                ___
               /   \
              /     \
             /  ASK  \
            /  AGAIN  \
           /   LATER   \
           \___________/

 My own prediction is that we argue about it for a few months more,
 and then Walter caves in and adds a "readonly" keyword to D... :-)

 For the time being, I think returning the string and asking
 others to be nice is better than using a Class or a struct ?

 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.

 That's where we are at now, I suppose.

 I've already run into some things regarding string literals.
 And that was even before any potential class library user...

 Copy on Write is currently just a Gentlemen's Agreement.
 And it needs the client using Url.hostname to play along.

Yep, Anders, seems like we are on the same track with you.

The thing is: I really don't know of a good style of
library/component design in D.

If I am writing all code in one EXE by myself -
fine - I am genetleman with myself. Well... even with
myself... 'Today me' and 'yesterday me' frequently
different persons.

But! If I am desisginig library of common use....

I can imagine: documentation starts from:

"For gentlemen only...."

May 31 2005

"Kris" <fu bar.com> writes:

"Andrew Fedoniouk" <news terrainformatica.com> wrote ...
 "Anders F Bj�rklund" <afb algonet.se> wrote in message
 news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

 Magic Eight Ball says:
                ___
               /   \
              /     \
             /  ASK  \
            /  AGAIN  \
           /   LATER   \
           \___________/

 My own prediction is that we argue about it for a few months more,
 and then Walter caves in and adds a "readonly" keyword to D... :-)

 For the time being, I think returning the string and asking
 others to be nice is better than using a Class or a struct ?

 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.

 That's where we are at now, I suppose.

 I've already run into some things regarding string literals.
 And that was even before any potential class library user...

 Copy on Write is currently just a Gentlemen's Agreement.
 And it needs the client using Url.hostname to play along.

 Yep, Anders, seems like we are on the same track with you.

Aye. There's a fair number of people who share this concern, and we all seem
to be asking for the same thing. I think/hope it's a matter of time rather
than legitimacy.


 The thing is: I really don't know of a good style of
 library/component design in D.

FWIW: I've been forced down the path of the Gentleman's agreement, with the
expectation that 'readonly' will materialize in some form. It is possible to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).


 If I am writing all code in one EXE by myself -
 fine - I am genetleman with myself. Well... even with
 myself... 'Today me' and 'yesterday me' frequently
 different persons.

 But! If I am desisginig library of common use....

 I can imagine: documentation starts from:

 "For gentlemen only...."

Good one! Perhaps we should come up with a GLA: "Gentleman's License
Agreement" :-)

May 31 2005

Sean Kelly <sean f4.ca> writes:

In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...
FWIW: I've been forced down the path of the Gentleman's agreement, with the
expectation that 'readonly' will materialize in some form. It is possible to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).

FWIW, logical const behavior can *almost* be modeled decently using a sort of
honor system as well:













































The obvious problems with the above are that (1) it requires a lot of
cooperation and (2) the read function has no easy way of storing whether C was
mutable *before* the function was called, so it may 'restore' the wrong state
(though this issue could be solved with thread local storage, as the expense of
added complexity).


Sean

May 31 2005

Brad Beveridge <brad somewhere.net> writes:

Sean Kelly wrote:
 In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...
 
FWIW: I've been forced down the path of the Gentleman's agreement, with the
expectation that 'readonly' will materialize in some form. It is possible to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).

 
 
 FWIW, logical const behavior can *almost* be modeled decently using a sort of
 honor system as well:
 











































 
 The obvious problems with the above are that (1) it requires a lot of
 cooperation and (2) the read function has no easy way of storing whether C was
 mutable *before* the function was called, so it may 'restore' the wrong state
 (though this issue could be solved with thread local storage, as the expense of
 added complexity).
 
 
 Sean
 
 

Do you have any ideas on how this could work with basic data types and 
arrays?  With classes, is it possible for the compiler to generate 
accessors for all members automatically?  That would ease implementation 
details.  Perhaps a template could be created so that class code could 
look like
class C
{
	bit mutable;
	mixin constcapable!(int) someInt;
}

Brad

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7ifkp$2goj$1 digitaldaemon.com...
 Sean Kelly wrote:
 In article <d7ic55$2df4$1 digitaldaemon.com>, Kris says...

FWIW: I've been forced down the path of the Gentleman's agreement, with 
the
expectation that 'readonly' will materialize in some form. It is possible 
to
minimise the occurence of such things, but it doesn't always provide the
most lightweight implementation (as you noted elsewhere).


 FWIW, logical const behavior can *almost* be modeled decently using a 
 sort of
 honor system as well:






































 The obvious problems with the above are that (1) it requires a lot of
 cooperation and (2) the read function has no easy way of storing whether 
 C was
 mutable *before* the function was called, so it may 'restore' the wrong 
 state
 (though this issue could be solved with thread local storage, as the 
 expense of
 added complexity).


 Sean

 Do you have any ideas on how this could work with basic data types and 
 arrays?  With classes, is it possible for the compiler to generate 
 accessors for all members automatically?  That would ease implementation 
 details.  Perhaps a template could be created so that class code could 
 look like
 class C
 {
 bit mutable;
 mixin constcapable!(int) someInt;
 }

 Brad

The main problem that for basic types there are no solution in principle.
At least I did not find proper idiom.

I've ended up with a "solution" to patch std.internals and to add
readonly flag to arrays. But this is not working for pointers.

The only feasible solution is const - checks at compile time
and not in runtime.

Andrew.

May 31 2005

Sean Kelly <sean f4.ca> writes:

In article <d7ifkp$2goj$1 digitaldaemon.com>, Brad Beveridge says...
Do you have any ideas on how this could work with basic data types and 
arrays?

I don't know that this is worth doing for POD types because they're copied when
passed as 'in' parameters anyway.  But I suppose strings could be an issue.  One
possibility would be to fingerprint the memory using a checksum and verify that
fingerprint in the out clause.  Perhaps someone has a better suggestion?

With classes, is it possible for the compiler to generate 
accessors for all members automatically?

Certainly.  If DBC became a popular means for verifying const correctness I'd
suggest that the compiler offer a means to do this.  I don't think it would be
too terribly complicated, but I haven't given it much thought.

That would ease implementation 
details.  Perhaps a template could be created so that class code could 
look like
class C
{
	bit mutable;
	mixin constcapable!(int) someInt;
}

Definately a possibility.  My only concern with the DBC method is that it
requires the library writer to build stuff in to support it to keep the code
clean.  The client could do the switching instead:

c.mutable = false;
func( c, d );
c.mutable = true;

but this is obviously pretty clunky.  It might be possible to do this with auto
classes:





































but this is very clunky and doesn't even offer the potential to streamline it
because of auto lifetime rules, though it's worth noting that the above example
prints this:

pre
ctor
post
dtor

and wrapping the function call in its own scope doesn't help:

{ func( cast(C)(new SetConst!(C)( c )) ); }

it's little things like this that has me cursing the restriction that structs
can't have ctors.  Not only does the above code require a completely pointless
memory allocation, but the lifetime of the class isn't even what it should be
(though I'd consider this latter issue to be a bug).


Sean

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Kris" <fu bar.com> wrote in message news:d7ic55$2df4$1 digitaldaemon.com...
 "Andrew Fedoniouk" <news terrainformatica.com> wrote ...
 "Anders F Bj�rklund" <afb algonet.se> wrote in message
 news:d7ha36$18df$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

 Magic Eight Ball says:
                ___
               /   \
              /     \
             /  ASK  \
            /  AGAIN  \
           /   LATER   \
           \___________/

 My own prediction is that we argue about it for a few months more,
 and then Walter caves in and adds a "readonly" keyword to D... :-)

 For the time being, I think returning the string and asking
 others to be nice is better than using a Class or a struct ?

 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.

 That's where we are at now, I suppose.

 I've already run into some things regarding string literals.
 And that was even before any potential class library user...

 Copy on Write is currently just a Gentlemen's Agreement.
 And it needs the client using Url.hostname to play along.

 Yep, Anders, seems like we are on the same track with you.

 Aye. There's a fair number of people who share this concern, and we all 
 seem
 to be asking for the same thing. I think/hope it's a matter of time rather
 than legitimacy.


 The thing is: I really don't know of a good style of
 library/component design in D.

 FWIW: I've been forced down the path of the Gentleman's agreement, with 
 the
 expectation that 'readonly' will materialize in some form. It is possible 
 to
 minimise the occurence of such things, but it doesn't always provide the
 most lightweight implementation (as you noted elsewhere).


 If I am writing all code in one EXE by myself -
 fine - I am genetleman with myself. Well... even with
 myself... 'Today me' and 'yesterday me' frequently
 different persons.

 But! If I am desisginig library of common use....

 I can imagine: documentation starts from:

 "For gentlemen only...."

 Good one! Perhaps we should come up with a GLA: "Gentleman's License
 Agreement" :-)

Cool! :)

My variation:

"Dgentleman's License Agreement"

Andrew.

May 31 2005

"Walter" <newshound digitalmars.com> writes:

"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

Third party functions should follow the COW principle too. They should not
modify strings that they don't know they are the owner of. Look at
std.string.tolower for an example of this.


 I am remebering old good days of C programming with these char[]s.
 Damned fast but not maintainable.
 In C++ I have my own nice tool::string with reliable
 copy-on-write..... sigh.

The C++ std::string is slower than D strings.
www.digitalmars.com/d/cppstrings.html

C++ strings have another serious problem (that D doesn't have): you have to
keep track of who the owner is, so it can be deleted (else you get a memory
leak). In my experience, that is a LOT harder to get right than adhering to
COW. Trying to absolutely determine ownership, like C++ does, is much harder
than just being able to assume you don't own it.

May 31 2005

Derek Parnell <derek psych.ward> writes:

On Tue, 31 May 2005 19:37:03 -0700, Walter wrote:

 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

 
 Third party functions should follow the COW principle too. They should not
 modify strings that they don't know they are the owner of.

Yes, and cyclists shouldn't run red lights either.

We have to code in a world in which many people using our libraries don't
care about what they 'should' do; they use anything that seems like an
expedient idea at the time. Yes I know that not following the CoW rules is
dangerous, but its not as dangerous as cyclists running red lights and they
continue to do that.

-- 
Derek
Melbourne, Australia
1/06/2005 12:56:20 PM

May 31 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:7aqw8u524dge$.1hmvgp4jz3dvc.dlg 40tude.net...
 On Tue, 31 May 2005 19:37:03 -0700, Walter wrote:

 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7h66r$14s5$1 digitaldaemon.com...
 And what will be your advice then for:

 class Url {
   char[] _hostname;
   char[] hostname() { return _hostname; }
 }

 _hostname should not be changeable nor intentionally
 nor accidentally.
 hostname access pattern is primarily read. But it could possibly be
 passed in some third party functions.

 I am serious. I really want to know how to design it better.

 Third party functions should follow the COW principle too. They should 
 not
 modify strings that they don't know they are the owner of.

 Yes, and cyclists shouldn't run red lights either.

 We have to code in a world in which many people using our libraries don't
 care about what they 'should' do; they use anything that seems like an
 expedient idea at the time. Yes I know that not following the CoW rules is
 dangerous, but its not as dangerous as cyclists running red lights and 
 they
 continue to do that.

hmm. around here it isn't the cyclists that run red lights - it's the things 
with 4 wheels and that unused pedal called the "brake". :-P

But more to topic I'm with Walter that when you look at the big picture COW 
is a reasonable balance of trade-offs. The only suggestion I have is to put 
COW more front-and-center in the array help so that people see it from the 
start and it becomes second nature. Compiler protection against malicious 
code isn't that important to me since people will go out of their way to 
write malicious code no matter what the compiler does. I'm more worried 
about the accidental D-newbie who doesn't know about arrays or COW. For 
those cases talking about COW right away in the doc will decrease the 
likelihood of newbie errors.

May 31 2005

kris <fu bar.org> writes:

Ben Hinkle wrote:
And what will be your advice then for:

class Url {
  char[] _hostname;
  char[] hostname() { return _hostname; }
}

_hostname should not be changeable nor intentionally
nor accidentally.
hostname access pattern is primarily read. But it could possibly be
passed in some third party functions.

I am serious. I really want to know how to design it better.

Third party functions should follow the COW principle too. They should 
not
modify strings that they don't know they are the owner of.

Yes, and cyclists shouldn't run red lights either.

We have to code in a world in which many people using our libraries don't
care about what they 'should' do; they use anything that seems like an
expedient idea at the time. Yes I know that not following the CoW rules is
dangerous, but its not as dangerous as cyclists running red lights and 
they
continue to do that.

 
 
 hmm. around here it isn't the cyclists that run red lights - it's the things 
 with 4 wheels and that unused pedal called the "brake". :-P
 
 But more to topic I'm with Walter that when you look at the big picture COW 
 is a reasonable balance of trade-offs. The only suggestion I have is to put 
 COW more front-and-center in the array help so that people see it from the 
 start and it becomes second nature. Compiler protection against malicious 
 code isn't that important to me since people will go out of their way to 
 write malicious code no matter what the compiler does. I'm more worried 
 about the accidental D-newbie who doesn't know about arrays or COW. For 
 those cases talking about COW right away in the doc will decrease the 
 likelihood of newbie errors. 


Ben; Walter;

I think perhaps you're missing a significant point being made? CoW is 
not the issue at stake ~ instead, what's being asked for is a mechanism 
to /enforce/ CoW.

For example: the little example above should not be dup'ing the content 
before return, if it's only being used for reference (read-only) 
purposes by both parties (caller and callee). I think we can all agree 
on that? Yes?

What's being asked for is a means whereby the compiler will 'prohibit' 
some other caller from using the returned array as a writable lValue; at 
compile time. That is, the CoW should be performed by the caller (not 
the callee), /if and when the caller needs to perform a write upon it/. 
And only at that time.

Again, CoW is not being questioned. It's the total lack of enforcement 
that would be good to do something about. The compiler goes out of its 
way to catch out-of-bounds errors WRT arrays ~ we're asking for 
something similar here to avoid a source of silly, easily preventable, 
and hard to track down bugs. It would add some noticable weight to any 
story regarding robustness.

Turn things around for a minute, and assume such a facility was 
available. It's not hard to see how this would be viewed in a most 
favourable light. And there's no downside for the code, or for the 
developer. Best of all worlds?

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"kris" <fu bar.org> wrote in message news:d7jemq$k2n$1 digitaldaemon.com...
 Ben Hinkle wrote:
And what will be your advice then for:

class Url {
  char[] _hostname;
  char[] hostname() { return _hostname; }
}

_hostname should not be changeable nor intentionally
nor accidentally.
hostname access pattern is primarily read. But it could possibly be
passed in some third party functions.

I am serious. I really want to know how to design it better.

Third party functions should follow the COW principle too. They should 
not
modify strings that they don't know they are the owner of.

Yes, and cyclists shouldn't run red lights either.

We have to code in a world in which many people using our libraries don't
care about what they 'should' do; they use anything that seems like an
expedient idea at the time. Yes I know that not following the CoW rules 
is
dangerous, but its not as dangerous as cyclists running red lights and 
they
continue to do that.


 hmm. around here it isn't the cyclists that run red lights - it's the 
 things with 4 wheels and that unused pedal called the "brake". :-P

 But more to topic I'm with Walter that when you look at the big picture 
 COW is a reasonable balance of trade-offs. The only suggestion I have is 
 to put COW more front-and-center in the array help so that people see it 
 from the start and it becomes second nature. Compiler protection against 
 malicious code isn't that important to me since people will go out of 
 their way to write malicious code no matter what the compiler does. I'm 
 more worried about the accidental D-newbie who doesn't know about arrays 
 or COW. For those cases talking about COW right away in the doc will 
 decrease the likelihood of newbie errors.


 Ben; Walter;

 I think perhaps you're missing a significant point being made? CoW is not 
 the issue at stake ~ instead, what's being asked for is a mechanism to 
 /enforce/ CoW.

 For example: the little example above should not be dup'ing the content 
 before return, if it's only being used for reference (read-only) purposes 
 by both parties (caller and callee). I think we can all agree on that? 
 Yes?

 What's being asked for is a means whereby the compiler will 'prohibit' 
 some other caller from using the returned array as a writable lValue; at 
 compile time. That is, the CoW should be performed by the caller (not the 
 callee), /if and when the caller needs to perform a write upon it/. And 
 only at that time.

*exactly*.

 Again, CoW is not being questioned. It's the total lack of enforcement 
 that would be good to do something about. The compiler goes out of its way 
 to catch out-of-bounds errors WRT arrays ~ we're asking for something 
 similar here to avoid a source of silly, easily preventable, and hard to 
 track down bugs. It would add some noticable weight to any story regarding 
 robustness.

 Turn things around for a minute, and assume such a facility was available. 
 It's not hard to see how this would be viewed in a most favourable light. 
 And there's no downside for the code, or for the developer. Best of all 
 worlds?

proposed const costs nothing in runtime. Even better - it helps
to reduce unnecessary allocations.

Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
    Field[] fields() { return m_fields.dup;  } // just in case?
    // probably following is better?
    // const Field[] fields() { return m_fields;  }
}

class Database
{
    Record[]  records()  {   return m_records.dup;   } // the same
    Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
    this(char[] name, uint errno)
    { char* s = strerror(errno);             //  I have no idea
 this(name, std.string.toString(s).dup); //  what is going on here.
 this.errno = errno;
    }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
 ....
     int len = std.string.strlen(fdata.d_name);
     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed 
here???
             // allocation of new string on each entry! Doh!

 ....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
    this(uint errcode)
    {
 super(std.string.toString(strerror(errcode)).dup); // why?
    }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
 {
  type = cast(ProtocolType)proto.p_proto;
  name = std.string.toString(proto.p_name).dup; // why?
....
   aliases = new char[][i];
   for(i = 0; i != aliases.length; i++)
   {
    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
   }
....

}
----------------------------------------
etc.

Jun 05 2005

Ben Hinkle <Ben_member pathlink.com> writes:

[snip]
Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
    Field[] fields() { return m_fields.dup;  } // just in case?
    // probably following is better?
    // const Field[] fields() { return m_fields;  }
}

class Database
{
    Record[]  records()  {   return m_records.dup;   } // the same
    Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
    this(char[] name, uint errno)
    { char* s = strerror(errno);             //  I have no idea
 this(name, std.string.toString(s).dup); //  what is going on here.
 this.errno = errno;
    }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
 ....
     int len = std.string.strlen(fdata.d_name);
     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed 
here???
             // allocation of new string on each entry! Doh!

 ....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
    this(uint errcode)
    {
 super(std.string.toString(strerror(errcode)).dup); // why?
    }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
 {
  type = cast(ProtocolType)proto.p_proto;
  name = std.string.toString(proto.p_name).dup; // why?
....
   aliases = new char[][i];
   for(i = 0; i != aliases.length; i++)
   {
    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
   }
....

}
----------------------------------------
etc.

Are the comments in the code above editorial by you or are they actually in the
code? I'd say someone needs to look at phobos to clean up the dups. If you've
already done the sweep then sending Walter the fixes would be helpful. Phobos
can contain non-D programming styles occasionally - each module has a strong
indication of the author's attitudes IMHO.

ps - for kicks this weekend I've been adding a parameter to the MinTL containers
to indicate read-only vs read-write. For example
struct List(Value, bit ReadOnly = false) {
static if (!ReadOnly) {
void addTail(Value v){...}
.. other functions that modify the list ...
}
.. functions that don't modify the list ...
}
You get a read-only view of a container by using the "readonly" property. I'll
be finishing this stuff up soon and post to D.dtl later in the week.

Jun 05 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Ben Hinkle" <Ben_member pathlink.com> wrote in message 
news:d80fq2$2gfv$1 digitaldaemon.com...
 [snip]
Let's take a look in some code fragments of Phobos:

module std.openrj -----------------------------------

class Record
{
    Field[] fields() { return m_fields.dup;  } // just in case?
    // probably following is better?
    // const Field[] fields() { return m_fields;  }
}

class Database
{
    Record[]  records()  {   return m_records.dup;   } // the same
    Field[]      fields()      {   return m_fields.dup;  }
}

std.file----------------------------------------------

class FileException : Exception
{
    this(char[] name, uint errno)
    { char* s = strerror(errno);             //  I have no idea
 this(name, std.string.toString(s).dup); //  what is going on here.
 this.errno = errno;
    }
}

void listdir(char[] pathname, bool delegate(char[] filename) callback)
{
 ....
     int len = std.string.strlen(fdata.d_name);
     if (!callback(fdata.d_name[0 .. len].dup)) // is dup really needed
here???
             // allocation of new string on each entry! Doh!

 ....
}

std.loader ----------------------------------------------
public class ExeModuleException : Exception
{
    this(uint errcode)
    {
 super(std.string.toString(strerror(errcode)).dup); // why?
    }
}

std.socket ----------------------------------------------

void populate(protoent* proto)
 {
  type = cast(ProtocolType)proto.p_proto;
  name = std.string.toString(proto.p_name).dup; // why?
....
   aliases = new char[][i];
   for(i = 0; i != aliases.length; i++)
   {
    aliases[i] = std.string.toString(proto.p_aliases[i]).dup; // what for?
   }
....

}
----------------------------------------
etc.

 Are the comments in the code above editorial by you or are they actually 
 in the
 code? I'd say someone needs to look at phobos to clean up the dups. If 
 you've
 already done the sweep then sending Walter the fixes would be helpful. 
 Phobos
 can contain non-D programming styles occasionally - each module has a 
 strong
 indication of the author's attitudes IMHO.

Yes, comments are mine. I took a look again:

1) std.socket is fine: std.string.toString(proto.p_aliases[i]).dup is really 
needed
    as proto.p_aliases[i] is temporary string coming from system (sockets)
2) strerror(errcode)).dup probably make sense, as string there is coming 
from
     extern (C) char* strerror(int);
3) For std.openrj I don't really know Matthew intention. Seems like he needs
   such implementation for some reasons or he is following
   .dup advices for library module safe implementation.
   No idea to be short. const will help in such cases.
4) std.file -> void listdir(char[] pathname, bool delegate(char[] filename) 
callback)
    the way it is implemented and without const char[] filename seems like 
only reliable
    way of how to accomplish this as fdata.d_name coming from system static 
buffer.
    Other option would be to create temp buffer on stack, copy string value 
there
    and pass this buffer reference. But this is a copy on each iteration.
    With const such string can be passed 'as is'.

 ps - for kicks this weekend I've been adding a parameter to the MinTL 
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" property. 
 I'll
 be finishing this stuff up soon and post to D.dtl later in the week.

(no offence, just wondering)

I really don't know how such static flag would work.
Suppose I declared:

   List(int, true /*ReadOnly*/) readOnlyList;

How to fill this list then? You will need some mechanism
allowing to do mutable->immutable conversion, right?
My perception is that just something like const_iterator
needs to be designed. Or ConstList without modification
methods which will have sole ctor ConstList(List data).


Andrew.

Jun 05 2005

"Ben Hinkle" <ben.hinkle gmail.com> writes:

 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" property. 
 I'll
 be finishing this stuff up soon and post to D.dtl later in the week.

 (no offence, just wondering)

 I really don't know how such static flag would work.
 Suppose I declared:

   List(int, true /*ReadOnly*/) readOnlyList;

 How to fill this list then? You will need some mechanism
 allowing to do mutable->immutable conversion, right?

That's what the "readonly" property does. For example

  void foo(List!(int,ReadOnly) y) { ... }
  List!(int) x;
  x.add(10,20,30);
  foo(x.readonly);

The 'ReadOnly' in the code above is a constant defined to be true. I think 
it improves readability of the code. If one wants to "cast away the const" I 
also added a property x.readwrite.

 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).

Those are equivalent to iterator!(true) where iterator(bit Const = false) 
and List!(true) where List(bit Const = false). The "constness" becomes part 
of the type just like the types List/ConstList or iterator/const_iterator.

Jun 06 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d81fb2$7el$1 digitaldaemon.com...
 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" 
 property. I'll
 be finishing this stuff up soon and post to D.dtl later in the week.

 (no offence, just wondering)

 I really don't know how such static flag would work.
 Suppose I declared:

   List(int, true /*ReadOnly*/) readOnlyList;

 How to fill this list then? You will need some mechanism
 allowing to do mutable->immutable conversion, right?

 That's what the "readonly" property does. For example

  void foo(List!(int,ReadOnly) y) { ... }
  List!(int) x;
  x.add(10,20,30);
  foo(x.readonly);

 The 'ReadOnly' in the code above is a constant defined to be true. I think 
 it improves readability of the code. If one wants to "cast away the const" 
 I also added a property x.readwrite.

 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).

 Those are equivalent to iterator!(true) where iterator(bit Const = false) 
 and List!(true) where List(bit Const = false). The "constness" becomes 
 part of the type just like the types List/ConstList or 
 iterator/const_iterator.

Jun 06 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Ben Hinkle" <ben.hinkle gmail.com> wrote in message 
news:d81fb2$7el$1 digitaldaemon.com...
 ps - for kicks this weekend I've been adding a parameter to the MinTL
 containers
 to indicate read-only vs read-write. For example
 struct List(Value, bit ReadOnly = false) {
 static if (!ReadOnly) {
 void addTail(Value v){...}
 .. other functions that modify the list ...
 }
 .. functions that don't modify the list ...
 }
 You get a read-only view of a container by using the "readonly" 
 property. I'll
 be finishing this stuff up soon and post to D.dtl later in the week.

 (no offence, just wondering)

 I really don't know how such static flag would work.
 Suppose I declared:

   List(int, true /*ReadOnly*/) readOnlyList;

 How to fill this list then? You will need some mechanism
 allowing to do mutable->immutable conversion, right?

 That's what the "readonly" property does. For example

  void foo(List!(int,ReadOnly) y) { ... }
  List!(int) x;
  x.add(10,20,30);
  foo(x.readonly);

 The 'ReadOnly' in the code above is a constant defined to be true. I think 
 it improves readability of the code. If one wants to "cast away the const" 
 I also added a property x.readwrite.

 My perception is that just something like const_iterator
 needs to be designed. Or ConstList without modification
 methods which will have sole ctor ConstList(List data).

 Those are equivalent to iterator!(true) where iterator(bit Const = false) 
 and List!(true) where List(bit Const = false). The "constness" becomes 
 part of the type just like the types List/ConstList or 
 iterator/const_iterator.

Yep. It will work. It will increase size of generated code twice as you
need second instantiation of the template (readonly version).
But such specialized  wrapper is definitely a solution for containers for
places where robustness and modulraity is main requirements.

Wish it will be possible to say something like char[].readonly or so
for primary containers and pointers. As in most cases they are pretty
sufficient and more effective.

I took a look in compiler code. Seems I can implement constness for
array and pointers pretty easily as everything is already there.

Another option would be in implementation of extended/selected typedef 
notation:
typedef  char[] cstring:  opSlice, opIndexAssign; // list of operations 
derived from
base type.  A bit ugly but would work.

Andrew.

Jun 06 2005

U.Baumanis <U.Baumanis_member pathlink.com> writes:

How about immutable final String for general stuff end StringBuffer (or
whatever) for performance needs.

ubau

In article <d7h4rf$1345$1 digitaldaemon.com>, Walter says...
"Andrew Fedoniouk" <news terrainformatica.com> wrote in message
news:d7gtvf$qs0$1 digitaldaemon.com...
 java.lang.String class has a) methods b) String owns buffer - it controls
 buffer.

 In D is possible:
 int[char[]] map;
 char[] s = "something";
 map[s] = 1;
 s[0] = '?'; // I have no idea what result will be. sure not good.

 And you can bump into such problem quite easily in D. I personally
 did many times. And too hard to find source sometimes.

 In Java such collision is not possible in principle: String is final and
 immutable.

A number of languages use the immutable string idiom, and its corollary
"always implicitly copy the string when writing to it". They all share
another common characteristic - they're slow, and they're slow in a manner
that is *not fixable*. And they're not just slower by a factor, many
algorithms run *exponentially* slower because of the copying.

D must be fast, and the only way to be fast with strings (and arrays) is to
not have the language implicitly copy them, but to allow the programmer the
flexibility to copy or not copy. To know when to copy, use the Copy On Write
principle (COW). That is, if you're not *sure* you've got the only copy of a
string, .dup it before modifying it.

So why isn't that just as bad as the languages that implicitly copy on
write? The answer is that often, you know that you are the sole owner, such
as:

    char[] s = new char[10];
    for (i = 0; i < 10; i++)
        s[i] = 'c';

Those other languages are doomed to make 10 copies of s. The D programmer
needs to make 0 copies.

As to your example above, when you pass a reference to a string to an
associative array, then you aren't the sole owner of that string anymore.
Don't change it. .dup it.

May 31 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

U.Baumanis wrote:

 How about immutable final String for general stuff end StringBuffer (or
 whatever) for performance needs.

If you want some Java-like string classes, I hacked some stuff together:
http://www.algonet.se/~afb/d/dcaf/html/class_string.html
http://www.algonet.se/~afb/d/dcaf/html/class_string_buffer.html

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.

--anders

May 31 2005

U.Baumanis <U.Baumanis_member pathlink.com> writes:

In article <d7hagf$194p$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
U.Baumanis wrote:

 How about immutable final String for general stuff end StringBuffer (or
 whatever) for performance needs.

If you want some Java-like string classes, I hacked some stuff together:
http://www.algonet.se/~afb/d/dcaf/html/class_string.html
http://www.algonet.se/~afb/d/dcaf/html/class_string_buffer.html

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.

--anders

Thanks! It would be nice to have it in std.string.
Well, better somewhere than nowhere. :-)

--
ubau

May 31 2005

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

U.Baumanis wrote:

If you want some Java-like string classes, I hacked some stuff together:

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.

 
 Thanks! It would be nice to have it in std.string.
 Well, better somewhere than nowhere. :-)

Don't misunderstand me, I do not think that D needs a String class...
(it was just a small example on how one could implement such a beast)

The default D string type is still "char[]".

Just wished there was a simple way to preserve the readonly-ness of it,
without resorting to using a fullblown wrapper class - like Java does ?

Something like: readonly char[] s = "hello";

--anders

PS. Name "const" has been renamed for political reasons.
     Kinda like typedef, which changed name into "alias".

May 31 2005

U.Baumanis <U.Baumanis_member pathlink.com> writes:

In article <d7hl28$1ikj$1 digitaldaemon.com>,
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= says...
U.Baumanis wrote:

If you want some Java-like string classes, I hacked some stuff together:

That doesn't change the "readonly" (was: const) needs of the built-in
string types of D (code unit arrays) ? Just something of a workaround.
Kris has a much nicer wrapper (with ICU features) under the Mango Tree.

 
 Thanks! It would be nice to have it in std.string.
 Well, better somewhere than nowhere. :-)

Don't misunderstand me, I do not think that D needs a String class...
(it was just a small example on how one could implement such a beast)

The default D string type is still "char[]".

Just wished there was a simple way to preserve the readonly-ness of it,
without resorting to using a fullblown wrapper class - like Java does ?

Something like: readonly char[] s = "hello";

--anders

PS. Name "const" has been renamed for political reasons.
     Kinda like typedef, which changed name into "alias".

You are right! I forgot why i wanted a String class... 
May be because I have to use Java at work. ;-)

--
ubau

May 31 2005

Eugene Pelekhay <pelekhay gmail.com> writes:

Walter wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 
java.lang.String class has a) methods b) String owns buffer - it controls
buffer.

In D is possible:
int[char[]] map;
char[] s = "something";
map[s] = 1;
s[0] = '?'; // I have no idea what result will be. sure not good.

And you can bump into such problem quite easily in D. I personally
did many times. And too hard to find source sometimes.

In Java such collision is not possible in principle: String is final and
immutable.

 
 
 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.
 
 D must be fast, and the only way to be fast with strings (and arrays) is to
 not have the language implicitly copy them, but to allow the programmer the
 flexibility to copy or not copy. To know when to copy, use the Copy On Write
 principle (COW). That is, if you're not *sure* you've got the only copy of a
 string, .dup it before modifying it.
 
 So why isn't that just as bad as the languages that implicitly copy on
 write? The answer is that often, you know that you are the sole owner, such
 as:
 
     char[] s = new char[10];
     for (i = 0; i < 10; i++)
         s[i] = 'c';

May be I'm dummy, but I don't see in this example why this other 
languages must copy it 10 times. For my implementation of reference 
counted string in my C++ project, copy will be performed also 0 times. 
And if there is more then 1 reference to instance exsits it's only one 
copy operation will be performed. I see only one advantage in current 
implementation of string - not need to check or increment/decrement 
reference counter, but instead of this string duplication is required

 
 Those other languages are doomed to make 10 copies of s. The D programmer
 needs to make 0 copies.
 
 As to your example above, when you pass a reference to a string to an
 associative array, then you aren't the sole owner of that string anymore.
 Don't change it. .dup it.

May 31 2005

"Walter" <newshound digitalmars.com> writes:

"Eugene Pelekhay" <pelekhay gmail.com> wrote in message
news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required

You're right that you can avoid excessive copying by doing ref counting.

Reference counting carries with it other penalties - storage must be
allocated for the ref count, every copy increments the count, and every
reference that goes out of scope must decrement the count. Add in exception
handling, and the price is high (although C++'s mechanisms hide that price
from you).

Ref counting would make it impractical to do D's array slices.

Furthermore, in the presence of garbage collection, layering on top a
reference counting mechanism probably means you'll want to ditch the gc and
go with a full ref counting architecture for every object. In my experience,
such is slower than using mark/sweep gc.

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Walter" <newshound digitalmars.com> wrote in message 
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required

 You're right that you can avoid excessive copying by doing ref counting.

 Reference counting carries with it other penalties - storage must be
 allocated for the ref count, every copy increments the count, and every
 reference that goes out of scope must decrement the count. Add in 
 exception
 handling, and the price is high (although C++'s mechanisms hide that price
 from you).

 Ref counting would make it impractical to do D's array slices.

 Furthermore, in the presence of garbage collection, layering on top a
 reference counting mechanism probably means you'll want to ditch the gc 
 and
 go with a full ref counting architecture for every object. In my 
 experience,
 such is slower than using mark/sweep gc.

Yes, GC does good job. In some places. In other places ref-counting is 
better.
Ideal language shall allow to use both. Dot.
Everything has its own price:
as much objects allocated (your .dup advice) as slow their scanning will be 
by GC.
And I am not  sure what is faster in fact in big picture - ref-counting
for strings or GC.

rather
defeat. At least in real life projects I can test by hands. But in abstract 
tests - everything
is just perfect.
I know only one: as less GC cycle as better. As it locks everything
and at unpredictable moment. ref-counting has price but
this price is acceptable as it is predictable and accountible and
equally spreaded.

The best solution is as always - in the middle - in the balance
between GC and not-GC.

If I have vector of passive elements (chars) I would go with
ref-countng for creating envelope safe to pass back and forth.
If I have container of active elements (objects) with
complex and sometimes unknown system of relationship
I'll go with GC to avoid headaches with cyclic references and so on
and broken pointers.

Strings are strange types, they are both : wave and particle -
scalar and aggregate at the same time.

String as a wrapper-owner of character buffer allows
somehow (not ideally!) to work with the string using its both
forms, balancing between str1 = str2, str1 == str2 and
str1.ptr == str2.ptr.

Back to const.
Having bultin arrays and slicing now creates *prerequisites*
of optimal or suboptimal string handling.
But e.g. slicing is just nothing without const ( for strings especially).
See: I've found some string fragment and passed it to some function.
This function does something and is passing it further. All these
functions were built with good intentions and good programmers.
But these programmers live in 12 hours timezone shift .
The only one feasible way for them is self documenting code.
Someone thinked that this particular string is safe to zero
terminate it. Everything is ruined. To find source of it is not trivial.
I bet that second time when it will happen D will be dead
for the project. When it happened for me first time
I've decided to do a string wrapper emulating constness.
JUST NO WAY IN D. not technically nor theoretically.
Neither '=' overload (to implement ownership and refcounting)
nor const. Nothing.  Dead corner.

char[] is not a string - it is array of chars.

Pattern of string use is quite different from array.
As a rule array is a heart of some container and pretty
frequently already wrapped. But strings are flying
everywhere. D shall have const for arrays and pointers
to be considered as a language for teams and serious
projects.

IMHO.

Jun 01 2005

Kramer <Kramer_member pathlink.com> writes:

Well put.  Again, I think a point has been made for having a facility in the
language to say "this thing shouldn't change value".  I understand that some
devious programmer can find a way to change something that the compiler verified
shouldn't be changed.  But I think that programmer is in the minority and the
majority of programmers could use some self-documenting help from the
language/compiler (and the devious programmer is specifically and intentionally
going outside of the program specification, which I'd reckon could somehow be
done in any language).

Again, my $0.02.  But I think many have put in their $0.02 and will continue to
do so because many believe it's an important concept.  When will all the $0.02
contributions add up to be enough?

-Kramer

In article <d7li3v$2psu$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Walter" <newshound digitalmars.com> wrote in message 
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required

 You're right that you can avoid excessive copying by doing ref counting.

 Reference counting carries with it other penalties - storage must be
 allocated for the ref count, every copy increments the count, and every
 reference that goes out of scope must decrement the count. Add in 
 exception
 handling, and the price is high (although C++'s mechanisms hide that price
 from you).

 Ref counting would make it impractical to do D's array slices.

 Furthermore, in the presence of garbage collection, layering on top a
 reference counting mechanism probably means you'll want to ditch the gc 
 and
 go with a full ref counting architecture for every object. In my 
 experience,
 such is slower than using mark/sweep gc.

Yes, GC does good job. In some places. In other places ref-counting is 
better.
Ideal language shall allow to use both. Dot.
Everything has its own price:
as much objects allocated (your .dup advice) as slow their scanning will be 
by GC.
And I am not  sure what is faster in fact in big picture - ref-counting
for strings or GC.

rather
defeat. At least in real life projects I can test by hands. But in abstract 
tests - everything
is just perfect.
I know only one: as less GC cycle as better. As it locks everything
and at unpredictable moment. ref-counting has price but
this price is acceptable as it is predictable and accountible and
equally spreaded.

The best solution is as always - in the middle - in the balance
between GC and not-GC.

If I have vector of passive elements (chars) I would go with
ref-countng for creating envelope safe to pass back and forth.
If I have container of active elements (objects) with
complex and sometimes unknown system of relationship
I'll go with GC to avoid headaches with cyclic references and so on
and broken pointers.

Strings are strange types, they are both : wave and particle -
scalar and aggregate at the same time.

String as a wrapper-owner of character buffer allows
somehow (not ideally!) to work with the string using its both
forms, balancing between str1 = str2, str1 == str2 and
str1.ptr == str2.ptr.

Back to const.
Having bultin arrays and slicing now creates *prerequisites*
of optimal or suboptimal string handling.
But e.g. slicing is just nothing without const ( for strings especially).
See: I've found some string fragment and passed it to some function.
This function does something and is passing it further. All these
functions were built with good intentions and good programmers.
But these programmers live in 12 hours timezone shift .
The only one feasible way for them is self documenting code.
Someone thinked that this particular string is safe to zero
terminate it. Everything is ruined. To find source of it is not trivial.
I bet that second time when it will happen D will be dead
for the project. When it happened for me first time
I've decided to do a string wrapper emulating constness.
JUST NO WAY IN D. not technically nor theoretically.
Neither '=' overload (to implement ownership and refcounting)
nor const. Nothing.  Dead corner.

char[] is not a string - it is array of chars.

Pattern of string use is quite different from array.
As a rule array is a heart of some container and pretty
frequently already wrapped. But strings are flying
everywhere. D shall have const for arrays and pointers
to be considered as a language for teams and serious
projects.

IMHO.

Jun 01 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

 ....for having a facility in the
 language to say "this thing shouldn't change value".

exactly.

It is enough to have  const T[]  and const T* as distinct types from just
T[]  and T* .

const T[] type has no opIndexAssign, length(int) and cannot be lvalue at 
all.
Simple as 1-2-3. I really don't understand what is the motivation to do not 
have them.

String literals are const char[] by definition.

Andrew.




"Kramer" <Kramer_member pathlink.com> wrote in message 
news:d7ljra$2qvj$1 digitaldaemon.com...
 Well put.  Again, I think a point has been made for having a facility in 
 the
 language to say "this thing shouldn't change value".  I understand that 
 some
 devious programmer can find a way to change something that the compiler 
 verified
 shouldn't be changed.  But I think that programmer is in the minority and 
 the
 majority of programmers could use some self-documenting help from the
 language/compiler (and the devious programmer is specifically and 
 intentionally
 going outside of the program specification, which I'd reckon could somehow 
 be
 done in any language).

 Again, my $0.02.  But I think many have put in their $0.02 and will 
 continue to
 do so because many believe it's an important concept.  When will all the 
 $0.02
 contributions add up to be enough?

 -Kramer

 In article <d7li3v$2psu$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Walter" <newshound digitalmars.com> wrote in message
news:d7j6na$d2f$1 digitaldaemon.com...
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 May be I'm dummy, but I don't see in this example why this other
 languages must copy it 10 times. For my implementation of reference
 counted string in my C++ project, copy will be performed also 0 times.
 And if there is more then 1 reference to instance exsits it's only one
 copy operation will be performed. I see only one advantage in current
 implementation of string - not need to check or increment/decrement
 reference counter, but instead of this string duplication is required

 You're right that you can avoid excessive copying by doing ref counting.

 Reference counting carries with it other penalties - storage must be
 allocated for the ref count, every copy increments the count, and every
 reference that goes out of scope must decrement the count. Add in
 exception
 handling, and the price is high (although C++'s mechanisms hide that 
 price
 from you).

 Ref counting would make it impractical to do D's array slices.

 Furthermore, in the presence of garbage collection, layering on top a
 reference counting mechanism probably means you'll want to ditch the gc
 and
 go with a full ref counting architecture for every object. In my
 experience,
 such is slower than using mark/sweep gc.

Yes, GC does good job. In some places. In other places ref-counting is
better.
Ideal language shall allow to use both. Dot.
Everything has its own price:
as much objects allocated (your .dup advice) as slow their scanning will 
be
by GC.
And I am not  sure what is faster in fact in big picture - ref-counting
for strings or GC.

rather
defeat. At least in real life projects I can test by hands. But in 
abstract
tests - everything
is just perfect.
I know only one: as less GC cycle as better. As it locks everything
and at unpredictable moment. ref-counting has price but
this price is acceptable as it is predictable and accountible and
equally spreaded.

The best solution is as always - in the middle - in the balance
between GC and not-GC.

If I have vector of passive elements (chars) I would go with
ref-countng for creating envelope safe to pass back and forth.
If I have container of active elements (objects) with
complex and sometimes unknown system of relationship
I'll go with GC to avoid headaches with cyclic references and so on
and broken pointers.

Strings are strange types, they are both : wave and particle -
scalar and aggregate at the same time.

String as a wrapper-owner of character buffer allows
somehow (not ideally!) to work with the string using its both
forms, balancing between str1 = str2, str1 == str2 and
str1.ptr == str2.ptr.

Back to const.
Having bultin arrays and slicing now creates *prerequisites*
of optimal or suboptimal string handling.
But e.g. slicing is just nothing without const ( for strings especially).
See: I've found some string fragment and passed it to some function.
This function does something and is passing it further. All these
functions were built with good intentions and good programmers.
But these programmers live in 12 hours timezone shift .
The only one feasible way for them is self documenting code.
Someone thinked that this particular string is safe to zero
terminate it. Everything is ruined. To find source of it is not trivial.
I bet that second time when it will happen D will be dead
for the project. When it happened for me first time
I've decided to do a string wrapper emulating constness.
JUST NO WAY IN D. not technically nor theoretically.
Neither '=' overload (to implement ownership and refcounting)
nor const. Nothing.  Dead corner.

char[] is not a string - it is array of chars.

Pattern of string use is quite different from array.
As a rule array is a heart of some container and pretty
frequently already wrapped. But strings are flying
everywhere. D shall have const for arrays and pointers
to be considered as a language for teams and serious
projects.

IMHO.

Jun 01 2005

Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Wed, 1 Jun 2005 17:55:04 -0700:
 ....for having a facility in the
 language to say "this thing shouldn't change value".

 exactly.

 It is enough to have  const T[]  and const T* as distinct types from just
 T[]  and T* .

 const T[] type has no opIndexAssign, length(int) and cannot be lvalue at 
 all.
 Simple as 1-2-3. I really don't understand what is the motivation to do not 
 have them.

 String literals are const char[] by definition.

Sadly it's not that simple. The content of the string literals are known at
compile time and can thus be placed in an OS-protected memory area.

If the array content is mutable by default, the const attribute for
arrays is a mere suggestion. Do a bit of pointer math or store arrays as
elements in other arrays and the const attribute loses it's effect.

If the array content is by default immutable - that is, once an element
is set it can't be changed - the "mutable" attribute could be used to
allow the editing of array element. In difference to the current system
the compiler could only allow those cases where it can _prove_ that the
array is mutable. What happens if you store pointers/object
references in the array?!
In addition this has some negative impact for mixed closed-open
source projects as the compiler would have to treat all arrays comming
from the closed source part as immutable.

The third way is to allow only assigning and not changeing any var. It's
neat but pointer math gets in the way again.

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFCnsBx3w+/yD4P9tIRAj+BAKCZTrUBYd4ARfWuxMcmN9126dyVmQCcDI5/
St2gvmZjNrzBbITklIQYb+g=
=WdJv
-----END PGP SIGNATURE-----

Jun 02 2005

kris <fu bar.org> writes:

Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array element. 

Now there's a different idea. Talk about CoW enforcement :-)

 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!

Doesn't have to prove anything if the mutable aspect is part of the 
type; right, Thomas? Aliases upon the array still have to go through the 
same type-matching procedure; Yes?

If you cast an immutable type to a mutable type, then all bets are off; 
just as they are with *cast(void *)0 = 0;

 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

Is it still an issue if third-party code can be declared/proto-typed 
appropriately?

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"kris" <fu bar.org> wrote in message news:d7mbac$c2d$1 digitaldaemon.com...
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array element.

 Now there's a different idea. Talk about CoW enforcement :-)

 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!

 Doesn't have to prove anything if the mutable aspect is part of the type; 
 right, Thomas? Aliases upon the array still have to go through the same 
 type-matching procedure; Yes?

Thanks, Kris. This is exactly what I have in my mind.

 If you cast an immutable type to a mutable type, then all bets are off; 
 just as they are with *cast(void *)0 = 0;

 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 Is it still an issue if third-party code can be declared/proto-typed 
 appropriately?

This "all arrays comming from the closed source part as immutable." sounds
like Ride of the Valkyries from Wagner.
I can feel two layers of sense there but only managed to get one :).
Thomas, for the D sake, what it was all about?

Andrew.

Jun 02 2005

Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:51:30 -0700:
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 Is it still an issue if third-party code can be declared/proto-typed 
 appropriately?

 This "all arrays comming from the closed source part as immutable." sounds
 like Ride of the Valkyries from Wagner.
 I can feel two layers of sense there but only managed to get one :).
 Thomas, for the D sake, what it was all about?

If mutability is a "suggestion" than there is no problem with open
source projects that use close source libs. However if mutability is to
be strictly enforced, we would either need runtime checks or would have
to treat the arrays from the closed source part as immutable.

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFCnuJO3w+/yD4P9tIRApxdAJ9f/18fE6/lblc/VPVItHbhkR9JXQCdGW/E
IUXVTD7HHXPJy0qhpHxFfLE=
=b0Hv
-----END PGP SIGNATURE-----

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:eb43n2-na4.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:51:30 -0700:
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 Is it still an issue if third-party code can be declared/proto-typed
 appropriately?

 This "all arrays comming from the closed source part as immutable." 
 sounds
 like Ride of the Valkyries from Wagner.
 I can feel two layers of sense there but only managed to get one :).
 Thomas, for the D sake, what it was all about?

 If mutability is a "suggestion" than there is no problem with open
 source projects that use close source libs. However if mutability is to
 be strictly enforced, we would either need runtime checks or would have
 to treat the arrays from the closed source part as immutable.

 Thomas

I think I see now what you mean ...
Thanks, Thomas.

Jun 02 2005

Brad Beveridge <brad somewhere.net> writes:

Here are my thoughts - if people would like to hear them :)

1) Everyone that has posted to this thread agrees that there needs to be 
a language feature that prevents code from changing the value of a 
variable/array.  Call it readonly, const, whatever.

2) There have been no suggestions on how to actually enforce 
immutability in a manner that is 100% correct - there are always 
loopholes that a devious programmer can go through.  From my point of 
view, the only truely correct way of enforcing constness is to place the 
memory in question under the protection of the hardware MMU - which IMHO 
will be very wasteful on memory & probably non-trivial to implement.

3) C++ errors at compile time for const violations and that is generally 
what people would like from D also.

I think that we need to face reality - there is no easy solution that is 
actually 100% correct.  However, since D is a practical language we 
don't need to be 100% correct, we only need enough protection to prevent 
accidental bugs.  Given a language that allows pointers, a programmer 
will be able to break constness.  If you need to write a completely 
robust library that malicious programmers will _try_ to break, then 
relying on const will never cut it.  You must add your own protection 
mechanisms - like always duping memory before returning it.  But I would 
say that this kind of environment is extremely rare.

So I think that a simple const mechanism that aims only to help prevent 
bugs, and offers no actual assurances about true constness, ought to be 
enough to satisfy most people.

I am sure that I don't grasp the full implications of how to implement 
this, but honestly it doesn't sound that hard.  I've attached a simple 
example that very nearly does simple protection by only using the 
built-in type system.  If you could implicitly promote char -> 
const_char then this would pretty much work.

Does this sound workable - or am I being totally naive?

Brad

import std.stdio;

typedef char const_char;

int main (char[][] arg)
{
     char[] plain = "This is a plain char string";
     const_char[] strange = cast(const_char[])"This is a const_char string";

     plain = foo(plain);
     strange = foo(strange);

     // no implicit casting, so the below doesn't work
     //char[] test = strange;
     //const_char[] test2 = plain;
     return 0;
}

char[] foo(char[] f)
{
     // normally, writefln would only take const_char[]'s
     // but char[] would be allowed to be implicitly promoted to 
const_char[]
     writefln ("char[] is ", f);
     return f;
}

const_char[] foo(const_char[] f)
{
     // normally, writefln would only take const_char[]'s - so
     // wouldn't need to cast
     writefln ("const_char[] is ", cast(char[])f);
     return f;
}

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7nbhs$1p2o$1 digitaldaemon.com...
 Here are my thoughts - if people would like to hear them :)

 1) Everyone that has posted to this thread agrees that there needs to be a 
 language feature that prevents code from changing the value of a 
 variable/array.  Call it readonly, const, whatever.

 2) There have been no suggestions on how to actually enforce immutability 
 in a manner that is 100% correct - there are always loopholes that a 
 devious programmer can go through.  From my point of view, the only truely 
 correct way of enforcing constness is to place the memory in question 
 under the protection of the hardware MMU - which IMHO will be very 
 wasteful on memory & probably non-trivial to implement.

 3) C++ errors at compile time for const violations and that is generally 
 what people would like from D also.

 I think that we need to face reality - there is no easy solution that is 
 actually 100% correct.  However, since D is a practical language we don't 
 need to be 100% correct, we only need enough protection to prevent 
 accidental bugs.  Given a language that allows pointers, a programmer will 
 be able to break constness.  If you need to write a completely robust 
 library that malicious programmers will _try_ to break, then relying on 
 const will never cut it.  You must add your own protection mechanisms - 
 like always duping memory before returning it.  But I would say that this 
 kind of environment is extremely rare.

 So I think that a simple const mechanism that aims only to help prevent 
 bugs, and offers no actual assurances about true constness, ought to be 
 enough to satisfy most people.

Exactly. And again, we don't need it at the scale it made in C++ .
Classes can be protected "by hands",  and in fact protection for
class instnaces is already there: setter/getters, private/public, etc.

Again just need const for T[] and T*.

See: char[] is atomic type in terms of that it is builtin in D.
But they are implemented differently from other scalars-
they are references to some memory locations. And this referencing adds
one more 'dimension' requirement - readonlyness.

This is exactly as function parameter attributes: in, out.
Walter said A but is shy to say B.
Or I don't understand something.

Andrew.







 I am sure that I don't grasp the full implications of how to implement 
 this, but honestly it doesn't sound that hard.  I've attached a simple 
 example that very nearly does simple protection by only using the built-in 
 type system.  If you could implicitly promote char -> const_char then this 
 would pretty much work.

 Does this sound workable - or am I being totally naive?

 Brad

 import std.stdio;

 typedef char const_char;

 int main (char[][] arg)
 {
     char[] plain = "This is a plain char string";
     const_char[] strange = cast(const_char[])"This is a const_char 
 string";

     plain = foo(plain);
     strange = foo(strange);

     // no implicit casting, so the below doesn't work
     //char[] test = strange;
     //const_char[] test2 = plain;
     return 0;
 }

 char[] foo(char[] f)
 {
     // normally, writefln would only take const_char[]'s
     // but char[] would be allowed to be implicitly promoted to 
 const_char[]
     writefln ("char[] is ", f);
     return f;
 }

 const_char[] foo(const_char[] f)
 {
     // normally, writefln would only take const_char[]'s - so
     // wouldn't need to cast
     writefln ("const_char[] is ", cast(char[])f);
     return f;
 }

Jun 02 2005

Sean Kelly <sean f4.ca> writes:

In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is 
actually 100% correct.  However, since D is a practical language we 
don't need to be 100% correct, we only need enough protection to prevent 
accidental bugs.

..
So I think that a simple const mechanism that aims only to help prevent 
bugs, and offers no actual assurances about true constness, ought to be 
enough to satisfy most people.

My impression of Walter is that he doesn't like solutions that are just okay,
particularly if they add significant compiler complexity.  And while I think
this is probably feasible for arrays, it's a slippery slope from there to
logical const-ness for user defined objects.  Personally, I'll take anything I
can get, but I think the likelihood we'll see this for 1.0 is quite slim.


Sean

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.

 ..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.

 My impression of Walter is that he doesn't like solutions that are just 
 okay,
 particularly if they add significant compiler complexity.

Ummm... Is it in fact so complex? I think everything needed is already 
there.

 And while I think
 this is probably feasible for arrays, it's a slippery slope from there to
 logical const-ness for user defined objects.

Don't need this for UDO. Such objects have all needed for encapsulation/
protection already. Basic types in contrary are naked now. You cannot
define methods/envelopes for them. const is not 100% solution in terms
of real protection. But there are no 100% solutions at all, even in D.
It is a compromise and a good one - it costs nothing in runtime and
almost nothing in compile time.

Personally, I'll take anything I
 can get, but I think the likelihood we'll see this for 1.0 is quite slim.

Too bad if we will have not this in 1.0. Without const for arrays and 
pointers
D feature set is incomplete. You can write word counters spanning two three 
pages
but for serious projects with many developers involved const (especially for 
strings)
is must have feature. I am managing GUI projects last 10 years and I am 
certain.
Fighting with three bugs in Harmonia  took me one week. Two of them
was about stirng/array corruptions. And I was designing it by myself!
I don't want to share this experience with other team members where
probability of such things is higher.

Again D is a good language. I would say D is near perfect especially for 
GUI.
But this particular area (constness of builtin types) is not finished
- as there are no workarounds at all to protect references.
Advices to spread .dups everywhere I will left for .NET/Java community - 
there they
will be accepted.

Andrew.

Jun 02 2005

Sean Kelly <sean f4.ca> writes:

In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.

 ..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.

 My impression of Walter is that he doesn't like solutions that are just 
 okay,
 particularly if they add significant compiler complexity.

Ummm... Is it in fact so complex? I think everything needed is already 
there.

I don't think it is complex.  I was thinking of object const-ness when I wrote
this.  Sorry for the confusion.
 And while I think
 this is probably feasible for arrays, it's a slippery slope from there to
 logical const-ness for user defined objects.

Don't need this for UDO. Such objects have all needed for encapsulation/
protection already. Basic types in contrary are naked now. You cannot
define methods/envelopes for them. const is not 100% solution in terms
of real protection. But there are no 100% solutions at all, even in D.
It is a compromise and a good one - it costs nothing in runtime and
almost nothing in compile time.

I disagree that UDO have what's needed in terms of protection.  Though logical
const-ness is far from simple to implement.


Sean

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Sean Kelly" <sean f4.ca> wrote in message 
news:d7nod5$263o$1 digitaldaemon.com...
 In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
"Sean Kelly" <sean f4.ca> wrote in message
news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
I think that we need to face reality - there is no easy solution that is
actually 100% correct.  However, since D is a practical language we
don't need to be 100% correct, we only need enough protection to prevent
accidental bugs.

 ..
So I think that a simple const mechanism that aims only to help prevent
bugs, and offers no actual assurances about true constness, ought to be
enough to satisfy most people.

 My impression of Walter is that he doesn't like solutions that are just
 okay,
 particularly if they add significant compiler complexity.

Ummm... Is it in fact so complex? I think everything needed is already
there.

 I don't think it is complex.  I was thinking of object const-ness when I 
 wrote
 this.  Sorry for the confusion.

Why sorry? Anyway...

 And while I think
 this is probably feasible for arrays, it's a slippery slope from there 
 to
 logical const-ness for user defined objects.

Don't need this for UDO. Such objects have all needed for encapsulation/
protection already. Basic types in contrary are naked now. You cannot
define methods/envelopes for them. const is not 100% solution in terms
of real protection. But there are no 100% solutions at all, even in D.
It is a compromise and a good one - it costs nothing in runtime and
almost nothing in compile time.

 I disagree that UDO have what's needed in terms of protection.  Though 
 logical
 const-ness is far from simple to implement.

In user defined objects at least you can say 'protect'. Protect
some operations and functions from outside
This is compile time protection and not runtime! . In runtime you can
relatively easy get a pointer to private variable and change it outside.

Following logic that we cannot reach 100% protection we shall remove
this 'private', 'package' etc. What for they were introduced? They
are not 100% protecting!.

You can do dynamic protection for UDO : you can naturally
implement:
"in this particular state of object
this particular property is immutable, etc." Right?

What we all want to have is simple:
"by using const for arrays and pointers we want to make
some methods of const types 'private' - not accessible". This is it.


 Sean

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

Ok, side question, assume we're going to use 'const' (or any keyword) for  
array contents (something I've asked for in the past), how do we write it?

As it stands:
   const char[] bob;

'currently' makes the "array reference" constant. Would we then need:
   char[] const bob;

or perhaps
   char[const] bob;

or maybe as it's const it has to have an initialiser:
   char[] bob = "test";

(assuming of course that "test" is automatically constant)
or would that give an error and require the const keyword somewhere, as in  
one of the first two ideas.

Regan

On Thu, 2 Jun 2005 13:48:06 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Sean Kelly" <sean f4.ca> wrote in message
 news:d7nod5$263o$1 digitaldaemon.com...
 In article <d7nmto$24n4$1 digitaldaemon.com>, Andrew Fedoniouk says...
 "Sean Kelly" <sean f4.ca> wrote in message
 news:d7nkra$22fh$1 digitaldaemon.com...
 In article <d7nbhs$1p2o$1 digitaldaemon.com>, Brad Beveridge says...
 I think that we need to face reality - there is no easy solution  
 that is
 actually 100% correct.  However, since D is a practical language we
 don't need to be 100% correct, we only need enough protection to  
 prevent
 accidental bugs.

 ..
 So I think that a simple const mechanism that aims only to help  
 prevent
 bugs, and offers no actual assurances about true constness, ought to  
 be
 enough to satisfy most people.

 My impression of Walter is that he doesn't like solutions that are  
 just
 okay,
 particularly if they add significant compiler complexity.

 Ummm... Is it in fact so complex? I think everything needed is already
 there.

 I don't think it is complex.  I was thinking of object const-ness when I
 wrote
 this.  Sorry for the confusion.

 Why sorry? Anyway...

 And while I think
 this is probably feasible for arrays, it's a slippery slope from there
 to
 logical const-ness for user defined objects.

 Don't need this for UDO. Such objects have all needed for  
 encapsulation/
 protection already. Basic types in contrary are naked now. You cannot
 define methods/envelopes for them. const is not 100% solution in terms
 of real protection. But there are no 100% solutions at all, even in D.
 It is a compromise and a good one - it costs nothing in runtime and
 almost nothing in compile time.

 I disagree that UDO have what's needed in terms of protection.  Though
 logical
 const-ness is far from simple to implement.

 In user defined objects at least you can say 'protect'. Protect
 some operations and functions from outside
 This is compile time protection and not runtime! . In runtime you can
 relatively easy get a pointer to private variable and change it outside.

 Following logic that we cannot reach 100% protection we shall remove
 this 'private', 'package' etc. What for they were introduced? They
 are not 100% protecting!.

 You can do dynamic protection for UDO : you can naturally
 implement:
 "in this particular state of object
 this particular property is immutable, etc." Right?

 What we all want to have is simple:
 "by using const for arrays and pointers we want to make
 some methods of const types 'private' - not accessible". This is it.


 Sean

Jun 02 2005

Brad Beveridge <brad somewhere.net> writes:

Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) 
 for  array contents (something I've asked for in the past), how do we 
 write it?
 

I like
const char[] bob;

since we aren't C++, I don't think that we should split hairs quite so 
much about where the const (or readonly) keyword goes.
In D "const char[] bob", means that the contents of bob cannot be 
altered, the reference can't be changed, and you cannot slice out a 
chunk of bob unless you are slicing into another const char[].
Well it could mean that :)  It could also mean anything else!

Brad

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) for 
 array contents (something I've asked for in the past), how do we write 
 it?

 I like
 const char[] bob;

 since we aren't C++, I don't think that we should split hairs quite so 
 much about where the const (or readonly) keyword goes.
 In D "const char[] bob", means that the contents of bob cannot be altered, 
 the reference can't be changed, and you cannot slice out a chunk of bob 
 unless you are slicing into another const char[].
 Well it could mean that :)  It could also mean anything else!

 Brad

Yep, only one:
"the reference can't be changed"
I think this is too strict.

const char[] Dolli = "McArtur"; // fine
/// mariage happens
                   Dolli = "O'Connor"; // shoud be also fine.
/// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
/// value iself will be an erro:r
                     Dolli[0] = '\0'; /// ERROR

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 2 Jun 2005 14:40:33 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Brad Beveridge" <brad somewhere.net> wrote in message
 news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword)  
 for
 array contents (something I've asked for in the past), how do we write
 it?

 I like
 const char[] bob;

 since we aren't C++, I don't think that we should split hairs quite so
 much about where the const (or readonly) keyword goes.
 In D "const char[] bob", means that the contents of bob cannot be  
 altered,
 the reference can't be changed, and you cannot slice out a chunk of bob
 unless you are slicing into another const char[].
 Well it could mean that :)  It could also mean anything else!

 Brad

 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.

 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR

I don't understand what you're saying. Which of these, if any, do you  
think we need to be able to do:

1 - have a constant reference, to non-constant data
2 - have a non-constant reference, to constant data
3 - have a constant reference, to constant data

Bear in mind, when I say:

"constant reference" I mean a char[] which cannot be assigned to, or it's  
length changed. eg.

char[] foo = "bar";
foo = foo[1..3]; //illegal
foo.length = foo.length + 10; //illegal
foo[0] = 'a' //ok

"constant data" I mean a char[] whose referenced data cannot be modified,  
eg.

char[] foo = "bar";
foo = foo[1..3]; //ok
foo.length = foo.length + 10; //ok
foo[0] = 'a' //illegal

Regan

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Regan Heath" <regan netwin.co.nz> wrote in message 
news:opsrrjkrfx23k2f5 nrage.netwin.co.nz...
 On Thu, 2 Jun 2005 14:40:33 -0700, Andrew Fedoniouk 
 <news terrainformatica.com> wrote:
 "Brad Beveridge" <brad somewhere.net> wrote in message
 news:d7ntg7$2b5e$1 digitaldaemon.com...
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword) 
 for
 array contents (something I've asked for in the past), how do we write
 it?

 I like
 const char[] bob;

 since we aren't C++, I don't think that we should split hairs quite so
 much about where the const (or readonly) keyword goes.
 In D "const char[] bob", means that the contents of bob cannot be 
 altered,
 the reference can't be changed, and you cannot slice out a chunk of bob
 unless you are slicing into another const char[].
 Well it could mean that :)  It could also mean anything else!

 Brad

 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.

 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR

 I don't understand what you're saying. Which of these, if any, do you 
 think we need to be able to do:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data




you can always wrap references in something.
Or you can choose to use in, inout, etc to passing references.

But values they are referring to are not protected.


Makes sense?


 Bear in mind, when I say:

 "constant reference" I mean a char[] which cannot be assigned to, or it's 
 length changed. eg.

 char[] foo = "bar";
 foo = foo[1..3]; //illegal
 foo.length = foo.length + 10; //illegal
 foo[0] = 'a' //ok

 "constant data" I mean a char[] whose referenced data cannot be modified, 
 eg.

 char[] foo = "bar";
 foo = foo[1..3]; //ok
 foo.length = foo.length + 10; //ok
 foo[0] = 'a' //illegal

 Regan

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 2 Jun 2005 15:16:00 -0700, Andrew Fedoniouk  
<news terrainformatica.com> wrote:
 "Regan Heath" <regan netwin.co.nz> wrote in message
 news:opsrrjkrfx23k2f5 nrage.netwin.co.nz...
 I don't understand what you're saying. Which of these, if any, do you
 think we need to be able to do:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data




 you can always wrap references in something.
 Or you can choose to use in, inout, etc to passing references.

Yet 'in' does not prevent you from changing the parameter (a copy of the  
real reference) so you often get bugs where the programmer has desired  
change, implemented it, seen no errors from the compiler, and yet it  
fails. It would be nice to prevent this.

It seems a simple change to have the compiler treat 'in' as 'const in'  
erroring on changes to the parameter itself.

 But values they are referring to are not protected.


 Makes sense?

Perfect. I'm just not convinced we don't want all 3, yet.

Regan

Jun 02 2005

Brad Beveridge <brad somewhere.net> writes:

Andrew Fedoniouk wrote:

 
 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.
 
 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR

Though I think I get your point - this just doesn't feel right to me.
Assume for a moment that "const" in D means the simplest thing - the 
reference cannot be changed and the data cannot be changed.  Your 
example can be rewritten as:

char[] Dolli = "McArtur";
const char[] safeDolli = Dolli;
// do things that aren't allowed to change safeDolli
// Dolli gets married
Dolli = "O'Connor"; // safeDolli also is now "O'Connor"

I am all for things being simple, and to me the simplest use of const is 
to make both the reference and the data immutable.

Brad

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Brad Beveridge" <brad somewhere.net> wrote in message 
news:d7o3d9$2fqb$1 digitaldaemon.com...
 Andrew Fedoniouk wrote:

 Yep, only one:
 "the reference can't be changed"
 I think this is too strict.

 const char[] Dolli = "McArtur"; // fine
 /// mariage happens
                    Dolli = "O'Connor"; // shoud be also fine.
 /// but attempt  to break Dolli's "private parts" ((C) Booch) - to change
 /// value iself will be an erro:r
                      Dolli[0] = '\0'; /// ERROR

 Though I think I get your point - this just doesn't feel right to me.
 Assume for a moment that "const" in D means the simplest thing - the 
 reference cannot be changed and the data cannot be changed.  Your example 
 can be rewritten as:

 char[] Dolli = "McArtur";
 const char[] safeDolli = Dolli;
 // do things that aren't allowed to change safeDolli
 // Dolli gets married
 Dolli = "O'Connor"; // safeDolli also is now "O'Connor"

 I am all for things being simple, and to me the simplest use of const is 
 to make both the reference and the data immutable.

Take a look on this:

for( const int* p = ...; p < end; ++p )
  {
  }

You can enumerate but you cannot change.

Again, there are mechanisms for practical implementations of const 
references in D now
e.g.: in, inout, out for parameters

but there are no convenient and effective ways to protect reference values.

Also if you have const ref on const data then you will not be able to do:

foo( out const char[]  ) which is rare but desireable use case.

Andrew.








 Brad

Jun 02 2005

Brad Beveridge <brad somewhere.net> writes:

Andrew Fedoniouk wrote:

 You can enumerate but you cannot change.
 
 Again, there are mechanisms for practical implementations of const 
 references in D now
 e.g.: in, inout, out for parameters
 
 but there are no convenient and effective ways to protect reference values.
 
 Also if you have const ref on const data then you will not be able to do:
 
 foo( out const char[]  ) which is rare but desireable use case.
 

You have convinced me :)
Const reference + const data is too simplistic.

Brad

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 02 Jun 2005 14:25:28 -0700, Brad Beveridge <brad somewhere.net>  
wrote:
 Regan Heath wrote:
 Ok, side question, assume we're going to use 'const' (or any keyword)  
 for  array contents (something I've asked for in the past), how do we  
 write it?

 I like
 const char[] bob;

 since we aren't C++, I don't think that we should split hairs quite so  
 much about where the const (or readonly) keyword goes.

I wasn't concerned so much with 'where' to type the keyword, but rather  
whether we need to be able to:

1 - have a constant reference, to non-constant data
2 - have a non-constant reference, to constant data
3 - have a constant reference, to constant data

If we need all of them, then:

"const char[] bob" - to me, means const ref (as it currently does in D)
"char[const] bob"  - (my fav suggestion so far) to me, means const data
"const char[const] bob" - const ref, const data

 In D "const char[] bob", means that the contents of bob cannot be  
 altered, the reference can't be changed



 , and you cannot slice out a chunk of bob unless you are slicing into  
 another const char[].

That would be illegal, surely?

"another const char[]" is a constant reference, so you cannot change it -  
unless you were thinking we might lift this restriction at initialisation  
time, eg.

const char[] abc = "abc";
const char[] bob = abc[0..2];

 Well it could mean that :)  It could also mean anything else!

Yeah, but we do have to make some temporary decisions in order to come to  
a temporary solution/idea in order to test it, before we can make any  
decisions.

Regan

Jun 02 2005

Brad Beveridge <brad somewhere.net> writes:

Regan Heath wrote:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data
 
 If we need all of them, then:
 
 "const char[] bob" - to me, means const ref (as it currently does in D)
 "char[const] bob"  - (my fav suggestion so far) to me, means const data
 "const char[const] bob" - const ref, const data
 
 In D "const char[] bob", means that the contents of bob cannot be  
 altered, the reference can't be changed

 
 




Sorry, I don't like "char[const] bob", simply because I feel it is too 
close to the AA syntax, eg, what is "char[const int] bob" going to do?

 
 , and you cannot slice out a chunk of bob unless you are slicing into  
 another const char[].

 
 
 That would be illegal, surely?
 
 "another const char[]" is a constant reference, so you cannot change it 
 -  unless you were thinking we might lift this restriction at 
 initialisation  time, eg.

I was thinking more along the lines of
const char[] str = "This is a string";

void foo (const char[] f){...}
void foo (char[] f){...}

foo(str[0..4]); // slices and calls the foo(const char[]) func

Brad

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Thu, 02 Jun 2005 15:07:42 -0700, Brad Beveridge <brad somewhere.net>  
wrote:
 Regan Heath wrote:

 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data
  If we need all of them, then:
  "const char[] bob" - to me, means const ref (as it currently does in D)
 "char[const] bob"  - (my fav suggestion so far) to me, means const data
 "const char[const] bob" - const ref, const data

 In D "const char[] bob", means that the contents of bob cannot be   
 altered, the reference can't be changed







would err on the side of caution and, presuming no reason against, make it  
possible.

 Sorry, I don't like "char[const] bob", simply because I feel it is too  
 close to the AA syntax, eg, what is "char[const int] bob" going to do?

Good question. Options are AFAICS:

- illegal (raising the question of how to define constant data for an AA)
- keys to AA are const (not sure what that might mean yet)

I don't like either option myself..

So, exploring a syntax for enabling all 3 options, it looks like we have:





Presuming we need all 3, any problems with this syntax?

If not, the next question this raises is how/when is 'y' initialised. I  
recall Matthew (I think - don't quote me) asking for a 'const' that would  
allow initialisation in a constructor (class, or static module).

 , and you cannot slice out a chunk of bob unless you are slicing into   
 another const char[].

   That would be illegal, surely?
  "another const char[]" is a constant reference, so you cannot change  
 it -  unless you were thinking we might lift this restriction at  
 initialisation  time, eg.

 I was thinking more along the lines of
 const char[] str = "This is a string";

 void foo (const char[] f){...}
 void foo (char[] f){...}

 foo(str[0..4]); // slices and calls the foo(const char[]) func

Ahh, that is in fact the same thing :)

The slice is creating a new array, which is then initialised. So the rule  
might be "const arrays can be assigned only when they are initialised."

The passing of the new const char[] into the function would follow normal  
rules from then on, as in, it's a const array so it's fine. Aside: AS non  
const char[] could be passed to this function, as non const can implicitly  
be promoted to const (but not vice-versa).

Regan

Jun 02 2005

Derek Parnell <derek psych.ward> writes:

On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

[snip]

 So, exploring a syntax for enabling all 3 options, it looks like we have:
 




I'm getting confused now; sorry. Are these three things mean ...






-- 
Derek
Melbourne, Australia
3/06/2005 8:53:25 AM

Jun 02 2005

"Regan Heath" <regan netwin.co.nz> writes:

On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we  
 have:





 I'm getting confused now; sorry. Are these three things mean ...





Yep. Assuming we need all 3 of them.

Regan

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Regan Heath" <regan netwin.co.nz> wrote in message 
news:opsrrmmoii23k2f5 nrage.netwin.co.nz...
 On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we 
 have:





 I'm getting confused now; sorry. Are these three things mean ...





 Yep. Assuming we need all 3 of them.

 Regan

Let's just have



It is quite enough.


always can be expressed by other methods .

E.g. in D if you want to have non-changeable reference field in the class
you can always do

class Foo
{
    private char[] _bar;

    const char[] bar() { return _bar; }
}

and this is it.

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Andrew Fedoniouk" <news terrainformatica.com> wrote in message 
news:d7o4tf$2h13$1 digitaldaemon.com...
 "Regan Heath" <regan netwin.co.nz> wrote in message 
 news:opsrrmmoii23k2f5 nrage.netwin.co.nz...
 On Fri, 3 Jun 2005 08:55:55 +1000, Derek Parnell <derek psych.ward> 
 wrote:
 On Fri, 03 Jun 2005 10:20:53 +1200, Regan Heath wrote:

 [snip]

 So, exploring a syntax for enabling all 3 options, it looks like we 
 have:





 I'm getting confused now; sorry. Are these three things mean ...





 Yep. Assuming we need all 3 of them.

 Regan

 Let's just have



Sorry, above shall be read as:




 It is quite enough.


 always can be expressed by other methods .

 E.g. in D if you want to have non-changeable reference field in the class
 you can always do

 class Foo
 {
    private char[] _bar;

    const char[] bar() { return _bar; }
 }

 and this is it.

Jun 02 2005

Tom S <h3r3tic remove.mat.uni.torun.pl> writes:

Regan Heath wrote:
 So, exploring a syntax for enabling all 3 options, it looks like we have:
 
 <snip />
 
 1 - have a constant reference, to non-constant data
 2 - have a non-constant reference, to constant data
 3 - have a constant reference, to constant data 


What about:





It's consistent with the way D declarations are parsed by myBrain(tm)


-- 
Tomasz Stachowiak  /+ a.k.a. h3r3tic +/

Jun 02 2005

Sean Kelly <sean f4.ca> writes:

In article <d7nra6$291a$1 digitaldaemon.com>, Andrew Fedoniouk says...
In user defined objects at least you can say 'protect'. Protect
some operations and functions from outside
This is compile time protection and not runtime! . In runtime you can
relatively easy get a pointer to private variable and change it outside.

Following logic that we cannot reach 100% protection we shall remove
this 'private', 'package' etc. What for they were introduced? They
are not 100% protecting!.

I was thinking more of something like this:














'val' might be protected, but it can still be altered by calling setVal.  Since
D has no concept of logical const-ness, there is no way to verify at
compile-time that doNotChangeC actually did not change C.  Though as I
demonstrated in another thread, it is possible to verify this somewhat at
compile-time using DBC.

You can do dynamic protection for UDO : you can naturally
implement:"in this particular state of object
this particular property is immutable, etc." Right?

Along those lines, I suppose this is an option:


























What we all want to have is simple:
"by using const for arrays and pointers we want to make
some methods of const types 'private' - not accessible". This is it.

True enough.  And this shouldn't be very hard to do.


Sean

Jun 02 2005

Sean Kelly <sean f4.ca> writes:

In article <d7nra6$291a$1 digitaldaemon.com>, Andrew Fedoniouk says...
What we all want to have is simple:
"by using const for arrays and pointers we want to make
some methods of const types 'private' - not accessible". This is it.

Actually, this would work for classes as well.


Sean

Jun 02 2005

Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

kris schrieb am Thu, 02 Jun 2005 00:15:46 -0700:
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array elements. 

 Now there's a different idea. Talk about CoW enforcement :-)

 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!

 Doesn't have to prove anything if the mutable aspect is part of the 
 type; right, Thomas?

This isn't sufficent.

class C{
	<immutable> int i;
}

void bugger(){
	C c = new C;
	c.i = 2; // **ok**
	<mutable> int* i = &c.i;
	*i = 1; // ***bug** 
}

 Aliases upon the array still have to go through the 
 same type-matching procedure; Yes?

In the sense that "<mutable> t" and "<immutable> t" are two distinct
types.

 If you cast an immutable type to a mutable type, then all bets are off; 
 just as they are with *cast(void *)0 = 0;

How about enforcing that immutable types can only be casted to immutable
types?

<immutable> type[] a;
<immutable> void* b = cast(<immutable> void*) a; // legal, the target is
immutable
<mutable> void* c = cast(<mutable> void*) a; // illegal, the target is mutable.

This isn't sufficent. Let's rewrite the sample.

class C{
	<immutable> int i;
}

void bugger2(){
	C c = new C;
	c.i = 2; // **ok**
	<immutable> size_t ptrI = cast(<immutabe> size_t)(cast(<immutable> void*))
&c.i);
	<mutable> size_t ptrM = ptrI;
	<mutable> void* v = cast(<mutable> void*) ptrM;
	<mutable> int* i = cast(<mutable> int*) v;
	*i = 1; // **bug, but legal if mutability is a plain attribute**
}

 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 Is it still an issue if third-party code can be declared/proto-typed 
 appropriately?

This would reduce the the protection to a suggestion.

As can be seen: a simple <mutable> attribute isn't a sufficent
protection.

The compiler would have to do a quite extensive flow analysis to
provide even very limited mutable access.

If the "default <immutable>" is limited to arrays, how deep would the
array be protected? What about pointers as array elements?

class D{
	<mutable> int i;
}

<immutable> D[] o;
o.length=1;
o[0]= new D; // **ok**

o[0]= new D; // **bug**

o[0].i = 1; // legal or illegal?

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFCnt1K3w+/yD4P9tIRAuebAKCAgW0XGeL7/5QkZ+GmZnwefI+hzQCfdv6B
isJeMx63fCvqJgoxpQhzKAk=
=Q3ak
-----END PGP SIGNATURE-----

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:a333n2-t84.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 kris schrieb am Thu, 02 Jun 2005 00:15:46 -0700:
 Thomas Kuehne wrote:
 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array elements.

 Now there's a different idea. Talk about CoW enforcement :-)

 In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!

 Doesn't have to prove anything if the mutable aspect is part of the
 type; right, Thomas?

 This isn't sufficent.

 class C{
 <immutable> int i;
 }

 void bugger(){
 C c = new C;
 c.i = 2; // **ok**
 <mutable> int* i = &c.i;
 *i = 1; // ***bug**
 }

 Aliases upon the array still have to go through the
 same type-matching procedure; Yes?

 In the sense that "<mutable> t" and "<immutable> t" are two distinct
 types.

 If you cast an immutable type to a mutable type, then all bets are off;
 just as they are with *cast(void *)0 = 0;

 How about enforcing that immutable types can only be casted to immutable
 types?

 <immutable> type[] a;
 <immutable> void* b = cast(<immutable> void*) a; // legal, the target is 
 immutable
 <mutable> void* c = cast(<mutable> void*) a; // illegal, the target is 
 mutable.

 This isn't sufficent. Let's rewrite the sample.

 class C{
 <immutable> int i;
 }

 void bugger2(){
 C c = new C;
 c.i = 2; // **ok**

 <immutable> size_t ptrI = cast(<immutabe> size_t)(cast(<immutable> void*)) 
 &c.i);
 <mutable> size_t ptrM = ptrI;
 <mutable> void* v = cast(<mutable> void*) ptrM;
 <mutable> int* i = cast(<mutable> int*) v;
 *i = 1; // **bug, but legal if mutability is a plain attribute**
 }

class C {
   const int i = 20; // works now and perfectly. it is a compile time
                           // constant and such variable may not have even
                          // location in runtime.
}

The only one:

<immutable> int* ptrI = &someintvar;
*ptrI = 20 ; // here compiler must generate error - is not an l-value.

And this is it. Enough for most cases. Casting, slicing and dicing to
remove constness shall be also possible for masochistic use cases.

We shall help good people instead of fighting with bad ones.
Only in this case we will have a chance to see transformation
of bad guys into good guys.  ( Pure Canadian statement :-)

Andrew.


 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 Is it still an issue if third-party code can be declared/proto-typed
 appropriately?

 This would reduce the the protection to a suggestion.

 As can be seen: a simple <mutable> attribute isn't a sufficent
 protection.

 The compiler would have to do a quite extensive flow analysis to
 provide even very limited mutable access.

 If the "default <immutable>" is limited to arrays, how deep would the
 array be protected? What about pointers as array elements?

 class D{
 <mutable> int i;
 }

 <immutable> D[] o;
 o.length=1;
 o[0]= new D; // **ok**

 o[0]= new D; // **bug**

 o[0].i = 1; // legal or illegal?

 Thomas


 -----BEGIN PGP SIGNATURE-----

 iD8DBQFCnt1K3w+/yD4P9tIRAuebAKCAgW0XGeL7/5QkZ+GmZnwefI+hzQCfdv6B
 isJeMx63fCvqJgoxpQhzKAk=
 =Q3ak
 -----END PGP SIGNATURE-----

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:hsr2n2-ic3.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Wed, 1 Jun 2005 17:55:04 -0700:
 ....for having a facility in the
 language to say "this thing shouldn't change value".

 exactly.

 It is enough to have  const T[]  and const T* as distinct types from just
 T[]  and T* .

 const T[] type has no opIndexAssign, length(int) and cannot be lvalue at
 all.
 Simple as 1-2-3. I really don't understand what is the motivation to do 
 not
 have them.

 String literals are const char[] by definition.

 Sadly it's not that simple. The content of the string literals are known 
 at
 compile time and can thus be placed in an OS-protected memory area.

 If the array content is mutable by default, the const attribute for
 arrays is a mere suggestion. Do a bit of pointer math or store arrays as
 elements in other arrays and the const attribute loses it's effect.

 If the array content is by default immutable - that is, once an element
 is set it can't be changed - the "mutable" attribute could be used to
 allow the editing of array element. In difference to the current system
 the compiler could only allow those cases where it can _prove_ that the
 array is mutable. What happens if you store pointers/object
 references in the array?!
 In addition this has some negative impact for mixed closed-open
 source projects as the compiler would have to treat all arrays comming
 from the closed source part as immutable.

 The third way is to allow only assigning and not changeing any var. It's
 neat but pointer math gets in the way again.

 Thomas

Thomas, I am not proposing here any flags changeable in runtime or existing 
there.
I was thinking aloud before about them but not in this message.
I did tests already with flags - indeed they do not work in some cases.
Let's forget about them and focus on just a const type modifier

const T[] and T[] has exactly the same binary layout in runtime.

The only difference is on compiler level (as in C++).
Lets imagine that we have

const char[] t = some_other_array;

uint find( const char[] where, const char[] what)
{
.....
}

uint replace( char[] where, const char[] from, const char[] to)
{
.....
}

find(t, "hello"); // fine
replace( t, "c++", "d" ); // bang! compile time error : cannot change const 
value.

const type modifier is not to much complicated
currently it is supported for scalars. We need to extend it
to arrays and pointers with slightly modified meaning.
Close to what C++ has. But not exactly.
I thing we don't need to go to const methods or so they are
too shaky even in C++.

Just a type modifer for POD types (as currently) and for
arrays and pointers which are generally speaking for D are also POD
types.

This is it. Not a rocket science.

Andrew.

Jun 02 2005

Thomas Kuehne <thomas-dloop kuehne.this-is.spam.cn> writes:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:38:54 -0700:
<snip>
 Thomas, I am not proposing here any flags changeable in runtime or existing 
 there.
 I was thinking aloud before about them but not in this message.
 I did tests already with flags - indeed they do not work in some cases.

The only way I am aware of to enforce const at runtime could be compared
to Java's String pool...

 Let's forget about them and focus on just a const type modifier

 const T[] and T[] has exactly the same binary layout in runtime.

 The only difference is on compiler level (as in C++).
 Lets imagine that we have

 const char[] t = some_other_array;

 uint find( const char[] where, const char[] what)
 {
 .....
 }

 uint replace( char[] where, const char[] from, const char[] to)
 {
 .....
 }

 find(t, "hello"); // fine
 replace( t, "c++", "d" ); // bang! compile time error : cannot change const 
 value.

 const type modifier is not to much complicated
 currently it is supported for scalars. We need to extend it
 to arrays and pointers with slightly modified meaning.
 Close to what C++ has. But not exactly.

Now that is the tricky part. ;)
What would be the const rules for arrays and pointers/references?
How/When would those rules be checked and/or enfored?

Thomas


-----BEGIN PGP SIGNATURE-----

iD8DBQFCnuIR3w+/yD4P9tIRApaoAKCfmszRUHdzY+D1ZqBE28fWkQ1cBQCcDouw
Jjmq2a1Wkah6dhYchFArC+g=
=VHXo
-----END PGP SIGNATURE-----

Jun 02 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Thomas Kuehne" <thomas-dloop kuehne.this-is.spam.cn> wrote in message 
news:h943n2-na4.ln1 lnews.kuehne.cn...
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1

 Andrew Fedoniouk schrieb am Thu, 2 Jun 2005 00:38:54 -0700:
 <snip>
 Thomas, I am not proposing here any flags changeable in runtime or 
 existing
 there.
 I was thinking aloud before about them but not in this message.
 I did tests already with flags - indeed they do not work in some cases.

 The only way I am aware of to enforce const at runtime could be compared
 to Java's String pool...

 Let's forget about them and focus on just a const type modifier

 const T[] and T[] has exactly the same binary layout in runtime.

 The only difference is on compiler level (as in C++).
 Lets imagine that we have

 const char[] t = some_other_array;

 uint find( const char[] where, const char[] what)
 {
 .....
 }

 uint replace( char[] where, const char[] from, const char[] to)
 {
 .....
 }

 find(t, "hello"); // fine
 replace( t, "c++", "d" ); // bang! compile time error : cannot change 
 const
 value.

 const type modifier is not to much complicated
 currently it is supported for scalars. We need to extend it
 to arrays and pointers with slightly modified meaning.
 Close to what C++ has. But not exactly.

 Now that is the tricky part. ;)
 What would be the const rules for arrays and pointers/references?
 How/When would those rules be checked and/or enfored?

When: In compile time. How: by generating compile time error.

How be checked? Exactly as right now. Imagine that
const char[] is just a typedefed char[].
Can compiler check casting problems of typedefed types now? Yes.
This is it. Don't need anything more here.

The only one thing: string literals are const char arrays by their
nature and definition.

See I am clearly expressing my intentions - defining contracts on 
parameters:
--------------------------------
foo(char[] ms)
foo(const char[] ims)
--------------------------------

const char[] s1 = url.hostname;
char[] s2 = url.hostname; // compile time error - cast to non-const

but this should be possible too (my guess):

char[] s2 = cast(char[]) url.hostname; // const is recomendation, not more!

better to have constcast() for such cases but we can live without that.

Andrew.

Jun 02 2005

Eugene Pelekhay <pelekhay gmail.com> writes:

Walter wrote:
 "Eugene Pelekhay" <pelekhay gmail.com> wrote in message
 news:d7hfuh$1ejl$1 digitaldaemon.com...
 
May be I'm dummy, but I don't see in this example why this other
languages must copy it 10 times. For my implementation of reference
counted string in my C++ project, copy will be performed also 0 times.
And if there is more then 1 reference to instance exsits it's only one
copy operation will be performed. I see only one advantage in current
implementation of string - not need to check or increment/decrement
reference counter, but instead of this string duplication is required

 
 
 You're right that you can avoid excessive copying by doing ref counting.
 
 Reference counting carries with it other penalties - storage must be
 allocated for the ref count, every copy increments the count, and every
 reference that goes out of scope must decrement the count. Add in exception
 handling, and the price is high (although C++'s mechanisms hide that price
 from you).

I know about price I pay for ref counting. But in my field as in many 
others it is a reasonable price to be sure that my application will not 
freeze for some time to perform garbage collection. Another thing which 
is significant for me it is destructor calls from garbage collector. I 
don't care when memory will be released, but i won't be sure that 
destructor is called as soon as all references to object are gone. If 
last is not guaranteed then my code will look like in Java, with 
enormous amount of *try finally* blocks with forced call to cleanup 
methods. IMHO to be successful D language must reduce development cycle 
and this is only reason big bosses will understand.

 
 Ref counting would make it impractical to do D's array slices.

Yes this is nice feature, but some one (like me) can say that bitfields 
in C is also very nice feature I use quite often (in binary exchange 
protocol between my devices). This is all about constness and CoW and 
without it I only see hardly findable errors in code (if code is not 
just simple one page test.d). Don't think I'm against this nice feature.
 
 Furthermore, in the presence of garbage collection, layering on top a
 reference counting mechanism probably means you'll want to ditch the gc and
 go with a full ref counting architecture for every object. In my experience,
 such is slower than using mark/sweep gc.
  

Just remember all fields where performance is important i can list some 
of them:
1) real time systems programming (gc is not acceptable if it is not 
deterministic)
2) game with extensive usage of latest hardware (freezing for for some 
time in unpredictable moments. Who will play such game?)
3) embedded systems (resources in most cases are limited, so You need to 
utilize all resources You have)
4) scientific calculations

As You see high performance requirements often go hand in hand with 
deterministic execution time, which is not guaranteed in the case of GC 
or at least existing implementation.

PS: For whom who see this post and thinks he wont to rip D language, I'm 
not. I'm praying last ~5 years for success of D and I'm here to bring 
all my experience to make it successful. I'm not starting flame war here 
'cause I really like the language

Jun 02 2005

Derek Parnell <derek psych.ward> writes:

On Tue, 31 May 2005 00:46:35 -0700, Walter wrote:


[snip]
 
 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.
 
 D must be fast, and the only way to be fast with strings (and arrays) is to
 not have the language implicitly copy them, but to allow the programmer the
 flexibility to copy or not copy. To know when to copy, use the Copy On Write
 principle (COW). That is, if you're not *sure* you've got the only copy of a
 string, .dup it before modifying it.

I think there are two distinct aspects that are sometimes being confused or
mingled. One is the idea that the compiler must *prevent* read-only
variables from being modified, and the other is that the compiler must
*report* when it detects (during compilation) code that attempts (i.e would
attempt at run time) to write to a read-only item.

The first idea is the subject of the CoW proposition above; that D takes
the position that the compiler is not responsible for this but that the
coder is. 

But I still think that we haven't heard Walter's final position on the
second idea. So Walter, what if we could indicate which items we would like
to be read-only and if the compiler detects code which is writing to them,
it issues an error message? I know this would not cause these items to be
read-only, but it may help prevent silly coding errors such as the other
silly coding errors you've already added protection for in D.

-- 
Derek Parnell
Melbourne, Australia
31/05/2005 11:02:10 PM

May 31 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:djl0rio88ubd$.1v38dy1thwnze.dlg 40tude.net...
 On Tue, 31 May 2005 00:46:35 -0700, Walter wrote:


 [snip]

 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a 
 manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.

 D must be fast, and the only way to be fast with strings (and arrays) is 
 to
 not have the language implicitly copy them, but to allow the programmer 
 the
 flexibility to copy or not copy. To know when to copy, use the Copy On 
 Write
 principle (COW). That is, if you're not *sure* you've got the only copy 
 of a
 string, .dup it before modifying it.

 I think there are two distinct aspects that are sometimes being confused 
 or
 mingled. One is the idea that the compiler must *prevent* read-only
 variables from being modified, and the other is that the compiler must
 *report* when it detects (during compilation) code that attempts (i.e 
 would
 attempt at run time) to write to a read-only item.

 The first idea is the subject of the CoW proposition above; that D takes
 the position that the compiler is not responsible for this but that the
 coder is.

 But I still think that we haven't heard Walter's final position on the
 second idea. So Walter, what if we could indicate which items we would 
 like
 to be read-only and if the compiler detects code which is writing to them,
 it issues an error message? I know this would not cause these items to be
 read-only, but it may help prevent silly coding errors such as the other
 silly coding errors you've already added protection for in D.

Derek, this is exactly what const does in C++.
It does one more thing in fact. char[] and const char[] are
distinct types so you can do:

void foo( char[] s) { ... }
void foo( const char[] s)  { ... }

This also allows you to clear show your intentions and so on.






 -- 
 Derek Parnell
 Melbourne, Australia
 31/05/2005 11:02:10 PM

May 31 2005

Jan-Eric Duden <jeduden whisset.com> writes:

Walter wrote:
 "Andrew Fedoniouk" <news terrainformatica.com> wrote in message
 news:d7gtvf$qs0$1 digitaldaemon.com...
 
java.lang.String class has a) methods b) String owns buffer - it controls
buffer.

In D is possible:
int[char[]] map;
char[] s = "something";
map[s] = 1;
s[0] = '?'; // I have no idea what result will be. sure not good.

And you can bump into such problem quite easily in D. I personally
did many times. And too hard to find source sometimes.

In Java such collision is not possible in principle: String is final and
immutable.

 
 
 A number of languages use the immutable string idiom, and its corollary
 "always implicitly copy the string when writing to it". They all share
 another common characteristic - they're slow, and they're slow in a manner
 that is *not fixable*. And they're not just slower by a factor, many
 algorithms run *exponentially* slower because of the copying.
 
 D must be fast, and the only way to be fast with strings (and arrays) is to
 not have the language implicitly copy them, but to allow the programmer the
 flexibility to copy or not copy. To know when to copy, use the Copy On Write
 principle (COW). That is, if you're not *sure* you've got the only copy of a
 string, .dup it before modifying it.
 
 So why isn't that just as bad as the languages that implicitly copy on
 write? The answer is that often, you know that you are the sole owner, such
 as:
 
     char[] s = new char[10];
     for (i = 0; i < 10; i++)
         s[i] = 'c';
 
 Those other languages are doomed to make 10 copies of s. The D programmer
 needs to make 0 copies.
 
 As to your example above, when you pass a reference to a string to an
 associative array, then you aren't the sole owner of that string anymore.
 Don't change it. .dup it.
 
 

If D had a standard string class there wouldn't be any problem!

The string class would implement immutable strings without COW.

That's what you need in 99% of all applications.
Applications that are an exeception to the rule should use char arrays 
with dup if needed.

That's a win win situation - and no performance hit.
The only thing that people need to get used to is that "normal" strings 
are immutable, but that's not really hard to accept.

Cheers,
Jan

Jun 03 2005

Derek Parnell <derek psych.ward> writes:

On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


[snip]

 class Url
 {
    char[] _hostname;
    ...
    char[] hostname() { return _hostname.dup; } // Doh!
 }
 
 if( url.hostname == "terrainformatica.com" )
 // 32 bytes less in memory, just to compare it!
   ....
 
 Ideal from many points of view would be a solution with const
 
 class Url {
   char[] _hostname;
 
   const char[] hostname() { return _hostname; } // Yep! this exactly what we 
 need.
 
 }
 

Given the current semantics of D, could a workaround be that we give the
caller the choice, thus making them take explicit responsibility for their
usage.

 class Url
 {
    private char[] _hostname;
    ...
    char[] hostname_unsafe() { return _hostname; }
    char[] hostname()   { return _hostname.dup; }
 }

 char[] a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname_unsafe; // Gets a string without safety

<offtopic>
Of course, if we had return type function matching this would be a whole
lot easier and legible.

 typedef char[] safe_string;
 
 class Url
 {
    private char[] _hostname;
    ...
    char[] hostname()      { return _hostname; }
    safe_string hostname() { return cast(safe_string)_hostname.dup; }
 }

 safe_string a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname; // Gets a string without safety

But I should wake up from this dream now ... -)
</offtopic>

-- 
Derek
Melbourne, Australia
2/06/2005 10:12:16 AM

Jun 01 2005

"Andrew Fedoniouk" <news terrainformatica.com> writes:

"Derek Parnell" <derek psych.ward> wrote in message 
news:lktv44kgbu1j.1tp9nxs3uqvmr.dlg 40tude.net...
 On Mon, 30 May 2005 22:50:29 -0700, Andrew Fedoniouk wrote:


 [snip]

 class Url
 {
    char[] _hostname;
    ...
    char[] hostname() { return _hostname.dup; } // Doh!
 }

 if( url.hostname == "terrainformatica.com" )
 // 32 bytes less in memory, just to compare it!
   ....

 Ideal from many points of view would be a solution with const

 class Url {
   char[] _hostname;

   const char[] hostname() { return _hostname; } // Yep! this exactly what 
 we
 need.

 }

 Given the current semantics of D, could a workaround be that we give the
 caller the choice, thus making them take explicit responsibility for their
 usage.

 class Url
 {
    private char[] _hostname;
    ...
    char[] hostname_unsafe() { return _hostname; }
    char[] hostname()   { return _hostname.dup; }
 }

 char[] a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname_unsafe; // Gets a string without safety

Derek it is not a choice.
Nobody in good mental health will do such double implementation.
Easier to switch to C++.

const char[] hostname()   { return _hostname; }
const char[] hostname = url.hostname;
const char[] level1 = hostname[$-3..$];
char[] level2 = hostname[0...$-3]; // bang!

We do have const for int, double, etc.
Why not for arrays and pointers - they are also primitive types
builtin in language, right? Yes this is a bit different in implementation
but for the sake of consistency?

Jees, I am loosing big project for D and Harmonia....
Team is voting for C++. Only const! :(
Simply do not have moral rights to insist further.
That was really good chance to feed Harmonia....


 <offtopic>
 Of course, if we had return type function matching this would be a whole
 lot easier and legible.

 typedef char[] safe_string;

 class Url
 {
    private char[] _hostname;
    ...
    char[] hostname()      { return _hostname; }
    safe_string hostname() { return cast(safe_string)_hostname.dup; }
 }

 safe_string a;
 char[] b;
 a = url.hostname; // Gets a string with safety
 b = url.hostname; // Gets a string without safety

 But I should wake up from this dream now ... -)
 </offtopic>

 -- 
 Derek
 Melbourne, Australia
 2/06/2005 10:12:16 AM

Jun 01 2005

D Programming

C/C++ Programming

Other

digitalmars.D - Java String vs wchar[] Was: Re: inner classes