www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - OO & D strings

reply gbatyan yahoo.com writes:
D strings are dynamic char arrays.

There is no string class. I'm not yet sure if it's good, but I slowly tend to
think it's good and at least not necessarily bad.

On the other hand there are plenty operations one might wish to perform with
strings. If you work with strings in D as with char[]'s the code will look
more or less the same as in c, having loads of functions and passing char[]'s
to them. 

My opinion - ugly code, non-OO, naming problems (whole lots of str... funcs)

I've seen a workaround attempt, introducing the String class, (I've lurked
shortly at dool.)

Hmm, I think I have a problem with such Idea. There are plenty different
visions of string designs (for example to use or not copy on write, etc.)

Where I'm pretty sure is that I wouldn't like to be doomed to:

- neither adhere to some library's string philosophy and use it's string
handling throughout my whole code.

- nor use my own string philosophy and always convert to/from other string
classes when using other libraries.

Besides, If I have a String class incorporating a char[], here's how i get
to the string contents following all the indirections:

- Reference to String object
- real object ptr
- reference to char[]
- pointer to string bytes

Isn't it a bit too mutch and inefficient?

------------
My solution proposal...

In advance my apologies for poor expression. but you get the idea...

Wouldn't it be nice to have language constructs for being able to
_dynamically_ "bind" functions to some array TYPE[] in a particular scope
(module / {...} / .d file)?

Below is an imaginary syntax: (the TYPE I'm interested at is char[] but there
is no reason why it can't be any other kind of array)


array_binding XYZ( TYPE[] )
{
int index(char c) // body in place
{
return std.string.index(toStringz(this), c));
// NOTE 'this' here is of type TYPE[]
}
TYPE[][] split(char[] delim) : pkg.blah.blah.index; // existing funk  
}

.. and then use it like this....

..
apply_array_binding XYZ(char[]);

char[] foo = "I'm a simple char[]";

int pos = foo.index('a');
char[][] words = foo.split(' ');
..


-----------------------------
this way
- You still work with char[]'s, so it's efficient, since everything
translates at compile time to real function calls.
- Your code looks more like OOP and more abstractly, you may always
rewrite / extend your array_binding XYZ(...)
- No one is impacted by your vision of strings, no wrapper classes,
just char[].

Regards!
Dec 07 2004
next sibling parent reply pragma <pragma_member pathlink.com> writes:
------------
My solution proposal...

In advance my apologies for poor expression. but you get the idea...

Wouldn't it be nice to have language constructs for being able to
_dynamically_ "bind" functions to some array TYPE[] in a particular scope
(module / {...} / .d file)?

Now dynamic bindings are a cool idea, but I think that OO pundits (myself included) would probably cite this as a form of polymorphism. I know that what you're proposing is fundamentally different in the compiled code, but the overall effect could be achieved using more traditional constructs (at the cost of additional syntax of course). D, however, provides another way to do this. It has a quasi-documented feature that helps achieve what you want, at least where the end syntax is concerned:
 import std.string;
 alias char[] string;
 string upcase(string str){ return std.string.toupper(str); }

 void main(){
     string foo = "hello world";
     writefln(foo.upcase()); // output: HELLO WORLD
 }

You don't need a full-blown class, since D will allow the first argument of a function to be used as an 'object' if that methodology is used. In this case, "foo.upcase()" is equivalent to "upcase(foo)". I also used the "string" alias here to better encapsulate "char[]", so it feels more like a class. Its as close as you'll get to attaching a 'method' to a primitive type in D. The catch with this technique is that it only works for array types. Plus, there is no way to force users of the library to use the object-style technique; they can still use the free-function style all they want. - Pragma [ ericanderton at yahoo ]
Dec 07 2004
parent reply gbatyan yahoo.com writes:
D, however, provides another way to do this.  It has a quasi-documented feature
that helps achieve what you want, at least where the end syntax is concerned:

 import std.string;
 alias char[] string;
 string upcase(string str){ return std.string.toupper(str); }

 void main(){
     string foo = "hello world";
     writefln(foo.upcase()); // output: HELLO WORLD
 }

You don't need a full-blown class, since D will allow the first argument of a function to be used as an 'object' if that methodology is used. In this case, "foo.upcase()" is equivalent to "upcase(foo)". I also used the "string" alias here to better encapsulate "char[]", so it feels more like a class. Its as close as you'll get to attaching a 'method' to a primitive type in D.

!!! This is great !!! Why not stopping the String class war and convince people to write library code this way? It's surely not very OO, but can someone give me a reason why a class wrapper around char[] should exist? (It's IMHO only OK if the class needs member vars except information implicitely or explicitely contained in char[]) Well, it's fine to contribute a cool string class to community, but why basing huge libraries on it hence forcing the library user to use this particular string class or relentlessly converting string classes back and forth? It's rather likely for a project to use a couple of libraries using different String objects. It's OK to implement some handy class and use it throughout a big library, but not for such a fundamental thing like strings, IMHO... (especially if such an elegant solution exists) Nearly 100% of software deals with strings in some way.
The catch with this technique is that it only works for array types.  Plus,
there is no way to force users of the library to use the object-style
technique;

they can still use the free-function style all they want.

Well, I'd prefer convincing someone rather than forcing :-) Besides, if a user 'misuses' strings this way (free-function style) it's more harmless than a library forcing plenty users to use particular string class. Normally if I use someone's function, I'm not interested at how the code inside the function looks like, what's absolutely important to me is to have uniform string interface throughout the majority of existing functions. In C, everyone knows what char* is, but I don't know and don't really want to know what String<this<and<that>>> is in some particular library :=) Apologies for somewhat radical way of expression, but frequently enough I had to deal with different implementations of some fundamental things and if there is a nice compromiss to avoid introducing ANOTHER frequently used class to represent even more frequently used materia, I'd hold to it, regardless if writing one-way-ticket apps or some reusable code. I'd appreciate any kind of comments on my point of view, don't want to miss anything :) King regards!
Dec 07 2004
parent reply "Regan Heath" <regan netwin.co.nz> writes:
On Tue, 7 Dec 2004 18:11:39 +0000 (UTC), <gbatyan yahoo.com> wrote:
 D, however, provides another way to do this.  It has a quasi-documented  
 feature
 that helps achieve what you want, at least where the end syntax is  
 concerned:

 import std.string;
 alias char[] string;
 string upcase(string str){ return std.string.toupper(str); }

 void main(){
     string foo = "hello world";
     writefln(foo.upcase()); // output: HELLO WORLD
 }

You don't need a full-blown class, since D will allow the first argument of a function to be used as an 'object' if that methodology is used. In this case, "foo.upcase()" is equivalent to "upcase(foo)". I also used the "string" alias here to better encapsulate "char[]", so it feels more like a class. Its as close as you'll get to attaching a 'method' to a primitive type in D.

!!! This is great !!! Why not stopping the String class war and convince people to write library code this way? It's surely not very OO, but can someone give me a reason why a class wrapper around char[] should exist? (It's IMHO only OK if the class needs member vars except information implicitely or explicitely contained in char[]) Well, it's fine to contribute a cool string class to community, but why basing huge libraries on it hence forcing the library user to use this particular string class or relentlessly converting string classes back and forth? It's rather likely for a project to use a couple of libraries using different String objects. It's OK to implement some handy class and use it throughout a big library, but not for such a fundamental thing like strings, IMHO... (especially if such an elegant solution exists) Nearly 100% of software deals with strings in some way.
 The catch with this technique is that it only works for array types.   
 Plus,
 there is no way to force users of the library to use the object-style
 technique;

:-)
 they can still use the free-function style all they want.

Well, I'd prefer convincing someone rather than forcing :-) Besides, if a user 'misuses' strings this way (free-function style) it's more harmless than a library forcing plenty users to use particular string class. Normally if I use someone's function, I'm not interested at how the code inside the function looks like, what's absolutely important to me is to have uniform string interface throughout the majority of existing functions. In C, everyone knows what char* is, but I don't know and don't really want to know what String<this<and<that>>> is in some particular library :=) Apologies for somewhat radical way of expression, but frequently enough I had to deal with different implementations of some fundamental things and if there is a nice compromiss to avoid introducing ANOTHER frequently used class to represent even more frequently used materia, I'd hold to it, regardless if writing one-way-ticket apps or some reusable code. I'd appreciate any kind of comments on my point of view, don't want to miss anything :)

The debate on the String class has raged far and long.. basically it comes down to the following: There exists 3 string types char, wchar, and dchar. If I choose to use wchar in my library and someone else uses dchar in theirs and you write a program using char, then your program has to 'transcode' to/from all 3 string types everywhere you call my or the other guys/gals lib. Further, OS level functions on different OS's use different string types, the latter versions of windows use wchar strings for example. This sort of transcoding is inefficient, ideally you only want to transcode on input and output and only if you have to. So the appeal of a single String class is that everyone will use it and there will be no transcoding outside of input/output. This could also be achieved with an 'official' alias i.e. alias char[] string; As D is not a single paradigm language the 'class' solution is not going to be accepted by the function & data people, so perhaps the alias is the only D solution. Looking at phobos char[] appears to be the 'default' or 'recommended' string type. That said if you're writing an international app with multiple languages etc you'll probably want to use wchar or even dchar. Regan
Dec 07 2004
parent James McComb <ned jamesmccomb.id.au> writes:
Regan Heath wrote:

 This could also be achieved with an 'official' alias i.e.
 alias char[] string;

+1 for this idea. This is the same way Walter handled the bool issue. Should bools be bits, ints or a special class? D lets you use any of these but indicates that bit is the 'default' or 'preferred' type by making bool an alias for it. My /personal/ preference is for distinct boolean and string types, but that could just be my object-oriented prejudice. This 'alias solution' would work well, and seems more in the 'Spirit of D'. James McComb
Dec 07 2004
prev sibling parent "Ben Hinkle" <bhinkle mathworks.com> writes:
<gbatyan yahoo.com> wrote in message news:cp3sp3$91o$1 digitaldaemon.com...
 D strings are dynamic char arrays.

 There is no string class. I'm not yet sure if it's good, but I slowly tend

 think it's good and at least not necessarily bad.

 On the other hand there are plenty operations one might wish to perform

 strings. If you work with strings in D as with char[]'s the code will look
 more or less the same as in c, having loads of functions and passing

 to them.

 My opinion - ugly code, non-OO, naming problems (whole lots of str...

Have you seen http://www.digitalmars.com/d/phobos.html#string? It doesn't have str... funcs. If you run into name clashing use the fully qualified name (eg std.string.find). The argument for OO function-call syntax like str.func(...) instead of func(str,...) ranks up there with brace placement (ie - on the same line or on the next line) in terms of importance IMO. The fact that arrays can use str.func currently is most likely a bug and is deceptive for code maintainers since they might wonder which func gets called when dealing with overload resolution. Personally I'd only use foo.bar notation for structs and classes, as intended. [snip]
Dec 07 2004