digitalmars.D - D shouldn't be different from Java without good reason

James McComb (21/21) Nov 14 2004 Hi. I wonder if there are any other developers out there who, like me,

Ben Hinkle (15/23) Nov 14 2004 Is type-safety really that important? eh. I don't notice. I can't recall

Glen Perkins (57/81) Nov 15 2004 I haven't seen the reasoning behind either decision presented in the

Ben Hinkle (24/121) Nov 15 2004 I haven't looked at the FAQ in a while. I'll check it out and add some r...

Dave (14/35) Nov 14 2004 There is alot more on this topic in the archives, but to sum up: One of ...
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (48/60) Nov 15 2004 D doesn't have boolean *conditionals*, either. They're arithmetic.
Bastiaan Veelo (5/8) Nov 15 2004 Maybe this page carries an answer?

Ben Hinkle (11/21) Nov 15 2004 A few of the posts listed on that page are about bool/bit not begin

Walter (3/4) Nov 15 2004 Thanks for the chuckle .
=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (15/23) Nov 15 2004 You can take the address of bit variables now,

Ben Hinkle (12/35) Nov 15 2004 through

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= (9/12) Nov 15 2004 Enough! Enough! (my poor stomach) :-D

Ben Hinkle (7/19) Nov 15 2004 but

James McComb <ned jamesmccomb.id.au> writes:

Hi. I wonder if there are any other developers out there who, like me, 
were initially very excited about D, but are now starting to worry that 
the language is maybe starting to go off the rails a bit...

Since D is advertised as a simpler, garbage-collected C++, it's natural 

feature, many developers (such as me) assume that the feature is going 
to be similar to the same feature in Java, unless Walter has some 
compelling reason to do things differently. Seen from this perspective, 
some features in D stick out like a sore thumb:

FEATURES I WOULD HAVE EXPECTED TO BE LIKE JAVA
(Is there really a compelling reason to it differently?)

1. A type-safe boolean. Java has one, D doesn't. I know this has been 
discussed to death, but people coming from a Java background want to 
know: What is the compelling reason to not have a type-safe boolean?

2. Built-in string types. Java has one, D has THREE. This feels wrong, 
because it feels like a failure of encapsulation (the encoding of the 
string should be hidden from the programmer) and it doesn't scale (there 
is no utf-7 type). To put it another way... What is the compelling 
problem to which having three built-in string types is the solution?

I don't think that these differences from Java are necessary or beneficial.

James McComb

Nov 14 2004

Ben Hinkle <bhinkle4 juno.com> writes:

sigh.

oh well. here we go again...

 1. A type-safe boolean. Java has one, D doesn't. I know this has been
 discussed to death, but people coming from a Java background want to
 know: What is the compelling reason to not have a type-safe boolean?

Is type-safety really that important? eh. I don't notice. I can't recall
making a type-safety bug when writing my D code like MinTL or the gmp
wrappers or anything else. Maybe I've been lucky. I just type "bool" and
get on with my coding.

 2. Built-in string types. Java has one, D has THREE. This feels wrong,
 because it feels like a failure of encapsulation (the encoding of the
 string should be hidden from the programmer) and it doesn't scale (there
 is no utf-7 type). To put it another way... What is the compelling
 problem to which having three built-in string types is the solution?

Java has StringBuffers and char[], too. So it has three types just to do
what D does with one - in this case wchar[]. Actually in JDK 5 they
introduced a new StringBuilder (a single-threaded version of StringBuffer).
So now Java has four types for wchar[].
Given all the platform differences in defining char and wchar I think D's
choices make a nice balance between platform dependence and platform
independence. They are simple and fast - just right for good string
handling.

-Ben

Nov 14 2004

"Glen Perkins" <please.dont email.com> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message 
news:cn98ni$n3l$1 digitaldaemon.com...
 sigh.

 oh well. here we go again...


I haven't seen the reasoning behind either decision presented in the 
FAQ. If these are asked frequently enough to make you start your 
answers this way, wouldn't that be the fault of the FAQ, not the 
questioner? Clearly the rightness of these design decisions is not 
obvious, unlike so many other aspects of D's design, so most 
developers investigating D will at least wonder about them even if 
they don't ask.

 1. A type-safe boolean. Java has one, D doesn't. I know this has 
 been
 discussed to death, but people coming from a Java background want 
 to
 know: What is the compelling reason to not have a type-safe 
 boolean?

 Is type-safety really that important? eh. I don't notice.

Well, hold on a minute. I have no position on this issue since I know 
so little about it, but I keep seeing "type safety" listed as a reason 
for why some feature in D is superior to its equivalent in C. If type 
safety isn't really important, then maybe D isn't, either. If it is 
important enough to cite repeatedly as an advantage, then what's the 
story with booleans?

Again, I'm not arguing for or against this design decision. I haven't 
even looked at it. I have wondered about it, though, and if the 
justification is "type safety doesn't matter", then I have to wonder 
even more.


 2. Built-in string types. Java has one, D has THREE. This feels 
 wrong,
 because it feels like a failure of encapsulation (the encoding of 
 the
 string should be hidden from the programmer) and it doesn't scale 
 (there
 is no utf-7 type). To put it another way... What is the compelling
 problem to which having three built-in string types is the 
 solution?

 Java has StringBuffers and char[], too. So it has three types just 
 to do
 what D does with one - in this case wchar[]. Actually in JDK 5 they
 introduced a new StringBuilder (a single-threaded version of 
 StringBuffer).
 So now Java has four types for wchar[].

This isn't quite right. The point is that Java has a single *default* 
string type, and that's a huge advantage. The number of non-defaults 
used for special cases is almost irrelevant.

The default String in Java is used for almost everything and is almost 
always fast enough, which I define very practically. If the 
performance of String could improve dramatically without improving the 
app itself, then String is "fast enough". Anything more has no 
benefit, so if it has any cost, it is a net negative.

I said "almost always fast enough" because by my definition there are 
times when it is not: at many bottlenecks. In those cases, and only in 
those cases, I wish there were a lot more options. Java's performance 
problem is caused by severe constraints on optimization options that 
come from the unfortunate requirement that it be able to run 
identically and safely--as object code--anywhere.

Thank goodness, D hasn't chosen to burden itself with such 
constraints. That means that D could go ahead and have a good default 
that was powerful, easy, and consistent, supplemented by a great 
toolbox of optimizations that would be used only at bottlenecks. After 
all, performance is only relevant at bottlenecks, while simplicity is 
relevant everywhere.

Unfortunately, having no default string type means that integrating 
code from multiple designers (as in using libraries, for example) will 
almost always result in the needless complexity of multiple text 
formats in non-bottleneck code. It won't do me much good to declare a 
personal default, either, if I use libraries written by others.

I'm not sure what would make the best default string type for D, but I 
think that not having one will have costs. Whether those costs will 
turn out to be significant is hard to say for sure. I suspect that D 
people will do what C people have done for years: overweight the 
things that are easy to measure (time to execute a toy for loop a 
million times) and underweight more important things that are harder 
to measure (time wasted debating which string type to use in 
non-bottleneck code, bugs introduced when trying to change string type 
used in module B source code to match module A, etc.) If so, the costs 
of this approach could end up being serious without it being 
recognized, even after the fact.

Well, we'll see. Any way you look at it, it's a huge improvement over 
both C++ and Java in so many ways....

Nov 15 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Glen Perkins wrote:

 
 "Ben Hinkle" <bhinkle4 juno.com> wrote in message
 news:cn98ni$n3l$1 digitaldaemon.com...
 sigh.

 oh well. here we go again...

 
 
 I haven't seen the reasoning behind either decision presented in the
 FAQ. If these are asked frequently enough to make you start your
 answers this way, wouldn't that be the fault of the FAQ, not the
 questioner? Clearly the rightness of these design decisions is not
 obvious, unlike so many other aspects of D's design, so most
 developers investigating D will at least wonder about them even if
 they don't ask.
 

I haven't looked at the FAQ in a while. I'll check it out and add some rants
there, too ;-)

 1. A type-safe boolean. Java has one, D doesn't. I know this has
 been
 discussed to death, but people coming from a Java background want
 to
 know: What is the compelling reason to not have a type-safe
 boolean?

 Is type-safety really that important? eh. I don't notice.

 
 Well, hold on a minute. I have no position on this issue since I know
 so little about it, but I keep seeing "type safety" listed as a reason
 for why some feature in D is superior to its equivalent in C. If type
 safety isn't really important, then maybe D isn't, either. If it is
 important enough to cite repeatedly as an advantage, then what's the
 story with booleans?
 
 Again, I'm not arguing for or against this design decision. I haven't
 even looked at it. I have wondered about it, though, and if the
 justification is "type safety doesn't matter", then I have to wonder
 even more.

Removing all implicit conversions would be a pain. Imagine if one had an int
"x" and short "y" and you want to add them. Without implicit conversions
one would have to write x+cast(int)y. So D must have some implicit
conversions and the question is which ones. Java chose not to have implicit
conversions between numeric types and bool. C++ and D chose to allow it.
For D it makes more sense since "bool" is "bit" and naturally one wants to
be able to implicitly convert between bits, bytes and ints, etc. 
To me it's not a big deal.
 
 2. Built-in string types. Java has one, D has THREE. This feels
 wrong,
 because it feels like a failure of encapsulation (the encoding of
 the
 string should be hidden from the programmer) and it doesn't scale
 (there
 is no utf-7 type). To put it another way... What is the compelling
 problem to which having three built-in string types is the
 solution?

 Java has StringBuffers and char[], too. So it has three types just
 to do
 what D does with one - in this case wchar[]. Actually in JDK 5 they
 introduced a new StringBuilder (a single-threaded version of
 StringBuffer).
 So now Java has four types for wchar[].

 
 This isn't quite right. The point is that Java has a single *default*
 string type, and that's a huge advantage. The number of non-defaults
 used for special cases is almost irrelevant.
 
 The default String in Java is used for almost everything and is almost
 always fast enough, which I define very practically. If the
 performance of String could improve dramatically without improving the
 app itself, then String is "fast enough". Anything more has no
 benefit, so if it has any cost, it is a net negative.
 
 I said "almost always fast enough" because by my definition there are
 times when it is not: at many bottlenecks. In those cases, and only in
 those cases, I wish there were a lot more options. Java's performance
 problem is caused by severe constraints on optimization options that
 come from the unfortunate requirement that it be able to run
 identically and safely--as object code--anywhere.
 
 Thank goodness, D hasn't chosen to burden itself with such
 constraints. That means that D could go ahead and have a good default
 that was powerful, easy, and consistent, supplemented by a great
 toolbox of optimizations that would be used only at bottlenecks. After
 all, performance is only relevant at bottlenecks, while simplicity is
 relevant everywhere.
 
 Unfortunately, having no default string type means that integrating
 code from multiple designers (as in using libraries, for example) will
 almost always result in the needless complexity of multiple text
 formats in non-bottleneck code. It won't do me much good to declare a
 personal default, either, if I use libraries written by others.
 
 I'm not sure what would make the best default string type for D, but I
 think that not having one will have costs. Whether those costs will
 turn out to be significant is hard to say for sure. I suspect that D
 people will do what C people have done for years: overweight the
 things that are easy to measure (time to execute a toy for loop a
 million times) and underweight more important things that are harder
 to measure (time wasted debating which string type to use in
 non-bottleneck code, bugs introduced when trying to change string type
 used in module B source code to match module A, etc.) If so, the costs
 of this approach could end up being serious without it being
 recognized, even after the fact.
 
 Well, we'll see. Any way you look at it, it's a huge improvement over
 both C++ and Java in so many ways....

It's clear char[] is very common in phobos and in the documentation. It is,
in practice, the default string type. I would expect any library that deals
with strings to take at least char[] or have some story about how to use it
with char[]. If the library has some parts where the conversion to char[]
would be a large performance hit then it should also support wchar[] and
maybe dchar[]. The ICU library internally uses wchar[] so if one is writing
an application that deals heavily with the ICU library I would use wchar[]
in my app instead of char[]. Otherwise I would use char[]. It all depends
on the situation. The dchar[] type is really just for Linux C calls and
people who want to have character indexing be the same as code-unit
indexing so I wouldn't seriously consider it in the same league as char[]
and wchar[].

Nov 15 2004

Dave <Dave_member pathlink.com> writes:

In article <cn93su$h4o$1 digitaldaemon.com>, James McComb says...
Hi. I wonder if there are any other developers out there who, like me, 
were initially very excited about D, but are now starting to worry that 
the language is maybe starting to go off the rails a bit...

Since D is advertised as a simpler, garbage-collected C++, it's natural 

feature, many developers (such as me) assume that the feature is going 
to be similar to the same feature in Java, unless Walter has some 
compelling reason to do things differently. Seen from this perspective, 
some features in D stick out like a sore thumb:

FEATURES I WOULD HAVE EXPECTED TO BE LIKE JAVA
(Is there really a compelling reason to it differently?)

1. A type-safe boolean. Java has one, D doesn't. I know this has been 
discussed to death, but people coming from a Java background want to 
know: What is the compelling reason to not have a type-safe boolean?

2. Built-in string types. Java has one, D has THREE. This feels wrong, 
because it feels like a failure of encapsulation (the encoding of the 
string should be hidden from the programmer) and it doesn't scale (there 
is no utf-7 type). To put it another way... What is the compelling 
problem to which having three built-in string types is the solution?

There is alot more on this topic in the archives, but to sum up: One of the
primary complaints for Java is performance, especially for memory management,
and one the reasons for that is the 'one character size fits all' approach to
strings.

Depending on the encoding, it may be more efficient to use any one of the three
string types D offers (after all, the conversions can be handled by library
functionality), which is why all three are supported by the core language.

Since D is intended as an all-around systems-type language, this (efficiency &
flexibility) makes perfect sense, IMHO. One of the problems with C/C++ is that
the strings are not built-in. One of the problems with Java is that the majority
of programs where UTF16 is not optimal end up suffering for it. D tries to
rectify both of these problems by offering all three built-in types.

- Dave

I don't think that these differences from Java are necessary or beneficial.

James McComb

Nov 14 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

James McComb wrote:

 FEATURES I WOULD HAVE EXPECTED TO BE LIKE JAVA
 (Is there really a compelling reason to it differently?)
 
 1. A type-safe boolean. Java has one, D doesn't. I know this has been 
 discussed to death, but people coming from a Java background want to 
 know: What is the compelling reason to not have a type-safe boolean?

D doesn't have boolean *conditionals*, either. They're arithmetic.
So things like "if (2)" and "if (pointer)" are perfectly legal...
(So far I haven't heard any better reasons than that it makes D
code more similar to C, and thus easier to adopt for old-timers?)

In light of that, it has the same kind of booleans that C/C++ has.
i.e. "zero is false, non-zero is true". And "true + true == 2".

The default boolean type in D is "bit" (which has an alias of "bool")

 2. Built-in string types. Java has one, D has THREE. This feels wrong, 
 because it feels like a failure of encapsulation (the encoding of the 
 string should be hidden from the programmer) and it doesn't scale (there 
 is no utf-7 type). To put it another way... What is the compelling 
 problem to which having three built-in string types is the solution?

As has been pointed out by others, Java has *more* than just String.
Performance-conscious people have been using e.g. byte[] for ASCII ?
And with the new surrogates, functions now take both "int" and "char":
see http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Character.html

So the exact same things occur in Java too, as in all Unicode handling.
Another main difference is that String is a class, while D has a type ?

The default string type in D is "char[]" (I suggested alias: "string")

 I don't think that these differences from Java are necessary or beneficial.


But it's still on a higher plane, than old C and C++ :-)

The main difference between D and Java is now that java uses
objects for things like strings and arrays, while D doesn't ?

D is a lot more concerned about performance and implementation,
and therefore leaves a lot of such things for the end programmer.

Walter has chosen to position D inbetween these two "sides".
I was puzzled by this too, but it's done that way by choice.

And it could be what I *like* about D. It's half-C and half-Java.
(and mainly because I think it's more elegant than what C++ is...)


Maybe we just need some new FAQ entries:

Q: What's the default boolean type in D ?
A: bit.
(bool is an "alias")

Q: Is that really type-safe ?
A: No.
(just as in C99/C++)

Q: What's the default string type in D ?
A: char[].
(since main() uses it)

Q: Is that a single class ?
A: No.
(it's a primitive type)

Q: Was this done by accident or by choice ?
A: choice.
(by Walter Bright)

Q: Will this change before D version 1.0 ?
A: No.
(at least unlikely)

And we could all get along with our lives...

--anders


PS.
See also http://www.prowiki.org/wiki4d/wiki.cgi?FeatureRequestList

Nov 15 2004

Bastiaan Veelo <Bastiaan.N.Veelo ntnu.no> writes:

James McComb wrote:
 1. A type-safe boolean. Java has one, D doesn't. I know this has been 
 discussed to death, but people coming from a Java background want to 
 know: What is the compelling reason to not have a type-safe boolean?

Maybe this page carries an answer?
http://www.prowiki.org/wiki4d/wiki.cgi?BooleanNotEquBit

regards,
Bastiaan.

Nov 15 2004

Ben Hinkle <bhinkle4 juno.com> writes:

Bastiaan Veelo wrote:

 James McComb wrote:
 1. A type-safe boolean. Java has one, D doesn't. I know this has been
 discussed to death, but people coming from a Java background want to
 know: What is the compelling reason to not have a type-safe boolean?

 
 Maybe this page carries an answer?
 http://www.prowiki.org/wiki4d/wiki.cgi?BooleanNotEquBit
 
 regards,
 Bastiaan.

A few of the posts listed on that page are about bool/bit not begin
addressable. That got me thinking about adding (either builtin or through
aliases or typedefs or something) a separate type for addressable
bools/bits called ... drum roll please... wbit and wbool (naturally wbool
is an alias for wbit). It would have the size of a byte (hence the "w") and
otherwise behave like bit (which pretty much makes it behave like C++'s
bool). Making bit addressable would mess up the rule that pointers all have
the same size and are convertable to void* and back.

-Ben

ps - I can just picture Elmer Fudd singing "kill the wbit, kill the wbit"

Nov 15 2004

"Walter" <newshound digitalmars.com> writes:

"Ben Hinkle" <bhinkle4 juno.com> wrote in message
news:cnaqh1$65e$1 digitaldaemon.com...
 ps - I can just picture Elmer Fudd singing "kill the wbit, kill the wbit"

Thanks for the chuckle <g>.

Nov 15 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Ben Hinkle wrote:

 A few of the posts listed on that page are about bool/bit not begin
 addressable. That got me thinking about adding (either builtin or through
 aliases or typedefs or something) a separate type for addressable
 bools/bits called ... drum roll please... wbit and wbool (naturally wbool
 is an alias for wbit). It would have the size of a byte (hence the "w") and
 otherwise behave like bit (which pretty much makes it behave like C++'s
 bool). Making bit addressable would mess up the rule that pointers all have
 the same size and are convertable to void* and back.

You can take the address of bit variables now,
and they should also work as "out" parameters.

You can't take the address of bits within arrays.
(or actually you can, but the pointer doesn't work)


A single bit field/var occupies a byte in memory,
and a bit[] field occupies (length+31)/32 bits.

So "bit a; bit b;" is 2 bytes, "bit c[2];" is 4.
(the multiple of four is for access-speed reasons)

--anders

PS.
In the Mac OS X C++ compiler (g++) a "bool" is 4 bytes.
A "_Bool", as used in C99, also occupies a full four.
(that is, both have the same size as an "int" does...)
On Linux, they seems to have a usual sizeof() 1 byte ?

Nov 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message
news:cnavkn$e0i$1 digitaldaemon.com...
 Ben Hinkle wrote:

 A few of the posts listed on that page are about bool/bit not begin
 addressable. That got me thinking about adding (either builtin or


through
 aliases or typedefs or something) a separate type for addressable
 bools/bits called ... drum roll please... wbit and wbool (naturally


wbool
 is an alias for wbit). It would have the size of a byte (hence the "w")


and
 otherwise behave like bit (which pretty much makes it behave like C++'s
 bool). Making bit addressable would mess up the rule that pointers all


have
 the same size and are convertable to void* and back.

 You can take the address of bit variables now,
 and they should also work as "out" parameters.

 You can't take the address of bits within arrays.
 (or actually you can, but the pointer doesn't work)


 A single bit field/var occupies a byte in memory,
 and a bit[] field occupies (length+31)/32 bits.

 So "bit a; bit b;" is 2 bytes, "bit c[2];" is 4.
 (the multiple of four is for access-speed reasons)

I was being too vague. You are right that bits by themselves are addressable
and can be used as out parameters. It's only when they get packed that life
gets interesting - for better or worse.

 --anders

 PS.
 In the Mac OS X C++ compiler (g++) a "bool" is 4 bytes.
 A "_Bool", as used in C99, also occupies a full four.
 (that is, both have the same size as an "int" does...)
 On Linux, they seems to have a usual sizeof() 1 byte ?

Interesting. I hadn't really thought that bools can have different sizes but
I suppose there isn't anything stopping it. Maybe int is better than byte.
Dare I suggest "dbit" and "dbool" for int sized bits and bools? :-)

Nov 15 2004

=?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:

Ben Hinkle wrote:

 Interesting. I hadn't really thought that bools can have different sizes but
 I suppose there isn't anything stopping it. Maybe int is better than byte.
 Dare I suggest "dbit" and "dbool" for int sized bits and bools? :-)

Enough! Enough! (my poor stomach) :-D

Hat's off for that most excellent suggestion!


Henceforth, byte/char shall be known as a "wbit" when used as a bool
and int/long shall similarly be known as a "dbit" when used as a bool.

Thus, one can choose between bit, wbit and dbit for storing booleans.
This makes it consistent with the other missing type, namely strings.


Oh, the humanity
--anders

Nov 15 2004

"Ben Hinkle" <bhinkle mathworks.com> writes:

"Anders F Bj�rklund" <afb algonet.se> wrote in message
news:cnb2jm$ic4$1 digitaldaemon.com...
 Ben Hinkle wrote:

 Interesting. I hadn't really thought that bools can have different sizes


but
 I suppose there isn't anything stopping it. Maybe int is better than


byte.
 Dare I suggest "dbit" and "dbool" for int sized bits and bools? :-)

 Enough! Enough! (my poor stomach) :-D

 Hat's off for that most excellent suggestion!


 Henceforth, byte/char shall be known as a "wbit" when used as a bool
 and int/long shall similarly be known as a "dbit" when used as a bool.

 Thus, one can choose between bit, wbit and dbit for storing booleans.
 This makes it consistent with the other missing type, namely strings.


 Oh, the humanity
 --anders

Actually I was being semi-serious! It is kindof overkill but I think
explicit types with different behaviors are preferable to hacking up
pointers.

Nov 15 2004

D Programming

C/C++ Programming

Other

digitalmars.D - D shouldn't be different from Java without good reason