D - drop ASCII characters from D?

J. Daniel Smith (20/20) Apr 05 2002 Walter's comment in the "Delegates" thread about the code

Pavel Minayev (17/34) Apr 05 2002 character

J. Daniel Smith (21/59) Apr 05 2002 Today there are still a lot of Win9x/WinME boxes out there, but that's n...

roland (4/76) Apr 05 2002 how is linux concerning caractere size ?

Walter (5/7) Apr 05 2002 Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be ...

OddesE (28/35) Apr 06 2002 the

Walter (11/30) Apr 06 2002 It still is a concern. I have an app on linux with wchars, and it still ...

J. Daniel Smith (11/18) Apr 08 2002 So what about my suggestion of making ASCII a bit more difficult to use ...

Walter (9/16) Apr 08 2002 There is no default char type in D. char is ascii, wchar is unicode. The

J. Daniel Smith (25/41) Apr 09 2002 But then aren't we full-circle back to

Walter (4/10) Apr 09 2002 Yes, but that doesn't in any way impede a programmer who wants to write ...

Pavel Minayev (6/7) Apr 05 2002 Unicode is not necessary future. Not in the few next years,

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> writes:

Walter's comment in the "Delegates" thread about the code

    void foo(char[]);
    void foo(wchar[]);
    ...
    foo("hello");
being ambiguous made me wonder about the point of supporting ASCII
characters in the first place.



Why not drop "wchar" and make "char" always means a 2-byte UNICODE character
(or even a 4-byte ISO10646 character).  With the release of Windows XP last
fall, the need for ASCII support is going to diminish as WinXP replaces
Win98/WinME.



If you really need a single byte character to interface with legacy APIs,
use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from "char" as
needed.  Yea, that makes such code more difficult, but it should all be
burried in some class anyway.



It seems that supporting both in D goes against current trends (Java, VB and

ASCII which is a decision that is usually regretted in many real-world
applications.



   Dan

Apr 05 2002

"Pavel Minayev" <evilone omen.ru> writes:

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
news:a8kprh$4h2$1 digitaldaemon.com...

 Walter's comment in the "Delegates" thread about the code

     void foo(char[]);
     void foo(wchar[]);
     ...
     foo("hello");
 being ambiguous made me wonder about the point of supporting ASCII
 characters in the first place.



 Why not drop "wchar" and make "char" always means a 2-byte UNICODE

character
 (or even a 4-byte ISO10646 character).  With the release of Windows XP

last
 fall, the need for ASCII support is going to diminish as WinXP replaces
 Win98/WinME.

I believe no more than 10% of my friends have WinNT, 2K or XP.
Your suggestion would make it very hard to write programs that run
on 9x series, which is still most popular.

 If you really need a single byte character to interface with legacy APIs,
 use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from "char"

as
 needed.  Yea, that makes such code more difficult, but it should all be
 burried in some class anyway.

You can convert a single char; but what about strings? D doesn't convert
arrays, AFAIK...

 It seems that supporting both in D goes against current trends (Java, VB

and


of

VB is bloated, partially because of UNICODE-only strings. Java is
platform-independent and doesn't really care of underlying system.


D is a practical tool. Since most systems and most programs today still
work with ASCII strings, they should be in the language.

Apr 05 2002

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> writes:

Today there are still a lot of Win9x/WinME boxes out there, but that's not
going to be the case for long.  I don't know what Walter's timeline is for
officially releasing D to the world, but let's just say it's 1-Jan-2003.
Add another year onto that for people to actually start adopting the
language and developing/shipping programs en-masse and we're to 2004.  I
think the Win9x/WinME numbers will look a lot different in 18+ months.  I
don't think it's much of a stretch to say that in the not too distant
future, ASCII will largely be considered legacy.

If you don't want to drop ASCII support completely from D, how about making
it (much) more difficult to use by making UNICODE the default? "char" is
UNICODE, "achar" is ASCII; a string/character literal is UNICODE, you have
to use an ugly A prefix to get ASCII; there are no implicit conversions
between UNICODE/ASCII - you've got to call some library routine (or maybe
cast) instead.

D is a new language, it should look to the future.

   Dan

"Pavel Minayev" <evilone omen.ru> wrote in message
news:a8kttp$ued$1 digitaldaemon.com...
 "J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
 news:a8kprh$4h2$1 digitaldaemon.com...

 Walter's comment in the "Delegates" thread about the code

     void foo(char[]);
     void foo(wchar[]);
     ...
     foo("hello");
 being ambiguous made me wonder about the point of supporting ASCII
 characters in the first place.



 Why not drop "wchar" and make "char" always means a 2-byte UNICODE

 character
 (or even a 4-byte ISO10646 character).  With the release of Windows XP

 last
 fall, the need for ASCII support is going to diminish as WinXP replaces
 Win98/WinME.

 I believe no more than 10% of my friends have WinNT, 2K or XP.
 Your suggestion would make it very hard to write programs that run
 on 9x series, which is still most popular.

 If you really need a single byte character to interface with legacy


APIs,
 use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from


"char"
 as
 needed.  Yea, that makes such code more difficult, but it should all be
 burried in some class anyway.

 You can convert a single char; but what about strings? D doesn't convert
 arrays, AFAIK...

 It seems that supporting both in D goes against current trends (Java, VB

 and



use
 of

 VB is bloated, partially because of UNICODE-only strings. Java is
 platform-independent and doesn't really care of underlying system.


 D is a practical tool. Since most systems and most programs today still
 work with ASCII strings, they should be in the language.

Apr 05 2002

roland <nancyetroland free.fr> writes:

"J. Daniel Smith" a �crit :

 Today there are still a lot of Win9x/WinME boxes out there, but that's not
 going to be the case for long.  I don't know what Walter's timeline is for
 officially releasing D to the world, but let's just say it's 1-Jan-2003.
 Add another year onto that for people to actually start adopting the
 language and developing/shipping programs en-masse and we're to 2004.  I
 think the Win9x/WinME numbers will look a lot different in 18+ months.  I
 don't think it's much of a stretch to say that in the not too distant
 future, ASCII will largely be considered legacy.

 If you don't want to drop ASCII support completely from D, how about making
 it (much) more difficult to use by making UNICODE the default? "char" is
 UNICODE, "achar" is ASCII; a string/character literal is UNICODE, you have
 to use an ugly A prefix to get ASCII; there are no implicit conversions
 between UNICODE/ASCII - you've got to call some library routine (or maybe
 cast) instead.

 D is a new language, it should look to the future.

    Dan

 "Pavel Minayev" <evilone omen.ru> wrote in message
 news:a8kttp$ued$1 digitaldaemon.com...
 "J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
 news:a8kprh$4h2$1 digitaldaemon.com...

 Walter's comment in the "Delegates" thread about the code

     void foo(char[]);
     void foo(wchar[]);
     ...
     foo("hello");
 being ambiguous made me wonder about the point of supporting ASCII
 characters in the first place.



 Why not drop "wchar" and make "char" always means a 2-byte UNICODE

 character
 (or even a 4-byte ISO10646 character).  With the release of Windows XP

 last
 fall, the need for ASCII support is going to diminish as WinXP replaces
 Win98/WinME.

 I believe no more than 10% of my friends have WinNT, 2K or XP.
 Your suggestion would make it very hard to write programs that run
 on 9x series, which is still most popular.

 If you really need a single byte character to interface with legacy


 APIs,
 use "ubyte" (or "ulong" for "wchar_t" APIs on *IX) convert to/from


 "char"
 as
 needed.  Yea, that makes such code more difficult, but it should all be
 burried in some class anyway.

 You can convert a single char; but what about strings? D doesn't convert
 arrays, AFAIK...

 It seems that supporting both in D goes against current trends (Java, VB

 and



 use
 of

 VB is bloated, partially because of UNICODE-only strings. Java is
 platform-independent and doesn't really care of underlying system.


 D is a practical tool. Since most systems and most programs today still
 work with ASCII strings, they should be in the language.


how is linux concerning caractere size ?
i personaly see my future rather near linux than XP

roland

Apr 05 2002

"Walter" <walter digitalmars.com> writes:

"roland" <nancyetroland free.fr> wrote in message
news:3CAE2667.1375F510 free.fr...
 how is linux concerning caractere size ?
 i personaly see my future rather near linux than XP

Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be the
future, but it is still many years away, and D should be agnostic about
whether the app is ASCII or Unicode.

Apr 05 2002

"OddesE" <OddesE_XYZ hotmail.com> writes:

"Walter" <walter digitalmars.com> wrote in message
news:a8lbqb$1d7s$1 digitaldaemon.com...
 "roland" <nancyetroland free.fr> wrote in message
 news:3CAE2667.1375F510 free.fr...
 how is linux concerning caractere size ?
 i personaly see my future rather near linux than XP

 Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be

the
 future, but it is still many years away, and D should be agnostic about
 whether the app is ASCII or Unicode.

I think memory shouldn't be a concern. I
don't think text, as in characters and
strings of characters, is the real memory
user in today's computing is it? A
typical e-mail or document isn't big at
all. It's things like images, textures in
games and audio and video that consume
most space on disk or in memory.

I agree that ASCII, although it isn't
dead, deserves to die. I think the idea
to make 32-bit characters the standard
is a good one, although I have to admit
I don't know much about the standardisation
that is going on in that field.
But I do know that 256 characters is way
too little!

maybe it is just too early to make a
decision, when the standardisation
hasn't settled down...


--
Stijn
OddesE_XYZ hotmail.com
http://OddesE.cjb.net
_________________________________________________
Remove _XYZ from my address when replying by mail

Apr 06 2002

"Walter" <walter digitalmars.com> writes:

"OddesE" <OddesE_XYZ hotmail.com> wrote in message
news:a8n3tp$18pi$1 digitaldaemon.com...
 I think memory shouldn't be a concern. I
 don't think text, as in characters and
 strings of characters, is the real memory
 user in today's computing is it? A
 typical e-mail or document isn't big at
 all. It's things like images, textures in
 games and audio and video that consume
 most space on disk or in memory.

It still is a concern. I have an app on linux with wchars, and it still uses
200 megs of ram, mostly because of the 4 bytes per char. Secondly, if you're
distributing an executable with a lot of text strings in it, it can bloat up
the download size quite a bit. I can also neatly fit all my source code on a
CD. I don't want it 4 times bigger <g>.

 I agree that ASCII, although it isn't
 dead, deserves to die. I think the idea
 to make 32-bit characters the standard
 is a good one, although I have to admit
 I don't know much about the standardisation
 that is going on in that field.
 But I do know that 256 characters is way
 too little!
 maybe it is just too early to make a
 decision, when the standardisation
 hasn't settled down...

Another huge reason to support ASCII in D is because D is meant to interface
with C functions. C apps are nearly all written to use ASCII. ASCII support
isn't going away anytime soon in operating systems, so D must support it
easilly.

Apr 06 2002

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> writes:

So what about my suggestion of making ASCII a bit more difficult to use -
that is, Unicode is the prefered/default character type in D.  'char' is a
Unicode character and "abc" is a Unicode string.

With the release of Windows XP, it's not going to be very long (months, not
years) before a Unicode-enabled platform is the norm for most people.

And I'm not sure I buy the "memory" argument - my PocketPC which is easily
more memory constrainted than any desktop PC only supports Unicode.

   Dan

"Walter" <walter digitalmars.com> wrote in message
news:a8lbqb$1d7s$1 digitaldaemon.com...
 "roland" <nancyetroland free.fr> wrote in message
 news:3CAE2667.1375F510 free.fr...
 how is linux concerning caractere size ?
 i personaly see my future rather near linux than XP

 Linux uses 4 byte wchars. This uses up memory real fast. Unicode may be

the
 future, but it is still many years away, and D should be agnostic about
 whether the app is ASCII or Unicode.

Apr 08 2002

"Walter" <walter digitalmars.com> writes:

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
news:a8s2ng$1jg0$1 digitaldaemon.com...
 So what about my suggestion of making ASCII a bit more difficult to use -
 that is, Unicode is the prefered/default character type in D.  'char' is a
 Unicode character and "abc" is a Unicode string.

There is no default char type in D. char is ascii, wchar is unicode. The
type of "abc" depends on context. The source text can be ascii or unicode
(try it!).

 With the release of Windows XP, it's not going to be very long (months,

not
 years) before a Unicode-enabled platform is the norm for most people.

All win32 platforms support unicode already.

 And I'm not sure I buy the "memory" argument - my PocketPC which is easily
 more memory constrainted than any desktop PC only supports Unicode.

You can shrink down the memory for unicode quite a bit by using UTF8, at the
expense of slowing things down.

Apr 08 2002

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> writes:

But then aren't we full-circle back to
    void foo(char[]);
    void foo(wchar[]);
    ...
    foo("hello");
being ambiguous?  Although it sounds like you can (almost?) support both
Unicode and ASCII transparently, I'd still like to see Unicode implicitly
given more emphasis; for example, the above code snipet would be
    void foo(achar[]);    // ASCII
    void foo(char[]);    // Unicode
    ...
    foo("hello");    // Unicode string - calls foo(char[])
    foo(A"hello");    // ASCII string - calls foo(achar[])

As far as Win32 platforms go, I guess it depends on what one means by
"supporting Unicode."  Only a small handful of Win32 APIs are Unicode on
Win9x (although the recently released MSLU expands that list considerably).

Dell has a complete 1.8Ghz system with 256MB of RAM for $999; given numbers
like that, I'm not overly concerned with either processing power or memory.

   Dan

"Walter" <walter digitalmars.com> wrote in message
news:a8t2j5$1mc$1 digitaldaemon.com...
 "J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
 news:a8s2ng$1jg0$1 digitaldaemon.com...
 So what about my suggestion of making ASCII a bit more difficult to


use -
 that is, Unicode is the prefered/default character type in D.  'char' is


a
 Unicode character and "abc" is a Unicode string.

 There is no default char type in D. char is ascii, wchar is unicode. The
 type of "abc" depends on context. The source text can be ascii or unicode
 (try it!).

 With the release of Windows XP, it's not going to be very long (months,

 not
 years) before a Unicode-enabled platform is the norm for most people.

 All win32 platforms support unicode already.

 And I'm not sure I buy the "memory" argument - my PocketPC which is


easily
 more memory constrainted than any desktop PC only supports Unicode.

 You can shrink down the memory for unicode quite a bit by using UTF8, at

the
 expense of slowing things down.

Apr 09 2002

"Walter" <walter digitalmars.com> writes:

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
news:a8und0$3e7$1 digitaldaemon.com...
 But then aren't we full-circle back to
     void foo(char[]);
     void foo(wchar[]);
     ...
     foo("hello");
 being ambiguous?

Yes, but that doesn't in any way impede a programmer who wants to write a
full unicode app.

Apr 09 2002

"Pavel Minayev" <evilone omen.ru> writes:

"J. Daniel Smith" <j_daniel_smith HoTMaiL.com> wrote in message
news:a8l5o5$17di$1 digitaldaemon.com...

 D is a new language, it should look to the future.

Unicode is not necessary future. Not in the few next years,
at least.

And after all, you can always use alias in your programs:

    alias wchar Char;

Apr 05 2002

D Programming

C/C++ Programming

Other

D - drop ASCII characters from D?