www.digitalmars.com         C & C++   DMDScript  

D - Conversion of char* to wchar*

reply Russ Lewis <russ deming-os.org> writes:
What happens in this cast:

char *myString = "asdf";
wchar *myUnicodeString = (wchar*)myString;

If we do a simple pointer conversion, then we have lost the string
meaning of the pointer, which, while technically correct is most likely
not what 95% of the code writers would have wanted.
Aug 18 2001
parent reply "Walter" <walter digitalmars.com> writes:
Russ Lewis wrote in message <3B7F3BCB.6C07277D deming-os.org>...
What happens in this cast:

char *myString = "asdf";
wchar *myUnicodeString = (wchar*)myString;

If we do a simple pointer conversion, then we have lost the string
meaning of the pointer, which, while technically correct is most likely
not what 95% of the code writers would have wanted.

It undergoes a simple pointer conversion, and is pretty obviously a coding bug. Doing a cast on a string literal *will* convert the string itself, as: char *astring = "asdf"; // an ASCII version of "asdf" wchar *wstring = "asdf"; // makes a unicode version of "asdf" No need to put the L prefix on the string.
Aug 18 2001
parent reply weingart cs.ualberta.ca (Tobias Weingartner) writes:
In article <9lnimq$1ht9$1 digitaldaemon.com>, Walter wrote:
 
 Doing a cast on a string literal *will* convert the string itself, as:
 
     char *astring = "asdf";        // an ASCII version of "asdf"
     wchar *wstring = "asdf";    // makes a unicode version of "asdf"

That begs the question, "What character set is the language written in?" Will that be configurable? Will it be ascii? Etc, etc... --Toby.
Aug 20 2001
parent Russell Bornschlegel <kaleja estarcion.com> writes:
Tobias Weingartner wrote:
 That begs the question, "What character set is the language written
 in?"  Will that be configurable?  Will it be ascii?  Etc, etc...

If you mean "what character set does the language expect source to appear in," that's addressed in http://www.digitalmars.com/d/lex.html : # "The source file is checked to see if it is in ascii or unicode, and # the appropriate scanner is loaded ... D source text consists of Unicode # characters. If the source text consists of ASCII characters, they are # treated as the first 128 Unicode characters. Multibyte and UTF8 # character sets are not supported, although nothing precludes them # from being supported."
Aug 20 2001