www.digitalmars.com         C & C++   DMDScript  

D.gnu - invalid UTF-8 sequence compiler error

reply Cesar Rabak <crabak acm.org> writes:
Doing some tests on the gdc-0.22-1 for Linux (result of uname -a: Linux 
fuba 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686 Intel(R) 
Pentium(R) 4 CPU 3.00GHz GNU/Linux), I got the error "invalid UTF-8 
sequence" even if the 'offending' character is in comments.

Since the gcc counterpart does not complain on similar code with locale 
specific characters (mainly accented chars), I ponder:

Is there a way to have a gdc that can work with accented characters in 
strings and comments?

Regards,

--
Cesar Rabak
Apr 01 2007
parent reply Carlos Santander <csantander619 gmail.com> writes:
Cesar Rabak escribió:
 Doing some tests on the gdc-0.22-1 for Linux (result of uname -a: Linux 
 fuba 2.6.12-12mdksmp #1 SMP Fri Sep 9 17:43:23 CEST 2005 i686 Intel(R) 
 Pentium(R) 4 CPU 3.00GHz GNU/Linux), I got the error "invalid UTF-8 
 sequence" even if the 'offending' character is in comments.
 
 Since the gcc counterpart does not complain on similar code with locale 
 specific characters (mainly accented chars), I ponder:
 
 Is there a way to have a gdc that can work with accented characters in 
 strings and comments?
 
 Regards,
 
 -- 
 Cesar Rabak

This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or -32, and it'll work. -- Carlos Santander Bernal
Apr 01 2007
parent reply Cesar Rabak <crabak acm.org> writes:
Carlos Santander escreveu:
[snipped]
 
 This is a D feature, not a GDC problem. Save your file as UTF-8, -16, or 
 -32, and it'll work.
 

It will take time to convince my folks it is a feature but I'll try :-) Thanks, -- Cesar Rabak
Apr 01 2007
parent reply Cesar Rabak <crabak acm.org> writes:
Cesar Rabak escreveu:
 Carlos Santander escreveu:
 [snipped]
 This is a D feature, not a GDC problem. Save your file as UTF-8, -16, 
 or -32, and it'll work.

It will take time to convince my folks it is a feature but I'll try :-)

I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-. How it is this solved with the D compiler? -- Cesar Rabak
Apr 06 2007
parent reply Thomas Kuehne <thomas-dloop kuehne.cn> writes:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Cesar Rabak schrieb am 2007-04-06:
 Cesar Rabak escreveu:
 Carlos Santander escreveu:
 [snipped]
 This is a D feature, not a GDC problem. Save your file as UTF-8, -16, 
 or -32, and it'll work.

It will take time to convince my folks it is a feature but I'll try :-)

I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-. How it is this solved with the D compiler?

If you simply store and print message you could use "ubyte[]". For everyting else like searching, concating etc. more information about the encoding(s) is required. Thomas -----BEGIN PGP SIGNATURE----- iD8DBQFGF4+YLK5blCcjpWoRArCeAJ9rHeF6lxdnM84SbB0u+yn8v9q+igCdHakA 7Wzc3v3elbwKzB7y+wAwudQ= =Lzag -----END PGP SIGNATURE-----
Apr 07 2007
parent reply Cesar Rabak <crabak acm.org> writes:
Thomas Kuehne escreveu:
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Cesar Rabak schrieb am 2007-04-06:
 Cesar Rabak escreveu:
 Carlos Santander escreveu:
 [snipped]
 This is a D feature, not a GDC problem. Save your file as UTF-8, -16, 
 or -32, and it'll work.


I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-. How it is this solved with the D compiler?

If you simply store and print message you could use "ubyte[]". For everyting else like searching, concating etc. more information about the encoding(s) is required.

For going further in the tests for the consideration of D for programming in new projects storing and printing messages is all I need by now. If too much text data starts to appear to be processed I might have to consider an input method that makes the appropriate conversion to an internal representation (and an output one, too). -- Cesar Rabak
Apr 07 2007
parent reply Lars Ivar Igesund <larsivar igesund.net> writes:
Cesar Rabak wrote:

 Thomas Kuehne escreveu:
 -----BEGIN PGP SIGNED MESSAGE-----
 Hash: SHA1
 
 Cesar Rabak schrieb am 2007-04-06:
 Cesar Rabak escreveu:
 Carlos Santander escreveu:
 [snipped]
 This is a D feature, not a GDC problem. Save your file as UTF-8, -16,
 or -32, and it'll work.


I want, but for the message strings I'll need to abide the sysadmin's decisions which may be not UTF-. How it is this solved with the D compiler?

If you simply store and print message you could use "ubyte[]". For everyting else like searching, concating etc. more information about the encoding(s) is required.

For going further in the tests for the consideration of D for programming in new projects storing and printing messages is all I need by now. If too much text data starts to appear to be processed I might have to consider an input method that makes the appropriate conversion to an internal representation (and an output one, too). -- Cesar Rabak

If you should find yourself in need of such conversion routines, consider using the ICU (IBM package for such things) bindings in the Mango library, see http://www.dsource.org/projects/mango -- Lars Ivar Igesund blog at http://larsivi.net DSource, #d.tango & #D: larsivi Dancing the Tango
Apr 07 2007
parent Cesar Rabak <crabak acm.org> writes:
Lars Ivar Igesund escreveu:
 Cesar Rabak wrote:
 
 Thomas Kuehne escreveu:


 Thanks Thomas.

 For going further in the tests for the consideration of D for
 programming in new projects storing and printing messages is all I need
 by now.

 If too much text data starts to appear to be processed I might have to
 consider an input method that makes the appropriate conversion to an
 internal representation (and an output one, too).

 --
 Cesar Rabak

If you should find yourself in need of such conversion routines, consider using the ICU (IBM package for such things) bindings in the Mango library, see http://www.dsource.org/projects/mango

Thank you, Lars.
Apr 08 2007