www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - telling D to ignore utf-8

reply "Ameer Armaly" <ameer_armaly hotmail.com> writes:
Hi.
Is there any way to tell the d compiler to treat *all* text as ascii, with 
no consideration to utf-8?  I will be receiving some     text that it thinks 
are bad utf-8 sequences.

-- 
---
Life is either tragedy or comedy.
 Usually it's your choice. You can whine or you can laugh.
--Animorphs 
Jul 13 2005
parent reply Stewart Gordon <smjg_1998 yahoo.com> writes:
Ameer Armaly wrote:
 Hi.
 Is there any way to tell the d compiler to treat *all* text as ascii, with 
 no consideration to utf-8?  I will be receiving some     text that it thinks 
 are bad utf-8 sequences.

I'm not sure what you mean. ASCII is a subset of UTF-8, so if your code file is ASCII then it won't have any UTF-8 sequences, bad or otherwise. Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Jul 13 2005
parent reply AJG <AJG_member pathlink.com> writes:
Hi,

Maybe he means binary, instead of ASCII. Do you? In this case then UTF8 makes no
difference, since binary will likely result in fubared UTF-8 _or_ ASCII. The
solution is to use ubyte instead of char and co.

--AJG.

In article <db3lj8$o5q$1 digitaldaemon.com>, Stewart Gordon says...
Ameer Armaly wrote:
 Hi.
 Is there any way to tell the d compiler to treat *all* text as ascii, with 
 no consideration to utf-8?  I will be receiving some     text that it thinks 
 are bad utf-8 sequences.

I'm not sure what you mean. ASCII is a subset of UTF-8, so if your code file is ASCII then it won't have any UTF-8 sequences, bad or otherwise. Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.

Jul 13 2005
parent Stewart Gordon <smjg_1998 yahoo.com> writes:
AJG wrote:
 Hi,
 
 Maybe he means binary, instead of ASCII. Do you? In this case then 
 UTF8 makes no difference, since binary will likely result in fubared 
 UTF-8 _or_ ASCII. The solution is to use ubyte instead of char and 
 co.

Notice that the OP talks about telling the _compiler_ to treat all text as ASCII. By which I understand treating source code files as being in ASCII. Since ASCII is a subset of UTF-8, any code file containing only ASCII characters is implicitly treated as ASCII. Stewart. -- -----BEGIN GEEK CODE BLOCK----- Version: 3.1 GCS/M d- s:- a->--- UB P+ L E W++ N+++ o K- w++ O? M V? PS- PE- Y? PGP- t- 5? X? R b DI? D G e++>++++ h-- r-- !y ------END GEEK CODE BLOCK------ My e-mail is valid but not my primary mailbox. Please keep replies on the 'group where everyone may benefit.
Jul 15 2005