digitalmars.D.learn - ANSI - output with phobos

me (7/7) Apr 03 2007 for(char c = 0; c < c.max; c++)

me (5/16) Apr 03 2007 First let me apologize for the double post.

Derek Parnell (18/33) Apr 03 2007 You seem to be wanting to display the characters of the console's curren...
Deewiant (5/20) Apr 04 2007 Not possible. Just use the C library, writing a wrapper around it if you...

Derek Parnell (20/29) Apr 03 2007 Characters whose numeric representation is above 127 and less than 256, ...
Juan Jose Comellas (11/24) Apr 04 2007 The problem is that the 'char' type can only contain valid UTF-8

Daniel Keep (13/41) Apr 04 2007 Here's another one (shameless plug):
Deewiant (7/14) Apr 04 2007 3) Use ubyte (or use char, but be careful about what functions you pass

Don Clugston (2/4) Apr 05 2007 Then how can you know how to output it?

Deewiant (7/13) Apr 05 2007 Just pass the bytes to the console, and let the user worry about how it'...

me <me nospamusa.com> writes:

for(char c = 0; c < c.max; c++)
    writefln(c);

In a not too distant past the above code could produce the entire ANSI table,
however this is not the case today. Today it peters out at 127 and any code
beyond that cannot be desplayed. The error message produced is:

  Error: 4invalid UTF-8 sequence

Please provide some guidance on how to accomplish this in present D.

Thanks,
Drew

Apr 03 2007

me <me nospamusa.com> writes:

me Wrote:

 for(char c = 0; c < c.max; c++)
     writefln(c);
 
 In a not too distant past the above code could produce the entire ANSI table,
however this is not the case today. Today it peters out at 127 and any code
beyond that cannot be desplayed. The error message produced is:
 
   Error: 4invalid UTF-8 sequence
 
 Please provide some guidance on how to accomplish this in present D.
 
 Thanks,
 Drew


First let me apologize for the double post.

I am aware that printf() can still be used to achieve the desired result.
However, I�m interested in accomplishing this through writef()/writefln();

Thanks again,
Drew

Apr 03 2007

Derek Parnell <derek nomail.afraid.org> writes:

On Tue, 03 Apr 2007 20:26:49 -0400, me wrote:

 me Wrote:
 
 for(char c = 0; c < c.max; c++)
     writefln(c);
 
 In a not too distant past the above code could produce the entire ANSI table,
however this is not the case today. Today it peters out at 127 and any code
beyond that cannot be desplayed. The error message produced is:
 
   Error: 4invalid UTF-8 sequence
 
 Please provide some guidance on how to accomplish this in present D.
 
 Thanks,
 Drew

 

You seem to be wanting to display the characters of the console's current
code-page.

 I am aware that printf() can still be used to achieve the desired result.

Yes, because it's a C routine and not D.

So I guess, the issue you are trying to resolve is how to convert code-page
characters into UTF-8 form. Character values 128-255 are displayed on the
Windows console using the console's current code-page to select the
appropriate glyph. To get the same glyph to display using Unicode (which is
the only character set that D supports) would mean that you have to set the
console to a Unicode "code-page" and manually convert the character values
from the code-page you were assuming, to the equivalent Unicode value. 

Not a trivial task at all.

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
4/04/2007 11:05:14 AM

Apr 03 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

me wrote:
 for(char c = 0; c < c.max; c++)
     writefln(c);

 In a not too distant past the above code could produce the entire ANSI table,
however this is not the case today. Today it peters out at 127 and any code
beyond that cannot be desplayed. The error message produced is:

   Error: 4invalid UTF-8 sequence

 Please provide some guidance on how to accomplish this in present D.

 
 
 First let me apologize for the double post.
 
 I am aware that printf() can still be used to achieve the desired result.
However, I知 interested in accomplishing this through writef()/writefln();
 

Not possible. Just use the C library, writing a wrapper around it if you don't
want to worry about whether strings are zero-terminated all the time.

-- 
Remove ".doesnotlike.spam" from the mail address.

Apr 04 2007

Derek Parnell <derek nomail.afraid.org> writes:

On Tue, 03 Apr 2007 20:16:06 -0400, me wrote:

 for(char c = 0; c < c.max; c++)
     writefln(c);
 
 In a not too distant past the above code could produce the entire ANSI table,
however this is not the case today. Today it peters out at 127 and any code
beyond that cannot be desplayed. The error message produced is:
 
   Error: 4invalid UTF-8 sequence
 
 Please provide some guidance on how to accomplish this in present D.
 

Characters whose numeric representation is above 127 and less than 256, are
not UTF-8 characters and the function 'writefln' expects 'char' values to
be UTF-8. So, to do what you want, you must either not use writefln or not
use 'char' types.

import std.stdio;
void main()
{

 for(ubyte c = 0; c < c.max; c++)
 {
     if (c <= 127) writef("'%s' ", cast(char)c);
     writefln(c);
 }
}

-- 
Derek
(skype: derek.j.parnell)
Melbourne, Australia
"Justice for David Hicks!"
4/04/2007 10:22:49 AM

Apr 03 2007

Juan Jose Comellas <jcomellas gmail.com> writes:

The problem is that the 'char' type can only contain valid UTF-8
*characters*. A character in UTF-8 can be composed of 1 to 4 *bytes*, and
not all of the values a byte can take are valid in UTF-8. In fact, most of
the byte values above 127 are not valid. You have two options: 1) use the
wchar type (the Latin 1/ISO8859-1 character set is very similar to ANSI and
all of its characters are 2 byte-wide when mapped to the UTF-16 character
set); 2) manually convert the 'ANSI' value into UTF-8.

For more information I suggest reading this:

http://en.wikipedia.org/wiki/Utf-8
http://en.wikipedia.org/wiki/Utf-16


me wrote:

 for(char c = 0; c < c.max; c++)
     writefln(c);
 
 In a not too distant past the above code could produce the entire ANSI
 table, however this is not the case today. Today it peters out at 127 and
 any code beyond that cannot be desplayed. The error message produced is:
 
   Error: 4invalid UTF-8 sequence
 
 Please provide some guidance on how to accomplish this in present D.
 
 Thanks,
 Drew

Apr 04 2007

Daniel Keep <daniel.keep.lists gmail.com> writes:

Juan Jose Comellas wrote:
 The problem is that the 'char' type can only contain valid UTF-8
 *characters*. A character in UTF-8 can be composed of 1 to 4 *bytes*, and
 not all of the values a byte can take are valid in UTF-8. In fact, most of
 the byte values above 127 are not valid. You have two options: 1) use the
 wchar type (the Latin 1/ISO8859-1 character set is very similar to ANSI and
 all of its characters are 2 byte-wide when mapped to the UTF-16 character
 set); 2) manually convert the 'ANSI' value into UTF-8.
 
 For more information I suggest reading this:
 
 http://en.wikipedia.org/wiki/Utf-8
 http://en.wikipedia.org/wiki/Utf-16

Here's another one (shameless plug):

http://www.prowiki.org/wiki4d/wiki.cgi?DanielKeep/TextInD

	-- Daniel

 me wrote:
 
 for(char c = 0; c < c.max; c++)
     writefln(c);

 In a not too distant past the above code could produce the entire ANSI
 table, however this is not the case today. Today it peters out at 127 and
 any code beyond that cannot be desplayed. The error message produced is:

   Error: 4invalid UTF-8 sequence

 Please provide some guidance on how to accomplish this in present D.

 Thanks,
 Drew

 

-- 
int getRandomNumber()
{
    return 4; // chosen by fair dice roll.
              // guaranteed to be random.
}

http://xkcd.com/

v2sw5+8Yhw5ln4+5pr6OFPma8u6+7Lw4Tm6+7l6+7D
i28a2Xs3MSr2e4/6+7t4TNSMb6HTOp5en5g6RAHCP  http://hackerkey.com/

Apr 04 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Juan Jose Comellas wrote:
 The problem is that the 'char' type can only contain valid UTF-8
 *characters*. A character in UTF-8 can be composed of 1 to 4 *bytes*, and
 not all of the values a byte can take are valid in UTF-8. In fact, most of
 the byte values above 127 are not valid. You have two options: 1) use the
 wchar type (the Latin 1/ISO8859-1 character set is very similar to ANSI and
 all of its characters are 2 byte-wide when mapped to the UTF-16 character
 set); 2) manually convert the 'ANSI' value into UTF-8.

3) Use ubyte (or use char, but be careful about what functions you pass
non-UTF-8 chars to), and print using the C standard library.

One might have to output a string without knowing its encoding, thus making it
impossible to convert it to a UTF encoding reliably.

-- 
Remove ".doesnotlike.spam" from the mail address.

Apr 04 2007

Don Clugston <dac nospam.com.au> writes:

Deewiant wrote:
 One might have to output a string without knowing its encoding, thus making it
 impossible to convert it to a UTF encoding reliably.

Then how can you know how to output it?

Apr 05 2007

Deewiant <deewiant.doesnotlike.spam gmail.com> writes:

Don Clugston wrote:
 Deewiant wrote:
 One might have to output a string without knowing its encoding, thus
 making it
 impossible to convert it to a UTF encoding reliably.

 
 Then how can you know how to output it?

Just pass the bytes to the console, and let the user worry about how it's
displayed.

If you write "\xe4" to a file, you expect the file to contain the byte 0xE4. If
you write it in a console, the console should display the character in the
current character set which 0xE4 is mapped to.

-- 
Remove ".doesnotlike.spam" from the mail address.

Apr 05 2007

D Programming

C/C++ Programming

Other

digitalmars.D.learn - ANSI - output with phobos