digitalmars.D.learn - Displaying non UTF-8 8 bit character codes with writefln()
- Graham (20/20) Oct 04 2007 Is there an easy way of displaying non UTF-8 8 bit codes with writefln()...
- Regan Heath (23/23) Oct 04 2007 Try printf and saving the file as a UTF-8 encoded text file...
- Graham (2/33) Oct 04 2007 Thanks, I was hoping for something more elegant but if all char variable...
- Stewart Gordon (13/19) Oct 05 2007 Why, exactly, are you advocating going back to the printf abomination?
- Graham (8/15) Oct 05 2007 Thanks, that's nice.
-
Stewart Gordon
(9/16)
Oct 05 2007
"Graham >"
wrote in message - Regan Heath (13/31) Oct 05 2007 Well.. there were 2 ways to solve his problem:
-
Stewart Gordon
(18/28)
Oct 05 2007
"Regan Heath"
wrote in message - Regan Heath (12/42) Oct 05 2007 Sure, except the OP wanted formatting. End of the day, as long as you
-
Stewart Gordon
(18/20)
Oct 05 2007
"Regan Heath"
wrote in message - Graham (6/6) Oct 04 2007 After searching back a bit further than before I see this was discussed
- Aziz K. (7/13) Oct 04 2007 Hi,
Is there an easy way of displaying non UTF-8 8 bit codes with writefln() ? E.g. code like: writefln("elapsed time %.9f \µS", elapsed_time); On a windows system displays output like: elapsed time 2.598202392 µS (displayed when running in a cmd.exe window) The µ is character codes 0xC2 0xB5 for the UTF-8 encoding of µ. Code like: writefln("elapsed time %.9f \u00B5S", elapsed_time); displays the same and code like: writefln("elapsed time %.9f \xB5S", elapsed_time); understandably displays the run-time error: Error: 4invalid UTF-8 sequence trying a Wysiwyg string like: writefln("elapsed time %.9f " r"µ" "S", elapsed_time); displays a compiler error: invalid UTF-8 sequence Is there any simple way to output a non UTF-8 string containing the B5 character code without the C2 prefix ?
Oct 04 2007
Try printf and saving the file as a UTF-8 encoded text file... --[b5.d]-- import std.stdio; void main() { printf("\µ\n"); printf("\u00B5\n"); printf("\xB5\n"); //doesn't output anything writefln("µ"); } Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: E:\D\src\tmp>chcp 65001 Active code page: 65001 E:\D\src\tmp>dmd -run b5.d µ µ µ The 3rd printf doesn't output anything, not sure why, the others all output the same character. chcp 65001 changes to UTF-8 code page :) Regan
Oct 04 2007
Regan Heath Wrote:Try printf and saving the file as a UTF-8 encoded text file... --[b5.d]-- import std.stdio; void main() { printf("\µ\n"); printf("\u00B5\n"); printf("\xB5\n"); //doesn't output anything writefln("µ"); } Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: E:\D\src\tmp>chcp 65001 Active code page: 65001 E:\D\src\tmp>dmd -run b5.d µ µ µ The 3rd printf doesn't output anything, not sure why, the others all output the same character. chcp 65001 changes to UTF-8 code page :) ReganThanks, I was hoping for something more elegant but if all char variables in phobos have to be UTF-8 I guess this is the only way.
Oct 04 2007
"Regan Heath" <regan netmail.co.nz> wrote in message news:fe2uf5$2gsa$1 digitalmars.com...Try printf and saving the file as a UTF-8 encoded text file...Why, exactly, are you advocating going back to the printf abomination? <snip>Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: E:\D\src\tmp>chcp 65001 Active code page: 65001<snip> This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly. What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/ Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.
Oct 05 2007
Stewart Gordon Wrote:What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/ Stewart. --Thanks, that's nice. By the way, I spotted some minor errors on a couple of your documentation pages: ConsoleOutput referring to ConsoleInput in second column on http://pr.stewartsplace.org.uk/d/sutil/ref/annotated.html and the subtitle on http://pr.stewartsplace.org.uk/d/sutil/ref/classsmjg_1_1libs_1_1util_1_1console_1_1ConsoleOutput.html is ConsoleInput instead of ConsoleOutput
Oct 05 2007
"Graham >" <GC <grahamc001uk nospam-yahoo.co.uk> wrote in message news:fe5cp5$bp$1 digitalmars.com... <snip>By the way, I spotted some minor errors on a couple of your documentation pages: ConsoleOutput referring to ConsoleInput in second column on http://pr.stewartsplace.org.uk/d/sutil/ref/annotated.html and the subtitle on http://pr.stewartsplace.org.uk/d/sutil/ref/classsmjg_1_1libs_1_1util_1_1console_1_1ConsoleOutput.html is ConsoleInput instead of ConsoleOutputGood catch. Also noticed quite a few cases where the automatic removal of words like "The ConsoleInput class" in the brief description hasn't worked. Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.
Oct 05 2007
Stewart Gordon wrote:"Regan Heath" <regan netmail.co.nz> wrote in message news:fe2uf5$2gsa$1 digitalmars.com...Well.. there were 2 ways to solve his problem: 1. avoid the valid utf-8 cahracter check. 2. make the console display utf-8 correctly. printf("%c\n", 230); writefln("\u00B5"); or save the file as UTF-8 and use writefln("µ");Try printf and saving the file as a UTF-8 encoded text file...Why, exactly, are you advocating going back to the printf abomination?<snip>Sadly, if the application is outputting UTF-8 you don't have a choice.Using this source saved as b5.d as a UTF-8 encoded text file (IMPORTANT) I can set my command prompt font to "Lucida Console" and execute the following commands: E:\D\src\tmp>chcp 65001 Active code page: 65001<snip> This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly.What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/Cool. You're converting UTF-8 to the console code page I assume. Regan
Oct 05 2007
"Regan Heath" <regan netmail.co.nz> wrote in message news:fe5d88$15l$1 digitalmars.com... <snip>1. avoid the valid utf-8 cahracter check. 2. make the console display utf-8 correctly. printf("%c\n", 230);No I gottan't. I could use putchar, puts or OutputStream.writeString for example. <snip>But how many DOS or Windows console apps in the real world output UTF-8? Presumably not many, considering that no versions of DOS and only a few versions of Windows support it. There's also a causal loop in that even modern Windows versions don't come with the console code page set to 65001 by default. I don't know what is likely to break this loop, but I doubt that the restrictiveness of one language's standard library is going to do it.This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly.Sadly, if the application is outputting UTF-8 you don't have a choice.Exactly. (Well, as exactly as is possible under the constraints.) Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/Cool. You're converting UTF-8 to the console code page I assume.
Oct 05 2007
Stewart Gordon wrote:"Regan Heath" <regan netmail.co.nz> wrote in message news:fe5d88$15l$1 digitalmars.com... <snip>Sure, except the OP wanted formatting. End of the day, as long as you know what you're doing using printf isn't going to kill you.1. avoid the valid utf-8 cahracter check. 2. make the console display utf-8 correctly. printf("%c\n", 230);No I gottan't. I could use putchar, puts or OutputStream.writeString for example.<snip>Everything written in D using writefln from phobos ;) Even if you're only outputting ASCII characters (a subset of UTF-8 - as I'm sure you know) you have the ability to output the full range of UTF-8 codepoints and really we need a console which can handle that.But how many DOS or Windows console apps in the real world output UTF-8?This misses the point slightly. The user shouldn't have to change the codepage just to get someone else's application to work properly.Sadly, if the application is outputting UTF-8 you don't have a choice.Presumably not many, considering that no versions of DOS and only a few versions of Windows support it. There's also a causal loop in that even modern Windows versions don't come with the console code page set to 65001 by default. I don't know what is likely to break this loop, but I doubt that the restrictiveness of one language's standard library is going to do it.True. I wonder what the vista console defaults to? Are they still using local code pages or are they using UTF-8 or UTF-16 (perhaps more likely):) ReganExactly. (Well, as exactly as is possible under the constraints.)What you want is my utility library: http://pr.stewartsplace.org.uk/d/sutil/Cool. You're converting UTF-8 to the console code page I assume.
Oct 05 2007
"Regan Heath" <regan netmail.co.nz> wrote in message news:fe5g9k$5i6$1 digitalmars.com... <snip>True. I wonder what the vista console defaults to? Are they still using local code pages or are they using UTF-8 or UTF-16 (perhaps more likely)<snip> Mine defaults to 850. (Strange - British installations of MS-DOS back in the day always default to 437 as far as my experience goes. Sometimes under Win9x, you would get the anomaly of 437 in full screen mode, but a console font in windowed mode that's set up for 850.) But having it use UTF-16 would break far too many programs. There is, however, a function ReadConsoleW, which reads characters in UTF-16 regardless of the active code page. But it doesn't work if stdin is redirected. But I also found that ReadFile doesn't handle UTF-8 console input properly. Look at the way my library uses the two functions, each to get around the problems with the other depending on circumstance. Stewart. -- My e-mail address is valid but not my primary mailbox. Please keep replies on the 'group where everybody may benefit.
Oct 05 2007
After searching back a bit further than before I see this was discussed in April and the answer was to use printf for the 8 bit string. something like: writef("elapsed time %.9f", elapsed_time); printf(" \xB5S\n"); does work, but if anybody has a more elegant solution please let me know.
Oct 04 2007
Graham wrote:After searching back a bit further than before I see this was discussed in April and the answer was to use printf for the 8 bit string. something like: writef("elapsed time %.9f", elapsed_time); printf(" \xB5S\n"); does work, but if anybody has a more elegant solution please let me know.Hi, There's a better solution. You could switch to the Tango librabry which uses WriteConsoleW() internally to correctly write Unicode characters on the Windows console. Regards, Aziz
Oct 04 2007