digitalmars.D.learn - unicode characters are not printed correctly on the windows command
- moth (15/15) Dec 21 2019 hi all.
- rikki cattermole (12/34) Dec 21 2019 This is not nonsense. This is the correct solution if that is what you
- Mike Parker (11/18) Dec 22 2019 Yes, and it's not just D programs. And setting the code page
- Adam D. Ruppe (8/9) Dec 22 2019 No, Phobos is *clearly* in the wrong here. There is a proper fix.
- Steven Schveighoffer (6/11) Dec 22 2019 Phobos doesn't call the wrong function, libc does. Phobos uses fwrite
- Adam D. Ruppe (20/22) Dec 22 2019 There is allegedly a way to set fwrite to do the translations on
- Steven Schveighoffer (35/54) Dec 22 2019 Looks like you need to switch to "wprintf". I'm not sure, but I think we...
- Symphony (11/24) Dec 22 2019 I don't have the ingenuity, intelligence, nor experience that
- Steven Schveighoffer (27/52) Dec 23 2019 I really appreciate the enthusiasm here, but at the risk of being
- bachmeier (4/13) Dec 23 2019 Just out of curiosity, what would be the advantage of having
- Steven Schveighoffer (6/20) Dec 23 2019 It means that all of Phobos can take advantage of the better performance...
- Symphony (9/38) Dec 23 2019 Pardon my ignorance, but wouldn't the inclusion of a std.io (e.g.
- Steven Schveighoffer (18/26) Dec 23 2019 Well, that's certainly a lot easier project. But one might question
- H. S. Teoh (20/23) Dec 23 2019 [...]
- Steven Schveighoffer (14/36) Dec 23 2019 That means we have to buffer separately, which means we have a problem
- Adam D. Ruppe (10/12) Dec 23 2019 Or simply don't buffer. Any call you get, flush the C buffer and
- Steven Schveighoffer (9/20) Dec 23 2019 Unbuffered output would perform badly, especially if you are writing
- H. S. Teoh (8/11) Dec 23 2019 [...]
- Adam D. Ruppe (6/9) Dec 22 2019 It isn't the language/compiler per se, it is the library calling
hi all. been learning d for the last few years but suddenly realised... when i use this code: writeln('♥'); the output displayed on the windows command line is "ÔÖÑ" [it works fine when piped directly into a text file, however]. i've looked about in this forum, but all that i could find was people in 2016[!] saying the codepage had to be altered - clearly nonsense, since Rust [which i am also learning] has no problem whatsoever displaying "♥". is there any function i can call or setting i can adjust to get D to do the same, or do i have to wait for something to be fixed in the language / compiler itself? best regards moth [su.angel-island.zone]
Dec 21 2019
On 22/12/2019 7:11 PM, moth wrote:hi all. been learning d for the last few years but suddenly realised... when i use this code: writeln('♥'); the output displayed on the windows command line is "ÔÖÑ" [it works fine when piped directly into a text file, however]. i've looked about in this forum, but all that i could find was people in 2016[!] saying the codepage had to be altered - clearly nonsense, since Rust [which i am also learning] has no problem whatsoever displaying "♥".This is not nonsense. This is the correct solution if that is what you intend for your program to do. Not everybody will want this. They may have set the code page themselves in some way. It may not have even occurred within a D application! Its best we leave it as the default to play nice with other applications and libraries.is there any function i can call or setting i can adjust to get D to do the same, or do i have to wait for something to be fixed in the language / compiler itself? best regards moth [su.angel-island.zone]Not a bug. This is a known issue on the Windows side for people new to developing natively for it. I just checked the terminal emulator I use, ConEmu and yeah it doesn't have to do anything to make Unicode "just work" settings wise. Its conhost with its legacy which is what you are facing.
Dec 21 2019
On Sunday, 22 December 2019 at 06:25:42 UTC, rikki cattermole wrote:On 22/12/2019 7:11 PM, moth wrote:Yes, and it's not just D programs. And setting the code page isn't always perfect, as it matters which font cmd is configured to use. Google for "windows command prompt unicode output". MS has updated the command prompt to support Unicode, but I don't know how to use it: https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/ If you're on Windows 10, there's also Windows Terminal, which was released on the app store in June: https://devblogs.microsoft.com/commandline/windows-terminal-preview-v0-7-release/is there any function i can call or setting i can adjust to get D to do the same, or do i have to wait for something to be fixed in the language / compiler itself?Not a bug. This is a known issue on the Windows side for people new to developing natively for it.
Dec 22 2019
On Sunday, 22 December 2019 at 06:25:42 UTC, rikki cattermole wrote:Not a bug.No, Phobos is *clearly* in the wrong here. There is a proper fix. http://dpldocs.info/this-week-in-d/Blog.Posted_2019_11_25.html#unicode Use the correct WriteConsoleW api instead of the ancient ascii api. WriteConsoleW works without changing any settings. (on old versions of Windows, you may have to install fonts to display it, but new ones come with it all preinstalled).
Dec 22 2019
On 12/22/19 8:40 AM, Adam D. Ruppe wrote:On Sunday, 22 December 2019 at 06:25:42 UTC, rikki cattermole wrote:Phobos doesn't call the wrong function, libc does. Phobos uses fwrite for output.Not a bug.No, Phobos is *clearly* in the wrong here. There is a proper fix.http://dpldocs.info/this-week-in-d/Blog.Posted_2019_11_25.html#unicodeYou need to address that in DMC. I wonder, does MSVCRT have the same problem? -Steve
Dec 22 2019
On Sunday, 22 December 2019 at 18:41:16 UTC, Steven Schveighoffer wrote:Phobos doesn't call the wrong function, libc does. Phobos uses fwrite for output.There is allegedly a way to set fwrite to do the translations on MSVCRT: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/setmode?view=vs-2019 but trying it here it throws invalid parameter exception so idk. Regardless, I'm pretty well of the opinion that fwrite is the wrong thing to do anyway. fwrite writes bytes to a file, but we want to write strings to the console. There's other functions that do that. There is the worry of mixing stuff from C and keeping the buffer consistent, but it could always just flush() before doing its thing too. Or maybe even merge the buffers, idk what the MS runtime supports for that. or maybe i'm missing something and _setmode is a viable solution. But whatever we do, passing the buck isn't solving anything. Windows has supported Unicode console output since NT 4.0 in 1996.. just have to call the right function, and whether it is Phobos calling it or druntime or the CRT, someone just needs to do it!
Dec 22 2019
On 12/22/19 5:04 PM, Adam D. Ruppe wrote:On Sunday, 22 December 2019 at 18:41:16 UTC, Steven Schveighoffer wrote:Looks like you need to switch to "wprintf". I'm not sure, but I think we rely only on fwrite, for which there is no "w" equivalent.Phobos doesn't call the wrong function, libc does. Phobos uses fwrite for output.There is allegedly a way to set fwrite to do the translations on MSVCRT: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/ etmode?view=vs-2019but trying it here it throws invalid parameter exception so idk.Not surprised ;) Here's a cool feature of Windows: https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fwide?view=vs-2019 Basically does nothing, all parameters ignored (and yes, we use this function in Phobos, assuming it does something). But let me just say, the fact that there is some "mode" you have to set, like binary mode, that makes unicode work is unsettling. I hate libc streams...Regardless, I'm pretty well of the opinion that fwrite is the wrong thing to do anyway. fwrite writes bytes to a file, but we want to write strings to the console. There's other functions that do that.Preaching to the choir here. I wanted to rip out libc reliance a decade ago.There is the worry of mixing stuff from C and keeping the buffer consistent, but it could always just flush() before doing its thing too. Or maybe even merge the buffers, idk what the MS runtime supports for that.This is the crux. Some people gotta have their printf. And if you do different types of buffered streams, the result even from single-threaded output looks like garbage. The only solution is to wrap FILE *. And I do mean only. I looked into trying to hook the buffers. There's no reliable way without knowing all the implementation details.or maybe i'm missing something and _setmode is a viable solution._setmode is on a file descriptor. That already is a red flag to me, as there are no file descriptors in the OS. Windows use handles. So this has some weird library "translation" happening underneath. Ugh.But whatever we do, passing the buck isn't solving anything. Windows has supported Unicode console output since NT 4.0 in 1996.. just have to call the right function, and whether it is Phobos calling it or druntime or the CRT, someone just needs to do it!Hey, you can always just call the function yourself! Just make an output stream that writes with the right function, and then you can use formattedWrite instead of writef. To fix Phobos, we just(!) need to remove libc as the underlying stream implementation. I had at one point agreement from Walter to make a "backwards-compatible-ish" mechanism for file/streams. But it's not pretty, and was convoluted. At the time, I was struggling getting what would become iopipe to be usable on its own, and I eventually quit worrying about that aspect of it. We have the basic building blocks with https://github.com/MartinNowak/io and https://github.com/schveiguy/iopipe. It would be cool to get this into Phobos, but it's a lot of work. I bet Rust just skips libc altogether. -Steve
Dec 22 2019
On Sunday, 22 December 2019 at 22:47:43 UTC, Steven Schveighoffer wrote:To fix Phobos, we just(!) need to remove libc as the underlying stream implementation. I had at one point agreement from Walter to make a "backwards-compatible-ish" mechanism for file/streams. But it's not pretty, and was convoluted. At the time, I was struggling getting what would become iopipe to be usable on its own, and I eventually quit worrying about that aspect of it. We have the basic building blocks with https://github.com/MartinNowak/io and https://github.com/schveiguy/iopipe. It would be cool to get this into Phobos, but it's a lot of work. I bet Rust just skips libc altogether. -SteveI don't have the ingenuity, intelligence, nor experience that many of you possess, but I have *a lot* of time on my hands for something like this. I assume I should start with std.stdio's source code and the aforementioned projects' source code, but some guidance on this would be very helpful, if not needed. D has been quite useful to me since I stumbled upon it, and I think it's time to give back in some way. (I'd do it financially, but I'm poor, haha) Anyway, if anybody wants to take me up on this offer, just let me know!
Dec 22 2019
On 12/22/19 11:53 PM, Symphony wrote:On Sunday, 22 December 2019 at 22:47:43 UTC, Steven Schveighoffer wrote:I really appreciate the enthusiasm here, but at the risk of being cynical, I see little chance that this gets accepted. Before you spend any time on actual code, a DIP is going to be required, as this would be a huge change to the language. I'm sure you have a lot of time, but I don't want you to waste it on something that is likely to be rejected. If you still want to proceed, even at the risk of doing a lot of work for nothing (or at least, a lot of work that ends up being just on code.dlang.org instead of Phobos), I can tell you what my plan was: 1. std.stdio.File was going to be set up to source from either an iopipe-based io subsystem, or a FILE *. 2. The standard handles would be open with the default C FILE * standard handles as the source/target. 3. Upon using any "d-like" features on a File that is sourced from a FILE * (i.e. byline), the File would be switched to a newly-created iopipe-based source. The theory is here, that once you do something like this, you commit to using D on that, and I'd much rather use a higher performing subsystem (iopipe beats Phobos right now by 2x performance). This only counts for things that make the File unusable on its own anyway. So writefln and writeln would NOT switch the source, neither would lockingTextReader/Writer. 4. Any new File that is opened using any constructor other than passing in a FILE * will be opened with an iopipe source. 5. The iopipe and io subsystems can be used directly instead of with File, as a lot of times you don't need that overhead. Let me know if you decide to do this, I can guide you. -SteveTo fix Phobos, we just(!) need to remove libc as the underlying stream implementation. I had at one point agreement from Walter to make a "backwards-compatible-ish" mechanism for file/streams. But it's not pretty, and was convoluted. At the time, I was struggling getting what would become iopipe to be usable on its own, and I eventually quit worrying about that aspect of it. We have the basic building blocks with https://github.com/MartinNowak/io and https://github.com/schveiguy/iopipe. It would be cool to get this into Phobos, but it's a lot of work. I bet Rust just skips libc altogether.I don't have the ingenuity, intelligence, nor experience that many of you possess, but I have *a lot* of time on my hands for something like this. I assume I should start with std.stdio's source code and the aforementioned projects' source code, but some guidance on this would be very helpful, if not needed. D has been quite useful to me since I stumbled upon it, and I think it's time to give back in some way. (I'd do it financially, but I'm poor, haha) Anyway, if anybody wants to take me up on this offer, just let me know!
Dec 23 2019
On Monday, 23 December 2019 at 15:34:13 UTC, Steven Schveighoffer wrote:I really appreciate the enthusiasm here, but at the risk of being cynical, I see little chance that this gets accepted. Before you spend any time on actual code, a DIP is going to be required, as this would be a huge change to the language. I'm sure you have a lot of time, but I don't want you to waste it on something that is likely to be rejected. If you still want to proceed, even at the risk of doing a lot of work for nothing (or at least, a lot of work that ends up being just on code.dlang.org instead of Phobos)Just out of curiosity, what would be the advantage of having something like this in Phobos rather than as a separate package?
Dec 23 2019
On 12/23/19 10:48 AM, bachmeier wrote:On Monday, 23 December 2019 at 15:34:13 UTC, Steven Schveighoffer wrote:It means that all of Phobos can take advantage of the better performance and other benefits. For instance, std.process uses File (and therefore FILE *) as it's streams for the pipes to the child process. This has huge limitations. -SteveI really appreciate the enthusiasm here, but at the risk of being cynical, I see little chance that this gets accepted. Before you spend any time on actual code, a DIP is going to be required, as this would be a huge change to the language. I'm sure you have a lot of time, but I don't want you to waste it on something that is likely to be rejected. If you still want to proceed, even at the risk of doing a lot of work for nothing (or at least, a lot of work that ends up being just on code.dlang.org instead of Phobos)Just out of curiosity, what would be the advantage of having something like this in Phobos rather than as a separate package?
Dec 23 2019
On Monday, 23 December 2019 at 15:34:13 UTC, Steven Schveighoffer wrote:I really appreciate the enthusiasm here, but at the risk of being cynical, I see little chance that this gets accepted. Before you spend any time on actual code, a DIP is going to be required, as this would be a huge change to the language. I'm sure you have a lot of time, but I don't want you to waste it on something that is likely to be rejected. If you still want to proceed, even at the risk of doing a lot of work for nothing (or at least, a lot of work that ends up being just on code.dlang.org instead of Phobos), I can tell you what my plan was: 1. std.stdio.File was going to be set up to source from either an iopipe-based io subsystem, or a FILE *. 2. The standard handles would be open with the default C FILE * standard handles as the source/target. 3. Upon using any "d-like" features on a File that is sourced from a FILE * (i.e. byline), the File would be switched to a newly-created iopipe-based source. The theory is here, that once you do something like this, you commit to using D on that, and I'd much rather use a higher performing subsystem (iopipe beats Phobos right now by 2x performance). This only counts for things that make the File unusable on its own anyway. So writefln and writeln would NOT switch the source, neither would lockingTextReader/Writer. 4. Any new File that is opened using any constructor other than passing in a FILE * will be opened with an iopipe source. 5. The iopipe and io subsystems can be used directly instead of with File, as a lot of times you don't need that overhead. Let me know if you decide to do this, I can guide you. -StevePardon my ignorance, but wouldn't the inclusion of a std.io (e.g. Martin Nowak's io library) into Phobos be an easier and cleaner move? Other Phobos modules that require std.stdio could be gradually changed so that they use std.io instead. There would be the issue of two coexisting IO libraries in std, but issuing some warnings whenever std.stdio is imported wouldn't be too bad in my view; that is unless Mr. Bright's opposition is the main blocker.
Dec 23 2019
On 12/23/19 2:52 PM, Symphony wrote:Pardon my ignorance, but wouldn't the inclusion of a std.io (e.g. Martin Nowak's io library) into Phobos be an easier and cleaner move? Other Phobos modules that require std.stdio could be gradually changed so that they use std.io instead.Well, that's certainly a lot easier project. But one might question whether we should do it unless we have a reason to have Phobos start using it. As bachmeier mentioned, it can happily exist in its own location. The "gradual change" thing, I don't know how that works. Also note that std.io has no buffering. You need something like iopipe on top of it for it to be reasonably usable.There would be the issue of two coexisting IO libraries in std, but issuing some warnings whenever std.stdio is imported wouldn't be too bad in my view; that is unless Mr. Bright's opposition is the main blocker.It's not without precedent though. There actually was an alternate stream system in Phobos, now in undead: https://github.com/dlang/undeaD/blob/master/src/undead/stream.d But I think before we think about making the attempt to get this accepted, we really need to flesh out the end goal. The maintainers have soured a bit I think on the std.experiemental location, especially since we do have code.dlang.org. The bar for entry is high for Phobos. My recommendation is to focus on getting the std.io project and the iopipe project to be usable and fully featured. Then it may be a much easier task to convince leadership that they should be in Phobos. -Steve
Dec 23 2019
On Sun, Dec 22, 2019 at 10:04:20PM +0000, Adam D. Ruppe via Digitalmars-d-learn wrote: [...]Regardless, I'm pretty well of the opinion that fwrite is the wrong thing to do anyway. fwrite writes bytes to a file, but we want to write strings to the console. There's other functions that do that.[...] Would it make sense for std.stdio.write* (the package global functions, as opposed to File.write*) to use the Windows console output functions instead of proxying to libc? Alternatively, we could change std.stdio.File to check if the current file descriptor is the console (fd == stdout && stdout == console, however you figure that out in Windows), and silently switch to the Windows console output functions instead of libc. We *are* already wrapping libc's FILE*, why not wrap the Windows console output functions as well. Mixing raw libc printf with std.stdio.write* is a bad idea anyway; do we really need to support that?? Though calling fflush(stdout) may not be amiss, just to alleviate sudden breakage and ensuing complaints. And of course, this only applies to Windows. On Posix libc is pretty much still the standard way of working with console output. T -- VI = Visual Irritation
Dec 23 2019
On 12/23/19 10:25 AM, H. S. Teoh wrote:On Sun, Dec 22, 2019 at 10:04:20PM +0000, Adam D. Ruppe via Digitalmars-d-learn wrote: [...]That means we have to buffer separately, which means we have a problem interleaving printf with writef. It would be awful.Regardless, I'm pretty well of the opinion that fwrite is the wrong thing to do anyway. fwrite writes bytes to a file, but we want to write strings to the console. There's other functions that do that.[...] Would it make sense for std.stdio.write* (the package global functions, as opposed to File.write*) to use the Windows console output functions instead of proxying to libc?Alternatively, we could change std.stdio.File to check if the current file descriptor is the console (fd == stdout && stdout == console, however you figure that out in Windows), and silently switch to the Windows console output functions instead of libc. We *are* already wrapping libc's FILE*, why not wrap the Windows console output functions as well.Again, the docs say you have to use wprintf, not fwrite. We would have to switch to using wprintf, and I'm not sure it's very easy thing to do. It might be possible though.Mixing raw libc printf with std.stdio.write* is a bad idea anyway; do we really need to support that?? Though calling fflush(stdout) may not be amiss, just to alleviate sudden breakage and ensuing complaints.There's this guy, his name is Walter. He likes printf. I'm pretty sure when he's buried, his cold dead fingers will be tightly and inextricably wrapped around printf.And of course, this only applies to Windows. On Posix libc is pretty much still the standard way of working with console output.The source of this thread is for valid unicode to come out on the screen, which I'm pretty sure Posix systems support just fine. Other than that, there are good reasons NOT to use libc, but this is disruptive and difficult to get right as a "drop in" -Steve
Dec 23 2019
On Monday, 23 December 2019 at 15:41:33 UTC, Steven Schveighoffer wrote:That means we have to buffer separately, which means we have a problem interleaving printf with writef. It would be awful.Or simply don't buffer. Any call you get, flush the C buffer and write the D stuff immediately. Remember, this code branch is only called if we already know it is an interactive console. They're usually flushed frequently (at least at every line) anyway... so especially with writeln / writefln those are virtually guaranteed and certainly expected to flush at the end anyway. I really don't think any performance concern would be significant.
Dec 23 2019
On 12/23/19 11:02 AM, Adam D. Ruppe wrote:On Monday, 23 December 2019 at 15:41:33 UTC, Steven Schveighoffer wrote:Unbuffered output would perform badly, especially if you are writing characters at a time (which is what formattedWrite does). But I think this would solve the interleaving problem.That means we have to buffer separately, which means we have a problem interleaving printf with writef. It would be awful.Or simply don't buffer. Any call you get, flush the C buffer and write the D stuff immediately.Remember, this code branch is only called if we already know it is an interactive console. They're usually flushed frequently (at least at every line) anyway... so especially with writeln / writefln those are virtually guaranteed and certainly expected to flush at the end anyway. I really don't think any performance concern would be significant.Honestly, I think it sounds horrible to have yet another special case for this specific situation. But also, I almost never use Windows for D work, so I'm fine if you want to duct tape some more cruft onto that branch. std.stdio is already a pretty big mess. -Steve
Dec 23 2019
On Mon, Dec 23, 2019 at 10:41:33AM -0500, Steven Schveighoffer via Digitalmars-d-learn wrote: [...]There's this guy, his name is Walter. He likes printf. I'm pretty sure when he's buried, his cold dead fingers will be tightly and inextricably wrapped around printf.[...] But that's not a problem; since he loves printf so much, he'd never use std.stdio.write* in the first place. No conflict there. :-D T -- INTEL = Only half of "intelligence".
Dec 23 2019
On Sunday, 22 December 2019 at 06:11:13 UTC, moth wrote:is there any function i can call or setting i can adjust to get D to do the same, or do i have to wait for something to be fixed in the language / compiler itself?It isn't the language/compiler per se, it is the library calling the wrong function. See the code in the link in my last email - if you call the Windows WriteConsoleW function directly it will do what you want. The rest of the surrounding code in the link is to handle conversions and pipes to files.
Dec 22 2019