www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 8642] New: Fix `fopen` and friends functions signatures on Windows

reply d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642

           Summary: Fix `fopen` and friends functions signatures on
                    Windows
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: major
          Priority: P2
         Component: druntime
        AssignedTo: nobody puremagic.com
        ReportedBy: verylonglogin.reg gmail.com



21:39:07 MSD ---
`fopen` and friends are really nasty sources of unportable code and encoding
issues on Windows.

I hope eventually we will change its signatures to not accept `char*` with our
usual deprecation process.

These functions work on POSIX systems and work-in-many-cases on Windows (read:
hard to debug) so the situation is too dangerous to continue ignoring it (how
about to count druntime/Phobos bugs because of misunderstanding of this issue?)

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 11 2012
next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642


Jonathan M Davis <jmdavisProg gmx.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jmdavisProg gmx.com



PDT ---
fopen is standard C, and druntime simply provides the bindings for the C
library as well as the system calls specific to the OS. Most code should be
using D functions, not the C ones anyway. Providing bindings to standard C
functions or system call APIs is _not_ a bug. If you don't want to use them,
don't use them.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




22:00:12 MSD ---
 druntime simply provides the bindings for the C library
With wrong signatures. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




PDT ---
In what way are the signatures wrong?

According to the Linux man page, digitalmars' documentation, _and_ MSDN, fopen
is

FILE *fopen(const char *path, const char *mode);

And in druntime, it's

FILE* fopen(in char* filename, in char* mode);

The only difference is that in is const scope instead of just const. in
shouldn't have been used, but it won't affect the kind of stuff that you're
talking about.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




01:50:52 MSD ---
The meaning of `char` is different.
It is ASCII for C standard, CP_ACP for Windows, and UTF-8 for D and POSIX
systems.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




PDT ---
 The meaning of `char` is different.
 It is ASCII for C standard, CP_ACP for Windows, and UTF-8 for D and POSIX
 systems.
That doesn't change the function signature, just what encoding you should be passing in. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




02:00:07 MSD ---
 That doesn't change the function signature, just what encoding you should be
 passing in.
`char` is UTF-8 codepoint in D. It is specified and there is no choice. And you know it. So I don't understand your comment. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




PDT ---
 `char` is UTF-8 codepoint in D. It is specified and there is no choice. And
you know it. So I don't understand your comment.
Yes. But it's expected to use char when dealing with C's char. Any necessary conversion should be done at the call site. At most what you'd do is make C signature's take ubyte instead of char, which would cause all kinds of confusion. Yes. You need to be careful when passing strings to C functions which take char, because Microsoft was stupid, and ideally you'd use the w* functions on Windows precisely because of this nonsense, but that's something that the programmer needs to know and handle appropriately. The function signature itself is fine. Making it ubyte wouldn't solve anything, and you'd basically be arguing that ubyte should always be used instead of char in C bindings, and I don't think that you're going to find much traction on that. Another thing to remember is that we currently use digitalmars' C runtime, so stuff like fopen is provided by _it_ and not Microsoft, which could introduce its own set of quirks (and also means that the w* functions aren't even available for anything in the C runtime). And that situation is about to become that much more complicated when we start supporting Microsoft's runtime for 64-bit. If there's a bug, it's in the usage of fopen and friends, not in fopen itself (unless you count Microsoft's choice of CP_ACP as a bug, but that's not in our control in either case). -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




02:23:02 MSD ---
 Making it ubyte wouldn't solve anything, and you'd basically be
 arguing that ubyte should always be used instead of char in C bindings, and I
 don't think that you're going to find much traction on that.
IMHO it's a solution for such cases. Just my opinion.
 the w* functions aren't even available for anything in the C runtime
They are, by the way (see Issue 8643). MinGW also provides these functions. It looks like they are 'unofficial' standard of C file IO on Windows. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |bugzilla digitalmars.com
         Resolution|                            |WONTFIX



15:55:14 PDT ---
C Standard library functions have always had character encoding issue problems,
as there are innumerable encodings that C calls "char", including UTF-8
encoding.

D has a policy of not attempting to fix, refactor, reengineer, paper over,
improve, etc., Standard C functions nor operating system API functions. D
merely provides a straightforward, direct call to them.

It's up to the caller of those functions to understand them and call them
correctly.

-- 
Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Sep 11 2012
prev sibling next sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




2012-09-12 03:00:31 MSD ---
 D merely provides a straightforward, direct call to them.
And this is the point. We understand it in different ways. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012
prev sibling parent d-bugmail puremagic.com writes:
http://d.puremagic.com/issues/show_bug.cgi?id=8642




PDT ---
 D merely provides a straightforward, direct call to them.
 And this is the point. We understand it in different ways.
I have no idea how you could misunderstand that. I only see one way to interpret that, which is that we simply provide C bindings and you have to deal with whatever quirks the C function has, just like you would have to in C. The bindings in druntime are provided so that Phobos can build better abstractions on them and so that D programmers have direct access to system functions where necessary. If you want a cleaner API around a C function, then create a D wrapper. In the case of fopen, that's done with std.stdio.File. We're not trying to clean up C APIs or make them easy-to-use, just provide bindings for them. -- Configure issuemail: http://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Sep 11 2012