digitalmars.D - Let's schedule WinAPI ASCII functions for deprecation!
- Denis Shelomovskij <verylonglogin.reg gmail.com> May 22 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 22 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 22 2012
- "Roman D. Boiko" <rb d-coding.com> May 22 2012
- "Roman D. Boiko" <rb d-coding.com> May 22 2012
- Stewart Gordon <smjg_1998 yahoo.com> May 23 2012
- Denis Shelomovskij <verylonglogin.reg gmail.com> May 22 2012
- "Martin Nowak" <dawg dawgfoto.de> May 22 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 22 2012
- Walter Bright <newshound2 digitalmars.com> May 22 2012
- Trass3r <un known.com> May 22 2012
- Walter Bright <newshound2 digitalmars.com> May 22 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 22 2012
- Denis Shelomovskij <verylonglogin.reg gmail.com> May 24 2012
- Gor Gyolchanyan <gor.f.gyolchanyan gmail.com> May 22 2012
- "Mehrdad" <wfunction hotmail.com> May 22 2012
- Stewart Gordon <smjg_1998 yahoo.com> May 23 2012
- Jacob Carlborg <doob me.com> May 23 2012
- "Kagamin" <spam here.lot> May 23 2012
- "Michael" <pr m1xa.com> May 23 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 23 2012
- Dmitry Olshansky <dmitry.olsh gmail.com> May 23 2012
- "Michael" <pr m1xa.com> May 23 2012
- "Regan Heath" <regan netmail.co.nz> May 24 2012
- "Regan Heath" <regan netmail.co.nz> May 24 2012
- "Michael" <pr m1xa.com> May 24 2012
Since Win9x isn't supported any more why do we have ASCII WinAPI
functions in druntime's core.sys.windows.windows (and, possibly, other
places)?
Reasons against *A functions:
* using of every such function is unsafe (with really seldom exceptions
like LoadLibraryA("ntdll")) because inability to encode non-ASCII
characters to OEM encoding will almost always give unpredictable results
for programmer (simple test: you, reader, what will happen?);
* in D it's too easy to make a mistake by passing UTF-8 string pointer
to such function because D has no string types other than UTF and
elimination of such function is the only solution unless ASCII string
type is created
* it performs worse because Windows has to convert ASCII string to
UTF-16 first
And yes, druntime already has encoding bugs because of using such functions.
P.S.
Let's finally solve encoding problems that should be solved 10 years
ago! By the way, Git+TurtoiseGit still has encoding problems on Windows
and it is awful (see its changelog).
--
Денис В. Шеломовский
Denis V. Shelomovskij
May 22 2012
On 22.05.2012 22:11, Denis Shelomovskij wrote:Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.
Yes, let them burn! Burn, burn, burn! Seriously. For those that are bend on compatibility, *A functions also are: - security disasters - limited in more then just one way: 256 max path, and so on and so forth And last but not least: - *W were supported on Win98+ Second Edition with official addon - Unicode Layer for Windows ;) Not to mention the OEM encoding were never supported properly by D.P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
-- Dmitry Olshansky
May 22 2012
P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
forgot to mention that my GSOC project has support for legacy encodings as it's secondary goal. Check out: TODOs, synopsis & status: https://github.com/blackwhale/phobos/wiki/GSOC-Unicode-support/tree/gsoc-uni Original proposal: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/dolsh/20002# -- Dmitry Olshansky
May 22 2012
On Tuesday, 22 May 2012 at 18:39:46 UTC, Dmitry Olshansky wrote:P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
forgot to mention that my GSOC project has support for legacy encodings as it's secondary goal. Check out: TODOs, synopsis & status: https://github.com/blackwhale/phobos/wiki/GSOC-Unicode-support/tree/gsoc-uni Original proposal: http://www.google-melange.com/gsoc/proposal/review/google/gsoc2012/dolsh/20002#
Dmitry, your project looks really cool. As for the topic, I would vote for that, too, but don't have enough knowledge to understand all possible consequences...
May 22 2012
On Tuesday, 22 May 2012 at 18:43:58 UTC, Roman D. Boiko wrote:Dmitry, your project looks really cool. As for the topic, I would vote for that, too, but don't have enough knowledge to understand all possible consequences...
relevant tradeoffs".
May 22 2012
On 22/05/2012 19:24, Dmitry Olshansky wrote: <snip>* in D it's too easy to make a mistake by passing UTF-8 string pointer to such function
That's just as easy in almost any language. It's part of why so many websites have character encoding bugs. <snip>And last but not least: - *W were supported on Win98+ Second Edition with official addon - Unicode Layer for Windows ;)
I've heard of MS Layer for Unicode - don't know if that's what you meant or you're talking about something else. From what I recall reading, MSLU had the problem that EXEs have to be explicitly built to depend on it. So a typical app targeted at Win2000 and above wouldn't work with it, and you can't (at least easily) make an app detect whether Unicode is available and use it if it's there. Stewart.
May 23 2012
LPTSTR issue (it aliases char*) is already filled by Martin Nowak: Issue 8132 - LPTSTR always aliases to LPSTR http://d.puremagic.com/issues/show_bug.cgi?id=8132 -- Денис В. Шеломовский Denis V. Shelomovskij
May 22 2012
* it performs worse because Windows has to convert ASCII string to UTF-16 first
P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
Given that it only requires a 'w' suffix for literals it's a good choice.
May 22 2012
On 22.05.2012 23:32, Martin Nowak wrote:* it performs worse because Windows has to convert ASCII string to UTF-16 first
P.S. Let's finally solve encoding problems that should be solved 10 years ago! By the way, Git+TurtoiseGit still has encoding problems on Windows and it is awful (see its changelog).
Given that it only requires a 'w' suffix for literals it's a good choice.
http://stackoverflow.com/questions/7950271/windows-uses-utf-16-as-its-internal-encoding-what-exactly-does-this-mean Second answer sheds some light on the topic. From what I know of Windows NT, the kernel even doesn't use Z-strings most of the time. All stuff that can be called syscalls use a variation of L-strings for 16-bit width chars. -- Dmitry Olshansky
May 22 2012
On 5/22/2012 12:32 PM, Martin Nowak wrote:* it performs worse because Windows has to convert ASCII string to UTF-16 first
Yes. Windows internally is all 16 bit Unicode.
May 22 2012
On 5/22/2012 11:11 AM, Denis Shelomovskij wrote:Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.
First off, I agree that druntime and phobos must not use the A functions without a very, very good reason. Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.
May 22 2012
On 23.05.2012 0:41, Walter Bright wrote:On 5/22/2012 11:11 AM, Denis Shelomovskij wrote:Since Win9x isn't supported any more why do we have ASCII WinAPI functions in druntime's core.sys.windows.windows (and, possibly, other places)? Reasons against *A functions: * using of every such function is unsafe (with really seldom exceptions like LoadLibraryA("ntdll")) because inability to encode non-ASCII characters to OEM encoding will almost always give unpredictable results for programmer (simple test: you, reader, what will happen?); * in D it's too easy to make a mistake by passing UTF-8 string pointer to such function because D has no string types other than UTF and elimination of such function is the only solution unless ASCII string type is created * it performs worse because Windows has to convert ASCII string to UTF-16 first And yes, druntime already has encoding bugs because of using such functions.
First off, I agree that druntime and phobos must not use the A functions without a very, very good reason.
Right.Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.
Again correct. The trick is that the way *A functions are provided is in fact wrong edit! It signatres are basically saying "hello I'm explicit Win32 API multi-byte string binding and I accept UTF-8 string " ... WTF?! The fact that they are horribly outdated is the perfect moment to both fix the issue and get rid of junk. -- Dmitry Olshansky
May 22 2012
23.05.2012 0:41, Walter Bright написал:Secondly, as a matter of principle, we are not going to fix, improve, refactor, or re-engineer the Windows API, nor any other operating system API, nor the C Standard Library, no matter how tempting that may be. The job of the D interface modules is to simply provide an interface to them, as thin and direct as possible, without editorial comment. The user can decide what to use or not use from it.
The key point is what does it mean "interface"? An ability to load DLL and get symbols from it is enough to use every function. Is it an interface? You say "no". It's common in C/C++ to use WinAPI functions without A/W postfixes because preprocessor defines it according to your preferences. Is it an interface? You say "no". Functions like C's memmove are deprecated in VC headers on Windows because they are unsafe. Is it an interface? You say "no". WinAPI functions are more than just C definitions, they have IDL to allow user to avoid pointers and exit code checking. Is it an interface? You say "no". There is no such macros in Windows headers even for dmc and there is no talks at all to generate good D wrappers for WinAPI functions based on its IDL. *A functions are in WinAPI headers obviously for backward compatibility only. Are they definitions an interface? You say "yes". And I completely disagree with the last 2 points. I just want to show that this "principle" isn't as well-shaped as it can look at first sight. -- Денис В. Шеломовский Denis V. Shelomovskij
May 24 2012
--bcaec554d842108aad04c0a61cef Content-Type: text/plain; charset=UTF-8 On Wed, May 23, 2012 at 12:31 AM, Trass3r <un known.com> wrote:Yeah let 'em burn!
Kill it! Kill it with fire!!! +1 -- Bye, Gor Gyolchanyan. --bcaec554d842108aad04c0a61cef Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable <div class=3D"gmail_quote">On Wed, May 23, 2012 at 12:31 AM, Trass3r <span = dir=3D"ltr"><<a href=3D"mailto:un known.com" target=3D"_blank">un known.= com</a>></span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"mar= gin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"> Yeah let 'em burn!<br> </blockquote></div><br>Kill it! Kill it with fire!!!<div>+1<br clear=3D"all= "><div><br></div>-- <br>Bye,<br>Gor Gyolchanyan.<br> </div> --bcaec554d842108aad04c0a61cef--
May 22 2012
I hope this includes SNN.lib, which also uses ANSI functions...
May 22 2012
On 23/05/2012 15:16, Kagamin wrote: <snip>Well, you can't fix C because C explicitly ignores string encoding and thoughtlessly passes strings around without any transcoding. Though, D bindings suggest that C functions accept utf-8 strings
A lot of C functions do. Indeed, this is one of the considerations made in the design of UTF-8.which leads to assumption that those functions will act properly on utf-8 strings. I'd say that's a bug in bindings: C strings are specified to be in C encoding,
What is "C encoding"?not utf-8 encoding. I think, conversion from D string to C string should require at least a cast.
Several people have dealt with this by using byte or ubyte as D's equivalent of the C char type. Stewart.
May 23 2012
On 2012-05-23 20:34, Stewart Gordon wrote:What is "C encoding"?
Since C doesn't really have a concept of encodings it would be whatever a given application/library decides it is. -- /Jacob Carlborg
May 23 2012
On Wednesday, 23 May 2012 at 04:01:05 UTC, Mehrdad wrote:I hope this includes SNN.lib, which also uses ANSI functions...
Well, you can't fix C because C explicitly ignores string encoding and thoughtlessly passes strings around without any transcoding. Though, D bindings suggest that C functions accept utf-8 strings which leads to assumption that those functions will act properly on utf-8 strings. I'd say that's a bug in bindings: C strings are specified to be in C encoding, not utf-8 encoding. I think, conversion from D string to C string should require at least a cast.
May 23 2012
In WinAPI we have: LoadLibraryA/W, but not GetProcAddressA/W because PE COFF limitations exists.Walter Bright The user can decide what to use or not use from it.
256 max path
May 23 2012
On 23.05.2012 23:29, Michael wrote:In WinAPI we have: LoadLibraryA/W, but not GetProcAddressA/W because PE COFF limitations exists.Walter Bright The user can decide what to use or not use from it.
256 max path
Nope. Quoting random top hit from google: Individual components of a filename (i.e. each subdirectory along the path, and the final filename) are limited to 255 characters, and the total path length is limited to approximately 32,000 characters. However, you should generally try to limit path lengths to below 260 characters (MAX_PATH) when possible. See http://msdn.microsoft.com/en-us/library/aa365247.aspx for full details. -- Dmitry Olshansky
May 23 2012
On 24.05.2012 0:13, Michael wrote:approximately 32,000 characters...
I know it ;) But it's platform specific kung-fu.
It's the only game in M$ town ;) -- Dmitry Olshansky
May 23 2012
approximately 32,000 characters...
I know it ;) But it's platform specific kung-fu.
May 23 2012
On Wed, 23 May 2012 20:54:44 +0100, Jacob Carlborg <doob me.com> wrote:On 2012-05-23 20:34, Stewart Gordon wrote:What is "C encoding"?
Since C doesn't really have a concept of encodings it would be whatever a given application/library decides it is.
All the more reason to use byte/ubyte as D's equivalent to C's char. R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
May 24 2012
On Wed, 23 May 2012 21:13:47 +0100, Michael <pr m1xa.com> wrote:approximately 32,000 characters...
I know it ;) But it's platform specific kung-fu.
And, if you start to dig a bit things can get a bit hairy in places: http://blogs.msdn.com/b/bclteam/archive/2007/02/13/long-paths-in-net-part-1-of-3-kim-hamilton.aspx http://blogs.msdn.com/b/bclteam/archive/2007/03/26/long-paths-in-net-part-2-of-3-long-path-workarounds-kim-hamilton.aspx http://blogs.msdn.com/b/bclteam/archive/2008/07/07/long-paths-in-net-part-3-of-3-redux-kim-hamilton.aspx R -- Using Opera's revolutionary email client: http://www.opera.com/mail/
May 24 2012
I knew it till an .net era. Main line is even Windows may handle it in a wrong way. WinAPi - interface "as is". So let user decides to use or not.
May 24 2012









Dmitry Olshansky <dmitry.olsh gmail.com> 