www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [Issue 11837] New: String literals should convert to const(void)*

reply d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837

           Summary: String literals should convert to const(void)*
           Product: D
           Version: D2
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P2
         Component: DMD
        AssignedTo: nobody puremagic.com
        ReportedBy: yebblies gmail.com


--- Comment #0 from yebblies <yebblies gmail.com> 2013-12-29 03:08:31 EST ---
Code like this is perfectly valid:

memcmp(ptr, "abc");

But it fails in D because although string literals convert to const(char)*, and
const(char)* converts to const(void)*, string literals do not convert to
const(void)*

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 28 2013
next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


bearophile_hugs eml.cc changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bearophile_hugs eml.cc


--- Comment #1 from bearophile_hugs eml.cc 2013-12-28 08:33:27 PST ---
(In reply to comment #0)
 Code like this is perfectly valid:
 
 memcmp(ptr, "abc");
 
 But it fails in D because although string literals convert to const(char)*, and
 const(char)* converts to const(void)*, string literals do not convert to
 const(void)*

What are the advantages, disadvantages and possible risks of this change? -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #2 from yebblies <yebblies gmail.com> 2013-12-29 03:37:19 EST ---
Code like this will compile:

memcmp(ptr, "abc", 3);

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


monarchdodra gmail.com changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |monarchdodra gmail.com


--- Comment #3 from monarchdodra gmail.com 2013-12-28 12:54:42 PST ---
(In reply to comment #2)
 Code like this will compile:
 
 memcmp(ptr, "abc", 3);

What's wrong with `memcmp(ptr, "abc".ptr, 3)`? I seem to remember there is an issue with null termination in this kind of useage? -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #4 from yebblies <yebblies gmail.com> 2013-12-29 14:53:09 EST ---
(In reply to comment #3)
 (In reply to comment #2)
 Code like this will compile:
 
 memcmp(ptr, "abc", 3);

What's wrong with `memcmp(ptr, "abc".ptr, 3)`?

Adding .ptr looses the guarantee that the string will be 0-terminated. eg // enum x = "abc"; // immutable x = "abc"; auto x = "abc"; memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the compiler has no way to know that's what you wanted.
 I seem to remember there is an issue with null termination in this kind of
 useage?

...? The fact that string literals don't convert to const(void)* is IMO an annoying special case. This works: const(char)* x = "askjldfg"; const(void)* y = x; But this doesn't: const(void)* y = "askjldfg"; Unless there's a good reason this has to be prevented... -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


yebblies <yebblies gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |pull


--- Comment #5 from yebblies <yebblies gmail.com> 2013-12-29 16:24:04 EST ---
https://github.com/D-Programming-Language/dmd/pull/3044

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 28 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #6 from monarchdodra gmail.com 2013-12-29 09:33:33 PST ---
(In reply to comment #4)
 (In reply to comment #3)
 I seem to remember there is an issue with null termination in this kind of
 useage?

...?

What I meant here is what you explained just above:
 What's wrong with `memcmp(ptr, "abc".ptr, 3)`?
 

Adding .ptr looses the guarantee that the string will be 0-terminated. eg // enum x = "abc"; // immutable x = "abc"; auto x = "abc"; memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the compiler has no way to know that's what you wanted.

This may be a bit off topic, but what is the rationale behind this behavior? Why can't *all* string literals be 0 terminated, even if you explicitly extract a pointer out of them with ".ptr" ?
 The fact that string literals don't convert to const(void)* is IMO an annoying
 special case.
 
 This works:
 const(char)* x = "askjldfg";
 const(void)* y = x;
 
 But this doesn't:
 const(void)* y = "askjldfg";
 
 Unless there's a good reason this has to be prevented...

If a string literal implicitly casts to "const(char)*", then it absolutely 100% must be implicitly castable to "const(void)*". It only makes sense. Though personally, I find that the fact that you can *implicitly* extract any pointer from a string literal to be suboptimal :/ -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #7 from yebblies <yebblies gmail.com> 2013-12-30 04:55:36 EST ---
(In reply to comment #6)
 
 memcmp(ptr, x.ptr, 4); // oops, no guarantee x is 0-terminates, but the
 compiler has no way to know that's what you wanted.

This may be a bit off topic, but what is the rationale behind this behavior? Why can't *all* string literals be 0 terminated, even if you explicitly extract a pointer out of them with ".ptr" ?

All string literals are guaranteed to be 0 terminated, even if you use .ptr on them. The think is, manifest constants that expand to string literals also behave like this, so if this compiles you know it is safe: printf(formatstr, ...); But in this case, you can't tell: printf(formatstr.ptr, ...); // was it really a string literal?
 
 If a string literal implicitly casts to "const(char)*", then it absolutely 100%
 must be implicitly castable to "const(void)*". It only makes sense.
 

Ok, good. This is pretty much just convenience for porting c/c++ code, and removing what I see as an unnecessary limitation.
 Though personally, I find that the fact that you can *implicitly* extract any
 pointer from a string literal to be suboptimal :/

If it's safe, I don't see the harm. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


Walter Bright <bugzilla digitalmars.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |bugzilla digitalmars.com


--- Comment #8 from Walter Bright <bugzilla digitalmars.com> 2013-12-29
23:09:38 PST ---
There's code in dmd to specifically disallow this. I believe the reason is
because of function and template overloading. I'm not content with this change
simply passing the existing test suite. It's a more subtle, substantive change
than that.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Dec 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #9 from yebblies <yebblies gmail.com> 2013-12-30 18:37:33 EST ---
(In reply to comment #8)
 There's code in dmd to specifically disallow this. I believe the reason is
 because of function and template overloading.

Do you know what the actual problem is/was? The code in dmd that disallows this is ancient, and may well address a problem that no longer exists. Can you remember why you disabled it in the first place? Did you document this anywhere? As for overloading, this code works as expected as the conversion to const(char)* is preferred. import core.stdc.stdio; void call(const(char)* str) { printf("const(char)*\n"); } void call(const(void)* str) { printf("const(void)*\n"); } void call(const(int)[] arr) { printf("const(int)[]\n"); } void call(const(void)[] arr) { printf("const(void)[]\n"); } void main() { call("abc"); // prints const(char)* call([1, 2, 3]); // prints const(int)[] }
 I'm not content with this change
 simply passing the existing test suite. It's a more subtle, substantive change
 than that.

What _would_ you be content with? Unless someone can come up with an actual problem, putting this on hold is simply a waste of time. If this does turn out to cause a regression, it can trivially be rolled back. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Dec 29 2013
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #10 from Walter Bright <bugzilla digitalmars.com> 2014-01-12
11:35:38 PST ---
Changing the way overloading works can have far reaching consequences,
including issues like template matching, virtual functions, covariance,
contravariance, and __traits(compiles). I am not at all comfortable with just
throwing it in with the idea that it can be backed out. This proposal has not
received much of any discussion.

I also don't see memcmp usage as a compelling must-have use case.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Jan 12 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #11 from yebblies <yebblies gmail.com> 2014-01-13 14:27:10 EST ---
(In reply to comment #10)
 Changing the way overloading works can have far reaching consequences,
 including issues like template matching, virtual functions, covariance,
 contravariance, and __traits(compiles).

Irrelevant as I'm not changing the way overloading works. Listing parts of the compiler is not the same as pointing out and actual problem.
 I am not at all comfortable with just
 throwing it in with the idea that it can be backed out.

Your response consisted of "there could be problems" without specifying any actual problems. In the face of unspecified and potentially non-existent problems, putting the code in the compiler and waiting for feedback seems completely reasonable to me.
 This proposal has not
 received much of any discussion.

That's what we're doing now...
 I also don't see memcmp usage as a compelling must-have use case.

Given that A converts to B, and B converts to C, why doesn't A convert to C? memcmp is a symptom of this strange limitation. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Jan 12 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #12 from Walter Bright <bugzilla digitalmars.com> 2014-02-24
01:52:41 PST ---
(In reply to comment #11)
 Irrelevant as I'm not changing the way overloading works.  Listing parts of the
 compiler is not the same as pointing out and actual problem.

You are changing the way overloading works. Any time the implicit conversion rules are changed, and this is an implicit overloading rule change, the overloading changes, because overloading is all about implicit conversions.
 Your response consisted of "there could be problems" without specifying any
 actual problems.  In the face of unspecified and potentially non-existent
 problems, putting the code in the compiler and waiting for feedback seems
 completely reasonable to me.
 
 This proposal has not
 received much of any discussion.

That's what we're doing now...

It's pretty much just you and I, hardly representative.
 I also don't see memcmp usage as a compelling must-have use case.

Given that A converts to B, and B converts to C, why doesn't A convert to C? memcmp is a symptom of this strange limitation.

Changing overloading rules can have unexpected and far reaching consequences. It is not very knowable in advance. I have severe reservations about doing this just for memcmp(). Need a better reason. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 24 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #13 from yebblies <yebblies gmail.com> 2014-02-24 22:04:19 EST ---
(In reply to comment #12)
 
 I also don't see memcmp usage as a compelling must-have use case.

Given that A converts to B, and B converts to C, why doesn't A convert to C? memcmp is a symptom of this strange limitation.

[snip] I have severe reservations about doing this just for memcmp(). Need a better reason.

I gave you a reason, in fact, you quoted it. A converts to B, and B converts to C, but A doesn't convert to C. Why shouldn't A convert to C????? I'm not proposing a new special case, I'm trying to remove one that was introduced for reasons forgotten.
 Changing overloading rules can have unexpected and far reaching consequences.
 It is not very knowable in advance. 

The same reasoning could be used to block every change to the compiler. Every non-trivial change could potentially affect something unintended. The only way forward is to do your best to identify problems, then implement it and wait for regression reports. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Feb 24 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


Andrei Alexandrescu <andrei erdani.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
                 CC|                            |andrei erdani.com
         Resolution|                            |WONTFIX


--- Comment #14 from Andrei Alexandrescu <andrei erdani.com> 2014-03-08
21:22:18 PST ---
I agree it's an exception that "str" converts to const(char)* but not
subsequently to const(void)*. However, the conversion to char* is already a
known concession for the sake of C string APIs. I don't think we need to go all
the way into the rabbit hole. (Also the example is obscure.)

 yebblies sorry I'll close this and the pull request. Feel free to reopen if
you feel strongly about this.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 08 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


yebblies <yebblies gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|RESOLVED                    |REOPENED
         Resolution|WONTFIX                     |


--- Comment #15 from yebblies <yebblies gmail.com> 2014-03-09 18:46:08 EST ---
(In reply to comment #14)
 I agree it's an exception that "str" converts to const(char)* but not
 subsequently to const(void)*. However, the conversion to char* is already a
 known concession for the sake of C string APIs. I don't think we need to go all
 the way into the rabbit hole. (Also the example is obscure.)
 

You seem to be saying it's not worth the effort to fix, if I understand correctly. We've already spent a lot more time arguing about it than I spent fixing it, so I'd really like to know why you think preventing the fix is worth all this effort?
  yebblies sorry I'll close this and the pull request. Feel free to reopen if
 you feel strongly about this.

The usual arguments for rejecting an enhancement are that it breaks existing code, or it complicates the language. So far it seems this does neither, in fact it simplifies the language. Just like Walter, you've failed to provide a single reason why this special case should exist. I just can't accept that - it doesn't make any sense. I appreciate you taking the time to look at this, but without any evidence that this is a bad change I think you are drawing the wrong conclusion. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 08 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837


Iain Buclaw <ibuclaw ubuntu.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |ibuclaw ubuntu.com


--- Comment #16 from Iain Buclaw <ibuclaw ubuntu.com> 2014-03-09 04:37:16 PDT
---
The way I see it, relying on implicit conversion should be avoided where
possible. And where implicit conversion is allowed, enforce that only one path
could be taken. In this example, that means string -> const(char*), or string
-> const(void*), but not both.

Assuming this is for DDMD, then I'd suggest either grin and bear it, this kind
of code will be cleaned up.  Or use strcmp, which IIRC takes a const(char*) as
its parameters - and if the operation is comparing a (void*) with a (char*),
then explicitly cast the (void*) up.

As for Walter and Andrei's reasoning.  I am not opined in that way, but I would
suggest that you prove that this change doesn't break eg:

int foo(in void*);
int foo(in char*);

And if it does break overloading, provide good reasoning why this should be
invalid.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 09 2014
prev sibling next sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #17 from Andrei Alexandrescu <andrei erdani.com> 2014-03-09
09:48:56 PDT ---
(In reply to comment #15)
 (In reply to comment #14)
 I agree it's an exception that "str" converts to const(char)* but not
 subsequently to const(void)*. However, the conversion to char* is already a
 known concession for the sake of C string APIs. I don't think we need to go all
 the way into the rabbit hole. (Also the example is obscure.)
 

You seem to be saying it's not worth the effort to fix, if I understand correctly. We've already spent a lot more time arguing about it than I spent fixing it, so I'd really like to know why you think preventing the fix is worth all this effort?

I'm saying we shouldn't have a compromise force others after it. Conversion to untyped pointers is bad and should be avoided. So I'm arguing against what I believe is a bad thing. Also the supporting examples are specious and non-idiomatic D.
  yebblies sorry I'll close this and the pull request. Feel free to reopen if
 you feel strongly about this.

The usual arguments for rejecting an enhancement are that it breaks existing code, or it complicates the language. So far it seems this does neither, in fact it simplifies the language.

It makes the language worse.
 Just like Walter, you've failed to provide a single reason why this special
 case should exist.
 
 I just can't accept that - it doesn't make any sense.
 
 I appreciate you taking the time to look at this, but without any evidence that
 this is a bad change I think you are drawing the wrong conclusion.

Untyped pointers are bad. We are providing conversion to immutable(char)* as a compromise. Please let's not make nice string literals implicitly convert all the way down to void*. I'll leave it to you to close this. -- Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email ------- You are receiving this mail because: -------
Mar 09 2014
prev sibling parent d-bugmail puremagic.com writes:
https://d.puremagic.com/issues/show_bug.cgi?id=11837



--- Comment #18 from Andrei Alexandrescu <andrei erdani.com> 2014-03-09
10:03:44 PDT ---
A few more thoughts. 

Conversion of string literals to const(char)* is irregular and inconsistent. No
other array literals convert that way, and variables, enums etc also don't
convert that way. It's a kludge - albeit a clever one - that we accept as
convenience for the reality we must occasionally interface with C strings and
it's nice to not need to add the .ptr.

Now in invoking language consistency "post-kludge" we are in fact being
consistent with the kludge more than the language rules that it disobeys.

-- 
Configure issuemail: https://d.puremagic.com/issues/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
Mar 09 2014