www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Feedback Thread: DIP 1036--String Interpolation Tuple

reply Mike Parker <aldacron gmail.com> writes:
This is the feedback thread for the second round of Community 
Review of DIP 1036, "String Interpolation Tuple Literals".

===================================
**THIS IS NOT A DISCUSSION THREAD**

Posts in this thread must adhere to the feedback thread rules 
outlined in the Reviewer Guidelines (and listed at the bottom of 
this post).

https://github.com/dlang/DIPs/blob/master/docs/guidelines-reviewers.md

That document also provides guidelines on contributing feedback 
to a DIP review. Please read it before posting here. If you would 
like to discuss this DIP, please do so in the discussion thread:

https://forum.dlang.org/post/uhueqnulcsskznsyuhwx forum.dlang.org
==================================

You can find DIP 1036 here:

https://github.com/dlang/DIPs/blob/344e00ee2d6683d61ee019d5ef6c1a0646570093/DIPs/DIP1036.md

The review period will end at 11:59 PM ET on February 10, or when 
I make a post declaring it complete. Feedback posted to this 
thread after that point may be ignored.

At the end of this review round, the DIP will be moved into the 
Post-Community Round 2 state. Significant revisions resulting 
from this review round may cause the DIP manager to require 
another round of Community Review, otherwise the DIP will be 
queued for the Final Review.

==================================
Posts in this thread that do not adhere to the following rules 
will be deleted at the DIP author's discretion:

* All posts must be a direct reply to the DIP manager's initial 
post, with only two exceptions:

     - Any commenter may reply to their own posts to retract 
feedback contained in the original post

     - The DIP author may (and is encouraged to) reply to any 
feedback solely to acknowledge the feedback with agreement or 
disagreement (preferably with supporting reasons in the latter 
case)

* Feedback must be actionable, i.e., there must be some action 
the DIP author can choose to take in response to the feedback, 
such as changing details, adding new information, or even 
retracting the proposal.

* Feedback related to the merits of the proposal rather than to 
the contents of the DIP (e.g., "I'm against this DIP.") is 
allowed in Community Review (not Final Review), but must be 
backed by supporting arguments (e.g., "I'm against this DIP 
because..."). The supporting arguments must be reasonable. 
Obviously frivolous arguments waste everyone's time.

* Feedback should be clear and concise, preferably listed as 
bullet points (those who take the time to do an in-depth review 
and provide feedback in the form of answers to the questions in 
the documentation linked above will receive much gratitude). 
Information irrelevant to the DIP or which is not provided in 
service of clarifying the feedback is unwelcome.
Jan 27
next sibling parent reply Petar Kirov [ZombineDev] <petar.p.kirov gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [..]
The sentence:
 A string interpolation tuple literal, as defined below, allows 
 one to interleave non-string data inside a string literal.
Is repeated twice in this section: https://github.com/dlang/DIPs/blob/344e00ee2d6683d61ee019d5ef6c1a0646570093/DIPs/DIP1036.md#description
Jan 27
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/27/21 6:25 AM, Petar Kirov [ZombineDev] wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [..]
The sentence:
 A string interpolation tuple literal, as defined below, allows one to 
 interleave non-string data inside a string literal.
Is repeated twice in this section: https://github.com/dlang/DIPs/blob/344e00ee2d6683d61ee019d5ef6c1a0646570093/DIPs/DI 1036.md#description
Thanks, will fix. -Steve
Jan 27
prev sibling next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
Typo in the second sentence The DIP does not account for operator overloading on strings. A DIP author that the specific example given was not possible to support, but it can work with custom types.
Jan 27
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/27/21 6:33 AM, jmh530 wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
Typo in the second sentence The DIP does not account for operator overloading on strings. A DIP author that the specific example given was not possible to support, but it can work with custom types.
You had me confused for a while, I didn't remember writing anything like that. But it's part of the review piece I didn't touch that Mike added after the first round. Should say "A DIP author *replied* that..." -Steve
Jan 27
prev sibling next sibling parent reply jmh530 <john.michael.hall gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
More typos ("which" should be "that" in all) Functions which accept an appropriate string type will work with string interpolation literals due to the rewrite by the compiler to the idup call. The second use case is for functions which can process the data without needing a string translation. However, functions which accept interp literals will work in BetterC as long as the interp template is available. The interpolated string struct should contain its own arguments; the proposed approach of rewriting to <interplationSpec>, arg1, arg2 is confusing and prohibits implementing functions which accept two interpolated string arguments.
Jan 27
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/27/21 6:43 AM, jmh530 wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
More typos ("which" should be "that" in all) Functions which accept an appropriate string type will work with string interpolation literals due to the rewrite by the compiler to the idup call. The second use case is for functions which can process the data without needing a string translation. However, functions which accept interp literals will work in BetterC as long as the interp template is available. The interpolated string struct should contain its own arguments; the proposed approach of rewriting to <interplationSpec>, arg1, arg2 is confusing and prohibits implementing functions which accept two interpolated string arguments.
Thanks. As I said, grammar is not my strong suit :P -Steve
Jan 27
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 This is the feedback thread for the second round of Community 
 Review of DIP 1036, "String Interpolation Tuple Literals".

 [...]
I thought "the library" was confusing. I guess Phobos is what's meant? How would that work without an import? Shouldn't it be "the runtime" instead? The DIP says "// auto variables cannot be assigned to a interpolation sequence" but earlier on there's this: auto msg1 = i"Hello, ${name}, this is your ${visits}${post(visits)} time visiting"; auto msg2 = ir"Hello, ${name}, this is your ${visits}${post(visits)} time visiting"; auto msg3 = i`Hello, ${name}, this is your ${visits}${post(visits)} time visiting`; auto msg3 = q{Hello, ${name}, this is your ${visits}${post(visits)} time visiting}; I'm guessing msg3 was supposed to be `iq{...}`?
Jan 27
next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/27/21 7:09 AM, Atila Neves wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 This is the feedback thread for the second round of Community Review 
 of DIP 1036, "String Interpolation Tuple Literals".

 [...]
I thought "the library" was confusing. I guess Phobos is what's meant? How would that work without an import? Shouldn't it be "the runtime" instead?
Yes, it's meant to be the piece of the language that lives in the library. I meant the runtime. Will change.
 The DIP says "// auto variables cannot be assigned to a interpolation 
 sequence" but earlier on there's this:
 
 auto msg1 = i"Hello, ${name}, this is your ${visits}${post(visits)} time 
 visiting";
 auto msg2 = ir"Hello, ${name}, this is your ${visits}${post(visits)} 
 time visiting";
 auto msg3 = i`Hello, ${name}, this is your ${visits}${post(visits)} time 
 visiting`;
 auto msg3 = q{Hello, ${name}, this is your ${visits}${post(visits)} time 
 visiting};
I can see how the statement is awkward. I'm trying to convey that the *direct translation* of the interpolation tuple literal to interleaving InterpolationLiterals and expressions cannot be assigned to an auto variable, therefore the rewrite to idup is used.
 
 I'm guessing msg3 was supposed to be `iq{...}`?
Yes, it should be that. Will correct. -Steve
Jan 27
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/27/2021 4:09 AM, Atila Neves wrote:
 auto msg1 = i"Hello, ${name}, this is your ${visits}${post(visits)} time
visiting";
 auto msg2 = ir"Hello, ${name}, this is your ${visits}${post(visits)} time 
 visiting";
 auto msg3 = i`Hello, ${name}, this is your ${visits}${post(visits)} time
visiting`;
 auto msg3 = q{Hello, ${name}, this is your ${visits}${post(visits)} time
visiting};
 
 I'm guessing msg3 was supposed to be `iq{...}`?
And shouldn't it be msg4?
Jan 27
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/28/21 1:00 AM, Walter Bright wrote:
 On 1/27/2021 4:09 AM, Atila Neves wrote:
 auto msg1 = i"Hello, ${name}, this is your ${visits}${post(visits)} 
 time visiting";
 auto msg2 = ir"Hello, ${name}, this is your ${visits}${post(visits)} 
 time visiting";
 auto msg3 = i`Hello, ${name}, this is your ${visits}${post(visits)} 
 time visiting`;
 auto msg3 = q{Hello, ${name}, this is your ${visits}${post(visits)} 
 time visiting};

 I'm guessing msg3 was supposed to be `iq{...}`?
And shouldn't it be msg4?
Yes it should. I clearly didn't proofread this part very well. :( -Steve
Jan 28
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
#DIP1036

Full Disclosure: I am not favorably disposed to this, as it is fairly 
complicated and uses the GC.

 It can bind to a parameter list, but it does not have a type by itself.
Makes no sense. What is it doing by "binding" to a parameter list? The examples make no sense, either, because assert doesn't have a parameter list.
 idup
What does this function look like?
 requires the GC
D needs to move away from such constructs.
 interp and idup
Not clear when interp is called and when idup is called.
 With proper library definitions, if usage of a string interpolation is an 
error, this DIP does not specify the language of the error condition. It is our preference that the resulting error of the idup call is emitted instead of the failed sequence match. Finish this rather than hand wave.
 functions which accept interp literals
what are "interp literals" ?
 Because the type interp!"..." is not implicitly convertible to any other type
Why wouldn't it be?
 This design is intentional to trigger the implicit idup call whenever it is 
used for conventional string-accepting functions. I don't know how this might fit in with overload resolution.
 "Best effort" functions
I don't know what the definition of "best effort" is when applied to a function.
 What became clear as the prior version was reviewed was that the complexity 
of specifying format while transforming into a parameter sequence was not worth adding to the language. I didn't think that was the conclusion. This DIP is much more complicated.
 Because the interp template type will provide a toString member, it will pass 
properly to functions such as writeln or text and work as expected without any changes to the existing functions. It won't work generally, however: void foo(string); struct S { string toString(); } void test() { S s; foo(s); } fails.
 To pass two sequential interpolation strings to a function that accepts 
interpolation strings, concatenation is not needed—separating the string literals by a comma will suffice. This will have weird consequences for overloading, i.e. distinguishing one combined argument from two distinct arguments.
 The complete specification of these translations is left up to the eventual 
implementors and language maintainers. In my experience, doing the detail design of things often reveals a fatal flaw.
 Compiler implementation
This section appears to confuse a definition of of the feature with its implementation. It really should be labeled "Overload Resolution". I am totally confused why it refers to InterpolationString for matching purposes, and yet says InterpolationSequence and InterpolationLiteral are used for function overloading. Can't have it both ways.
 with no further attempt to rewrite the sequence.
Does that mean there are multiple rewrites under other conditions?
 In the case where it does not match, the InterpolationString will be 
rewritten as a call to a druntime library function named idup. Does this imply a two-pass approach to overload resolutions? Try and fail, then try again with rewrites?
 If multiple InterpolationString tokens are used in a parameter list, the call 
must match for the resulting expansion of all InterpolationString tokens, or the entire expression will fail to match. Which expansion, as there are two different expansions? What about variadic parameters? Lazy parameters? No examples given of trivial and non-trivial overload matches illustrating each step of this process. The reason I'm being pedantic on the overloading is we've done hand-wavy overload rules before (alias this, cough cough) and eventually found out it was unworkable.
Jan 27
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/28/21 2:39 AM, Walter Bright wrote:
 #DIP1036
 
 Full Disclosure: I am not favorably disposed to this, as it is fairly 
 complicated and uses the GC.
I hope to alleviate your concerns, from the responses below, it seems like I have poorly conveyed the intentions of the DIP in many parts.
  > It can bind to a parameter list, but it does not have a type by itself.
 
 Makes no sense. What is it doing by "binding" to a parameter list? The 
 examples make no sense, either, because assert doesn't have a parameter 
 list.
Forgive my ignorance of the language spec and terminology. I want to say that basically if you write: foo(i"Hello, ${name}") It translates to: foo(interp!"Hello, "(), name, interp!""()) Unless that doesn't match a valid overload, and if not, then it translates to: foo(idup(interp!"Hello, "(), name, interp!""())) Clearly I don't know how to say that properly. I'm thinking of a new way to say this with overloads (see overload blurb below). Hopefully this is better. On assert, the fact that it won't match the expanded form means it will use the idup rewrite. That is intentional. I will make it clear that the rewrite will only happen for function or template argument lists.
 
  > idup
 
 What does this function look like?
The signature would look like: S idup(Args...)(Args args) if (is(Args[0] : interp!S, S)) And it would be roughly equivalent to std.conv.text, but without much of the cruft of phobos (likely it reuses some features already in druntime, such as miniFormat).
  > requires the GC
 
 D needs to move away from such constructs.
First, the DIP only requires the GC if idup is used. SOME form of allocation is needed. Follow the logic: you need a string from a set of arguments. This set of arguments is only knowable at runtime. Therefore you need a runtime allocation to hold the resulting string. Where should that allocated space come from? There is no possible string interpolation feature that results in an actual string that can be done without either adding a new allocation scheme to the language (i.e. reference counting), or using the GC. And it was very clear from the previous review, a string interpolation feature that cannot simply be assigned to or used as a string is a failed feature.
 
  > interp and idup
 
 Not clear when interp is called and when idup is called.
See overload blurb below.
 
  > With proper library definitions, if usage of a string interpolation 
 is an error, this DIP does not specify the language of the error 
 condition. It is our preference that the resulting error of the idup 
 call is emitted instead of the failed sequence match.
 
 Finish this rather than hand wave.
I can do this even though it's an implementation detail.
 
  > functions which accept interp literals
 
 what are "interp literals" ?
That should say InterpolationLiterals as defined in the description. It's an instantiation of the `interp` struct.
  > Because the type interp!"..." is not implicitly convertible to any 
 other type
 
 Why wouldn't it be?
I don't understand the question. D does not allow implicit conversion of library types without either alias this or inheritance.
 
  > This design is intentional to trigger the implicit idup call whenever 
 it is used for conventional string-accepting functions.
 
 I don't know how this might fit in with overload resolution.
See my blurb about overload resolution below.
 
  > "Best effort" functions
 
 I don't know what the definition of "best effort" is when applied to a 
 function.
Functions that accept any and all types of arguments, like writeln, and use a best effort to do something with them. These will never trigger the idup rewrite, which is why I talk about them in the DIP. You may roughly define a best effort function as one that accepts a vararg template parameter, and has no template constraints related to that list.
 
  > What became clear as the prior version was reviewed was that the 
 complexity of specifying format while transforming into a parameter 
 sequence was not worth adding to the language.
 
 I didn't think that was the conclusion. This DIP is much more complicated.
I disagree. This DIP is much simpler to use. It may be more complicated to implement, but that doesn't matter to the user of the language. The overload resolution is likely the only truly complex part to implement, since the rules are not easy to fit into the existing ones. The translation of the literal to InteropolationLiterals and expressions should be actually simpler than the previous DIP because no formatting specifiers are involved.
 
  > Because the interp template type will provide a toString member, it 
 will pass properly to functions such as writeln or text and work as 
 expected without any changes to the existing functions.
 
 It won't work generally, however:
 
      void foo(string);
      struct S { string toString(); }
      void test() { S s; foo(s); }
 
 fails.
I'm not sure if you understand the point of the statement. Functions such as writeln or text will work with interpolation literals. There is no attempt to say that it works with all functions, or that functions which accept strings will work with all types that define a toString member. However, this will work with your foo and S above: void test() { S s; foo(i"${s}"); }
 
  > To pass two sequential interpolation strings to a function that 
 accepts interpolation strings, concatenation is not needed—separating 
 the string literals by a comma will suffice.
 
 This will have weird consequences for overloading, i.e. distinguishing 
 one combined argument from two distinct arguments.
Identifying specific weird consequences would be most helpful.
  > The complete specification of these translations is left up to the 
 eventual implementors and language maintainers.
 
 In my experience, doing the detail design of things often reveals a 
 fatal flaw.
We are willing to write a library implementation for discussion. But the actual implementation does not affect the DIP. We are 100% confident an implementation of idup is possible (simply for the fact that std.conv.text exists).
 
  > Compiler implementation
 
 This section appears to confuse a definition of of the feature with its 
 implementation. It really should be labeled "Overload Resolution".
OK, thank you for giving me the correct term! And also, this is a better frame of view than what I originally wrote from. See my new suggestion below.
 
 I am totally confused why it refers to InterpolationString for matching 
 purposes, and yet says InterpolationSequence and InterpolationLiteral 
 are used for function overloading. Can't have it both ways.
I'll make sure this is clearer.
 
  > with no further attempt to rewrite the sequence.
 
 Does that mean there are multiple rewrites under other conditions?
No. The point of this clarification is because the idup rewrite itself still is a function call that goes through the overload rules. I do not want to get into a recursive situation in the compiler where it tries foo(<expanded form>) then foo(idup(<expanded form>)), which for some reason doesn't match, and then tries foo(idup(idup(<expanded form>))) etc. The idup rewrite should contain no further possibility of rewriting.
 
  > In the case where it does not match, the InterpolationString will be 
 rewritten as a call to a druntime library function named idup.
 
 Does this imply a two-pass approach to overload resolutions? Try and 
 fail, then try again with rewrites?
My intention was for this to happen. But it only fails and tries the rewrite if there is no match (for function and template argument lists).
 
  > If multiple InterpolationString tokens are used in a parameter list, 
 the call must match for the resulting expansion of all 
 InterpolationString tokens, or the entire expression will fail to match.
 
 Which expansion, as there are two different expansions?
If you pass multiple interpolation string parameters into a function, then either all must be expanded or all must be rewritten to idup calls. There cannot be a mix of both rewritten or expanded forms matching.
 
 What about variadic parameters? Lazy parameters?
Good point on variadic parameters. We think they should not match the expanded form. The point here is that the function is likely not equipped to handle these things, and so passing a string instead will be more compatible. If you want to match the expanded form, you must use a variadic template. Are there different overload rules for lazy parameters? I would expect: foo(lazy string s) bar(Args...)(lazy Args args) if (is(Args[0] : interp!S, string S)) to both accept string interpolation literals the same as the non-lazy equivalents would.
 
 No examples given of trivial and non-trivial overload matches 
 illustrating each step of this process.
I will add this.
 
 The reason I'm being pedantic on the overloading is we've done hand-wavy 
 overload rules before (alias this, cough cough) and eventually found out 
 it was unworkable.
I'm sorry for not being more detailed here. I am not experienced in the underlying details of overloads. In particular I would like to know cases that break this scheme either by making something not match when it should, or by using the wrong mechanism than is expected. I can appreciate the point of view from the compiler side, and it's something we are lacking in experience. I am mostly focused on usability. I want to get it right, so that it's feasible to implement, whatever that takes. -- Redo using Overloading instead of Compiler Implementation I propose that instead of discussing the compiler implementation (that clearly was a mistake), the DIP should discuss the usage within the context of the existing overload rules. Here is what I would propose: 1. If a StringInterpolation token appears anywhere other than an argument to a function call or template, the idup rewrite is always done. This includes for assert and mixin. 2. If a StringInterpolation token appears in an argument list to a function or template, the compiler shall try overloads with the StringInterpolation token expanded into InterpolationLiteral and Expression data. If there are any matches to the call, overload resolution processes as normal, and no rewrite is performed. 3. If no matches are found in step 2, then the compiler retries the overload search substituting a call to idup with the sequence for each of the parameters. I will have to come up with a list of examples to clarify. -Steve
Jan 28
prev sibling next sibling parent reply Dukc <ajieskola gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
The DIP states that foo(i"a:${a}, ${b}.") is rewritten as `foo(Interp!"a:", a, Interp!", ", b, Interp!".")`. It think it's better to rewrite it as `foo(Interp!"a:", Interp!typeof(a)(a), Interp!", ", Interp!typeof(b)(b), Interp!".")`. That way, `foo` has easier time introspecting which came from the interpolated string. The type of interpolated string literal is very special cased. The DIP states it is not an alias sequence, but that it behaves like one when passed to a function. And if that does not compile, it is treated as string instead. This is going to be full of all sorts of corner cases. Let me suggest an alternative: the user manually chooses the type. For example, `i"hello ${world}"` would be rewritten as `idup(Interp!"hello ", Interp!typeof(world)(world))`, and `I"hello ${world}"` would be `AliasSeq!(Interp!"hello ", Interp!typeof(world)(world))`. And with latter I mean an honest alias sequence, not one with a special cased `.length` or anything like that.
Jan 28
next sibling parent Dukc <ajieskola gmail.com> writes:
On Thursday, 28 January 2021 at 08:35:34 UTC, Dukc wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker 
 wrote:
 [snip]
That way, `foo` has easier time introspecting which came from the interpolated string.
Meant: that way `foo` has easier time introspecting which arguments came from the interpolated string. I meant that the interpolated string might not be the only argument passed to `foo`.
Jan 28
prev sibling next sibling parent Dukc <ajieskola gmail.com> writes:
On Thursday, 28 January 2021 at 08:35:34 UTC, Dukc wrote:
 `I"hello ${world}"` would be `AliasSeq!(Interp!"hello ", 
 Interp!typeof(world)(world))`. And with latter I mean an honest 
 alias sequence, not one with a special cased `.length` or 
 anything like that.
Error again, this would not compile. Replace "alias sequence" with "expanded tuple" so that the rewritten snipped would be `tuple(Interp!"hello ", Interp!typeof(world)(world)).expand`. I don't mean that the compiler would rewrite the string to use `std.typecons.Tuple`, but that the resulting expanded tuple would be implemented just the same way.
Jan 28
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/28/21 3:35 AM, Dukc wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 [snip]
The DIP states that foo(i"a:${a}, ${b}.") is rewritten as `foo(Interp!"a:", a, Interp!", ", b, Interp!".")`. It think it's better to rewrite it as `foo(Interp!"a:", Interp!typeof(a)(a), Interp!", ", Interp!typeof(b)(b), Interp!".")`. That way, `foo` has easier time introspecting which came from the interpolated string.
First, I don't think it's critical for overloading, and will simply add to the template bloat. What are you going to do differently with `a` than you would with `Interp!(typeof(a))(a)`? Second, this removes any ref possibilities for the parameters. The parameters are guaranteed to start and end with an InterpolationLiteral, so one can assume that non-literal arguments are interspersed inside the literal.
 The type of interpolated string literal is very special cased. The DIP 
 states it is not an alias sequence, but that it behaves like one when 
 passed to a function. And if that does not compile, it is treated as 
 string instead. This is going to be full of all sorts of corner cases.
I was fully aware that this would be the most controversial part. I feel like it will not be full of corner cases, but I'm not sure. Can you specify any? Consider a normal string literal can be used as a string, immutable(char)*, wstring, or dstring. I find it very similar to this feature, and I don't feel like there are a lot of corner cases there.
 Let me suggest an alternative: the user manually chooses the type. For 
 example, `i"hello ${world}"` would be rewritten as `idup(Interp!"hello 
 ", Interp!typeof(world)(world))`, and `I"hello ${world}"` would be 
 `AliasSeq!(Interp!"hello ", Interp!typeof(world)(world))`. And with 
 latter I mean an honest alias sequence, not one with a special cased 
 `.length` or anything like that.
 
We have considered that. The problem is that people will use the string interpolation form without realizing the dangers or resulting bloat. For instance, writeln(i"Hello, ${name}"), if made to proactively generate a string just to send it to writeln is extremely wasteful when writeln(I"Hello, ${name}") is not. I feel like the auto rewrite is a better option because it does the right thing in all cases. The beauty of it is that the library author gets to decide whether it makes sense to accept the expanded form, the user is just saying "here's something string-like I want you to handle". It puts the decision in the right hands, while not being intrusive in case the library author doesn't want to deal with it. Consider also that code which uses a dual-literal system might have to use the string interpolation form because the library only allows that. Then at some point in the future, the library adds support for the expanded form. Now the user would have to go back and switch all usage to that new form, whereas an auto-rewrite would just work without changes. -Steve
Jan 28
prev sibling next sibling parent reply Kagamin <spam here.lot> writes:
auto convoluted = i"${ir"`${"{"}`"}"; // nested string 
interpolations work.
assert(convoluted == "`{`");

+InterpolatedString:
+    InterpolatedDoubleQuotedString
+    InterpolatedWysiwygString
+    InterpolatedAlternateWysiwygString
+    InterpolatedTokenString

Interpolated string should obey all escaping rules of the string 
literal it's derived from, and initial lexing of such string 
should be done with the same logic, and handling of interpolation 
sequences should be done on raw content of the lexed string after 
all due unescaping.

i`\${.}`
i"\\${.}"
These two should have the same meaning of escaped interpolation 
dollar sign, the escaped backslash becomes just backslash after 
double quote string unescaping, and this backslash is interpreted 
as interpolation escape sequence.
Jan 28
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/28/21 3:58 PM, Kagamin wrote:
 auto convoluted = i"${ir"`${"{"}`"}"; // nested string interpolations work.
 assert(convoluted == "`{`");
 
 +InterpolatedString:
 +    InterpolatedDoubleQuotedString
 +    InterpolatedWysiwygString
 +    InterpolatedAlternateWysiwygString
 +    InterpolatedTokenString
 
 Interpolated string should obey all escaping rules of the string literal 
 it's derived from, and initial lexing of such string should be done with 
 the same logic, and handling of interpolation sequences should be done 
 on raw content of the lexed string after all due unescaping.
 
 i`\${.}`
 i"\\${.}"
 These two should have the same meaning of escaped interpolation dollar 
 sign, the escaped backslash becomes just backslash after double quote 
 string unescaping, and this backslash is interpreted as interpolation 
 escape sequence.
DoubleQuotedString is the only string with EscapeSequence processing. Therefore we continued that same expectation. The Wysiwyg string types specifically allow single backslash to represent a backslash, and we did not want to change that behavior. In TokenString, the sequence ${ tokens } are not valid D tokens, so escaping the sequence isn't fruitful. It may be something that is reasonable inside a string literal inside the token string, but I don't think that's worth the complexity. If you want escapes, use the double quoted form. Note also, to wait until the entire string is lexed to process the interpolation sequences means the sequences would have to obey the rules of the string. This means something like: i"hello ${firstname ~ " " ~ lastname}" if processed as a raw string first, would look to the lexer like 2 string literals, one that is i"hello ${firstname ~ ", and one that is " ~ lastname}". You would have to escape the quotation marks, e.g.: i"hello ${firstname ~ \" \" ~ lastname}" making the example more convoluted than just passing the parameters directly. In some string types, this wouldn't be possible (i.e. inside a wysiwyg string, you could not use the end quote type (` or ") in your expression). The lexer already has the capability of processing token strings, which is essentially what this is. It's just in the middle of another string sequence. -Steve
Jan 29
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 This is the feedback thread for the second round of Community 
 Review of DIP 1036, "String Interpolation Tuple Literals".
DIP 1036 takes two different approaches to string interpolation and attempts to merge them together into a single proposal. In broad terms, those approaches can be characterized as follows: 1. The convenient approach: the language and runtime take care of string conversion and memory allocation for you, and you don't have to worry about any of the details. 2. The flexible approach: the language splits the string apart into interpolated and non-interpolated pieces, and it's up to you to decide what to do with them. while missing some important details, appears to be fundamentally on the right track. Either one of these proposals would make a fine DIP on its own. The problem with DIP 1036 is in the way it attempts to combine the two. Fundamentally, the goal that DIP 1036 is aiming for is to give the programmers who want convenience the convenient version, and to give programmers who want flexibility the flexible version. While this is an admirable goal, fully achieving it requires reading the programmer's mind, which is infeasible given D's current level of compiler technology. So what DIP 1036 does is attempt to *guess* what the programmer wants, using a rather crude heuristic: if the code compiles with the flexible version, the compiler is to assume that's what the programmer wants; otherwise, it assumes they want the convenient version. As with any heuristic or approximation, there are edge cases where this breaks down. One of them is called out in the DIP itself--type inference via `auto`--but it is not hard to imagine others. For example, a programmer who writes tuple(i"Good morning ${name}", i"Good evening ${name}") ...is probably not going to get what they intended, even though their code compiles. Every D programmer who wants to make effective use of DIP 1036's interpolation literals will have to go through the process of learning when .idup is required, when it's optional, when it's allowed-but-unnecessary, and when it's forbidden--which means that, in practice, they will have to learn how it actually works, under the hood. This is not a desirable trait for a language feature that's intended to make programming *easier*. Ultimately, I think attempting to guess the programmer's intent is the wrong way to go here. Either force them to spell it out explicitly (with a call to .idup, .text, etc.), or take away the choice and give up on one of the two approaches.
Jan 28
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/28/21 6:06 PM, Paul Backus wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 This is the feedback thread for the second round of Community Review 
 of DIP 1036, "String Interpolation Tuple Literals".
DIP 1036 takes two different approaches to string interpolation and attempts to merge them together into a single proposal. In broad terms, those approaches can be characterized as follows: 1. The convenient approach: the language and runtime take care of string conversion and memory allocation for you, and you don't have to worry about any of the details. 2. The flexible approach: the language splits the string apart into interpolated and non-interpolated pieces, and it's up to you to decide what to do with them. missing some important details, appears to be fundamentally on the right track. Either one of these proposals would make a fine DIP on its own. The problem with DIP 1036 is in the way it attempts to combine the two. Fundamentally, the goal that DIP 1036 is aiming for is to give the programmers who want convenience the convenient version, and to give programmers who want flexibility the flexible version. While this is an admirable goal, fully achieving it requires reading the programmer's mind, which is infeasible given D's current level of compiler technology. So what DIP 1036 does is attempt to *guess* what the programmer wants, using a rather crude heuristic: if the code compiles with the flexible version, the compiler is to assume that's what the programmer wants; otherwise, it assumes they want the convenient version. As with any heuristic or approximation, there are edge cases where this breaks down. One of them is called out in the DIP itself--type inference via `auto`--but it is not hard to imagine others. For example, a programmer who writes     tuple(i"Good morning ${name}", i"Good evening ${name}") ...is probably not going to get what they intended, even though their code compiles.
This is quite the unique edge case though. Any proposal that provides the flexible version is going to have trouble with tuple as it is now. DIP1027, Jonathan Marler's PR, etc would all do something "unintended" (though one could argue it may be intended, depending on the usage). There are solutions that can be had. For instance, tuple could be instrumented never to accept parameters that contain interpolation literals. Therefore, the idup rewrite happens, and they get what they expect. This is not hard to solve. But sure, one can definitely find edge cases for any proposal, especially in D where one can just write: auto foo(T...)(T args) { ... }
 
 Every D programmer who wants to make effective use of DIP 1036's 
 interpolation literals will have to go through the process of learning 
 when .idup is required, when it's optional, when it's 
 allowed-but-unnecessary, and when it's forbidden--which means that, in 
 practice, they will have to learn how it actually works, under the hood. 
This is not my interpretation at all. I can't think of a reasonable case aside from your tuple example where idup is required (if that's what you want). Can you? And the tuple issue can be solved quite easily.
 This is not a desirable trait for a language feature that's intended to 
 make programming *easier*.
Your logic is not very sound. It's ironic to say the language doing what you expect for 99% of cases is a higher burden than requiring you to write it yourself for 100% of cases.
 
 Ultimately, I think attempting to guess the programmer's intent is the 
 wrong way to go here. Either force them to spell it out explicitly (with 
 a call to .idup, .text, etc.), or take away the choice and give up on 
 one of the two approaches.
I want to say something about this idea of doing only one or the other. Let's say you want to create a language feature for string interpolation that always results in a string. Because you need to process the values and convert them to string data at runtime, you need a library function that accepts the data. How do you write such a library function? If this was D1, the way would be to use variadic parameters, passing the TypeInfo of each item, and a void *, and then you'd have to provide a universal way to convert all types to strings. But this is D2. So the correct way to do this is to pass a template variadic argument list to a runtime function, and have the runtime function do the conversion. If you are doing that, you are ALREADY having the compiler generate the expanded form. It's just always preventing anyone from using it directly. I can't imagine a DIP with this kind of "implementation detail" hiding the actual treasure of the feature being acceptable to the community. If you want to implement just the tuple version, now you run into the unpleasant result of having the simplest expected thing fail. That is, when you assign what looks like a string to a string, it fails. When passing what looks like a string into a string parameter, it fails. Both of these options are feasible, and can be argued for, but are not what I would expect for a D language feature. The first option is hiding everything useful from the user instead of using the power of D metaprogramming. The second option is right up D's alley, but places a heavy burden on those who want to just use strings. The combination of both is a cohesive whole which leaves very small edge cases to be dealt with by function authors (and provides a straightforward clear way to do so). If this DIP gets accepted that does not mean the work is done. It means the opportunity is opened for people to take advantage of it. Yes, there are edge cases, but the solutions are apparent and readily available, and the default behavior without such changes is still reasonable. And this does not take away from the fact that the vast vast majority of non-edge-case *just work*. -Steve
Jan 29
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
 provides a call that is free of sql injection attacks
This is a strong claim that requires substantiation, especially since sql injection attacks are a critical problem.
Jan 29
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/29/21 3:28 AM, Walter Bright wrote:
  > provides a call that is free of sql injection attacks
 
 This is a strong claim that requires substantiation, especially since 
 sql injection attacks are a critical problem.
It's trivially true. The mysql_query function can know that interp!"SELECT * FROM" type cannot be from sql injection because the string was known at compile time. All runtime parameters are identified because they are NOT interp structs, and therefore can use the correct mechanism (prepared statements) that does not suffer from sql injection attacks. -Steve
Jan 29
prev sibling next sibling parent reply SealabJaster <sealabjaster gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 ...
Reposting this here at Walter's request: Finally, and I'm sorry if I missed this being described in the DIP, but consider this case (and similar): ``` void f(T)(T array) if(isArray!T) { //... } f(i"..."); ``` What would happen here? My best guess is that it won't compile unless written as `f(i"...".idup)` which then introduces an odd discrepancy where there's certain usages that still require explicitly calling .idup, e.g. `i"...".idup.splitter`
Jan 29
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/29/21 6:49 AM, SealabJaster wrote:
 On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 ...
Reposting this here at Walter's request: Finally, and I'm sorry if I missed this being described in the DIP, but consider this case (and similar): ``` void f(T)(T array) if(isArray!T) {     //... } f(i"..."); ``` What would happen here? My best guess is that it won't compile unless written as `f(i"...".idup)` which then introduces an odd discrepancy where there's certain usages that still require explicitly calling .idup, e.g. `i"...".idup.splitter`
Sorry for not responding in the discussion thread. the above will work fine with the DIP. What will happen is that isArray!(interp!"...") will be false, and there will be no match for f. Therefore, the compiler will rewrite as f(i"...".idup), which will translate correctly to f("..."). If you still have questions, please ask on discussion thread and I can answer them. -Steve
Jan 29
prev sibling next sibling parent reply Q. Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 * All posts must be a direct reply to the DIP manager's initial 
 post, with only two exceptions:
A problem I see is that i"..." becomes an interp sequence too easily. D's templates have a low entrance barrier and passing an interpolated string to a variadic function template is something not too outlandish to do. Then, the by the current form of the proposal, the interpolated becomes an `interp` sequence. This may be completely unexpected as most variadic function templates won't be written with an interp sequence in mind at all. using auto tup = tuple(1, i"I'm ${name}.", 2.3) expecting Tuple!(int, string, double). First, tup.length != 3 and its content are wild. Basically, an interpolated string should be a string, except in very specific circumstances; variadic templates aren't even close to being a very specific circumstance. The typing is better done akin to slices vs static arrays. As a reminder, to a newcomer, auto xs = [ 1, 2, 3 ]; looks like it would infer int[3] as the type of xs. It is obviously the most descriptive type for that literal. Why would it infer int[] forgetting its compile-time known length and even do an allocation? That seems so much worse. At least, for consistency, typeof([1,2,3]) is int[], too, and not int[3]. We know why D does it the way it does, and why it uses int[3] with no allocation only when requested explicitly. You can do that with a template with a flexible length like this: void takesStaticArray(T, size_t n)(T[n] staticArray); Here, `T` and `n` can usually be inferred from the argument: takesStaticArray([1, 2, 3]); won't allocate. Interpolated strings should behave similarly: 1. typeof(i"...") is `string`. 2. auto str = i"..." infers string (cf. typeof) and gc-allocates if necessary. 3. i"..." has a secondary type akin to [1,2,3] having int[3] as a secondary type. (The secondary type is a sequence, but that's a detail.) If an interpolated string is bound to a parameter of that secondary type (`interp` in the DIP) it uses its secondary type (cf. calling takesStaticArray with [1,2,3]). In any other case, e.g. `auto` or otherwise generic template parameters will infer string. My suggestion is that one cannot do this: ResultSeq mysql_query(Args...)(Args args) if (Args.length > 0 && isInstanceOf!(interp, Args[0])) { ... } because Args... is too unspecific to force i"..." to become an interp sequence. Therefore auto rseq = mysql_query(i"..."); results in mysql_query!string and then fails the constraint. You have to request an interp sequence explicitly like this ResultSeq mysql_query(string first, Args...)(interp!first arg, Args args) to force the secondary type. This is akin to takesStaticArray(size_t n, T)(T[n]). Since it cannot bind an int[] However, if you already have an interp!"..." object in hand, of course passing it to a template will use its static type interp!"..." (and not string), like an int[3] variable will, too. This is fine, because at this point, you know what you're dealing with and that it is not a string. Akin to `staticArray`, Phobos can (should) supply a template `asInterp` that forces a i"..." literal to be statically typed as a tuple around the respective `interp` sequence. auto asInterp(string str, Args...)(interp!str first, Args rest) { import std.typecons : tuple; return tuple(first, rest); } Function templates handling interpolated strings can either provide an overload taking a tuple containing an interp!"..." as its first entry (likely result of asInterp) or require the user to use `expand` on the tuple. Example: https://run.dlang.io/is/h5LgF4 Getting a string is probably what most users expect most of the time, even in templates. Handling the secondary type must be explicit. It is almost an implementation detail that shouldn't be exposed to the user too easily.
Feb 03
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 3 February 2021 at 17:47:19 UTC, Q. Schroll wrote:
 Akin to `staticArray`, Phobos can (should) supply a template 
 `asInterp` that forces a i"..." literal to be statically typed 
 as a tuple around the respective `interp` sequence.
Something to keep in mind is that tuples cannot actually be wrapped in D. As soon as they are anything other than slices of themselves or directly used as function parameters, they start losing information.
     auto asInterp(string str, Args...)(interp!str first, Args 
 rest)
     {
         import std.typecons : tuple;
         return tuple(first, rest);
     }
Like here, if there was any refness in rest, it is gone in the return value. If it was used as a template argument and had alias parameters, that information is lost. That's the big driver toward the naked tuple - it is just the *only* way to express all the possibilities D has to offer. Conceptually, dip 1036 is trying to express i"foo" becoming: struct Interpolated(Pieces...) { Pieces... pieces; string toString() { return text(pieces); } alias toString this; alias args this; } That is, a new type with two different implicit conversions depending on appropriate context. But the fact is that's flat-out impossible to today's D, so the dip makes some compromises to get as close as it can with the simple naked tuple on one hand and the implicit conversion rules specified on the other.
 Getting a string is probably what most users expect most of the 
 time, even in templates.
That's why the implicit conversion part is in the dip, but your point about the static arrays is an interesting alternative approach. BTW this part of the argument barely matters to me. I will probably *never* use the implicit to string thing since it is hard for me to think of a situation where actually passing the tuple to the function isn't a better approach.
Feb 03
parent Q. Schroll <qs.il.paperinik gmail.com> writes:
On Wednesday, 3 February 2021 at 18:17:03 UTC, Adam D. Ruppe 
wrote:
 On Wednesday, 3 February 2021 at 17:47:19 UTC, Q. Schroll wrote:
 Akin to `staticArray`, Phobos can (should) supply a template 
 `asInterp` that forces a i"..." literal to be statically typed 
 as a tuple around the respective `interp` sequence.
Something to keep in mind is that tuples cannot actually be wrapped in D. As soon as they are anything other than slices of themselves or directly used as function parameters, they start losing information.
     auto asInterp(string str, Args...)(interp!str first, Args 
 rest)
     {
         import std.typecons : tuple;
         return tuple(first, rest);
     }
Like here, if there was any refness in rest, it is gone in the return value. If it was used as a template argument and had alias parameters, that information is lost.
I never thought of that `asInterp` as the summit of creation but rather as a proof of concept. I thought about it for like 3 minutes. The much better solution would be something akin to alias asInterp(alias interp!str first, string str, rest...) = AliasSeq!(first, rest); used as asInterp!i".." but unfortunately, this does not compile. (This was my first idea.) D cannot infer `str` from the value passed to it. It can for `interp!str` a run-time parameter type, but not for a template parameter. Since aliases are mere names, `ref`ness and stuff would be preserved (I guess so, since otherwise, forward wouldn't work). Another, actually very simple, idea would be to give an i"..." a property asInterp that (cf. tuple's expand) returns the interp sequence. I would have spelled out my critique differently had I read all feedback before; tuple was already mentioned. But the main point is: variadic templates are too common to assume they're all ready to take an interp sequence. As a very simple example different from tuple, consider emplacement functions that forward their arguments to constructors; all of them are variadic templates taking on almost anything. String arguments aren't too uncommon and wherever strings are used, we must assume that sooner or later someone uses an interpolated string. All of those variadic function templates would have to anticipate interpolated strings. When one thinks about handling strings, interpolated ones aren't too far fetched, but when one thinks about arbitrary types, interpolated strings aren't really the first or second thought. Variadic templates are the prime example of thinking of arbitrary types. The nogc people will hate it, but immediately making i"..." a string through allocation -- except "very specific circumstances" that leave absolutely no room for doubt -- is the only way this will interact with other language features in an intuitive manner. Having to use idup explicitly in any case is bad design. It makes interpolated strings inaccessible. They'd become a tool of the professionals and gurus, although being first and foremost designed to be an accessible productivity feature. If code can be nogc, programmers should have means to do that -- but that's it. It need not be stupidly easy at the expense of making an accessible productivity feature worse than it need be.
Feb 03
prev sibling parent reply Daniel N <no public.email> writes:
On Wednesday, 27 January 2021 at 10:33:53 UTC, Mike Parker wrote:
 This is the feedback thread for the second round of Community 
 Review of DIP 1036, "String Interpolation Tuple Literals".
There are many great qualities of DIP1036 and I hope it gets accepted, good luck! But one area which I think could need improvement is: 1) The lowering (either that or I totally overlooked something obvious again). writeln(i"I ate ${apples} apples and ${bananas} bananas totalling ${apples + bananas} fruit."); DIP1036 proposes the following rewrite: writeln(interp!"I ate "(), apples, interp!" apples and "(), bananas, interp!" bananas totalling "(), apples + bananas, interp!" fruit."()); Wouldn't this lowering be both simpler and more more efficient? writeln(interp!"I ate "(), apples, " apples and ", bananas, " bananas totalling ", apples + bananas, " fruit."); Actually couldn't we go even futher? immutable interp interp_tag; // an instance of some unique type with as low overhead as possible writeln(interp_tag, "I ate ", apples, " apples and ", bananas, " bananas totalling ", apples + bananas, " fruit."); This way you can overload with a normal function argument instead of using a template constraint and NO single template instance is created. 2) Also no fan of using abbreviated type-names(interp) in userfacing API:s
Feb 03
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 3 February 2021 at 20:02:57 UTC, Daniel N wrote:
 Wouldn't this lowering be both simpler and more more efficient?
Your change wouldn't expose the string for compile time processing. That's the real benefit of interp!"string" - it is available for CTFE rewriting by the function. So putting that on the whole thing means a whole format string or translation string or whatever else can be CTFE generated, inspected, etc.
 2) Also no fan of using abbreviated type-names(interp) in 
 userfacing API:s
yeah, it is ambiguous with "interpretation" too.
Feb 03