www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Multiline string literal improvements

reply Igor <stojkovic.igor gmail.com> writes:
D has a very nice feature of token strings:
string a = q{
                looksLikeCode();
             };

It is useful in writing mixins and to have syntax highlighting in 
editors. Although I like it, it is not something I ever felt like 
missing in other languages. What I do always miss are these two 
options:

1. Have something like this:

string a = |q{
                  firstLine();
                  if (cond) {
                      secondLine()
                  }
               };

mean count the number of whitespace characters at the start of a 
first new line of string literal and then strip up to that many 
whitespace characters from the start of each line.

2. If we just put for example "-" instead of "|" in above example 
have that mean: replace all whitespace with a single space in 
following string literal.

I think it is clear why would these be useful but if you want me 
I can add a few examples. This would not make any breaking 
changes to the language and it should be possible to simply 
implement it wholly in the lexer.

So what do think?
Oct 10
parent reply captaindet <2krnk gmx.net> writes:
 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };
you could write your own string processing function according to your needs to filter the code string, and use it like string a = inject(q{...}) //or string a = inject!(formatOpts)(q{...}) i have done this myself and also included positional argument formatting to my liking, optimized for CT code generation. don't have the code at my hands ATM though. could post it later if you are interested. /det
Oct 10
parent reply sarn <sarn theartofmachinery.com> writes:
On Tuesday, 10 October 2017 at 21:38:41 UTC, captaindet wrote:
 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };
you could write your own string processing function according to your needs
FWIW, that's the solution in Python: https://docs.python.org/release/3.6.3/library/textwrap.html#textwrap.dedent Works even better in D because it can run at compile time.
Oct 10
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.
Yes, I see no need for a language feature what can be easily and far more flexibly done with a regular function - especially since what |q{ and -q{ do gives no clue from the syntax.
Oct 11
next sibling parent reply Igor <stojkovic.igor gmail.com> writes:
On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.
Yes, I see no need for a language feature what can be easily and far more flexibly done with a regular function - especially since what |q{ and -q{ do gives no clue from the syntax.
You are right. My mind is just still not used to the power of D templates so I didn't think of this. On the other hand that is why D is still making me say "WOW!" on a regular basis :). Just to confirm I understand, for example the following would give me compile time stripping of white space: template stripws(string l) { enum stripws = l.replaceAll(regex("\s+", "g"), " "); } string variable = stripws(q{ whatever and ever; }); And I would get variable to be equal to " whatever and ever; ". Right?
Oct 11
parent reply Meta <jared771 gmail.com> writes:
On Wednesday, 11 October 2017 at 09:56:52 UTC, Igor wrote:
 On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
 wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.
Yes, I see no need for a language feature what can be easily and far more flexibly done with a regular function - especially since what |q{ and -q{ do gives no clue from the syntax.
You are right. My mind is just still not used to the power of D templates so I didn't think of this. On the other hand that is why D is still making me say "WOW!" on a regular basis :). Just to confirm I understand, for example the following would give me compile time stripping of white space: template stripws(string l) { enum stripws = l.replaceAll(regex("\s+", "g"), " "); } string variable = stripws(q{ whatever and ever; }); And I would get variable to be equal to " whatever and ever; ". Right?
Even better, you could write the same code that you would for doing this at runtime and it'll Just Work: string variable = q{ whatever and ever; }.replaceAll(regex(`\s+`, "g"), " ");
Oct 11
parent reply Igor <stojkovic.igor gmail.com> writes:
On Wednesday, 11 October 2017 at 14:28:32 UTC, Meta wrote:
 On Wednesday, 11 October 2017 at 09:56:52 UTC, Igor wrote:
 On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
 wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.
Yes, I see no need for a language feature what can be easily and far more flexibly done with a regular function - especially since what |q{ and -q{ do gives no clue from the syntax.
You are right. My mind is just still not used to the power of D templates so I didn't think of this. On the other hand that is why D is still making me say "WOW!" on a regular basis :). Just to confirm I understand, for example the following would give me compile time stripping of white space: template stripws(string l) { enum stripws = l.replaceAll(regex("\s+", "g"), " "); } string variable = stripws(q{ whatever and ever; }); And I would get variable to be equal to " whatever and ever; ". Right?
Even better, you could write the same code that you would for doing this at runtime and it'll Just Work: string variable = q{ whatever and ever; }.replaceAll(regex(`\s+`, "g"), " ");
I tried this but Disassembly view shows: call std.regex.regex!string.regex and call std.regex.replaceAll!(string, char, std.regex.internal.ir.Regex!char).replaceAll which means that replaceAll with regex is done at runtime, not compile time. Also when I just added enum in front of string variable then I got this: Error: malloc cannot be interpreted at compile time, because it has no available source code
Oct 12
parent reply Meta <jared771 gmail.com> writes:
On Thursday, 12 October 2017 at 08:08:17 UTC, Igor wrote:
 I tried this but Disassembly view shows:

 call std.regex.regex!string.regex
 and
 call std.regex.replaceAll!(string, char, 
 std.regex.internal.ir.Regex!char).replaceAll

 which means that replaceAll with regex is done at runtime, not 
 compile time. Also when I just added enum in front of string 
 variable then I got this:

 Error: malloc cannot be interpreted at compile time, because it 
 has no available source code
Hmm, you're right. I could've sworn that std.regex is CTFE-friendly but it looks like I was wrong. If it used the GC instead of malloc this would probably work.
Oct 12
parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On Thursday, 12 October 2017 at 16:59:46 UTC, Meta wrote:
 On Thursday, 12 October 2017 at 08:08:17 UTC, Igor wrote:
 I tried this but Disassembly view shows:
[snip]
 Hmm, you're right. I could've sworn that std.regex is 
 CTFE-friendly but it looks like I was wrong. If it used the GC 
 instead of malloc this would probably work.
Indeed it’s been an ongoing work to make regex match at CTFE. I considered peperring the code paths with __ctfe ? malloc ... : new[] But it a lot of corner cases snd run-time optimized code path is already quite impenetrable.
Oct 12
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2017-10-11 10:35, Walter Bright wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.
Yes, I see no need for a language feature what can be easily and far more flexibly done with a regular function - especially since what |q{ and -q{ do gives no clue from the syntax.
Unfortunately it doesn't work for the other multiline syntax: void main() { auto a = q"FOO int b = 3; FOO"; } The above fails to compile [1]. The trailing FOO cannot be indented. This works: void main() { auto a = q"FOO int b = 3; FOO"; } Which in my opinion doesn't look as good as the first example. It gets worse if "a" is indented even more, because it's nested in a class, in a method, in an if statement and so on. [1] main.d(3,14): Error: unterminated delimited string constant starting at main.d(3,15) -- /Jacob Carlborg
Oct 12
prev sibling parent reply Biotronic <simen.kjaras gmail.com> writes:
On Tuesday, 10 October 2017 at 22:16:00 UTC, sarn wrote:
 On Tuesday, 10 October 2017 at 21:38:41 UTC, captaindet wrote:
 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };
you could write your own string processing function according to your needs
FWIW, that's the solution in Python: https://docs.python.org/release/3.6.3/library/textwrap.html#textwrap.dedent Works even better in D because it can run at compile time.
D version that works in CTFE: import std.ascii : newline; string dedent(string s, string newline = newline) { import std.string : strip, splitLines, front, join; import std.uni : isWhite; import std.array : array; import std.algorithm : until, startsWith; auto lines = s.strip().splitLines(); if (lines.length == 0) return ""; if (lines.length == 1) return lines[0]; auto whitespace = lines[1].until!(a => !a.isWhite).array; foreach (ref line; lines) { if (line.startsWith(whitespace)) { line = line[whitespace.length..$]; } // Throw if line doesn't start with correct amount of whitespace? } return lines.join(newline); } unittest { assert(dedent("a") == "a"); assert(dedent("a\n a") == "a"~newline~"a"); string a = q{ firstLine(); if (cond) { secondLine(); } }; string b = "firstLine();"~newline~"if (cond) {"~newline~" secondLine();"~newline~"}"; assert(a.dedent == b); } template dedent(string s, string newline = newline) { enum dedent = .dedent(s, newline); } unittest { assert(dedent!("a\n a", "\n") == "a\na"); } Treats all whitespace the same ('\t' == ' ' == MONGOLIAN VOWEL SEPARATOR), which might not be optimal, but I'm not gonna touch that can of worms. -- Biotronic
Oct 13
parent Igor <stojkovic.igor gmail.com> writes:
On Friday, 13 October 2017 at 07:59:36 UTC, Biotronic wrote:
 D version that works in CTFE:
Thanks Biotronic! This is just what I had in mind.
Oct 13