digitalmars.D - Multiline string literal improvements

Igor (26/26) Oct 10 2017 D has a very nice feature of token strings:

captaindet (8/14) Oct 10 2017 you could write your own string processing function according to your

sarn (4/12) Oct 10 2017 FWIW, that's the solution in Python:

Walter Bright (4/5) Oct 11 2017 Yes, I see no need for a language feature what can be easily and far mor...

Igor (15/20) Oct 11 2017 You are right. My mind is just still not used to the power of D

Meta (6/28) Oct 11 2017 Even better, you could write the same code that you would for

Igor (11/44) Oct 12 2017 I tried this but Disassembly view shows:

Meta (4/14) Oct 12 2017 Hmm, you're right. I could've sworn that std.regex is

Dmitry Olshansky (7/13) Oct 12 2017 Indeed it’s been an ongoing work to make regex match at CTFE. I

Jacob Carlborg (23/29) Oct 12 2017 Unfortunately it doesn't work for the other multiline syntax:

Biotronic (43/56) Oct 13 2017 D version that works in CTFE:

Igor (2/3) Oct 13 2017 Thanks Biotronic! This is just what I had in mind.

Igor <stojkovic.igor gmail.com> writes:

D has a very nice feature of token strings:
string a = q{
                looksLikeCode();
             };

It is useful in writing mixins and to have syntax highlighting in 
editors. Although I like it, it is not something I ever felt like 
missing in other languages. What I do always miss are these two 
options:

1. Have something like this:

string a = |q{
                  firstLine();
                  if (cond) {
                      secondLine()
                  }
               };

mean count the number of whitespace characters at the start of a 
first new line of string literal and then strip up to that many 
whitespace characters from the start of each line.

2. If we just put for example "-" instead of "|" in above example 
have that mean: replace all whitespace with a single space in 
following string literal.

I think it is clear why would these be useful but if you want me 
I can add a few examples. This would not make any breaking 
changes to the language and it should be possible to simply 
implement it wholly in the lexer.

So what do think?

Oct 10 2017

captaindet <2krnk gmx.net> writes:

 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };

you could write your own string processing function according to your 
needs to filter the code string, and use it like
string a = inject(q{...})  			//or
string a = inject!(formatOpts)(q{...})

i have done this myself and also included positional argument formatting 
to my liking, optimized for CT code generation. don't have the code at 
my hands ATM though. could post it later if you are interested.

/det

Oct 10 2017

sarn <sarn theartofmachinery.com> writes:

On Tuesday, 10 October 2017 at 21:38:41 UTC, captaindet wrote:
 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };

 you could write your own string processing function according 
 to your needs

FWIW, that's the solution in Python:
https://docs.python.org/release/3.6.3/library/textwrap.html#textwrap.dedent

Works even better in D because it can run at compile time.

Oct 10 2017

Walter Bright <newshound2 digitalmars.com> writes:

On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.

Yes, I see no need for a language feature what can be easily and far more 
flexibly done with a regular function - especially since what |q{ and -q{ do 
gives no clue from the syntax.

Oct 11 2017

Igor <stojkovic.igor gmail.com> writes:

On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.

 Yes, I see no need for a language feature what can be easily 
 and far more flexibly done with a regular function - especially 
 since what |q{ and -q{ do gives no clue from the syntax.

You are right. My mind is just still not used to the power of D 
templates so I didn't think of this. On the other hand that is 
why D is still making me say "WOW!" on a regular basis :).

Just to confirm I understand, for example the following would 
give me compile time stripping of white space:

template stripws(string l) {
     enum stripws = l.replaceAll(regex("\s+", "g"), " ");
}

string variable = stripws(q{
    whatever and    ever;
});

And I would get variable to be equal to " whatever and ever; ". 
Right?

Oct 11 2017

Meta <jared771 gmail.com> writes:

On Wednesday, 11 October 2017 at 09:56:52 UTC, Igor wrote:
 On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
 wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.

 Yes, I see no need for a language feature what can be easily 
 and far more flexibly done with a regular function - 
 especially since what |q{ and -q{ do gives no clue from the 
 syntax.

 You are right. My mind is just still not used to the power of D 
 templates so I didn't think of this. On the other hand that is 
 why D is still making me say "WOW!" on a regular basis :).

 Just to confirm I understand, for example the following would 
 give me compile time stripping of white space:

 template stripws(string l) {
     enum stripws = l.replaceAll(regex("\s+", "g"), " ");
 }

 string variable = stripws(q{
    whatever and    ever;
 });

 And I would get variable to be equal to " whatever and ever; ". 
 Right?

Even better, you could write the same code that you would for 
doing this at runtime and it'll Just Work:

string variable = q{
     whatever and  ever;
}.replaceAll(regex(`\s+`, "g"), " ");

Oct 11 2017

Igor <stojkovic.igor gmail.com> writes:

On Wednesday, 11 October 2017 at 14:28:32 UTC, Meta wrote:
 On Wednesday, 11 October 2017 at 09:56:52 UTC, Igor wrote:
 On Wednesday, 11 October 2017 at 08:35:51 UTC, Walter Bright 
 wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.

 Yes, I see no need for a language feature what can be easily 
 and far more flexibly done with a regular function - 
 especially since what |q{ and -q{ do gives no clue from the 
 syntax.

 You are right. My mind is just still not used to the power of 
 D templates so I didn't think of this. On the other hand that 
 is why D is still making me say "WOW!" on a regular basis :).

 Just to confirm I understand, for example the following would 
 give me compile time stripping of white space:

 template stripws(string l) {
     enum stripws = l.replaceAll(regex("\s+", "g"), " ");
 }

 string variable = stripws(q{
    whatever and    ever;
 });

 And I would get variable to be equal to " whatever and ever; 
 ". Right?

 Even better, you could write the same code that you would for 
 doing this at runtime and it'll Just Work:

 string variable = q{
     whatever and  ever;
 }.replaceAll(regex(`\s+`, "g"), " ");

I tried this but Disassembly view shows:

call std.regex.regex!string.regex
and
call std.regex.replaceAll!(string, char, 
std.regex.internal.ir.Regex!char).replaceAll

which means that replaceAll with regex is done at runtime, not 
compile time. Also when I just added enum in front of string 
variable then I got this:

Error: malloc cannot be interpreted at compile time, because it 
has no available source code

Oct 12 2017

Meta <jared771 gmail.com> writes:

On Thursday, 12 October 2017 at 08:08:17 UTC, Igor wrote:
 I tried this but Disassembly view shows:

 call std.regex.regex!string.regex
 and
 call std.regex.replaceAll!(string, char, 
 std.regex.internal.ir.Regex!char).replaceAll

 which means that replaceAll with regex is done at runtime, not 
 compile time. Also when I just added enum in front of string 
 variable then I got this:

 Error: malloc cannot be interpreted at compile time, because it 
 has no available source code

Hmm, you're right. I could've sworn that std.regex is 
CTFE-friendly but it looks like I was wrong. If it used the GC 
instead of malloc this would probably work.

Oct 12 2017

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On Thursday, 12 October 2017 at 16:59:46 UTC, Meta wrote:
 On Thursday, 12 October 2017 at 08:08:17 UTC, Igor wrote:
 I tried this but Disassembly view shows:


[snip]
 Hmm, you're right. I could've sworn that std.regex is 
 CTFE-friendly but it looks like I was wrong. If it used the GC 
 instead of malloc this would probably work.

Indeed it’s been an ongoing work to make regex match at CTFE. I 
considered peperring the code paths with __ctfe ? malloc ... : 
new[]
But it a lot of corner cases snd run-time optimized code path is 
already quite impenetrable.

Oct 12 2017

Jacob Carlborg <doob me.com> writes:

On 2017-10-11 10:35, Walter Bright wrote:
 On 10/10/2017 3:16 PM, sarn wrote:
 Works even better in D because it can run at compile time.

 
 Yes, I see no need for a language feature what can be easily and far 
 more flexibly done with a regular function - especially since what |q{ 
 and -q{ do gives no clue from the syntax.

Unfortunately it doesn't work for the other multiline syntax:

void main()
{
     auto a = q"FOO
         int b = 3;
     FOO";
}

The above fails to compile [1]. The trailing FOO cannot be indented. 
This works:

void main()
{
     auto a = q"FOO
         int b = 3;
FOO";
}

Which in my opinion doesn't look as good as the first example. It gets 
worse if "a" is indented even more, because it's nested in a class, in a 
method, in an if statement and so on.

[1] main.d(3,14): Error: unterminated delimited string constant starting 
at main.d(3,15)

-- 
/Jacob Carlborg

Oct 12 2017

Biotronic <simen.kjaras gmail.com> writes:

On Tuesday, 10 October 2017 at 22:16:00 UTC, sarn wrote:
 On Tuesday, 10 October 2017 at 21:38:41 UTC, captaindet wrote:
 string a = |q{
                   firstLine();
                   if (cond) {
                       secondLine()
                   }
                };

 you could write your own string processing function according 
 to your needs

 FWIW, that's the solution in Python:
 https://docs.python.org/release/3.6.3/library/textwrap.html#textwrap.dedent

 Works even better in D because it can run at compile time.

D version that works in CTFE:

import std.ascii : newline;

string dedent(string s, string newline = newline) {
     import std.string : strip, splitLines, front, join;
     import std.uni : isWhite;
     import std.array : array;
     import std.algorithm : until, startsWith;

     auto lines = s.strip().splitLines();
     if (lines.length == 0) return "";
     if (lines.length == 1) return lines[0];

     auto whitespace = lines[1].until!(a => !a.isWhite).array;

     foreach (ref line; lines) {
         if (line.startsWith(whitespace)) {
             line = line[whitespace.length..$];
         }
         // Throw if line doesn't start with correct amount of 
whitespace?
     }

     return lines.join(newline);
} unittest {
     assert(dedent("a") == "a");
     assert(dedent("a\n a") == "a"~newline~"a");
     string a = q{
                  firstLine();
                  if (cond) {
                      secondLine();
                  }
               };
     string b = "firstLine();"~newline~"if (cond) {"~newline~"    
secondLine();"~newline~"}";
     assert(a.dedent == b);
}

template dedent(string s, string newline = newline) {
     enum dedent = .dedent(s, newline);
} unittest {
     assert(dedent!("a\n a", "\n") == "a\na");
}

Treats all whitespace the same ('\t' == ' ' == MONGOLIAN VOWEL 
SEPARATOR), which might not be optimal, but I'm not gonna touch 
that can of worms.

--
   Biotronic

Oct 13 2017

Igor <stojkovic.igor gmail.com> writes:

On Friday, 13 October 2017 at 07:59:36 UTC, Biotronic wrote:
 D version that works in CTFE:

Thanks Biotronic! This is just what I had in mind.

Oct 13 2017

D Programming

C/C++ Programming

Other

digitalmars.D - Multiline string literal improvements