www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Some Thoughts On String Interpolation [l10n, restricting access, AA]

reply kdevel <kdevel vogtner.de> writes:
**Localization**

A few days ago this example was posted in the "Learn" group [1]:

```d
writeln(i"You drink $coffees cups a day and it gives you 
$(coffees + iq) IQ");
```

A German version would read

```d
writeln(i"Sie trinken $coffees Tassen Kaffee am Tag. Dadurch 
erhöht sich Ihr IQ auf $(coffees + iq).");
```

How could the language version be selected (at runtime)? BTW: Is 
there a "D way of localization"?

**Restricting Access**

What about

```d
writeln(i"You drink $coffees cups a day and it gives you 
$(password) IQ");
```

How is it prevented that the person doing the localization puts 
arbitrary variable or code references into the localized strings?

Is this attack vector already known? And if so: Has it been named?

**Accessing Fields Of A Struct**

Having data in a struct Variable `s`

```d
struct S {
    string value;
}

S s;
```

would this work out-of-the-box?:

```d
with (s) writeln(i"The name in s is $(value)");
```

**Accessing Elements Of An AA**

```d
string[string] aa;
aa["name"] = "value";
writeln (i"The name in s is $(aa[\"name\"])"
```

That typing is laborious. Isn't there a way to bind the 
expression to the keys of the AA?

**Nesting**

Should it nest? How deep? What is the syntax?

```d
writeln(i"does this work $(a + \"$(b)\" + c)");
```

*to be continued*

[1] 
https://forum.dlang.org/post/rkevblyfgvmvmsuhtqmr forum.dlang.org
Oct 26 2023
next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Localization**

 A few days ago this example was posted in the "Learn" group [1]:

 [...]
Answers to all questions all indivudials have, and will have, are more or less already answered if we copy what some other languages do.
Oct 26 2023
prev sibling next sibling parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Localization**

 A few days ago this example was posted in the "Learn" group [1]:

 ```d
 writeln(i"You drink $coffees cups a day and it gives you 
 $(coffees + iq) IQ");
 ```

 A German version would read

 ```d
 writeln(i"Sie trinken $coffees Tassen Kaffee am Tag. Dadurch 
 erhöht sich Ihr IQ auf $(coffees + iq).");
 ```

 How could the language version be selected (at runtime)? BTW: 
 Is there a "D way of localization"?
With gnu gettext, you'd first pass the string through a tr() function, which lets it swap out at runtime. (You'd have to remember to do this though, since writeln will accept a generic string without this step.... unless writeln itself started wrapping through a standard translator function... but that's another story.) I wrote a sample for the interpolated version here, but the translations might not be obvious. Let me add your example as a concrete thing. https://github.com/adamdruppe/interpolation-examples/blob/master/04-internationalization.d It is now added there, running that program (at the time of this writing) gives: I, Adam, have a singular apple. I, Adam, have a singular apple. I, Adam, have 5 apples. I, Adam, have 5 apples. GG Adam GG Adam ggs 5, Adam ggs 5, Adam You drink 5 cups a day and it gives you -25 IQ Sie trinken 5 Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf -25. Note that the translator could also change word order, it uses positional params here. (I'm not entirely happy with this specific syntax but it is just a demo to show that you can do all these things.)
 **Restricting Access**

 What about

 ```d
 writeln(i"You drink $coffees cups a day and it gives you 
 $(password) IQ");
 ```

 How is it prevented that the person doing the localization puts 
 arbitrary variable or code references into the localized 
 strings?
The localization thing is done at runtime and only has access to the variables passed to it. If a programmer wrote "it gives you $password" then yes, password would be available to the translator, same as any other argument, but just... don't do that? Notice how in the example linked above, the translator uses $1 and $2 rather than the variable name, since that string is handled by the library code, not the D language.
 **Accessing Fields Of A Struct**
 would this work out-of-the-box?:
Yes, of course, exactly the same as if you passed `"name", value` to the function. (That's literally what the compiler's rewrite does.)
 **Accessing Elements Of An AA**
 writeln (i"The name in s is $(aa[\"name\"])"
This is wrong though, it should be: writeln (i"The name in s is $(aa["name"])" Once you're inside the $(..) region, it is read as D code, not as part of a string. (This is pretty standard for language support of interpolation.) So you don't want extra \ in there. I'll add these two to basics.
 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
I don't know what this means.
 **Nesting**
 Should it nest? How deep? What is the syntax?
It does and and deep as you want. Remember, what's inside the $() is D code, not string, so you'd do: i"thing $(i"thing $(i"thing"))" etc. The processing function has the info it needs to support this but may have to do extra work with it. BTW you don't have to ask me, you can ask the compiler, this is all fully implemented already. https://github.com/dlang/dmd/pull/15715 But let me add a few of these to the examples repo.... and done https://github.com/adamdruppe/interpolation-examples/blob/master/01-basics.d shows all these. If you compile the dmd from the PR you can build and run all these examples yourself.
Oct 26 2023
parent reply kdevel <kdevel vogtner.de> writes:
On Thursday, 26 October 2023 at 15:34:52 UTC, Adam D Ruppe wrote:
 [...]
 How could the language version be selected (at runtime)? BTW: 
 Is there a "D way of localization"?
With gnu gettext, you'd first pass the string through a tr() function, which lets it swap out at runtime.
Okay. Usually there is a source string in English language which is subject to translation: ```d int n = 3; writefln (_("n is %d"), n); ``` The source string in this example is `n is %d`. Now for every target language one creates seperate .po-files [1]. These files essentially contain pairs of source/target strings: ``` msgid "n is %d" msgstr "n ist %d" ``` These .po-Files are compiled into .mo-Files from which at runtime the strings read. If now interpolation is in play ```d int n = 3; writefln (_(i"n is $(n)")); ``` How is the translation workflow organized now? What is put in the po-Files? [1] https://www.gnu.org/software/gettext/manual/html_node/PO-Files.html
Oct 27 2023
parent reply Adam D Ruppe <destructionator gmail.com> writes:
On Friday, 27 October 2023 at 11:03:55 UTC, kdevel wrote:
 How is the translation workflow organized now? What is put in 
 the po-Files?
I literally have an example of this: https://github.com/adamdruppe/interpolation-examples/ There's a few different ways we could do it, this here is working with the existing D gettext lib which married itself to std.format (much to my chagrin) but it still wasn't hard to make it work. Remember, the library can tell the difference between the string literal segments and the interpolated segments, and it can work with the string literal segments at compile time. So it can CTFE msgids out of it in whatever format it wants and list all the possible things for the .pot file through compile time reflection (aggregated at runtime). These are all solved problems - my thing is a small wrapper around the existing gettext D lib which is a small wrapper around GNU gettext. All you have to do is arrange the data available to you. In my example, I used $1, $2, etc as placeholders for the parameters in the msgid/msgstrs, except for the plural param which had to be %d cuz the D gettext lib made that assumption and im trying to be compatible with it. This allows easy reordering etc by the translator, without conflicting with std.format's %s stuff. Look at that repo to see for yourself.
Oct 27 2023
parent reply kdevel <kdevel vogtner.de> writes:
On Friday, 27 October 2023 at 11:20:04 UTC, Adam D Ruppe wrote:
 On Friday, 27 October 2023 at 11:03:55 UTC, kdevel wrote:
 How is the translation workflow organized now? What is put in 
 the po-Files?
I literally have an example of this: https://github.com/adamdruppe/interpolation-examples/
Sorry for not having looked that up in the first place. `german.po` reads as follows: ``` msgid "You drink $1 cups a day and it gives you $2 IQ" msgstr "Sie trinken $1 Tassen Kaffee am Tag. Dadurch erhöht sich Ihr IQ auf $2." ``` while the source code `04-internationalization.d` reads ```d writeln(tr(i"You drink $coffees cups a day and it gives you $(coffees + iq) IQ")); ``` Does this allow the use of the `msgerge` program [1] for changes in the source code? [1] https://www.gnu.org/software/gettext/manual/gettext.html#msgmerge-Invocation
Oct 27 2023
parent Adam D Ruppe <destructionator gmail.com> writes:
On Friday, 27 October 2023 at 12:48:07 UTC, kdevel wrote:
 Does this allow the use of the `msgerge` program [1] for 
 changes in the source code?
Yes, in fact, I used that when adding that example to the existing file. Now, if the string itself (NOT what is interpolated between, this impl ignores that, though it could be processed if we choose; i was thinking about using it as a comment to translaters) changed that would probably be a different msg id but even that depends.
Oct 27 2023
prev sibling next sibling parent reply Nick Treleaven <nick geany.org> writes:
On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Accessing Elements Of An AA**

 ```d
 string[string] aa;
 aa["name"] = "value";
 writeln (i"The name in s is $(aa[\"name\"])"
 ```

 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
I think that is orthogonal to the DIP (though having to escape the `"` does make it worse). You can have a local reference in D: ```d string[string] aa; ref elem() => aa["name"]; aa["name"] = "value"; // insert the first key writeln("The name in s is ", elem); ``` Note it's not possible to use `elem` to insert the key because `aa["name"]` without `=` is a lookup (not an insertion), even though it is being returned by ref. So `elem = "blah";` only works when the key already exists in `aa`.
Oct 27 2023
parent kdevel <kdevel vogtner.de> writes:
On Friday, 27 October 2023 at 12:22:00 UTC, Nick Treleaven wrote:
 On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Accessing Elements Of An AA**

 ```d
 string[string] aa;
 aa["name"] = "value";
 writeln (i"The name in s is $(aa[\"name\"])"
 ```

 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
I think that is orthogonal to the DIP
Maybe. I am under the impression that there are no valid real usecases for string interpolation. Currently I have on my notepad "HTML", "SQL" and "composition of filesystem paths", none of which I would like to do using string interpolation.
 (though having to escape the `"` does make it worse).
As far as I understood Adam correctly my code must read ```d writeln (i"The name in s is $(aa["name"])"); // ^^^^^^^^^^ ``` There was missing the closing parenthesis and the final semicolon, too. According to Adam in [1] the marked part (^^^) is plain D code, so the quotation marks must not be escaped. [1] https://forum.dlang.org/post/wbcvuejmwircauzgxmdh forum.dlang.org
Oct 27 2023
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Accessing Elements Of An AA**

 ```d
 string[string] aa;
 aa["name"] = "value";
 writeln (i"The name in s is $(aa[\"name\"])"
 ```

 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
You can do this with existing language features: https://run.dlang.io/is/QRFNpg Although I wouldn't really recommend it, since it forces you to write fully-qualified names to access anything that *isn't* an associative-array key.
Oct 28 2023
next sibling parent monkyyy <crazymonkyyy gmail.com> writes:
On Saturday, 28 October 2023 at 13:13:16 UTC, Paul Backus wrote:
 On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Accessing Elements Of An AA**

 ```d
 string[string] aa;
 aa["name"] = "value";
 writeln (i"The name in s is $(aa[\"name\"])"
 ```

 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
You can do this with existing language features: https://run.dlang.io/is/QRFNpg Although I wouldn't really recommend it, since it forces you to write fully-qualified names to access anything that *isn't* an associative-array key.
https://run.dlang.io/is/jyNmOi different solution
Oct 28 2023
prev sibling parent reply kdevel <kdevel vogtner.de> writes:
On Saturday, 28 October 2023 at 13:13:16 UTC, Paul Backus wrote:
 [...]
 That typing is laborious. Isn't there a way to bind the 
 expression to the keys of the AA?
You can do this with existing language features: https://run.dlang.io/is/QRFNpg
Amazing!
 Although I wouldn't really recommend it, since it forces you to 
 write fully-qualified names to access anything that *isn't* an 
 associative-array key.
The simple name (`x`) seems to work: https://run.dlang.io/is/Nzne84 ```d import std.stdio; void main() { string[string] aa; string x = "xxx"; aa["name"] = "Arthur"; aa["quest"] = "seek the Holy Grail"; aa["favoriteColor"] = "blue"; with (aa.keysAsVars) .writefln!"My name is %s, I %s, and my favorite color is %s."( name, quest, x); } struct KeysAsVars(K, V) { V[K] aa; V opDispatch(string key)() => aa[key]; } KeysAsVars!(K, V) keysAsVars(K, V)(V[K] aa) { return typeof(return)(aa); } ```
Nov 01 2023
parent Paul Backus <snarwin gmail.com> writes:
On Wednesday, 1 November 2023 at 22:22:42 UTC, kdevel wrote:
 On Saturday, 28 October 2023 at 13:13:16 UTC, Paul Backus wrote:
 Although I wouldn't really recommend it, since it forces you 
 to write fully-qualified names to access anything that *isn't* 
 an associative-array key.
The simple name (`x`) seems to work: https://run.dlang.io/is/Nzne84
Interesting. It seems like symbols in the current module take precedence over `opDispatch`, but symbols in other modules don't. I wonder if that's intentional?
Nov 01 2023
prev sibling parent JN <666total wp.pl> writes:
On Thursday, 26 October 2023 at 11:18:46 UTC, kdevel wrote:
 **Accessing Elements Of An AA**

 ```d
 string[string] aa;
 aa["name"] = "value";
 writeln (i"The name in s is $(aa[\"name\"])"
 ```
I always thought string interpolation is literally some basic syntax sugar, like: string s = "$foo ${bar+1}"; would be lowered to: string s = foo.to!string ~ " " ~ (bar+1).to!string; I guess it would get problematic when people want to use strings other than the built-in one though.
Nov 01 2023