www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - missing HexString documentation

reply Ralph Doncaster <nerdralph github.com> writes:
It is mentioned in the literals section, but not documented:
https://dlang.org/spec/lex.html#string_literals

 From reading forum posts I managed to figure out that HexStrings 
are prefixed with an x.  i.e. x"deadbeef"
Feb 07
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 It is mentioned in the literals section, but not documented:
 https://dlang.org/spec/lex.html#string_literals
 
  From reading forum posts I managed to figure out that HexStrings are 
 prefixed with an x.  i.e. x"deadbeef"
 
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org -Steve
Feb 07
parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
Schveighoffer wrote:
 On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 It is mentioned in the literals section, but not documented:
 https://dlang.org/spec/lex.html#string_literals
 
  From reading forum posts I managed to figure out that 
 HexStrings are prefixed with an x.  i.e. x"deadbeef"
 
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org -Steve
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals Hence, the grammar has been incompletely updated. As it's not an error to use them now, it should have stated that they are deprecated. Anyhow, you can always go back in time: https://docarchives.dlang.io/v2.078.0/spec/lex.html#HexString
Feb 07
next sibling parent reply Ralph Doncaster <nerdralph github.com> writes:
On Wednesday, 7 February 2018 at 15:41:37 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
 Schveighoffer wrote:
 On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 It is mentioned in the literals section, but not documented:
 https://dlang.org/spec/lex.html#string_literals
 
  From reading forum posts I managed to figure out that 
 HexStrings are prefixed with an x.  i.e. x"deadbeef"
 
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org -Steve
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals Hence, the grammar has been incompletely updated. As it's not an error to use them now, it should have stated that they are deprecated. Anyhow, you can always go back in time: https://docarchives.dlang.io/v2.078.0/spec/lex.html#HexString
Doesn't that go against the idea of -betterC, or will std.conv work with -betterC. p.s. contrary to what the deprecation notice says, hex strings are very often used in crypto/hashing test cases. Most hash specs have example hash strings to verify implementation code.
Feb 07
parent reply Ralph Doncaster <nerdralph github.com> writes:
On Wednesday, 7 February 2018 at 15:54:05 UTC, Ralph Doncaster 
wrote:
 Doesn't that go against the idea of -betterC, or will std.conv 
 work with -betterC.

 p.s. contrary to what the deprecation notice says, hex strings 
 are very often used in crypto/hashing test cases.  Most hash 
 specs have example hash strings to verify implementation code.
As expected, auto data = cast(ubyte[]) x"deadbeef"; works with -betterC, but auto data = cast(ubyte[]) hexString!"deadbeef"; does not.
Feb 07
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 16:03:17 UTC, Ralph Doncaster 
wrote:
 As expected,
 auto data = cast(ubyte[]) x"deadbeef";
 works with -betterC, but
 auto data = cast(ubyte[]) hexString!"deadbeef";
 does not.
That's just because -betterC is buggy and extremely incomplete (this is why I'm so annoyed that it is getting advertised, it is nowhere near ready for use). there's no reason why it shouldn't work once those bugs get fixed.
Feb 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 8:05 AM, Adam D. Ruppe wrote:
 That's just because -betterC is buggy and extremely incomplete
Can you please provide a list of these issues, and file issues that aren't on bugzilla yet, and tag them with the betterC keyword? I see only one: https://issues.dlang.org/buglist.cgi?quicksearch=%5Bbetterc%5D&list_id=219382
Feb 07
parent reply Mike Franklin <slavo5150 yahoo.com> writes:
On Wednesday, 7 February 2018 at 23:30:57 UTC, Walter Bright 
wrote:

 Can you please provide a list of these issues, and file issues 
 that aren't on bugzilla yet, and tag them with the betterC 
 keyword?

 I see only one:

 https://issues.dlang.org/buglist.cgi?quicksearch=%5Bbetterc%5D&list_id=219382
Don't search for "[betterC]". Instead, use "betterC" (without the brackets). https://issues.dlang.org/buglist.cgi?quicksearch=betterc&list_id=219390 We can't reliably rely on informal conventions. Mike
Feb 07
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 4:08 PM, Mike Franklin wrote:
 Don't search for "[betterC]".  Instead, use "betterC" (without the brackets).
 
 https://issues.dlang.org/buglist.cgi?quicksearch=betterc&list_id=219390
 
 We can't reliably rely on informal conventions.
I used the wrong URL. This is the right one (a keyword search, not a text search): https://issues.dlang.org/buglist.cgi?keywords=betterC&list_id=219394&resolution=--- which lists 13 issues. Two of them were missing, and I annotated them with the keyword betterC, so it's 15.
Feb 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 8:03 AM, Ralph Doncaster wrote:
 As expected,
 auto data = cast(ubyte[]) x"deadbeef";
 works with -betterC, but
 auto data = cast(ubyte[]) hexString!"deadbeef";
 does not.
 
When I tried it: import std.conv; void test() { auto data = cast(ubyte[]) hexString!"deadbeef"; } with: dmd -c -betterC test2.d it compiled without complaint. Are you doing something different? (This is why posting complete examples, not snippets, is better. That way I don't have to fill in the blanks with guesswork.)
Feb 07
next sibling parent reply Seb <seb wilzba.ch> writes:
On Thursday, 8 February 2018 at 00:24:22 UTC, Walter Bright wrote:
 On 2/7/2018 8:03 AM, Ralph Doncaster wrote:
 As expected,
 auto data = cast(ubyte[]) x"deadbeef";
 works with -betterC, but
 auto data = cast(ubyte[]) hexString!"deadbeef";
 does not.
 
When I tried it: import std.conv; void test() { auto data = cast(ubyte[]) hexString!"deadbeef"; } with: dmd -c -betterC test2.d it compiled without complaint. Are you doing something different? (This is why posting complete examples, not snippets, is better. That way I don't have to fill in the blanks with guesswork.)
https://run.dlang.io/is/TEJDZO and hit "Run". I also opened a Bugzilla issue, s.t. it doesn't get lost https://issues.dlang.org/show_bug.cgi?id=18395
Feb 07
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 5:03 PM, Seb wrote:
 On Thursday, 8 February 2018 at 00:24:22 UTC, Walter Bright wrote:
 On 2/7/2018 8:03 AM, Ralph Doncaster wrote:
 As expected,
 auto data = cast(ubyte[]) x"deadbeef";
 works with -betterC, but
 auto data = cast(ubyte[]) hexString!"deadbeef";
 does not.
When I tried it:   import std.conv;   void test() {     auto data = cast(ubyte[]) hexString!"deadbeef";   } with:   dmd -c -betterC test2.d it compiled without complaint. Are you doing something different? (This is why posting complete examples, not snippets, is better. That way I don't have to fill in the blanks with guesswork.)
https://run.dlang.io/is/TEJDZO and hit "Run".
I wish people would say "does not link" or "links with undefined symbols" or something more helpful than "does not work" leaving me to guess.
 I also opened a Bugzilla issue, s.t. it doesn't get lost 
 https://issues.dlang.org/show_bug.cgi?id=18395
Feb 07
prev sibling next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 8 February 2018 at 00:24:22 UTC, Walter Bright wrote:
   dmd -c -betterC test2.d
Don't use -c with -betterC when doing tests. The majority of troubles we have are likely to be linker errors (undefined references to missing runtime) and that silences them.
Feb 07
prev sibling parent Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 00:24:22 UTC, Walter Bright wrote:
 On 2/7/2018 8:03 AM, Ralph Doncaster wrote:
 As expected,
 auto data = cast(ubyte[]) x"deadbeef";
 works with -betterC, but
 auto data = cast(ubyte[]) hexString!"deadbeef";
 does not.
 
When I tried it: import std.conv; void test() { auto data = cast(ubyte[]) hexString!"deadbeef"; } with: dmd -c -betterC test2.d it compiled without complaint. Are you doing something different? (This is why posting complete examples, not snippets, is better. That way I don't have to fill in the blanks with guesswork.)
I didn't think it would be that hard to guess I'm trying to make an executable. ralphdoncaster gl1u:~/code/d$ dmd -betterC hex.d hex.o: In function `_D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa': hex.d:(.text._D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa[_D3std4conv__T10hexStrImplTAyaZQrF aNbNfMQoZAa]+0x2e): undefined reference to `_D11TypeInfo_Aa6__initZ' hex.d:(.text._D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa[_D3std4conv__T10hexStrImplTAyaZQrF aNbNfMQoZAa]+0x33): undefined reference to `_d_arraysetlengthiT' hex.d:(.text._D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa[_D3std4conv__T10hexStrImplTAyaZQrF aNbNfMQoZAa]+0x7c): undefined reference to `_D3std5ascii10isHexDigitFNaNbNiNfwZb' hex.d:(.text._D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa[_D3std4conv__T10hexStrImplTAyaZQrFN NbNfMQoZAa]+0x160): undefined reference to `_D11TypeInfo_Aa6__initZ' hex.d:(.text._D3std4conv__T10hexStrImplTAyaZQrFNaNbNfMQoZAa[_D3std4conv__T10hexStrImplTAyaZQrFN NbNfMQoZAa]+0x165): undefined reference to `_d_arraysetlengthiT' collect2: error: ld returned 1 exit status Error: linker exited with status 1 ralphdoncaster gl1u:~/code/d$ cat hex.d import std.conv; extern (C) int main() { //auto data = cast(ubyte[]) x"deadbeef"; auto data = cast(ubyte[]) hexString!"deadbeef"; return cast(int) data[0]; } While the string hex literal version works fine: ralphdoncaster gl1u:~/code/d$ dmd -betterC hex.d ralphdoncaster gl1u:~/code/d$ ./hex ralphdoncaster gl1u:~/code/d$ echo $? 222 ralphdoncaster gl1u:~/code/d$ cat hex.d //import std.conv; extern (C) int main() { auto data = cast(ubyte[]) x"deadbeef"; //auto data = cast(ubyte[]) hexString!"deadbeef"; return cast(int) data[0]; }
Feb 07
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/7/18 10:41 AM, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven Schveighoffer wrote:
 On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 It is mentioned in the literals section, but not documented:
 https://dlang.org/spec/lex.html#string_literals

  From reading forum posts I managed to figure out that HexStrings are 
 prefixed with an x.  i.e. x"deadbeef"
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals
Wow, that's... a little superfluous. So we support this: "\xde\xad\xbe\xef" but not this? x"deadbeef" Seems like the same code you would need to parse the first is reusable for the second, no? I don't see why this deprecation was necessary, and now we have more library/template baggage. -Steve
Feb 07
next sibling parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 7 February 2018 at 16:03:36 UTC, Steven 
Schveighoffer wrote:
 On 2/7/18 10:41 AM, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
 Schveighoffer wrote:
 On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 It is mentioned in the literals section, but not documented:
 https://dlang.org/spec/lex.html#string_literals

  From reading forum posts I managed to figure out that 
 HexStrings are prefixed with an x.  i.e. x"deadbeef"
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals
Wow, that's... a little superfluous. So we support this: "\xde\xad\xbe\xef" but not this? x"deadbeef" Seems like the same code you would need to parse the first is reusable for the second, no? I don't see why this deprecation was necessary, and now we have more library/template baggage. -Steve
For the same reason why octal literals have been deprecated years ago: https://dlang.org/deprecate.html#Octal%20literals The library solution works as well and it's one of the features that are rarely used and add up to the steep learning curve.
Feb 07
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 16:51:02 UTC, Seb wrote:
 For the same reason why octal literals have been deprecated 
 years ago:

 https://dlang.org/deprecate.html#Octal%20literals

 The library solution works as well and it's one of the features 
 that are rarely used and add up to the steep learning curve.
That's actually not the reason given. Octal literals had the stupid leading 0. We should have just made it 0o instead. The library solution does not work just as well, since it doesn't work at all in some places. Behold: http://dpldocs.info/experimental-docs/source/core.sys.posix.fcntl.d.html#L123 version (X86) { enum O_CREAT = 0x40; // octal 0100 enum O_EXCL = 0x80; // octal 0200 enum O_NOCTTY = 0x100; // octal 0400 enum O_TRUNC = 0x200; // octal 01000 That's from druntime. The comments being there indicate the hex is not obvious in this context; the octal would be more illustrative. But the lack of use of std.conv shows it wasn't applicable where the literal was (since this is druntime, phobos isn't available). The octal library solution is brilliant. The genius who wrote that code is clearly god-like and we should all fall to our knees and worship his superior intellect. That pattern DOES have uses. But for octal? It was a mistake. We should have just made it 0o. Similarly, I think the mistake of hex strings is that they are typed char[] instead of ubyte[]. Otherwise... they work ok. And when learning, you don't need to know every bit. You'd just ignore it unless you hit upon the niche where it matters. (that's the way I learned basically all of D. my early D code is virtually identical to my C code, a bit later, similar to old style Java code. only after being in it for a while did i go nuts mastering the language.)
Feb 07
next sibling parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 7 February 2018 at 17:01:54 UTC, Adam D. Ruppe 
wrote:
 The octal library solution is brilliant. The genius who wrote 
 that code is clearly god-like and we should all fall to our 
 knees and worship his superior intellect. That pattern DOES 
 have uses.
Octal predates GitHub, hexString is new: https://github.com/dlang/phobos/pull/3133
Feb 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 17:36:56 UTC, Seb wrote:
 Octal predates GitHub, hexString is new:
Yes, I know, I was there :) Heck, in the hexString forum thread, I argued that people knowing this pattern is really useful because then they can do all kinds of custom literals like stripping hexdumps. Back in the octal days, I was thinking we should replace several literals with the new pattern and do more user-defined stuff. Notice who is cited in this old article http://www.drdobbs.com/tools/user-defined-literals-in-the-d-programmi/229401068 But, in the years since, I've changed my mind... somewhat. The pattern is still good and it being customizable is awesome, but it is a minor hassle and even that minor hassle has hurt the use in practice, like in the druntime examples. We can do user defined literals for base X if we need more. But since octal is used by operating system apis and that'd under phobos... the phobos solution isn't great. Hex strings I think are going to be basically the same in time. The library artifact will sit there, unused.
Feb 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 12:13 PM, Adam D. Ruppe wrote:
 even that minor hassle has hurt the use in practice, like in the druntime
examples.
hexString is in Phobos, and druntime can't use Phobos.
Feb 07
parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Feb 07, 2018 at 04:11:19PM -0800, Walter Bright via Digitalmars-d wrote:
 On 2/7/2018 12:13 PM, Adam D. Ruppe wrote:
 even that minor hassle has hurt the use in practice, like in the
 druntime examples.
hexString is in Phobos, and druntime can't use Phobos.
Should templates like octal and hexString be in druntime instead? T -- There are 10 kinds of people in the world: those who can count in binary, and those who can't.
Feb 07
next sibling parent Mike Franklin <slavo5150 yahoo.com> writes:
On Thursday, 8 February 2018 at 00:25:21 UTC, H. S. Teoh wrote:

 hexString is in Phobos, and druntime can't use Phobos.
Should templates like octal and hexString be in druntime instead?
IMO, no. I think the interdependencies between the compiler, druntime, phobos, and even the packages contained within needs some remodeling. I posted some of my initial thoughts here: https://forum.dlang.org/post/wvmgimzlwuwywxhhyhpi forum.dlang.org Mike
Feb 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 4:25 PM, H. S. Teoh wrote:
 Should templates like octal and hexString be in druntime instead?
No, because their usage by druntime is nearly nonexistent.
Feb 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 8 February 2018 at 01:55:19 UTC, Walter Bright wrote:
 No, because their usage by druntime is nearly nonexistent.
Only because they're not supported! Code like `0xsomething // octal something else` is found a whopping 200 times in druntime (granted btw all in the core.sys bindings). By contrast, the word "octal" only occurs 100 times through all of Phobos, and the octal template is used only 15 times, excluding its own unit tests.
Feb 07
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 08, 2018 at 02:39:50AM +0000, Adam D. Ruppe via Digitalmars-d wrote:
 On Thursday, 8 February 2018 at 01:55:19 UTC, Walter Bright wrote:
 No, because their usage by druntime is nearly nonexistent.
Only because they're not supported! Code like `0xsomething // octal something else` is found a whopping 200 times in druntime (granted btw all in the core.sys bindings).
I'm guessing most of those occurrences are in interfacing with Posix (or other OS) calls involving bitmasks, like umask().
 By contrast, the word "octal" only occurs 100 times through all of
 Phobos, and the octal template is used only 15 times, excluding its
 own unit tests.
Ironically, octal literals are probably most used in OS API calls like umask(), so .octal really should be in druntime rather than Phobos! T -- What are you when you run out of Monet? Baroque.
Feb 07
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 6:39 PM, Adam D. Ruppe wrote:
 On Thursday, 8 February 2018 at 01:55:19 UTC, Walter Bright wrote:
 No, because their usage by druntime is nearly nonexistent.
Only because they're not supported! Code like `0xsomething // octal something else` is found a whopping 200 times in druntime (granted btw all in the core.sys bindings).
Nearly all of that is in 3 files, and most are copy/pasta of the same groups lines for different systems. I didn't find any uses of x"string" at all, or my grep-fu is wanting.
Feb 07
prev sibling next sibling parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/7/18 12:01 PM, Adam D. Ruppe wrote:
 On Wednesday, 7 February 2018 at 16:51:02 UTC, Seb wrote:
 For the same reason why octal literals have been deprecated years ago:

 https://dlang.org/deprecate.html#Octal%20literals
Not even close. Octal literals are a disaster, because putting a leading 0 should never change the base of a number. Basically, causing bugs everywhere for a small corner case in the real world. The octal literal library solution is good, and it's fine to have something in the library for this, as octal values are extremely rare to need. But in this case, there is no ambiguity. x"..." is not obvious syntax for anything else. Not only that, but the code to parse hex data into a string is still in there for "\x..." So we didn't even remove anything significant.
 The library solution works as well and it's one of the features that 
 are rarely used and add up to the steep learning curve.
How so? If you see a hex string literal, you look it up, and now your learning curve is over.
 That's actually not the reason given. Octal literals had the stupid 
 leading 0. We should have just made it 0o instead.
This has its own problems (e.g. 0O), but definitely would have solved the issue. However, octal numbers are way less common than strings of hexadecimal data bytes. The difference for me isn't how the problem is solved, but that there was a problem for octals (error prone sinister errors) but there isn't/wasn't one for hex strings. Not only that, but the removal from the language doesn't really buy us any savings in the compiler. It's basically removing things for the sake of removing them.
 Similarly, I think the mistake of hex strings is that they are typed 
 char[] instead of ubyte[]. Otherwise... they work ok.
Yes, they would be better as ubyte[], but this problem is not the end of the world. I don't consider it the same level as thinking 012 is 12. -Steve
Feb 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 18:59:38 UTC, Steven 
Schveighoffer wrote:
 Not even close. Octal literals are a disaster, because putting 
 a leading 0 should never change the base of a number.
I agree the leading 0 is terrible. But that's not the real question here: it is 0o100 vs import std.conv. Note it isn't the syntax - octal!100 is quite nice to me - but rather the requirement to import. That is why it isn't used in druntime... and low level code interfacing with external OS or hardware APIs are the most common place for octal, and also where we can't use it. I fear hex will fall into the same pit.
 This has its own problems (e.g. 0O)
That's why I specifically wrote `0o`. I wouldn't allow `0O`, just like D doesn't allow `1l`: "Error: lower case integer suffix 'l' is not allowed. Please use 'L' instead"
 The difference for me isn't how the problem is solved, but that 
 there was a problem for octals (error prone sinister errors) 
 but there isn't/wasn't one for hex strings.
You and I are on the same side :) I also think they should stay (I just want to see them retyped as immutable(ubyte)[] instead of immutable(char)[], we always cast anyway). I'd repurpose the library hexString to actually read in hex dump, stripping offsets, etc, off. Demonstrate that you can strip other stuff from the string with CTFE as an example of what we can do so people can customize that (that's a big advantage of the function over the literal btw, you can feed stuff through ctfe modifier functions before it is parsed. Can't do that with a literal!) But also keep the x"" literal for the simple cases we already have.
Feb 07
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/7/18 3:24 PM, Adam D. Ruppe wrote:
 On Wednesday, 7 February 2018 at 18:59:38 UTC, Steven Schveighoffer wrote:
 Not even close. Octal literals are a disaster, because putting a 
 leading 0 should never change the base of a number.
I agree the leading 0 is terrible. But that's not the real question here: it is 0o100 vs import std.conv. Note it isn't the syntax - octal!100 is quite nice to me - but rather the requirement to import. That is why it isn't used in druntime... and low level code interfacing with external OS or hardware APIs are the most common place for octal, and also where we can't use it. I fear hex will fall into the same pit.
So you think it should go into druntime? I don't see why it wasn't in there in the first place to be honest. But there is no "decision" on whether to import or not, it's not possible in druntime to import from phobos. So saying the lack of use of octal in druntime is somehow a detraction on the import is incorrect. If you could have imported std.conv in druntime, it would have been done.
 This has its own problems (e.g. 0O)
That's why I specifically wrote `0o`. I wouldn't allow `0O`, just like D doesn't allow `1l`: "Error: lower case integer suffix 'l' is not allowed. Please use 'L' instead"
I'm still not in love with the little-o syntax, but this definitely would be necessary.
 The difference for me isn't how the problem is solved, but that there 
 was a problem for octals (error prone sinister errors) but there 
 isn't/wasn't one for hex strings.
You and I are on the same side :) I also think they should stay (I just want to see them retyped as immutable(ubyte)[] instead of immutable(char)[], we always cast anyway).
To me, it is a shortcut for specifying hex for every character. The cast isn't that horrible, and probably can be abstracted away into a function if you want. -Steve
Feb 08
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 8 February 2018 at 13:06:44 UTC, Steven 
Schveighoffer wrote:
 So you think it should go into druntime? I don't see why it 
 wasn't in there in the first place to be honest.
Yeah, probably. I might even publically import it when you import the posix header so it just works in the most common place. Of course, it is important then that the compile-time thing doesn't cause a link time error when you just import and don't compile it in.... but that should be the case anyway (and the other posts in this thread show Walter is working on that so yay)
 If you could have imported std.conv in druntime, it would have 
 been done.
That's my point. We keep clashing despite being on the same side! When I say the import is the problem, I don't mean the syntax or literal line of code. I mean the whole concept of depending on the Phobos module and all the stuff that brings. druntime can't have that dependency. Neither can a few other specialized low-level cases. And specialized low-level cases are where you find 95% of octal literals. (well ok 50% of octal literals, where the other 50% are bugs cuz someone wrote 010 to line up leading zeros... )
Feb 08
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/8/18 9:44 AM, Adam D. Ruppe wrote:
 On Thursday, 8 February 2018 at 13:06:44 UTC, Steven Schveighoffer wrote:
 So you think it should go into druntime? I don't see why it wasn't in 
 there in the first place to be honest.
Yeah, probably. I might even publically import it when you import the posix header so it just works in the most common place. Of course, it is important then that the compile-time thing doesn't cause a link time error when you just import and don't compile it in.... but that should be the case anyway (and the other posts in this thread show Walter is working on that so yay)
 If you could have imported std.conv in druntime, it would have been done.
That's my point. We keep clashing despite being on the same side!
Your statement before: "it is 0o100 vs import std.conv" and "That is why it isn't used in druntime" I thought it meant there was some sort of decision made to not use the import because it would be too costly. But really, there was no decision to be made. Sorry about the misunderstanding!
 When I say the import is the problem, I don't mean the syntax or literal 
 line of code. I mean the whole concept of depending on the Phobos module 
 and all the stuff that brings. druntime can't have that dependency. 
 Neither can a few other specialized low-level cases. And specialized 
 low-level cases are where you find 95% of octal literals. (well ok 50% 
 of octal literals, where the other 50% are bugs cuz someone wrote 010 to 
 line up leading zeros... )
I agree, you could implement the octal template in druntime without too much issue. The octal!"100" would have been easy-to-parse, the octal!100 version would be more difficult, but nothing impossible that requires the whole of phobos to do so. My concern in the hexString case is the sheer requirement of CTFE for something that is so easy to do in the compiler, already *done* in the compiler, and has another form specifically for hex strings (the "\xde\xad\xbe\xef" form) that isn't going away. It makes me laugh actually that Walter is now replacing the implementation with a mixin of that other form, incurring all the cost of CTFE so you can transform the string, while breaking existing code in the process: https://github.com/dlang/phobos/pull/6138 -Steve
Feb 08
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/8/2018 7:07 AM, Steven Schveighoffer wrote:
 My concern in the hexString case is the sheer requirement of CTFE for
something 
 that is so easy to do in the compiler, already *done* in the compiler, and has 
 another form specifically for hex strings (the "\xde\xad\xbe\xef" form) that 
 isn't going away. It makes me laugh actually that Walter is now replacing the 
 implementation with a mixin of that other form, incurring all the cost of CTFE 
 so you can transform the string, while breaking existing code in the process: 
 https://github.com/dlang/phobos/pull/6138
The breakage was due to the original implementation of hexString not producing a string literal like "abc", but producing an array literal like ['a', 'b', 'c'], which was not what the documentation said it did. And naturally, some uses wound up relying on the array behavior. What the PR does is fix hexString so that hexString!"deadbeef" rewrites it to the string literal "\xde\xad\xbe\xef". It's classic "lowering". Isn't it amazing that D can even do this? Simplifying the compiler and pushing things off into the library makes the compiler and spec smaller and less buggy. It also has the nice feature of providing a simple path for anyone who wants to write their own custom string syntax, such as EBCDIC string literals (!).
Feb 08
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/8/18 1:25 PM, Walter Bright wrote:
 On 2/8/2018 7:07 AM, Steven Schveighoffer wrote:
 My concern in the hexString case is the sheer requirement of CTFE for 
 something that is so easy to do in the compiler, already *done* in the 
 compiler, and has another form specifically for hex strings (the 
 "\xde\xad\xbe\xef" form) that isn't going away. It makes me laugh 
 actually that Walter is now replacing the implementation with a mixin 
 of that other form, incurring all the cost of CTFE so you can 
 transform the string, while breaking existing code in the process: 
 https://github.com/dlang/phobos/pull/6138
The breakage was due to the original implementation of hexString not producing a string literal like "abc", but producing an array literal like ['a', 'b', 'c'], which was not what the documentation said it did. And naturally, some uses wound up relying on the array behavior.
"abc" is an array (it's an immutable(char)[]). There's no reason why ['a','b','c'] should be different than "abc" (other than the hidden null character, which is irrelevant here). Perhaps the fact that using a string rather than an array causes code to fail should be addressed?
 
 What the PR does is fix hexString so that hexString!"deadbeef" rewrites 
 it to the string literal "\xde\xad\xbe\xef". It's classic "lowering". 
 Isn't it amazing that D can even do this?
It's great that D has this power, and would be really useful if D's language didn't already have a way to do this in a builtin way.
 Simplifying the compiler and pushing things off into the library makes 
 the compiler and spec smaller and less buggy. It also has the nice 
 feature of providing a simple path for anyone who wants to write their 
 own custom string syntax, such as EBCDIC string literals (!).
How can this be a huge simplification? I mean you already have code that parses hex characters in a string array, all you need is one flag that assumes all character pairs have been preceded by \x. I think this will save probably 4 or 5 lines of code? It also doesn't preclude at all someone writing library code to make their own custom string syntax. -Steve
Feb 08
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/8/2018 10:42 AM, Steven Schveighoffer wrote:
 On 2/8/18 1:25 PM, Walter Bright wrote:
 "abc" is an array (it's an immutable(char)[]). There's no reason why 
 ['a','b','c'] should be different than "abc" (other than the hidden null 
 character, which is irrelevant here).
['a','b','c'] is mutable, a string literal is immutable.
 Perhaps the fact that using a string rather than an array causes code to fail 
 should be addressed?
That would be a language change proposal or bug report. By all means, please do so.
 How can this be a huge simplification? I mean you already have code that
parses 
 hex characters in a string array, all you need is one flag that assumes all 
 character pairs have been preceded by \x. I think this will save probably 4 or
5 
 lines of code?
hexStringConstant() was 79 lines of code, not including comments and blank lines. I also showed how: x"deadbeef" can be replaced with: hexString!"deadbeef" with no overhead. If you hate typing hexString, you can always write: alias x = hexstring; and then you've got: x"deadbeef" x!"deadbeef" which seems an inconsequential difference. (The generated code is the same.)
 It also doesn't preclude at all someone writing library code to make their own 
 custom string syntax.
You're right it doesn't. But people don't do it, because it is neither obvious that D can do such a thing (it relies on a combination of features) nor is it obvious how to do it correctly (as the earlier hexString implementation shows and nobody seemed able to fix it but me). What Phobos provides is working, professional quality code that should serve as a user resource for "how to do things and how to do them right". I.e. having hexString as a library function is a good advertisement for what D can do. After all, how many languages can do this sort of thing?
Feb 08
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/8/18 3:49 PM, Walter Bright wrote:
 On 2/8/2018 10:42 AM, Steven Schveighoffer wrote:
 On 2/8/18 1:25 PM, Walter Bright wrote:
 "abc" is an array (it's an immutable(char)[]). There's no reason why 
 ['a','b','c'] should be different than "abc" (other than the hidden 
 null character, which is irrelevant here).
['a','b','c'] is mutable, a string literal is immutable.
OK. alias IC = immutable char; ubyte[3] x = [IC('a'), IC('b'), IC('c')]; works just fine.
 
 
 Perhaps the fact that using a string rather than an array causes code 
 to fail should be addressed?
That would be a language change proposal or bug report. By all means, please do so.
https://issues.dlang.org/show_bug.cgi?id=18420
 How can this be a huge simplification? I mean you already have code 
 that parses hex characters in a string array, all you need is one flag 
 that assumes all character pairs have been preceded by \x. I think 
 this will save probably 4 or 5 lines of code?
hexStringConstant() was 79 lines of code, not including comments and blank lines.
My mistake, I assumed the code to parse hex digits would be reused between both string parsing with \x escapes and the hex string parser. I also notice that hex strings are not simply equivalent to strings with \x in them -- the latter is more constrained, as it must be a pair of hex digits per \x. hex strings allow spaces between them.
 I also showed how:
 
     x"deadbeef"
 
 can be replaced with:
 
     hexString!"deadbeef"
 
 with no overhead. 
I wouldn't call invoking CTFE "no overhead" I tested it out, and generating a hex string of about 600 bytes took 3x as long as using builtin hex strings.
 If you hate typing hexString, you can always write:
 
     alias x = hexstring;
 
 and then you've got:
 
     x"deadbeef"
     x!"deadbeef"
 
 which seems an inconsequential difference. (The generated code is the 
 same.)
Again, this is about the compile time penalty.
 It also doesn't preclude at all someone writing library code to make 
 their own custom string syntax.
You're right it doesn't. But people don't do it, because it is neither obvious that D can do such a thing (it relies on a combination of features)
This isn't really about having hexString in phobos, I think it's fine to have it, even if it's redundant, since it can be more customized than a builtin language feature. All I was saying is that the language feature and the library function are not mutually exclusive.
 nor is it obvious how to do it correctly (as the earlier 
 hexString implementation shows and nobody seemed able to fix it but me).
Well, nobody asked :) Besides, it's still not "fixed", as it has the same poor performance as the previous version. And the new version has broken existing code. What the update shows is that you have to jump through incredible hoops to get the compiler not to include your compile-time only generation code in the resulting binary.
 What Phobos provides is working, professional quality code that should 
 serve as a user resource for "how to do things and how to do them right".
It worked before, pretty much at the same performance as it does now, the mitigating features (using string literals instead of array literals, splitting the implementation into hand written functions to avoid the D template penalty) are a good demonstration at how much work we have to do still on the CTFE front.
 I.e. having hexString as a library function is a good advertisement for 
 what D can do. After all, how many languages can do this sort of thing?
And nothing has changed here, it's still a library function, as it was before. I agree, it's great to have the ability to do library functions that can do compiler features. But if you already have the compiler feature, I don't see why we should remove it because a slower library version exists. If we did not have the feature in the language, and we were talking about adding it, I'd totally be on the other side. In fact, it's a motivating factor to make CTFE code compile faster as it takes away arguments of adding more things to the compiler. -Steve
Feb 11
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/11/2018 6:09 AM, Steven Schveighoffer wrote:
 On 2/8/18 3:49 PM, Walter Bright wrote:
 That would be a language change proposal or bug report. By all means, please 
 do so.
https://issues.dlang.org/show_bug.cgi?id=18420
Good!
 I also notice that hex strings are not simply equivalent to strings with \x in 
 them -- the latter is more constrained, as it must be a pair of hex digits per 
 \x. hex strings allow spaces between them.
The idea was to be able to cut&paste text from things like hex dumps, which include whitespace formatting.
 I wouldn't call invoking CTFE "no overhead"
It is no overhead in the generated code.
 I tested it out, and generating a hex string of about 600 bytes took 3x as
long 
 as using builtin hex strings.
That's only a potential issue if you've got a very, very large number of hex strings. And if you do, those strings can be put in a separate module and compiled separately.
 Again, this is about the compile time penalty.
Ok.
 Well, nobody asked :) Besides, it's still not "fixed", as it has the same poor 
 performance as the previous version. And the new version has broken existing
code.
It didn't break code that used x"deadbeef", it broke code that used the broken hexString.
 What the update shows is that you have to jump through incredible hoops to get 
 the compiler not to include your compile-time only generation code in the 
 resulting binary.
With a language that supports both templates and separate compilation, this will always be an issue. The solution here is not "incredible", it is just not obvious.
 And nothing has changed here, it's still a library function, as it was before.
What's changed is it works now with -betterC, and it doesn't produce bloat in the executable.
 But if you already have the compiler feature, I don't see why 
 we should remove it because a slower library version exists.
It was not an arbitrary and capricious decision, and the rationale behind it was presented here multiple times. If you are not convinced, that's cool, but the "why" should be pretty clear.
 In fact, it's a motivating factor to make 
 CTFE code compile faster as it takes away arguments of adding more things to
the 
 compiler.
I agree that speeding up CTFE will make it more useful.
Feb 11
parent Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/11/18 4:48 PM, Walter Bright wrote:
 I also notice that hex strings are not simply equivalent to strings 
 with \x in them -- the latter is more constrained, as it must be a 
 pair of hex digits per \x. hex strings allow spaces between them.
The idea was to be able to cut&paste text from things like hex dumps, which include whitespace formatting.
I've never seen a hex dump where the individual nibbles were separated by spaces in odd ways. In other words, what I was saying is that: "\x12\x34" could be written as: x"1 23 4" which is... odd. What it does is make it so you can't reuse the parsing code inside the string escape processor to handle the hex string, necessitating an extra 80-line function.
 I wouldn't call invoking CTFE "no overhead"
It is no overhead in the generated code.
It's overhead that adds up, memory and time-wise. Really, the memory concerns of using CTFE are a bigger problem than the extra tenths of a second of compile time.
 
 I tested it out, and generating a hex string of about 600 bytes took 
 3x as long as using builtin hex strings.
That's only a potential issue if you've got a very, very large number of hex strings. And if you do, those strings can be put in a separate module and compiled separately.
Or a very large hex string (very feasible). But very true that there are mitigating methods that can be used.
 Well, nobody asked :) Besides, it's still not "fixed", as it has the 
 same poor performance as the previous version. And the new version has 
 broken existing code.
It didn't break code that used x"deadbeef", it broke code that used the broken hexString.
In the past, we have not broken code, even when it depends on known bugs, if we can help it. But maybe if we can fix the bug I filed above, it won't matter.
 What the update shows is that you have to jump through incredible 
 hoops to get the compiler not to include your compile-time only 
 generation code in the resulting binary.
With a language that supports both templates and separate compilation, this will always be an issue.
Essentially, you have instantiated the template eagerly, which kind of, sort of, defeats the purpose of a template. Though, you still do get the benefit of code generation, it's just not an "open-ended" template that you can instantiate with any type. Perhaps we should be using this pattern all throughout phobos where strings are involved, since there are ever only 3 types you instantiate with.
 The solution here is not "incredible", it is just not obvious.
The solution isn't incredible, but the fact that this solution is the only way to get the CTFE-only code not to creep into your object file is a bit dissatisfying. You would think the linker/compiler would not inject the unused function into the object file without having to do this.
 And nothing has changed here, it's still a library function, as it was 
 before.
What's changed is it works now with -betterC, and it doesn't produce bloat in the executable.
I think this is due to functions-that-aren't-used being included. In other words, there was nothing inherent in the old library code that created a requirement for druntime to be included. The bloat is also a deficiency of the compiler, not the code itself.
 But if you already have the compiler feature, I don't see why we 
 should remove it because a slower library version exists.
It was not an arbitrary and capricious decision, and the rationale behind it was presented here multiple times. If you are not convinced, that's cool, but the "why" should be pretty clear.
I missed the discussion (there are times where I can't pay attention to D for a month or so due to being busy). But in any case, sure I understand the "why", but the cost/benefit for me was not high enough, and in some aspects, it is all cost, no benefit. In any case, it isn't a decision that needs to be reversed, as there is a workable solution in the library, even if it's sub-optimal. I just think it's not as beneficial as has been reported. -Steve
Feb 13
prev sibling next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 02/07/2018 09:01 AM, Adam D. Ruppe wrote:

 But for octal? It was a mistake. We should have just made it 0o.
It sounds so natural. I forgot; what was the argument against it? Ali
Feb 07
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Wednesday, 7 February 2018 at 19:38:35 UTC, Ali Çehreli wrote:
 It sounds so natural. I forgot; what was the argument against 
 it?
0o was denied basically just because we felt it wasn't necessary to have in the language at all; that it was rare enough and the library *can* do it, so the library *should* do it. And at the time, I totally agreed! And in some cases, I still do - I think D programmers ought to know the technique so they can use it for their own niches. Just in the years since, we see `0x40; // octal 0100` instead of `octal!100` since the cost of the library import is higher than the cost of converting by hand to hex or binary, which are still built into the language.
Feb 07
prev sibling parent reply Kagamin <spam here.lot> writes:
On Wednesday, 7 February 2018 at 17:01:54 UTC, Adam D. Ruppe 
wrote:
 http://dpldocs.info/experimental-docs/source/core.sys.posix.fcntl.d.html#L123

    version (X86)
     {
         enum O_CREAT        = 0x40;     // octal     0100
         enum O_EXCL         = 0x80;     // octal     0200
         enum O_NOCTTY       = 0x100;    // octal     0400
         enum O_TRUNC        = 0x200;    // octal    01000
Dunno, hex reads better here. Octal is only good for unix permissions which are grouped by 3 bits, which is not the case for io constants - these are usual ungrouped flags that are always done with hex and are easier to understand in hex. If you're desperate, octal can be also written as (1<<6)|(2<<3)|(4).
Feb 08
parent reply Kagamin <spam here.lot> writes:
Or have a function specifically for unix permissions, like
int unix(int r, int w, int x, int gr, int gw, int gx, int ur, int 
uw, int ux);
It might be even more readable.
Feb 08
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Thursday, 8 February 2018 at 10:52:35 UTC, Kagamin wrote:
 Or have a function specifically for unix permissions, like
 int unix(int r, int w, int x, int gr, int gw, int gx, int ur, 
 int uw, int ux);
 It might be even more readable.
I actually personally prefer binary: 0b_1_111_101_000 which visually corresponds with ls's output: drwxr-xr-x. But octal is the way they are usually done in C. The comments in druntime are because that had to be translated from the common convention.
Feb 08
prev sibling parent reply Ralph Doncaster <nerdralph github.com> writes:
On Wednesday, 7 February 2018 at 16:51:02 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 16:03:36 UTC, Steven 
 Schveighoffer wrote:
 Seems like the same code you would need to parse the first is 
 reusable for the second, no? I don't see why this deprecation 
 was necessary, and now we have more library/template baggage.

 -Steve
For the same reason why octal literals have been deprecated years ago: https://dlang.org/deprecate.html#Octal%20literals The library solution works as well and it's one of the features that are rarely used and add up to the steep learning curve.
I, like Steve, disagree. Coming from c/c++ (and some Java), this was really simple to understand: x"deadbeef" While this took a lot more time to understand: hexString!"deadbeef" For hexString, I had to understand that ! is for function template instantiation, and I also had to find out what library to import.
Feb 07
parent reply Ralph Doncaster <nerdralph github.com> writes:
On Wednesday, 7 February 2018 at 19:25:37 UTC, Ralph Doncaster 
wrote:
 On Wednesday, 7 February 2018 at 16:51:02 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 16:03:36 UTC, Steven 
 Schveighoffer wrote:
 Seems like the same code you would need to parse the first is 
 reusable for the second, no? I don't see why this deprecation 
 was necessary, and now we have more library/template baggage.

 -Steve
For the same reason why octal literals have been deprecated years ago: https://dlang.org/deprecate.html#Octal%20literals The library solution works as well and it's one of the features that are rarely used and add up to the steep learning curve.
I, like Steve, disagree. Coming from c/c++ (and some Java), this was really simple to understand: x"deadbeef" While this took a lot more time to understand: hexString!"deadbeef" For hexString, I had to understand that ! is for function template instantiation, and I also had to find out what library to import.
I just did a quick check, and with DMD v2.078.1, the hexString template increases code size by ~300 bytes vs the hex literal. So yet one more reason to prefer the hex literals.
Feb 07
next sibling parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Feb 07, 2018 at 07:29:10PM +0000, Ralph Doncaster via Digitalmars-d
wrote:
[...]
 I just did a quick check, and with DMD v2.078.1, the hexString
 template increases code size by ~300 bytes vs the hex literal.  So yet
 one more reason to prefer the hex literals.
Arguably, this is a QoI issue. We seriously need to take a closer look at the current implementation of templates and consider how to improve it. There is definitely plenty of room for improvement. T -- Computers are like a jungle: they have monitor lizards, rams, mice, c-moss, binary trees... and bugs.
Feb 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 11:29 AM, Ralph Doncaster wrote:
 I just did a quick check, and with DMD v2.078.1, the hexString template 
 increases code size by ~300 bytes vs the hex literal. So yet one more reason
to 
 prefer the hex literals.
Indeed it does, and that is the result of a poor implementation of hexString. I've figured out how to fix that, and hope to make a PR for it shortly. https://issues.dlang.org/show_bug.cgi?id=18397
Feb 07
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Wed, Feb 07, 2018 at 05:53:43PM -0800, Walter Bright via Digitalmars-d wrote:
 On 2/7/2018 11:29 AM, Ralph Doncaster wrote:
 I just did a quick check, and with DMD v2.078.1, the hexString
 template increases code size by ~300 bytes vs the hex literal. So
 yet one more reason to prefer the hex literals.
Indeed it does, and that is the result of a poor implementation of hexString. I've figured out how to fix that, and hope to make a PR for it shortly. https://issues.dlang.org/show_bug.cgi?id=18397
The bug report didn't explain what exactly in the implementation wasn't done right. :-/ Another data point: instantiating 10000 hex literals causes compilation time to bloat to 10 seconds. While I'm not saying we should expect user code to have so many hex literals, the point is that that's unacceptably slow for D, given our motto of fast-this and fast-that. T -- Recently, our IT department hired a bug-fix engineer. He used to work for Volkswagen.
Feb 07
parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 6:39 PM, H. S. Teoh wrote:
 and hope to make a PR forit shortly.

    https://issues.dlang.org/show_bug.cgi?id=18397
The bug report didn't explain what exactly in the implementation wasn't done right. :-/
The PR does. https://github.com/dlang/phobos/pull/6138
 Another data point: instantiating 10000 hex literals causes compilation
 time to bloat to 10 seconds.  While I'm not saying we should expect user
 code to have so many hex literals, the point is that that's unacceptably
 slow for D, given our motto of fast-this and fast-that.
Try it with the new PR.
Feb 07
prev sibling parent reply Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 01:53:43 UTC, Walter Bright wrote:
 On 2/7/2018 11:29 AM, Ralph Doncaster wrote:
 I just did a quick check, and with DMD v2.078.1, the hexString 
 template increases code size by ~300 bytes vs the hex literal. 
 So yet one more reason to prefer the hex literals.
Indeed it does, and that is the result of a poor implementation of hexString. I've figured out how to fix that, and hope to make a PR for it shortly. https://issues.dlang.org/show_bug.cgi?id=18397
While the fix is a huge improvement, it doesn't match the code generated by the hex literals. hexString!"deadbeef" stores the null-terminated string in the data section of the object file, while x"deadbeef" only stores 4 bytes in the data section.
Feb 07
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/7/2018 9:45 PM, Ralph Doncaster wrote:
 While the fix is a huge improvement, it doesn't match the code generated by 
the hex literals. hexString!"deadbeef" stores the null-terminated string in the data section of the object file, while x"deadbeef" only stores 4 bytes in the data section. string s = x"deadbeef"; stores a null terminated string, too. If you want only 4 bytes, __gshared immutable char[4] = hexString!"deadbeef"; just as you'd do for any string literal.
Feb 07
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/8/18 1:10 AM, Walter Bright wrote:
 On 2/7/2018 9:45 PM, Ralph Doncaster wrote:
  > While the fix is a huge improvement, it doesn't match the code 
 generated by the hex literals.  hexString!"deadbeef" stores the 
 null-terminated string in the data section of the object file, while 
 x"deadbeef" only stores 4 bytes in the data section.
 
    string s = x"deadbeef";
 
 stores a null terminated string, too.
 
 If you want only 4 bytes,
 
    __gshared immutable char[4] = hexString!"deadbeef";
 
 just as you'd do for any string literal.
The extra data in the object file comes from the inclusion of the hexStringImpl function, and from the template parameter (the symbol _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in there as well, which will always be larger than the actual string passed to hexString). I also see the data in there twice for some reason. -Steve
Feb 08
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Thu, Feb 08, 2018 at 08:26:03AM -0500, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 The extra data in the object file comes from the inclusion of the
 hexStringImpl function, and from the template parameter (the symbol
 _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in there as
 well, which will always be larger than the actual string passed to
 hexString).
[...] This is one area that really should be improved. Is there some easy way in the compiler to mark a template function as "only used in CTFE", and not emit it into the object file if there are no other runtime references to it? I'm thinking of some kind of boolean attribute that defaults to false, and gets set if the function is referenced by runtime code. During codegen, any function that doesn't have this attribute set will be skipped over. My speculation is that this would lead to a good amount of reduction in template bloat, given how pervasively CTFE is used in Phobos (and idiomatic D in general). T -- He who does not appreciate the beauty of language is not worthy to bemoan its flaws.
Feb 08
parent Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 17:06:55 UTC, H. S. Teoh wrote:
 On Thu, Feb 08, 2018 at 08:26:03AM -0500, Steven Schveighoffer 
 via Digitalmars-d wrote: [...]
 The extra data in the object file comes from the inclusion of 
 the hexStringImpl function, and from the template parameter 
 (the symbol 
 _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in 
 there as well, which will always be larger than the actual 
 string passed to hexString).
[...] This is one area that really should be improved. Is there some easy way in the compiler to mark a template function as "only used in CTFE", and not emit it into the object file if there are no other runtime references to it? I'm thinking of some kind of boolean attribute that defaults to false, and gets set if the function is referenced by runtime code. During codegen, any function that doesn't have this attribute set will be skipped over. My speculation is that this would lead to a good amount of reduction in template bloat, given how pervasively CTFE is used in Phobos (and idiomatic D in general).
Or maybe you can get away with just using a good compiler/linker that supports LTO. It's quite mature in GCC now, so it's probably worth trying with GDC. http://hubicka.blogspot.ca/2014/04/linktime-optimization-in-gcc-1-brief.html
Feb 08
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 2/8/2018 5:26 AM, Steven Schveighoffer wrote:
 The extra data in the object file comes from the inclusion of the
hexStringImpl 
 function, and from the template parameter (the symbol 
 _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in there as well, 
 which will always be larger than the actual string passed to hexString).
 
 I also see the data in there twice for some reason.
This is no longer the case with the PR. import std.conv; void test() { __gshared immutable char[4] s = hexString!"deadbeef"; } produces the following, with no sign of the template and the data is there only once: _TEXT segment dword use32 public 'CODE' ;size is 0 _TEXT ends _DATA segment para use32 public 'DATA' ;size is 4 _DATA ends CONST segment para use32 public 'CONST' ;size is 14 CONST ends _BSS segment para use32 public 'BSS' ;size is 0 _BSS ends FLAT group extrn _D5test24testFZv public _D5test24testFZ1syG4a FMB segment dword use32 public 'DATA' ;size is 0 FMB ends FM segment dword use32 public 'DATA' ;size is 4 FM ends FME segment dword use32 public 'DATA' ;size is 0 FME ends public _D5test212__ModuleInfoZ _D5test24testFZv COMDAT flags=x0 attr=x0 align=x0 _TEXT segment assume CS:_TEXT _TEXT ends _DATA segment _D5test24testFZ1syG4a: db 0ffffffdeh,0ffffffadh,0ffffffbeh,0ffffffefh ;.... _DATA ends CONST segment _D5test212__ModuleInfoZ: db 004h,010h,000h,000h,000h,000h,000h,000h ;........ db 074h,065h,073h,074h,032h,000h ;test2. CONST ends _BSS segment _BSS ends FMB segment FMB ends FM segment dd offset FLAT:_D5test212__ModuleInfoZ FM ends FME segment FME ends _D5test24testFZv comdat assume CS:_D5test24testFZv ret _D5test24testFZv ends end
Feb 08
parent reply Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 18:31:06 UTC, Walter Bright wrote:
 On 2/8/2018 5:26 AM, Steven Schveighoffer wrote:
 The extra data in the object file comes from the inclusion of 
 the hexStringImpl function, and from the template parameter 
 (the symbol 
 _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in 
 there as well, which will always be larger than the actual 
 string passed to hexString).
 
 I also see the data in there twice for some reason.
This is no longer the case with the PR. import std.conv; void test() { __gshared immutable char[4] s = hexString!"deadbeef"; } produces the following, with no sign of the template and the data is there only once: _DATA segment _D5test24testFZ1syG4a: db 0ffffffdeh,0ffffffadh,0ffffffbeh,0ffffffefh ;.... _DATA ends
But it looks like they are all dchar, so 4x the space vs x"deadbeef"?
Feb 08
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
On 2/8/18 1:42 PM, Ralph Doncaster wrote:
 On Thursday, 8 February 2018 at 18:31:06 UTC, Walter Bright wrote:
 On 2/8/2018 5:26 AM, Steven Schveighoffer wrote:
 The extra data in the object file comes from the inclusion of the 
 hexStringImpl function, and from the template parameter (the symbol 
 _D3std4conv__T9hexStringVAyaa8_6465616462656566ZQBiyAa is in there as 
 well, which will always be larger than the actual string passed to 
 hexString).

 I also see the data in there twice for some reason.
This is no longer the case with the PR.   import std.conv;   void test() {     __gshared immutable char[4] s = hexString!"deadbeef";   } produces the following, with no sign of the template and the data is there only once: _DATA    segment _D5test24testFZ1syG4a:     db    0ffffffdeh,0ffffffadh,0ffffffbeh,0ffffffefh    ;.... _DATA    ends
But it looks like they are all dchar, so 4x the space vs x"deadbeef"?
I was looking at that too when I was testing the differences, but actually, it's the same when you use x"deadbeef". I wonder if it's an issue with how obj2asm prints it out? Surely, that data array must be contiguous, and they must be bytes. Otherwise the resulting code would be wrong. -Steve
Feb 08
next sibling parent Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 18:49:51 UTC, Steven 
Schveighoffer wrote:
 I wonder if it's an issue with how obj2asm prints it out? 
 Surely, that data array must be contiguous, and they must be 
 bytes. Otherwise the resulting code would be wrong.
OK. I didn't even know about obj2asm until you mentioned it. objdump seems to work perfectly fine on the .o's that dmd generates, and I can tell that x"deadbeef" generates 4 contiguous bytes (objdump -D): Disassembly of section .rodata.str1.1: 0000000000000000 <_TMP0>: 0: de .byte 0xde 1: ad lods %ds:(%rsi),%eax 2: be .byte 0xbe 3: ef out %eax,(%dx) ...
Feb 08
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 2/8/2018 10:49 AM, Steven Schveighoffer wrote:
 On 2/8/18 1:42 PM, Ralph Doncaster wrote:
 On Thursday, 8 February 2018 at 18:31:06 UTC, Walter Bright wrote:
     db    0ffffffdeh,0ffffffadh,0ffffffbeh,0ffffffefh    ;....
But it looks like they are all dchar, so 4x the space vs x"deadbeef"?
The 'db' means 'define byte'. dw for words, dd for 32 bit words.
 I was looking at that too when I was testing the differences, but actually,
it's 
 the same when you use x"deadbeef".
Yes.
 I wonder if it's an issue with how obj2asm prints it out? Surely, that data 
 array must be contiguous, and they must be bytes. Otherwise the resulting code 
 would be wrong.
Yes. I just never bothered to fix it.
Feb 08
prev sibling parent Mike Franklin <slavo5150 yahoo.com> writes:
On Wednesday, 7 February 2018 at 16:03:36 UTC, Steven 
Schveighoffer wrote:
 
 They are deprecated:
 
 https://dlang.org/changelog/pending.html#hexstrings
 https://dlang.org/deprecate.html#Hexstring%20literals
Wow, that's... a little superfluous.
I agree with the notion that the language should be an aggregate of primitives, and anything that can be composed of those primitives should be implemented in a library (unless a compelling reason is found to justify otherwise). The deprecation of hex string literals has exposed flaws in the library implementation and the compiler's template implementation. That doesn't mean deprecation was the wrong thing to do; it just brings the aforementioned flaws to the forefront, so let's not shoot the messenger. Here's a few fundamental flaws I see in our library implementations. * Some library implementations are not very cohesive, and have too many interdependencies. This is what seems to prevent `HexString` from being used in -betterC. * Some Phobos implementations would be quite useful in Druntime and in code that doesn't want to employ the runtime (e.g. libraries consumed by other languages, resource-constrained systems, and bare-metal programming), but alas, Druntime can't have a circular dependency on Phobos (nor should it). This is a difficult problem, and I don't have any solutions; just ideas. Maybe Phobos and Druntime should be divided into 3 libraries: 1. A library with no dependencies whatsoever, not even druntime, c runtime, or the C standard library. Some stuff in `std.traits`, `std.conv`, and even `HexString` could go here. Let's call this library DLib. 2. Druntime would only depend on DLib, but never publicly expose it. 3. Phobos could depend on DLib or Druntime, but again, never publicly expose it. 4. Phobos could be refactored by identifying packages that have too much coupling, and factoring out the dependencies into a 3rd, more cohesive library, imported by the previous two. For extra credit: 2a. Move C/C++ standard library bindings to Deimos, and have the desktop OS ports import it privately. There's no reason to impose this interface on bare-metal ports, and it's a superficial dependency anyway. 3a. Phobos shouldn't have any dependency on C/C++ language bindings. DRuntime should expose an idiomatic D (and preferably safe) interface for Phobos to use. DLib could then be used in -betterC and other use cases where Druntime is more of a liability than an asset. Mike
Feb 07
prev sibling parent reply Seb <seb wilzba.ch> writes:
On Wednesday, 7 February 2018 at 15:41:37 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
 Schveighoffer wrote:
 On 2/7/18 9:59 AM, Ralph Doncaster wrote:
 [...]
Good catch! Even the grammar says nothing about what it is, except it has HexString as a possible literal. Can you file an issue? https://issues.dlang.org -Steve
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals Hence, the grammar has been incompletely updated. As it's not an error to use them now, it should have stated that they are deprecated. Anyhow, you can always go back in time: https://docarchives.dlang.io/v2.078.0/spec/lex.html#HexString
PR: https://github.com/dlang/dlang.org/pull/2190
Feb 07
parent reply Seb <seb wilzba.ch> writes:
On Thursday, 8 February 2018 at 00:55:28 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:41:37 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
 Schveighoffer wrote:
 [...]
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals Hence, the grammar has been incompletely updated. As it's not an error to use them now, it should have stated that they are deprecated. Anyhow, you can always go back in time: https://docarchives.dlang.io/v2.078.0/spec/lex.html#HexString
PR: https://github.com/dlang/dlang.org/pull/2190
... and back online: http://dlang.org/spec/lex.html#hex_strings
Feb 07
parent Ralph Doncaster <nerdralph github.com> writes:
On Thursday, 8 February 2018 at 01:27:46 UTC, Seb wrote:
 On Thursday, 8 February 2018 at 00:55:28 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:41:37 UTC, Seb wrote:
 On Wednesday, 7 February 2018 at 15:25:05 UTC, Steven 
 Schveighoffer wrote:
 [...]
They are deprecated: https://dlang.org/changelog/pending.html#hexstrings https://dlang.org/deprecate.html#Hexstring%20literals Hence, the grammar has been incompletely updated. As it's not an error to use them now, it should have stated that they are deprecated. Anyhow, you can always go back in time: https://docarchives.dlang.io/v2.078.0/spec/lex.html#HexString
PR: https://github.com/dlang/dlang.org/pull/2190
... and back online: http://dlang.org/spec/lex.html#hex_strings
I'm impressed. I think I'll keep using D for at least a little while longer. While it has it warts, I'm attracted to a language that has an intelligent group of people working to cauterize those warts.
Feb 07