www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - No we should not support enum types derived from strings

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
We should remove all that rot from phobos pronto.

https://github.com/dlang/phobos/pull/8029
May 06
next sibling parent reply evilrat <evilrat666 gmail.com> writes:
On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029
Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
May 06
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 2:03 AM, evilrat wrote:
 On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029
Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.
May 07
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.
It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
May 07
parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:
 On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu 
 wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.
It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
"Is a string type" and "is implicitly convertible to a string type" are not the same thing.
May 07
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 10:16 AM, Paul Backus wrote:
 On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:
 On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.
It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
"Is a string type" and "is implicitly convertible to a string type" are not the same thing.
Yah. It's really been a string (heh!) of suboptimal decisions. 1. We wanted strings to be synonym to built-in slices of char. "Users should not need to define their own string type!" This has been D's billion dollars mistake. 2. Representing strings are char[] meant GC is a must and also there's long-distance coupling between callers and callees whenever strings are passed about: a callee may modify characters in the caller's string. Such changes could have been absolutely trivially disallowed with a user-defined string type, but see (1) and did I mention D's billion dollars mistake? 3. So yours truly (shudder) came up with the idea of doing strings as immutable(char)[] so that people can pass strings around, no coupling, no problem. GC is still a must. That satisfies (1) but bought us into the entire qualifiers business, which, any way I look at it, did not produce enough dividends compared to the effort put into it and the massive complications added to the language. (Aside: inout is the weirdest thing. How could we ever think that that was a good idea.) 4. When doing generic string functions for phobos, it made sense to support... oh wait a second we have so many string types. char[], wchar[], dchar[], each in triplicate because of const and immutable. So right of the bat we decided to support 9 string types. That was another mistake because nobody cares about wchar and dchar. Anyway, that's how isSomeChar and isSomeString were born. 5. Then came the question of ranges that have one of those 9 character types as elements... those should be supported too, no? IIRC at least a subset of phobos supports that stuff. 6. Then of course someone figured, wait a second, what about enums derived from strings and user-defined types that have an alias this as string... those deserve attention too, right? And right here we had descended into madness. Compare all that with: 0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.
May 07
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 Compare all that with:
 
 0. We put a String type in the standard library. It uses UTF8 inside and 
 supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages 
 its own memory so no need for the GC. It disallows remote coupling 
 across callers/callees. Case closed.
You can have enums with the base type being a struct or a class. How does putting a String type in the standard library help with the enum problem you're describing? -- /Jacob Carlborg
May 07
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 2:22 PM, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:
 
 Compare all that with:

 0. We put a String type in the standard library. It uses UTF8 inside 
 and supports iteration by either bytes, UTF8, UTF16, or UTF32. It 
 manages its own memory so no need for the GC. It disallows remote 
 coupling across callers/callees. Case closed.
You can have enums with the base type being a struct or a class. How does putting a String type in the standard library help with the enum problem you're describing?
The solution to that is "We do not support enums". But if you use a non-templated class String, you won't feel much of a pain in the first place because the enums will be converted to String objects upon call. The String type solves all other problems mentioned.
May 07
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 0. We put a String type in the standard library.
If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings? It's not currently possible to have switch statements with user defined types. -- /Jacob Carlborg
May 07
next sibling parent Meta <jared771 gmail.com> writes:
On Friday, 7 May 2021 at 18:25:57 UTC, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 0. We put a String type in the standard library.
If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings? It's not currently possible to have switch statements with user defined types.
It really, really should be. Pattern matching and destructuring are two of my most wanted features in D.
May 07
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 2:25 PM, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:
 
 0. We put a String type in the standard library.
If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings?
Built-in strings remain as they are.
May 07
prev sibling next sibling parent reply Jon Degenhardt <jond noreply.com> writes:
On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 0. We put a String type in the standard library. It uses UTF8 
 inside and supports iteration by either bytes, UTF8, UTF16, or 
 UTF32. It manages its own memory so no need for the GC. It 
 disallows remote coupling across callers/callees. Case closed.
This is a bit orthogonal, but... An important characteristic of utf-8 arrays is that they are simultaneously a random access range of bytes and an input range of utf-8 characters. For efficiency it's often important to switch back and forth between these two interpretations. `byLine` is one type of example, where a byte oriented search is done (e.g. with `memchr`), but afterward the representation array is accessed as utf-8 input range. `byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access. This switching between interpretations doesn't fit well with current distinction between `char[]` and `byte[]`. A numbers of algorithms in phobos operate on one or the other, but not both. It'd be very useful to have an approach to utf-8 strings that enabled switching interpretations easily, without casting. --Jon
May 07
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 6:34 PM, Jon Degenhardt wrote:
 On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 0. We put a String type in the standard library. It uses UTF8 inside 
 and supports iteration by either bytes, UTF8, UTF16, or UTF32. It 
 manages its own memory so no need for the GC. It disallows remote 
 coupling across callers/callees. Case closed.
This is a bit orthogonal, but... An important characteristic of utf-8 arrays is that they are simultaneously a random access range of bytes and an input range of utf-8 characters. For efficiency it's often important to switch back and forth between these two interpretations. `byLine` is one type of example, where a byte oriented search is done (e.g. with `memchr`), but afterward the representation array is accessed as utf-8 input range. `byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access. This switching between interpretations doesn't fit well with current distinction between `char[]` and `byte[]`. A numbers of algorithms in phobos operate on one or the other, but not both. It'd be very useful to have an approach to utf-8 strings that enabled switching interpretations easily, without casting.
String s; func1(s.bytes); func2(s.dchars);
May 07
next sibling parent Jon Degenhardt <jond noreply.com> writes:
On Saturday, 8 May 2021 at 02:05:42 UTC, Andrei Alexandrescu 
wrote:
 On 5/7/21 6:34 PM, Jon Degenhardt wrote:
 It'd be very useful to have an approach to utf-8 strings that 
 enabled switching interpretations easily, without casting.
String s; func1(s.bytes); func2(s.dchars);
That's not quite what I was getting at. But that's my fault. A hastily written message that muddled a couple of concepts. Sorry about that, I need to write up a better description. But there are two underlying thoughts. One is being able to convert from a random access byte array to char input range (e.g. `byUTF`), do something with it (e.g. `popFront`), then convert that form back to a random access byte range. This is logically doable because both are views on the same physical array. However, once something is an input range it doesn't convert simply to a random access range. This first one strikes me as potentially challenging because this dual view on the underlying data is not common, so there's not a lot of incentive to support it as a general concept. The second issue is more about current Phobos algorithms that specialize their implementations depending on whether the argument is a `char[]` or a `byte[]`. This normally involves conditioning on `isSomeString` or `isSomeChar`. `char[]` / `char` pass these tests, `byte[]` / `byte` do not. The cases I remember are cases where the string form was specialized to have better performance than the byte form. Look through searching.d for `isSomeString` use to see this. The trouble with this is that at the application level it can be necessary to use a byte array when working with a number facilities. This often involves I/O. E.g. Reading fixed sized blocks from an input stream (`File.byChunk`). This operates on `ubyte[]` arrays. It can be cast to a `char[]`. But, this can run afoul of autodecoding related routines that expect correctly formed utf-8 characters. When reading fixed size buffers, the starts and ends of the buffer will often not fall on utf-8 boundaries, so examining the bytes is necessary to handle these cases. (And input streams may contain corrupt utf-8 characters.) I know the above is still not an adequate description. At some point I'll try to write up something more compelling. --Jon
May 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:
 String s;
 func1(s.bytes);
 func2(s.dchars);
Already done: s.byCodeUnit s.byChar s.byWchar s.byDchar s.byUTF https://dlang.org/phobos/std_utf.html
May 09
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/9/21 5:04 AM, Walter Bright wrote:
 On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:
 String s;
 func1(s.bytes);
 func2(s.dchars);
Already done: s.byCodeUnit s.byChar s.byWchar s.byDchar s.byUTF https://dlang.org/phobos/std_utf.html
Problem being of course that there's no UDT String type, only the crappy immutable(char)[].
May 09
prev sibling parent reply guai <guai inbox.ru> writes:
On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array in 
 roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves are 
 treated as utf-8 input ranges, not random access.
In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
May 08
next sibling parent reply Berni44 <someone somemail.com> writes:
On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.
It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind. - I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search. - Or when you've got for whatever reason the possibility to do larger jumps while scanning a string, e.g. when you know there are now 50 letters ahead, that do not contain a certain token you are looking for, you can safely jump 50 bytes, go back to the next splitting point and continue linear search there. - Or you want to cut a string into pieces of a certain length (again 50?), where the exact length is not so much important. So you just jump ahead 50, go back again and split at this point. If there are a lot of non ascii characters in between, this is of course shorter, but maybe ok, because speed is more important. - You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.
May 08
next sibling parent Jon Degenhardt <jond noreply.com> writes:
On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.
It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind. - I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search. ... (more examples) .. - You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.
Exactly. All the ideas you listed apply. Parallelization is very often useful.
May 08
prev sibling parent reply guai <guai inbox.ru> writes:
On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.
It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind.
I ment this [combining characters](https://en.wikipedia.org/wiki/Combining_character). they are language-specific, but most of the time the string does not contain any clue which language is it.
 - I can imagine, that this can be useful in divide-and-conquer 
 algorithms, like binary search.
They must be applied with great careful to non-ascii texts. What about RTL for example? You cannot split inside RTL block
 - Or you want to cut a string into pieces of a certain length 
 (again 50?), where the exact length is not so much important.
For what business task would I do that? I may want to split a string on some char subsequence for lexing. But one cannot assume lengths of those chunks.
 So you just jump ahead 50, go back again and split at this 
 point. If there are a lot of non ascii characters in between, 
 this is of course shorter, but maybe ok, because speed is more 
 important.
Not sure if speed is more important than correctness.
 - You want to process pieces of a string in parallel: Cut it in 
 16 pieces and let your 16 cores work on each of them.
I'm not sure if this is possible with all the quirks of unicode. Never herd even of parallel processors of structured texts like xml.
May 08
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). 
 they are language-specific, but most of the time the string 
 does not contain any clue which language is it.
The thing is making the range be of dchars doesn't help with this. This kind of thinking is why Phobos does the autodecoding thing it does now, converting utf-8 to a range of dchar as it sees it... but those combining characters are still (or rather can be) two separate dchars! So right now Phobos does something that seems useful... but actually isn't. All of the bad, none of the good. BTW I also like to point out that Ascii actually has a lot of the same mysteries we ascribe to unicode. Like variable width chars: \t is an ascii char. Zero width char, ascii has \0 and \a. Negative width char? Is \b one? idk. But there's still a lot of times you can treat it as bytes and get away with it. This is why I'm not sold on Andrei's new String idea myself. I totally agree making char[] a range of dchars is a bad idea. But I think the only right thing to do is to expose what it actually is and then both educate and empower the user to do what they need themselves.
May 08
parent reply guai <guai inbox.ru> writes:
On Saturday, 8 May 2021 at 19:30:03 UTC, Adam D. Ruppe wrote:
 On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). they are
language-specific, but most of the time the string does not contain any clue
which language is it.
The thing is making the range be of dchars doesn't help with this.
At least it won't induce more problems
May 08
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:
 The thing is making the range be of dchars doesn't help with 
 this.
At least it won't induce more problems
This is what Phobos already does and it has already created more problems. It was a mistake to do it this way. But if string was just an opaque(ish) blob with a variety of accessor properties it would work better then. The big mistake Phobos made was trying to automatically do something and causing friction by that automatic thing not being right.
May 08
parent Max Haughton <maxhaton gmail.com> writes:
On Saturday, 8 May 2021 at 21:54:28 UTC, Adam D. Ruppe wrote:
 On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:
 The thing is making the range be of dchars doesn't help with 
 this.
At least it won't induce more problems
This is what Phobos already does and it has already created more problems. It was a mistake to do it this way. But if string was just an opaque(ish) blob with a variety of accessor properties it would work better then. The big mistake Phobos made was trying to automatically do something and causing friction by that automatic thing not being right.
The opaque blob model also allows SSO much more easily.
May 08
prev sibling parent reply Berni44 <someone somemail.com> writes:
On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). 
 they are language-specific, but most of the time the string 
 does not contain any clue which language is it.
You are talking about generic algorithms that work for every script. But unicode allows for algorithms only supporting subsets. If your subset doesn't contain combining characters, you don't need to care about them. And else you may need to go back to the next base character. Depends on the usecase.
 - I can imagine, that this can be useful in divide-and-conquer 
 algorithms, like binary search.
They must be applied with great careful to non-ascii texts. What about RTL for example? You cannot split inside RTL block
Oh, yes, you can! Think of an algorithm which is doing cryptographic analysis and counting consecutive pairs of ascii characters. For that it doesn't matter if there is RTL text cut into pieces.
 - Or you want to cut a string into pieces of a certain length 
 (again 50?), where the exact length is not so much important.
For what business task would I do that?
Simple wrapping to avoid loosing text when printing, or to avoid having to scroll vertically. Is probably not useful for a high quality program...
 I may want to split a string on some char subsequence for 
 lexing. But one cannot assume lengths of those chunks.
Depending on the use case you may know ahead.
 So you just jump ahead 50, go back again and split at this 
 point. If there are a lot of non ascii characters in between, 
 this is of course shorter, but maybe ok, because speed is more 
 important.
Not sure if speed is more important than correctness.
Of course, this again depends on the use case. You can't say that in general.
 - You want to process pieces of a string in parallel: Cut it 
 in 16 pieces and let your 16 cores work on each of them.
I'm not sure if this is possible with all the quirks of unicode.
Think again of the cryptographic analysis above, for an example. (Or checking wikipedia entries for whatever automatically.) Keep in mind, that we do not always have to support everything of unicode. If we know ahead, that our text contains mainly ascii and aside from this only a few base characters, but never combining characters and so on, we can use different algorithms which might be simpler or faster or both. To make sure, that this constraint holds, is then something, that has to be done outside of the algorithm.
 Never herd even of parallel processors of structured texts like 
 xml.
I would judge it much more difficult to process xml in parallel than to do the same with unicode.
May 08
parent guai <guai inbox.ru> writes:
On Saturday, 8 May 2021 at 20:19:51 UTC, Berni44 wrote:
 Oh, yes, you can! Think of an algorithm which is doing 
 cryptographic analysis and counting consecutive pairs of ascii 
 characters. For that it doesn't matter if there is RTL text cut 
 into pieces.
No cryptography is done on strings but instead on byte arrays. Why would you even want to use string here? Its methods won't be in any help.
May 08
prev sibling parent reply Jon Degenhardt <jond noreply.com> writes:
On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array in 
 roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves 
 are treated as utf-8 input ranges, not random access.
In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
Data and log file processing are common cases. Single byte ascii characters are normally used to delimit structure in such files. Record delimiters, field delimiters, name-value pair delimiters, escape syntax, etc. A common way to operate on such files is to identify structural boundaries by finding the requisite single byte ascii characters and treating the contained data as opaque (uninterpreted) sequences of utf-8 bytes. The details depend on the file format. But the key part is that single byte ascii characters can be unambiguously identified without interpreting other characters in a utf-8 data stream. Of course, when it comes time to interpreting the data inside these data streams it is necessary to operate on cohesive blocks. Yes graphemes, but also things like numbers. It's not useful to split a number in the middle and then call `std.conv.to!double` on it. Operating on the single byte structural elements allows deferring interpretation of multi-byte unicode content until it is needed. This is why it's useful to switch back and forth between a byte-oriented view and a UTF character view. Operating on bytes is faster (e.g. `memchr`, no utf-8 decoding), enables parallelization (depending on the type of file), and can be used with fixed size buffer reads and writes. --Jon
May 08
parent reply guai <guai inbox.ru> writes:
On Saturday, 8 May 2021 at 18:44:00 UTC, Jon Degenhardt wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array 
 in roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves 
 are treated as utf-8 input ranges, not random access.
In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
Data and log file processing are common cases. Single byte ascii characters are normally used to delimit structure in such files. Record delimiters, field delimiters, name-value pair delimiters, escape syntax, etc. A common way to operate on such files is to identify structural boundaries by finding the requisite single byte ascii characters and treating the contained data as opaque (uninterpreted) sequences of utf-8 bytes. The details depend on the file format. But the key part is that single byte ascii characters can be unambiguously identified without interpreting other characters in a utf-8 data stream. Of course, when it comes time to interpreting the data inside these data streams it is necessary to operate on cohesive blocks. Yes graphemes, but also things like numbers. It's not useful to split a number in the middle and then call `std.conv.to!double` on it. Operating on the single byte structural elements allows deferring interpretation of multi-byte unicode content until it is needed. This is why it's useful to switch back and forth between a byte-oriented view and a UTF character view. Operating on bytes is faster (e.g. `memchr`, no utf-8 decoding), enables parallelization (depending on the type of file), and can be used with fixed size buffer reads and writes. --Jon
When you work with log files first you pull it in as a byte stream, split in chunks. Then make a string out of each of them. Once you've done it, you process it like a string with all the rules of unicode. For example split it into words. And then you may want to convert a word to bytes back again. But you cannot split a string wherever you want treating it as bytes. It most certainly wouldn't work with all the languages out there. With string you cannot get a char by index, you must read them sequentially. You can search, you can tokenize, rewind and reinterpret maybe.
May 08
parent reply Jon Degenhardt <jond noreply.com> writes:
On Saturday, 8 May 2021 at 19:33:45 UTC, guai wrote:
 ...
 But you cannot split a string wherever you want treating it as 
 bytes. It most certainly wouldn't work with all the languages 
 out there.
Sure you can. It's necessary to take of advantage of the properties of utf-8 encoding to do it. That is, it's necessary to find a nearby utf-8 character boundary, but utf-8 is defined in a manner that enables this. Take a look at [section 2.5 Encoding Forms](http://www.unicode.org/versions/Unicode13.0.0/ch02.pdf#G13708) in the Unicode Standards doc. It describes exactly this.
 With string you cannot get a char by index, you must read them 
 sequentially.
Correct, you cannot find a unicode character using a character based index without processing sequentially. But for large classes of algorithms this is not necessary. That is, there is often no need to find, for example, the 100th character. If all an algorithm needs to do is split a string roughly in half, then use the byte offsets to find the halfway point and then look for a utf-8 character boundary. If the algorithm is based on some other boundary, say, token boundaries, then find one of those boundaries.
May 08
parent reply guai <guai inbox.ru> writes:
On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:
 If all an algorithm needs to do is split a string roughly in 
 half, then use the byte offsets to find the halfway point and 
 then look for a utf-8 character boundary. If the algorithm is 
 based on some other boundary, say, token boundaries, then find 
 one of those boundaries.
Those algorithms you talking about are either doesn't need strings at all but instead byte/char arrays or would produce garbage for any input other than ascii. Your example with log files mixes binary data with text. Properly done logger will escape delimiters inside text chunks, so it isn't even a string per se, it's some binary data from which you need to extract a string first. A lot of bugs are caused by this mixing of text with binary. And I think it is better to distinguish them properly on a type level.
May 08
parent Jon Degenhardt <jond noreply.com> writes:
On Saturday, 8 May 2021 at 21:47:21 UTC, guai wrote:
 On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:
 If all an algorithm needs to do is split a string roughly in 
 half, then use the byte offsets to find the halfway point and 
 then look for a utf-8 character boundary. If the algorithm is 
 based on some other boundary, say, token boundaries, then find 
 one of those boundaries.
Those algorithms you talking about are either doesn't need strings at all but instead byte/char arrays or would produce garbage for any input other than ascii.
I don't understand the point you are trying to make. Perhaps you could rephrase. I've implemented any number of these types of algorithms. Its very common to mix interpretation as unicode strings with interpretation as utf-8 bytes. e.g. Maybe its necessary to do case-conversion at some stage of processing. This has to be done on unicode characters, not bytes. But needing to do such processing at some point does exclude such treating the data as utf-8 bytes for other purposes. Also, a `char[]` in D is defined to be utf-8, and a `string` is an `immutable(char)[]`. So why would utf-8 data, including non-ascii characters, read into a `char[]` produce garbage? The answer is that it wouldn't. No, you cannot simply start on an arbitrary byte boundary, but nobody has suggested this.
 Your example with log files mixes binary data with text. 
 Properly done logger will escape delimiters inside text chunks, 
 so it isn't even a string per se, it's some binary data from 
 which you need to extract a string first.
Again, I'm not following the logic. Log files may or may not include binary data. But I'm sure why that matters. I'm talking about log files where the text portions are encoded as utf-8.
 A lot of bugs are caused by this mixing of text with binary. 
 And I think it is better to distinguish them properly on a type 
 level.
Perhaps it would help if you described what you mean by "binary". I tend to think of "binary" as things like image data, binary serialization formats, base-64 coding, compressed or encrypted text. These are quite different than utf-8 encoded unicode text.
May 08
prev sibling parent Q. Schroll <qs.il.paperinik gmail.com> writes:
On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 Compare all that with:

 We put a String type in the standard library. It uses UTF8 
 inside and supports iteration by either bytes, UTF8, UTF16, or 
 UTF32. It manages its own memory so no need for the GC. It 
 disallows remote coupling across callers/callees. Case closed.
True. But why have it easy when you can have it complicated?
May 07
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/7/2021 7:16 AM, Paul Backus wrote:
 "Is a string type" and "is implicitly convertible to a string type" are not
the 
 same thing.
Language lawyer point: An enum can be implicitly converted to its base type, but it's a match level 2: https://dlang.org/spec/function.html#function-overloading (Agreeing with Paul)
May 08
parent deadalnix <deadalnix gmail.com> writes:
On Sunday, 9 May 2021 at 02:57:42 UTC, Walter Bright wrote:
 On 5/7/2021 7:16 AM, Paul Backus wrote:
 "Is a string type" and "is implicitly convertible to a string 
 type" are not the same thing.
Language lawyer point: An enum can be implicitly converted to its base type, but it's a match level 2: https://dlang.org/spec/function.html#function-overloading (Agreeing with Paul)
Sorry to be blunt, but this is complete language layering fail. Classes implementing and interface are a subtype and are match level 2 (implicit conversion) when matching against the interface. In fact, any subtype is expected to be a match level 2 - arguably, this isn't bijective, as not all level 2 match will be subtypes, that doesn't definitively nails the topic at hand, but the argument made in this thread are disturbingly unsound.
May 09
prev sibling parent reply Jon Degenhardt <jond noreply.com> writes:
On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 On 5/7/21 2:03 AM, evilrat wrote:
 On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu 
 wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029
Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.
To try to put some focus on the user perspective, here's a sample program: ``` import std.stdio; import std.array; import std.range; void main() { writefln!"%d"(0); immutable string f1 = "%d"; writefln!f1(1); enum f2 = "%d"; writefln!f2(2); enum string f3 = "%d"; writefln!f3(3); enum { f4 = "%d" } writefln!f4(4); enum : string { f5 = "%d" } writefln!f5(5); enum X { f6 = "%d" } writefln!(X.f6)(6); // Compilation error enum Y : string { f7 = "%d" } writefln!(Y.f7)(7); // Compilation error } ``` All but the named enums (last two) are fine. These fail with similar compilation errors: ``` Error: template std.stdio.writefln cannot deduce function from argument types !("%d")(int), candidates are: dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4258): writefln(alias fmt, A...)(A args) with fmt = f6, A = (int) must satisfy the following constraint: isSomeString!(typeof(fmt)) dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4269): writefln(Char, A...)(in Char[] fmt, A args ``` This is at least a potentially confusing situation for users. The error message indicates that `f6` should be a "string" of some kind, and it looks like one. One needs to be very familiar with the details to understand why it does not satisfy `isSomeString`. Similarly with understanding why anonymous enums are fine but named enums are not. The error message is also not particularly helpful in determining what the available workarounds are. They may be trivial once understood, but there's non-trivial learning to get there. Note that slicing (`[]`) and `.representation()` do not work for the template argument. Casting does. e.g. The following is fine: ``` writefln!(cast(string)X.f6)(6); ``` It can be argued that this case is rare enough in user code that the ROI from either making the case work or improving the compiler error message is too low to devote time to this now. But maybe there are other cheap options that could help users. A documentation note perhaps. A FAQ somewhere on the D site that would surface in searches.
May 11
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 3:43 PM, Jon Degenhardt wrote:
 
      enum { f4 = "%d" }
      writefln!f4(4);
 
      enum : string { f5 = "%d" }
      writefln!f5(5);
 
      enum X { f6 = "%d" }
      writefln!(X.f6)(6);   // Compilation error
 
      enum Y : string { f7 = "%d" }
      writefln!(Y.f7)(7);   // Compilation error
Thanks. I agree it's confusing. The mystery gets elucidated with some ease if we write the types involved: f4 and f5 have type string, f6 has type X, and f7 have type Y. It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum f5 = "%d"`. I expected that some anonymous enum type would be generated.
May 11
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 7:00 PM, Andrei Alexandrescu wrote:
 On 5/11/21 3:43 PM, Jon Degenhardt wrote:
      enum { f4 = "%d" }
      writefln!f4(4);

      enum : string { f5 = "%d" }
      writefln!f5(5);

      enum X { f6 = "%d" }
      writefln!(X.f6)(6);   // Compilation error

      enum Y : string { f7 = "%d" }
      writefln!(Y.f7)(7);   // Compilation error
Thanks. I agree it's confusing. The mystery gets elucidated with some ease if we write the types involved: f4 and f5 have type string, f6 has type X, and f7 have type Y. It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum f5 = "%d"`. I expected that some anonymous enum type would be generated.
Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.
May 11
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:
 
      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);
 
 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.
The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:
 
      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);
 
 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.
The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[] Unless I'm missing something, that wold b the consistent behavior. Unless representation is supposed to recurse up to the bottom turtle?
May 11
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/11/2021 7:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.
The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]
That's what I said.
May 11
parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Wednesday, 12 May 2021 at 02:56:49 UTC, Walter Bright wrote:
 On 5/11/2021 7:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not 
 even Y.f7.representation is usable. Sigh.
The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]
That's what I said.
🍿
May 11
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 10:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.
The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[] Unless I'm missing something, that wold b the consistent behavior. Unless representation is supposed to recurse up to the bottom turtle?
`representation` is a library function, so in a way we get to have a say in what it does. I would have expected it doesn't go all the way to primitive types, but if it does, that's not necessarily incorrect.
May 12
prev sibling parent Walter Bright <newshound2 digitalmars.com> writes:
On 5/11/2021 4:00 PM, Andrei Alexandrescu wrote:
 It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum 
 f5 = "%d"`. I expected that some anonymous enum type would be generated.
That came about due to the decision to overload enum to create manifest constants. This way, a block of manifest constants can be created.
May 11
prev sibling next sibling parent Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:
On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029
Can you describe the scope of the rottenness in terms of contexts and arguments? Are you referring to enums derived from aggregates aswell? And how does this rottenness relate to the discrepancy in behavior between builtin `__traits(X, ...)` and `std.traits.X!(...)` for enum arguments?
May 07
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.
 
 https://github.com/dlang/phobos/pull/8029
What do you mean "not support"? The language has enums derived from strings. Did you mean remove it from the language? That would be a severe penalty. Did you mean that Phobos routines just should error whenever you use enum types derived from strings? That's also a severe penalty. If you mean we shouldn't support it (as an ambiguous case) in *conversion* utilities (i.e. to/from string), then this makes some sense. But it's also not straightforward. Sometimes you WANT to convert from the enum to the base type. Sometimes you want to convert to the enum name. Going backwards (string to enum), which one makes more sense? It depends on context. It also doesn't help that a string enum implicitly converts to a string. The language is going to circumvent any policies Phobos has on that front. For an example, in the serializers I have written, I usually have a "treat this enum type as it's base type" UDA, because the data inside the serialized format is the base type, but I want it as an enum in d-land. But it depends on the situation. I think it makes possible sense to require either wrappers that clarify intent, or always treat enums the same way (as an enum). I think Phobos *mostly* does the latter. Erroring for ambiguity might be more disruptive than it's worth. -Steve
May 07
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029
What do you mean "not support"? The language has enums derived from strings. Did you mean remove it from the language? That would be a severe penalty.
Enums derived from strings should not be supported as strings in the standard library.
 Did you mean that Phobos routines just should error whenever you use 
 enum types derived from strings? That's also a severe penalty.
No it isn't.
May 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings 
 in the standard library.
I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string. Special casing static arrays is a mistake. The user can just slice it out the outside. Special casing alias this is a mistake. The user can pass what they meant to pass. The phobos templates should work like all other templates - on the exact type passed. Other functions work with the normal overloading and implicit conversion rules. Kill all the special cases!
May 07
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 11:33 AM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings in the 
 standard library.
I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string.
yes
 Special casing static arrays is a mistake. The user can just slice it 
 out the outside.
Yes
 Special casing alias this is a mistake. The user can pass what they 
 meant to pass.
YES
 The phobos templates should work like all other templates - on the exact 
 type passed. Other functions work with the normal overloading and 
 implicit conversion rules.
 
 Kill all the special cases!
YES!!!
May 07
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Friday, May 7, 2021 9:39:40 AM MDT Andrei Alexandrescu via Digitalmars-d 
wrote:
 On 5/7/21 11:33 AM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings in the
 standard library.
I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string.
yes
 Special casing static arrays is a mistake. The user can just slice it
 out the outside.
Yes
 Special casing alias this is a mistake. The user can pass what they
 meant to pass.
YES
 The phobos templates should work like all other templates - on the exact
 type passed. Other functions work with the normal overloading and
 implicit conversion rules.

 Kill all the special cases!
YES!!!
Agreed. While implicit conversions can at times be useful, they cause a ton of problems when templates are involved. Ideally, we should accept no implicit conversions of any kind with templated code. And honestly, I wish that the language had fewer implicit conversions in it. In particular, I think that implicitly slicing static arrays was a big mistake, and we've had a number of issues in Phobos because of it when trying to later generalize functions that originally just took strings. - Jonathan M Davis
May 12
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Friday, 7 May 2021 at 15:33:56 UTC, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu 
 wrote:
 Enums derived from strings should not be supported as strings 
 in the standard library.
I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string. [...] Kill all the special cases!
100% agreed, but, back to my original point, why is the enum thing a special case to begin with? The fact that it is a special case to begin with flies in the face of Liskov's substitution principle - the enum type clearly is a subtype of string. You got to wonder how it came to be that it just don't work automatically to begin with. Adding special cases is indeed the wrong path. There is something deeper rotten here, and just saying, no, this shouldn't work is just not cutting it. Note that there should be special cases, but it's be good to understand why these are special case to begin with, and fix this. Alternatively, we decide enums are not subtypes, in which case they shouldn't be implicitly convertible either. That wouldn't be such a bad idea as I've often missed the ability to do opaque type aliasing in D, but that seems way more disruptive than just admitting that "enum strings" are indeed a subtype of string.
May 09
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to convert 
 from the enum to the base type. Sometimes you want to convert to the 
 enum name. Going backwards (string to enum), which one makes more sense? 
 It depends on context. It also doesn't help that a string enum 
 implicitly converts to a string. The language is going to circumvent any 
 policies Phobos has on that front.
Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings. How about no. User: "I have this enum string str and phobos won't consider it a string. Help!" Another user: "Just use str.representation if you want to pass str around as a string." User. "Cool." Case closed.
May 07
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to 
 convert from the enum to the base type. Sometimes you want to convert 
 to the enum name. Going backwards (string to enum), which one makes 
 more sense? It depends on context. It also doesn't help that a string 
 enum implicitly converts to a string. The language is going to 
 circumvent any policies Phobos has on that front.
Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings.
But an enum with base string type can be passed as a string. The PR in question is working around a limitation of the Phobos trait that says something derived from a string isn't really usable as a string (when it is). The problem I see is, when phobos says something isn't true, when it really is, causes no end of confusion (*cough* autodecoding) static assert(!isSomeString!T); // yet... string s = someT;
 
 How about no.
 
 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"
 
 Another user: "Just use str.representation if you want to pass str 
 around as a string."
 
User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?" -Steve
May 07
next sibling parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.
"Can be passed as a" is not the same as "is a". There's a conversion involved. For better or for worse, D templates do not participate in conversion and we shouldn't pretend that they do. This is often times very useful - you don't want to lose information in many templates. But there's other times when that information doesn't matter and it would be nice it you didn't have to think about it.... ...so maybe we should consider changing templates so they can participate at the language level... it would be interesting if the compiler did the conversions BEFORE instantiating any template. Then it can reuse the instances more easily too. I think it actually does for const params for example, but it could do more.
 User: "OK, but when should I use representation? I already pass 
 it around as a string and it works fine. Why can't phobos 
 comprehend that, when the language has no problems with it?"
But the language DOES have problems with it for certain types of functions. Phobos is trying to deny that reality.
May 07
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 12:30 PM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.
"Can be passed as a" is not the same as "is a". There's a conversion involved.
YES! Int is not floating point, but yes you can initiate a floating point from an int. BTW it's worse than I feared. There are 104 occurrences of StringTypeOf in phobos. There should be 0.
May 07
prev sibling next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Friday, 7 May 2021 at 16:30:26 UTC, Adam D. Ruppe wrote:
 For better or for worse, D templates do not participate in 
 conversion and we shouldn't pretend that they do. This is often 
 times very useful - you don't want to lose information in many 
 templates. But there's other times when that information 
 doesn't matter and it would be nice it you didn't have to think 
 about it....
We can already *almost* express this in the language. This code works: void fun(T : string, T val)() { pragma(msg, "instantiated with ", T.stringof); } enum E : string { x = "hello" } alias test = fun!(E, E.x); // prints: instantiated with E But if you try to write it the more natural way, with the value parameter first, and have the compiler deduce the type, you get an error: void fun(T val, T : string)() { pragma(msg, "instantiated with ", T.stringof); } // Error: undefined identifier `T`
May 07
parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 17:02:17 UTC, Paul Backus wrote:
 We can already *almost* express this in the language. This code 
 works:
eeeeh that's a compile time argument and it still isn't actually a string. What I'm talking about is like in the normal function: void test(string s) { writeln(s); } enum Test : string { a = "foo" } test(Test.a); The conversion to string happens outside `test`. So caller instead of callee, whereas with a template - any template - the exact type is passed, what T:string is saying is that the callee *can* do the conversion if it wants to inside, but the compiler won't actually do it for you. This is very useful in a lot of cases. Like if you do void foo(T : SomeBase)(T t) {} and pass foo(new Derived()), you can still see the whole Dervied type and thus do some reflection and such over it, with the compiler promising that it can be converted to SomeBase if you want to. Of course, in this case, it is not really different than a template constraint. You could do void foo(T)(T t) if(is(T : SomeBase)) {} and get that same rejection behavior. But of course what's nice about specialization is you can then add an overload void foo(T : SomeBase)(T t) {} void foo(T : Derived)(T t) {} And if you get like class Derived : SomeBase {} class OtherBranch : SomeBase{} and call foo(new Derived()); // goes to second overload as it is a more specific match foo(new OtherBranch()); // goes to first overload as it is the best option available, but it still can see it is OtherBranch inside there, unlike a normal interface cast where you'd only have that detail at runtime.
May 07
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 12:30 PM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.
"Can be passed as a" is not the same as "is a". There's a conversion involved.
But that's the intention of the function. format doesn't care what the expression really is, it wants some type of string. How do you say "I want to accept something that's a string, but I want it as a string please"
 For better or for worse, D templates do not participate in conversion 
 and we shouldn't pretend that they do. This is often times very useful - 
 you don't want to lose information in many templates. But there's other 
 times when that information doesn't matter and it would be nice it you 
 didn't have to think about it....
e.g. format.
 ...so maybe we should consider changing templates so they can 
 participate at the language level... it would be interesting if the 
 compiler did the conversions BEFORE instantiating any template. Then it 
 can reuse the instances more easily too. I think it actually does for 
 const params for example, but it could do more.
Interesting idea!
 
 User: "OK, but when should I use representation? I already pass it 
 around as a string and it works fine. Why can't phobos comprehend 
 that, when the language has no problems with it?"
But the language DOES have problems with it for certain types of functions. Phobos is trying to deny that reality.
What I mean is, I can write: void foo(string s); and it works for enums that are string-based. Why doesn't format work with that same principle? The answer is because there isn't a good way to do it. -Steve
May 07
parent reply Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 17:11:32 UTC, Steven Schveighoffer wrote:
 How do you say "I want to accept something that's a string, but 
 I want it as a string please"
Well, one way we can do that today is to have the template forward to a normal function, or a normal function forward to a template. void format(T...)(const char[] s, T args) { format(asRangeOfDchar(s), args); } void format(Range T...)(Range r, T args) if(isAppopriateRange!Range) { // actual impl based on the range interface // and actually tbh I'd personally take another step // and collapse all these down even more. } Then a whole bunch of conversions are done to match `const char[]` and the template is then working with that entry point instead of the whole plate. This of course assumes isAppropriateRange is false for anything that isn't actually already a range. And I'm assuming string is not already a range. Otherwise you enter back into the hell of not only saying what you accept, but having to exclude things too. So let me rant. I think it was actually a mistake for Phobos to UFCS shoe-horn in range functions on arrays too - this includes strings as well as int[] and such as well. Lots of new users ask why they can't do the same thing. And like Phobos took this opportunity to do silly things like autodecoding when we all hate now, but I don't think the freestanding ufcs range functions should exist at all. Just have the user fetch a range out of the container. Then they get in that habit with other containers too and it moves a bunch of ugly code out of every consuming function. Heck the `asRange` thing itself might have a variety of overloads it forwards to. MyRange asRangeHelper(const char[] s) { return MyRange(s); } auto asRange(T)(T t) { /* generic stuff */ } auto asRange(T : const char[])(T t) { return asRangeHelper(t); } // let the language convert it in these specializations and so on and so forth. This is a half-baked rant im sure you can destroy at will. But like I'm pretty sure if we did develop this it would be nicer overall than what we have now.
 The answer is because there isn't a good way to do it.
And it is possible the language could insert some magic to make it easier if we really put our thinking caps on.
May 07
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 18:17:31 UTC, Adam D. Ruppe wrote:
 void format(T...)(const char[] s, T args) {
       format(asRangeOfDchar(s), args);
 }
oh i should have added of course you can do the wchar and dchar overloads here too. yeah yeah i know "DRY" but like it is a trivial forwarder, get over it.
May 07
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS shoe-horn in range 
 functions on arrays too - this includes strings as well as int[] and 
 such as well.
The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays. Just a taste: int[] arr = genArray; arr.sort(); // fail. I don't want to go to that place, ever. -Steve
May 07
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 18:44:26 UTC, Steven Schveighoffer wrote:
 The end result of something like you allude to would result in 
 nearly all of phobos NOT working with arrays.
int[5] arr; arr.sort(); // fails, you need to use [] Array!int arr; arr.sort(); // fails, you need to use [] some random phobos functions special-case this to make it work which is the real wtf and those should be undone, just get the user to slice a static array. So I'd just make it all consistent. But tbh I don't feel that strongly about it... except for string. string should no longer be a range. Delete its popFront overload and let the user pick byCodeUnit or byCodePoint or whatever. Just rip that band aid right off. Just even for the others, even if the [] was deemed unacceptable, i don't love the ufcs solution. So many people try to do freestanding functions for other types, inspired by the phobos popFront.. and isInputRange fails because phobos itself must import the ufcs module. Other new people do foo.empty and it fails because they didn't import the module. So like even if the behavior remained the same as today, I'd like to define it a little differently. but meh dont wanna continue too far down this particular thing since it is the part of my rant i care the least about.
May 07
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 2:44 PM, Steven Schveighoffer wrote:
 On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS shoe-horn in 
 range functions on arrays too - this includes strings as well as int[] 
 and such as well.
The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays. Just a taste: int[] arr = genArray; arr.sort(); // fail. I don't want to go to that place, ever. -Steve
Yah, ranges are a generalization of arrays. It would be odd if the generalization of arrays didn't work when tried with arrays.
May 07
parent reply NonNull <non-null use.startmail.com> writes:
On Friday, 7 May 2021 at 20:53:08 UTC, Andrei Alexandrescu wrote:
 On 5/7/21 2:44 PM, Steven Schveighoffer wrote:
 On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS 
 shoe-horn in range functions on arrays too - this includes 
 strings as well as int[] and such as well.
The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays.
Yah, ranges are a generalization of arrays. It would be odd if the generalization of arrays didn't work when tried with arrays.
No. Ranges are not a generalization of arrays unless you ignore the most important feature of the notion of a Range. An array is a sequence of things in space: a spatial container (all values stored) that happens to be a sequence. A Range is a sequence of things in time. (Purist definition, often true in practice.) A spatial container can be /exploded/ into a sequence in time. And a sequence in time can be /accreted/ into a spatial container (whether it has sequence or not). Explode is a natural idea and could be defined for any spatial container, producing a Range from a spatial container, and specifically from an array. Making a distinction of spatial and temporal makes sense.
May 12
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:
 On Friday, 7 May 2021 at 20:53:08 UTC, Andrei Alexandrescu
 Yah, ranges are a generalization of arrays. It would be odd if 
 the generalization of arrays didn't work when tried with 
 arrays.
No. Ranges are not a generalization of arrays unless you ignore the most important feature of the notion of a Range. An array is a sequence of things in space: a spatial container (all values stored) that happens to be a sequence. A Range is a sequence of things in time. (Purist definition, often true in practice.)
Ranges are a generalization of arrays (or slices, if you prefer) in the same way that iterators are a generalization of pointers. In both cases, certain features of the specialized version are ignored or left out in the generalized version. As you've correctly pointed out, one of those ignored features is the array's layout in memory. A range *may* store all of its elements in memory, or it may not; as users of the range API, we are not suppose to know or care.
May 12
parent NonNull <non-null use.startmail.com> writes:
On Wednesday, 12 May 2021 at 15:08:46 UTC, Paul Backus wrote:
 On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:
 No. Ranges are not a generalization of arrays unless you 
 ignore the most important feature of the notion of a Range. An 
 array is a sequence of things in space: a spatial container 
 (all values stored) that happens to be a sequence. A Range is 
 a sequence of things in time. (Purist definition, often true 
 in practice.)
Ranges are a generalization of arrays (or slices, if you prefer) in the same way that iterators are a generalization of pointers. In both cases, certain features of the specialized version are ignored or left out in the generalized version. As you've correctly pointed out, one of those ignored features is the array's layout in memory. A range *may* store all of its elements in memory, or it may not; as users of the range API, we are not suppose to know or care.
This is the standard pattern of the interpretation of the meaning of Range. It is more concrete. I want the idea of range to escape its historical semantic origins. I am suggesting a different and cleaner interpretation of that meaning. One that draws a deliberate line between space and time as a means of motivating language design. Instead of regarding the psychological process of regarding a spatial data structure as a range as being the psychological process of simply ignoring other non-range features and just using range operations, I am suggesting a semantic hard line be drawn between the two. A range could be obtained by exploding a spatial data structure (array say) and regarded as a distinct entity. Concretely the latent temporal sequence of things taken from the spatial data structure (the derived range) could be regarded as semantically quite different and separate from that data structure. While some may consider this a distinction without a difference, it does nevertheless change how one might relate a range to a spatial data structure in a programming language. My view leads to an explicit explode operation of some kind on all occasions, whereas yours can munge together range stuff with other operations on spatial data structures, so that your spatial structure IS a range and abstraction is avoided. Moving away from the historical semantics to the semantics I suggest above and having that guide language design separates those concerns. The idea of /explode/ is a nice intuitive fundamental concept that is concealed and entangled in D right now. Things could be less baroque. Specifically, arrays would then be treated the same way as any other spatial data structure. They would not be ranges.
May 12
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 11:51 AM, Steven Schveighoffer wrote:
 On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to 
 convert from the enum to the base type. Sometimes you want to convert 
 to the enum name. Going backwards (string to enum), which one makes 
 more sense? It depends on context. It also doesn't help that a string 
 enum implicitly converts to a string. The language is going to 
 circumvent any policies Phobos has on that front.
Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings.
But an enum with base string type can be passed as a string. The PR in question is working around a limitation of the Phobos trait that says something derived from a string isn't really usable as a string (when it is).
Well you see here is the problem. An enum with base string can be coerced to a string, but is not a true subtype of string. This came to a head with ranges, too - you can pop off the head of a string still have a string, but if you pop off the head of an enum string you get some enum value that is not present in the set of enum values. Concatenation has similar problems, e.g. s ~ s for enum strings yields string, not an enum string. (Weirdly s ~= s works...) So enum strings break ISA/Liskov. Alias this also does due to an overwhelming number of errors in its design and implementation.
 The problem I see is, when phobos says something isn't true, when it 
 really is, causes no end of confusion (*cough* autodecoding)
 
 static assert(!isSomeString!T);
 // yet...
 string s = someT;
This only shows that we have a baroque language that allows user-defined conversions from non-strings to strings. The code above is NO PROOF that T is supposed to be a string.
 How about no.

 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"

 Another user: "Just use str.representation if you want to pass str 
 around as a string."
User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"
"When you want a string".
May 07
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 12:43 PM, Andrei Alexandrescu wrote:
 On 5/7/21 11:51 AM, Steven Schveighoffer wrote:
 On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"

 Another user: "Just use str.representation if you want to pass str 
 around as a string."
User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"
"When you want a string".
Sorry, let's jump out of the fake dialog here for a second. The problem I have is, you have a function like: foo(T)(T s) if (isSomeString!T) The *intention* here is that, I want to NOT have to write: foo(string s) { impl } foo(wstring s) { impl } foo(dstring s) { impl } ... // etc with const, mutable BUT, if I have an enum that converts to a string, then if I actually DID write all those, then it would compile. However, the template version does not. This is the confusion that a user and library author has. I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the language I'm not suggesting any specific changes here, but I recognize there is a disconnect from what we *want* to express, and what the language provides. -Steve
May 07
next sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both can't 
 exactly express that intention, and where the approximations create 
 various template instantiations that cause strange problems (i.e. if you 
 accept an enum that converts to string, it's still an enum inside the 
 template). Whereas the language
I forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases. -Steve
May 07
parent reply Daniel N <no public.email> writes:
On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:
 On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you 
 a good way to express that. So we rely on template constraints 
 that both can't exactly express that intention, and where the 
 approximations create various template instantiations that 
 cause strange problems (i.e. if you accept an enum that 
 converts to string, it's still an enum inside the template). 
 Whereas the language
I forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases. -Steve
What's wrong with this? void fun(T : string)(T t)
May 07
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Friday, 7 May 2021 at 17:27:18 UTC, Daniel N wrote:
 What's wrong with this?

 void fun(T : string)(T t)
That doesn't convert to string. It allows it to compile because T *can* be converted to string and thus it is the closest specialization it can get, but it does NOT actually convert it. ---- import std.stdio; enum Test : string { a = "foo" } void test2(T:string)(T t) { pragma(msg, T); // Test, not string! writeln(t); } void main() { test2(Test.a); } -----
May 07
prev sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 5/7/21 1:27 PM, Daniel N wrote:
 On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:
 On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both 
 can't exactly express that intention, and where the approximations 
 create various template instantiations that cause strange problems 
 (i.e. if you accept an enum that converts to string, it's still an 
 enum inside the template). Whereas the language
I forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases.
What's wrong with this? void fun(T : string)(T t)
Because T is not a string. e.g. for an string-based enum, t.popFront won't work. -Steve
May 07
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 The problem I have is, you have a function like:
 
 foo(T)(T s) if (isSomeString!T)
 
 The *intention* here is that, I want to NOT have to write:
 
 foo(string s) { impl }
 foo(wstring s) { impl }
 foo(dstring s) { impl }
 ... // etc with const, mutable
 
 BUT, if I have an enum that converts to a string, then if I actually DID 
 write all those, then it would compile. However, the template version 
 does not. This is the confusion that a user and library author has.
Of course. I understand that very well. But that's a minor confusion and inconvenience; people understand very well that e.g. this won't work: void foo(float); void foo(double); void main() { foo(1); } The reason is slightly different but the point is the same: convertibility has its subtleties and programming languages comprehend small surprises. Supporting enum strings and alias this at the huge cost we incur now is definitely over two standard deviations away from what's reasonable.
 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both can't 
 exactly express that intention, and where the approximations create 
 various template instantiations that cause strange problems (i.e. if you 
 accept an enum that converts to string, it's still an enum inside the 
 template). Whereas the language
 
 I'm not suggesting any specific changes here, but I recognize there is a 
 disconnect from what we *want* to express, and what the language provides.
That I am on board with.
May 07
prev sibling parent Q. Schroll <qs.il.paperinik gmail.com> writes:
On Friday, 7 May 2021 at 17:05:08 UTC, Steven Schveighoffer wrote:
 The problem I have is, you have a function like:
 ```D
 auto foo(T)(T s) if (isSomeString!T) { impl }
 ```
 The *intention* here is that, I want to NOT have to write:
 ```D
 auto foo(string s) { impl }
 auto foo(wstring s) { impl }
 auto foo(dstring s) { impl }
 ... // etc with const, mutable
 ```
 BUT, if I have an enum that converts to a string, then if I 
 actually DID write all those, then it would compile. However, 
 the template version does not. This is the confusion that a 
 user and library author has.
Maybe this is special casing here, but if you have a finite list of types you want to support, it might be easier to add an `AliasSeq` of all string types to `std.traits` or so and use ```D static foreach (String; Strings) auto foo(String s) { impl } ``` Looks generic, but actually isn't. The implementation bloat is a different beast though.
May 07
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:
 Well you see here is the problem. An enum with base string can 
 be coerced to a string, but is not a true subtype of string. 
 This came to a head with ranges, too - you can pop off the head 
 of a string still have a string, but if you pop off the head of 
 an enum string you get some enum value that is not present in 
 the set of enum values. Concatenation has similar problems, 
 e.g. s ~ s for enum strings yields string, not an enum string. 
 (Weirdly s ~= s works...)
Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101. I raised a few times int he past that there were unsound operations performed in the past (as in "Weirdly s ~= s works...") but I don't think turning compiler bugs into standard library policies is going to lead to better tomorrows.
May 09
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/9/21 8:57 PM, deadalnix wrote:
 On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:
 Well you see here is the problem. An enum with base string can be 
 coerced to a string, but is not a true subtype of string. This came to 
 a head with ranges, too - you can pop off the head of a string still 
 have a string, but if you pop off the head of an enum string you get 
 some enum value that is not present in the set of enum values. 
 Concatenation has similar problems, e.g. s ~ s for enum strings yields 
 string, not an enum string. (Weirdly s ~= s works...)
Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101.
So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.
May 09
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.
If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
May 10
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu 
 wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.
If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
popFront doesn't return a value, it mutates. So `r` before popFront and `r` after popFront must be the same type, because they are the same variable. If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. $]) != typeof(r), then it doesn't work, and string enums can't be ranges (from which it follows that they are not Liskov-substitutable for strings).
May 10
parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:
 popFront doesn't return a value, it mutates. So `r` before 
 popFront and `r` after popFront must be the same type, because 
 they are the same variable.

 If popFront for a string enum is `r = r[1 .. $]`, and 
 typeof(r[1 .. $]) != typeof(r), then it doesn't work, and 
 string enums can't be ranges (from which it follows that they 
 are not Liskov-substitutable for strings).
r = r[1 .. $] is an error unless r actually is a string. You cannot mutate an enum value and have it stay an enum. If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A { A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A? Nowhere, the LSP absolutely doesn't mandate that. It mandate that you can pass a B to something that expects an A, and that thing will behave the way you'd expect. And it does! If your code needs an A, then you mark it as accepting an A as input. If I have a B and want to pass it to your code, I can too, transparently. You do not need to even know about the existence of B when your wrote your code. This is what the LSP is at its core. Back to our string example, the code should accept string (A), with zero knowledge of the existence of any enum string (B). You should be able to pass a B to that code and have everything work as expected. The argument that the enum string is not a subtype because it breaks he LSP is nonsense, this in fact demonstrate that the type system is unsound and it breaks LSP is broken. And this is why people end up desperately trying to re-implement it in libraries, which result in a ton of more work and complexity for everybody involved.
May 10
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/21 5:55 PM, deadalnix wrote:
 On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:
 popFront doesn't return a value, it mutates. So `r` before popFront 
 and `r` after popFront must be the same type, because they are the 
 same variable.

 If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. 
 $]) != typeof(r), then it doesn't work, and string enums can't be 
 ranges (from which it follows that they are not Liskov-substitutable 
 for strings).
r = r[1 .. $] is an error unless r actually is a string. You cannot mutate an enum value and have it stay an enum. If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A {    A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A?
If we move the goalposts we can with certain ease create the illusion that a lot of things are possible and even easy. This works very well in forum discussions where all needed is eloquence and the perseverance to answer every post with one that just slightly moves the discussion around so it appears to have answers to every objection and have the last word on any topic. This is exactly what happens here - half of your points contradict the other half, but never in the same post and the appearance is you seem to have easy answers to everything. In the initial days of ranges we actually considered that popFront() would be actually tail() that returns by value. So instead of today's form (given a range r): for (; !r.empty; r.popFront) { ... use r.front ... } we'd have had: for (; !r.empty; r = r.tail) { ... use r.front ... } This doesn't change things much (and wouldn't improve the situation with enums) but does open up the possibility - what if r.tail() actually returns a type different from r? In all interesting cases that means r = r.tail wouldn't work anymore, which complicates range algorithms A LOT. They'd need to use recursion instead of iteration: void someRangeFunction(R)(R range) { if (range.empty) { ... empty case ... } else { ... do some work for r.front ... return someRangeFunction(r.tail); } } (I should note that that's actually of interest for immutable ranges, for the simple reason they aren't assignable.) At any rate, we decided this would complicate everything in Phobos way too much (and I think that was a correct prediction) so we chose to have popFront() mutate the current range.
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 15:33:45 UTC, Andrei Alexandrescu 
wrote:
 If we move the goalposts we can with certain ease create the 
 illusion that a lot of things are possible and even easy.

 [...]

 At any rate, we decided this would complicate everything in 
 Phobos way too much (and I think that was a correct prediction) 
 so we chose to have popFront() mutate the current range.
I don't think that any of what you wrote is incorrect, and these are even reasonable tradeofs as far as I can tell. I however would like to remind where this whole thing starts from: format!SomeEnumString(...) is expected to work for users. Not that SomeEnumString is a full fledged range or anything, simply that you can pass is down to phobos, or anything else for that matter, in place where a string is expected. This is reasonable expectation. It is also a reasonable expectation that this shouldn't require a ton of scaffolding to work, in phobos or elsewhere. Therefore, the fact that phobos required scaffolding to make this work is indicative that there is a deeper problem. Focusing on finding what that deeper problem is and fixing it seems like a healthier path forward than simply pretending there is no problem and pushing it all on the users. I this case, it was noted here ( https://forum.dlang.org/post/umndraexmrxiyrmfpcyo forum.dlang.org ) that the root cause of the problem might be that there is a conflation between the container and the range. I think this is a reasonable hypothesis. Having two things trying to do one thing is a very typical source of such problems.
May 11
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 12:26 PM, deadalnix wrote:
 I however would like to remind where this whole thing starts from:
 
 format!SomeEnumString(...) is expected to work for users.
Reasonable, though I should add that it's a decision made by the author of the format() API.
 Not that SomeEnumString is a full fledged range or anything, simply that 
 you can pass is down to phobos, or anything else for that matter, in 
 place where a string is expected.
Reasonable, though again a matter of API definition. Would you expect this to work? float sin(float x); double sin(double x); real sin(real x); ... auto x = sin(1); Shouldn't that work? Not that int is a full fledged floating point number or anything, simply that you can pass it down to phobos, or anything else for that matter, in place where a floating point number is expected. Oh, but wait, it's the templates. Great. T sin(T)(T x) if (isFloatingPoint!T); ... auto x = sin(1); Shouldn't that work? Not that int is a full fledged floating point number or anything, simply that you can pass it down to phobos, or anything else for that matter, in place where a floating point number is expected. Well an argument can be made that it should work, or the API designer can wisely choose to NOT yield true from isFloatingPoint!int. And if we explore this madness further, we get to an enormity just as awful as StringTypeOf: template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = T; } else ... } And then whenever we need a floating point type we use is(FloatingPointTypeOf!T) like a bunch of dimwits. What use case does that helps? Who is helped by that? Someone who can't bring themselves to convert whatever they have to double prior to using the standard library. Arguably not a good design.
May 11
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 12:57 PM, Andrei Alexandrescu wrote:
 
 template FloatingPointTypeOf(T) {
      static if (isIntegral!T) {
          alias FloatingPointTypeOf = T;
      } else ...
 }
Correx: template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = double; } else ... }
May 11
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 16:57:13 UTC, Andrei Alexandrescu 
wrote:
 Reasonable, though again a matter of API definition. Would you 
 expect this to work?

 float sin(float x);
 double sin(double x);
 real sin(real x);
 ...
 auto x = sin(1);

 Shouldn't that work? Not that int is a full fledged floating 
 point number or anything, simply that you can pass it down to 
 phobos, or anything else for that matter, in place where a 
 floating point number is expected.
It's debatable. There are many languages out there where it doesn't. I think your case here is disingenuous, because an int is not a special kind of float. We are explicitly outside of the scope of the argument being made to begin with. Whatever conclusion we reach using int and float would have no bearing on what should happen for string and SomeEnumString. However, in D, it is possible to do: enum SomeEnumInt : int; This is for instance used in std.encoding. UI'm not sure if this works with float or not, but assuming that it does, then this absolutely and unambiguously work: enum SomeEnumFloat : float; SomeEnumFloat f = ...; auto x = sin(f); Here, x would have type float, based on `float sin(float x)`.
 Well an argument can be made that it should work, or the API 
 designer can wisely choose to NOT yield true from 
 isFloatingPoint!int.
An argument could be made, however, this is not the argument I am making, so I don't really see the point of bringing this up.
 And if we explore this madness further, we get to an enormity 
 just as awful as StringTypeOf:

 template FloatingPointTypeOf(T) {
     static if (isIntegral!T) {
         alias FloatingPointTypeOf = T;
     } else ...
 }

 And then whenever we need a floating point type we use 
 is(FloatingPointTypeOf!T) like a bunch of dimwits.

 What use case does that helps? Who is helped by that? Someone 
 who can't bring themselves to convert whatever they have to 
 double prior to using the standard library.

 Arguably not a good design.
This is indeed not a good design, but also isn't really required if the places requiring a float can consistently accept SomeEnumFloat, because in this case, it turtles transparently all the way down.
May 11
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/11/2021 12:14 PM, deadalnix wrote:
 I think your case here is disingenuous, because an int is not a special kind
of 
 float.
D has no notion of a "special kind of type". It only has a notion of "implicitly convertible". * An int is implicitly convertible to a float. * An enum is implicitly convertible to its base type. The two *must* behave the same way, or the language falls apart with hackish special cases that will never work in a predictable manner. One could design a language with two kinds of conversions: 1. is-a-special-case-of 2. is-implicitly-convertible-to but D isn't it.
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:
 On 5/11/2021 12:14 PM, deadalnix wrote:
 I think your case here is disingenuous, because an int is not 
 a special kind of float.
D has no notion of a "special kind of type". It only has a notion of "implicitly convertible". * An int is implicitly convertible to a float. * An enum is implicitly convertible to its base type. The two *must* behave the same way, or the language falls apart with hackish special cases that will never work in a predictable manner. One could design a language with two kinds of conversions: 1. is-a-special-case-of 2. is-implicitly-convertible-to but D isn't it.
Except, it is. D has numerous instance of both already and pretending it doesn't really isn't going to lead anywhere useful. And this very thread is indeed proof that "the language falls apart with hackish special cases that will never work in a predictable manner." The fact is that you can't get rid of 1. and support OOP, because polymorphism is a key ingredient of OOP. And we even go as far as to talk about some of the metaprogramming techniques in D as being compile time polymorphism, so so this is clearly a road we want to embark on. The alternative is to go full functional on these things, and, as Andrei explain with the tail example, this is an option that works as well, but you have to write everything in functional style, which makes some code harder to write. Personally, I'm not interested in D going full functional, because I appreciate that different ideas are better expressed in different paradigms. But I understand that it means that we must have 1. Now, do we need 2. ? Strictly speaking, we do not. We could just say that string float conversion and vice versa must be explicit. We can remove alias this, and whatever other feature of the language does implicit conversion. I'm actually confident that in some cases, that would be a win, but also that we are too far gone to realistically be able to remove 2. So we have both, we need to live with both, and make sensible decisions based on that. Pretending that we don't have both only leads to the guarantee that we'll make more bad decisions on that front in the future.
May 11
next sibling parent 12345swordy <alexanderheistermann gmail.com> writes:
On Tuesday, 11 May 2021 at 19:56:05 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:
 [...]
Except, it is. D has numerous instance of both already and pretending it doesn't really isn't going to lead anywhere useful. [...]
Remove alias this support for classes and replace it with compile time default interface methods. -Alex
May 11
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 5/11/2021 12:56 PM, deadalnix wrote:
 The fact is that you can't get rid of 1. and support OOP, because polymorphism 
 is a key ingredient of OOP.
Converting a derived class reference to a base class reference is an "implicitly convert" operation, not a special-kind-of conversion.
May 11
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 00:22:56 UTC, Walter Bright wrote:
 On 5/11/2021 12:56 PM, deadalnix wrote:
 The fact is that you can't get rid of 1. and support OOP, 
 because polymorphism is a key ingredient of OOP.
Converting a derived class reference to a base class reference is an "implicitly convert" operation, not a special-kind-of conversion.
That is trivially demonstrably false. Consider: class A {} class B : A {} B function() implicitly converts to A function() But byte function() doesn't implicitly converts to int function() Clear, the implicit conversion from byte to int is of different nature than the one from B to A, and one doesn't have to dig very deep to find these differences. Now, mind you, this is not a problem. At all. After all, B is a subtype of A, while byte is not a subtype of int. There are different kind of implicit conversions. This is pefectly sound and required if D wants to have implicit conversion of things which aren't subtypes of each others. There are no ways around it. Let's just not pretend it's the same, because this from these erroneous assumptions that bad design grows.
May 11
prev sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:
 If you think that invalidate the LSP, I'm afraid there is a big 
 misunderstanding about the LSP. Not all operation on a subtype 
 have to return said subtype. It is made clearer if you consider 
 the slicing operationa s a member function on an object instead 
 - as I seems classes and inheritance is the only way OPP is 
 understood these days.

 class A {
    A slice(int start, int end) { ... }
 }

 class B : A {}

 Where is it implied that B's version of the slice operation 
 must return an A? Nowhere, the LSP absolutely doesn't mandate 
 that. It mandate that you can pass a B to something that 
 expects an A, and that thing will behave the way you'd expect.

 And it does!

 If your code needs an A, then you mark it as accepting an A as 
 input. If I have a B and want to pass it to your code, I can 
 too, transparently. You do not need to even know about the 
 existence of B when your wrote your code. This is what the LSP 
 is at its core.

 Back to our string example, the code should accept string (A), 
 with zero knowledge of the existence of any enum string (B). 
 You should be able to pass a B to that code and have everything 
 work as expected.
I concede the points that enum strings do not violate the LSP, and that they are subtypes of string. You're right, and I was wrong. The point I should have made is that, at least in D, the LSP is not universal. There are situations where it simply does not apply. In particular, it does not guarantee that a substitution which changes the arguments used to instantiate a template will succeed; e.g., class A { int x; } class B : A { int y; } void example(T)(T obj) { static assert(!__traits(hasMember, T, "y")); } `example(new A)` will compile, but `example(new B)` will not--because they are not actually calling the same function. One calls `example!A` and the other calls `example!B`. This is an unavoidable consequence of the expressive power of D's templates: without specific knowledge about `example`'s implementation, we cannot guarantee anything about the relationship between `example!A` and `example!B`. All of which is to say, the fact that you can pass a string as an argument to a template does not *necessarily* imply that you can pass an enum string as an argument to the same template. That `format` handles them differently does not "fly in the face of Liskov's substitution principle" [1], any more than my example above does. [1] https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.org
May 12
next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 22:00:57 UTC, Paul Backus wrote:
 I concede the points that enum strings do not violate the LSP, 
 and that they are subtypes of string. You're right, and I was 
 wrong.
Thanks.
 The point I should have made is that, at least in D, the LSP is 
 not universal. There are situations where it simply does not 
 apply. In particular, it does not guarantee that a substitution 
 which changes the arguments used to instantiate a template will 
 succeed; e.g.,

 [...]

 All of which is to say, the fact that you can pass a string as 
 an argument to a template does not *necessarily* imply that you 
 can pass an enum string as an argument to the same template. 
 That `format` handles them differently does not "fly in the 
 face of Liskov's substitution principle" [1], any more than my 
 example above does.

 [1] 
 https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.org
That is true, and there are definitively cases where it is unavoidable. However, I don't think format fits that bill, because format does expect a string, not any random type. Where I'm getting at is a bit complicated to express clearly, because types are effectively also "values" that you can pass around at compile time, but let me try. We should reasonably expect the LSP to work when what is passed down is the value of the enum, but not when it it's type - which, in fact, isn't too surprising because the type itself isn't subject to the LSP. Consider: class A{} class B : A {} void foo(A a); // We should expect the LSP to hold true here, because the value is the only argument passed down to foo. void bar(T)(T t); // There is no expectation that foo(new A) and foo(new B) behave consistently, because not only the value is passed down, but also the type. While we expect passing down the value to respect the LSP, no such expectation can exist for the type. So in the second exemple, while we expect the runtime parameter `t` to conform to the LSP, we do not expect the compile time parameter `T` to do so. However, if we do not change the value of `T` but pass a B down to `t`, then we should get back to a situation where the LSP is respected. For instance: bar!A(new B()); // We expect this to be well behaved when it comes to the LSP, vs say bar(new A()) because the only change happened to the value parameter, which is supposed to uphold the LSP. So far, so good, I don't think this is too controversial, even though it is confusing to express that concept clearly. Now, with enum string, there is an interesting twist, because they can be passed at compile time too. in theory, that should not change anything when it comes to the LSP, but in practice, it seems like it does, which is IMO where the root of the problem is. Consider: string format(string S, A...)(A args); While S is a compile time parameter, it is not a type parameter, but a value parameter. In that case, it is expected as per the LSP that I can pass down string, or any subtype of strings as the first compile time parameter of format, and this ought to work as expected.
May 12
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 12 May 2021 at 23:08:24 UTC, deadalnix wrote:
 Now, with enum string, there is an interesting twist, because 
 they can be passed at compile time too. in theory, that should 
 not change anything when it comes to the LSP, but in practice, 
 it seems like it does, which is IMO where the root of the 
 problem is.

 Consider:

 string format(string S, A...)(A args);

 While S is a compile time parameter, it is not a type 
 parameter, but a value parameter. In that case, it is expected 
 as per the LSP that I can pass down string, or any subtype of 
 strings as the first compile time parameter of format, and this 
 ought to work as expected.
This *does* work as expected: https://run.dlang.io/is/Ru9phk The issue with `format` is that it takes an alias parameter, not a value parameter--and the reason it does *that* is to support string, wstring, and dstring with a single overload.
May 12
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 23:31:21 UTC, Paul Backus wrote:
 This *does* work as expected: https://run.dlang.io/is/Ru9phk

 The issue with `format` is that it takes an alias parameter, 
 not a value parameter--and the reason it does *that* is to 
 support string, wstring, and dstring with a single overload.
Yes, so we are getting at the root of this. I know these thing work, this is why I stated that SomeEnumString is a subtype of string to begin with, it has all the properties. If that wasn't working, then I would have been mistaken when making such assertions. It is working in the simple case, it is expected to work from the caller's standpoint due to the LSP, but it doesn't work in practice due to some obscure implementation detail that is of little concern to the user. Pushing this on the user is not the way to go. If the library writer desire to bundle string/dstring/wstring in the same implementation, this doesn't change the fact that it ought to work with subtypes. Choosing to break this is what "flies in the face of the LSP". I would also like to see people think what make respecting the LSP challenging in such case, and see what can be done at a systemic level. It's kind of a bummer that the path of least resistance is to break the LSP when going for more genericity in another dimension.
May 12
parent reply Paul Backus <snarwin gmail.com> writes:
On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:
 It is working in the simple case, it is expected to work from 
 the caller's standpoint due to the LSP, but it doesn't work in 
 practice due to some obscure implementation detail that is of 
 little concern to the user.

 Pushing this on the user is not the way to go.

 If the library writer desire to bundle string/dstring/wstring 
 in the same implementation, this doesn't change the fact that 
 it ought to work with subtypes. Choosing to break this is what 
 "flies in the face of the LSP".
Well, no, it doesn't--because, again, the LSP doesn't apply here in the first place, and never has. Flies in the face of user expectations, perhaps--though even then, if the user looks at the documentation and see `isSomeString!(typeof(fmt))`, is it really reasonable for them to expect that a non-string type will be accepted? I think it's a reasonable API design decision to support any type that implicitly converts to a string type, but it's not the *only* reasonable decision, and we ought to acknowledge the costs as well as the benefits. Personally, my inclination is to err on the side of making the standard library a little more complex so that user code can be simpler, but Andrei makes a convincing argument that this tendency has gotten us into trouble before [1]. How do we decide where to draw the line? There has to be some principle here beyond just "users expect it" and "respect the LSP." [1] https://forum.dlang.org/thread/q6plhj$1l9$1 digitalmars.com
 I would also like to see people think what make respecting the 
 LSP challenging in such case, and see what can be done at a 
 systemic level. It's kind of a bummer that the path of least 
 resistance is to break the LSP when going for more genericity 
 in another dimension.
IMO this is all downstream of D's choice to use untyped templates as opposed to typed generics (a tradeoff that goes all the way back to Lisp vs. ML). It's a fun thought experiment to imagine a version of D that took the other path, but there's not much we can do about it now.
May 12
parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 13 May 2021 at 01:03:19 UTC, Paul Backus wrote:
 On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:
 It is working in the simple case, it is expected to work from 
 the caller's standpoint due to the LSP, but it doesn't work in 
 practice due to some obscure implementation detail that is of 
 little concern to the user.

 Pushing this on the user is not the way to go.

 If the library writer desire to bundle string/dstring/wstring 
 in the same implementation, this doesn't change the fact that 
 it ought to work with subtypes. Choosing to break this is what 
 "flies in the face of the LSP".
Well, no, it doesn't--because, again, the LSP doesn't apply here in the first place, and never has. Flies in the face of user expectations, perhaps--though even then, if the user looks at the documentation and see `isSomeString!(typeof(fmt))`, is it really reasonable for them to expect that a non-string type will be accepted? I think it's a reasonable API design decision to support any type that implicitly converts to a string type, but it's not the *only* reasonable decision, and we ought to acknowledge the costs as well as the benefits. Personally, my inclination is to err on the side of making the standard library a little more complex so that user code can be simpler, but Andrei makes a convincing argument that this tendency has gotten us into trouble before [1]. How do we decide where to draw the line? There has to be some principle here beyond just "users expect it" and "respect the LSP."
While what you say is correct, I'm not convinced it is right. We established before that effectively, we should expect the LSP to hold when values are passed down, but not when types are. Which i think we both agree is the reasonable thing to do here, because B being a subtype of A doesn't say anything about meta_typeof(B) being a subtype of meta_typeof(A), and therefore there is no expectation that the LSP holds. So it is correct to assert that if format takes the type as a parameter, then there is no expectation that the LSP holds. It is also correct to say that the documentation describes things accurately. But I strongly disagree with the fact that it is right. To use an analogy, I could make a car where the gaz and break pedal are swapped, and explain as much in the user manual, yet, I fully expect people would crash such cars at a higher rate than the alternative. In the case of format, we need to ask ourselves what does the user expect, to pass a value down or to pass a type (plus possibly a value) down? Because if it the first, then it reasonable from the user standpoint that the LSP works and if it is the second, then there isn't such an expectation. The fact that we see people trying to do format!SomeEnumString , but not something like format!42 provides a good answer to that question. Format's parameter is expected to be a string, not any random type. And if that is the case, then it is reasonable to expect to LSP to hold. Now, the matter of cost is an interesting one. But I argue that doing what the user expect ought to be cheap, if not the cheapest option available. This is simply the difference between a language that helps its users and a language that gets in the way. So if the cost is high, then we need to consider this high cost a serious problem to solve.
May 13
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/21 6:00 PM, Paul Backus wrote:
 On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:
 If you think that invalidate the LSP, I'm afraid there is a big 
 misunderstanding about the LSP. Not all operation on a subtype have to 
 return said subtype. It is made clearer if you consider the slicing 
 operationa s a member function on an object instead - as I seems 
 classes and inheritance is the only way OPP is understood these days.

 class A {
    A slice(int start, int end) { ... }
 }

 class B : A {}

 Where is it implied that B's version of the slice operation must 
 return an A? Nowhere, the LSP absolutely doesn't mandate that. It 
 mandate that you can pass a B to something that expects an A, and that 
 thing will behave the way you'd expect.

 And it does!

 If your code needs an A, then you mark it as accepting an A as input. 
 If I have a B and want to pass it to your code, I can too, 
 transparently. You do not need to even know about the existence of B 
 when your wrote your code. This is what the LSP is at its core.

 Back to our string example, the code should accept string (A), with 
 zero knowledge of the existence of any enum string (B). You should be 
 able to pass a B to that code and have everything work as expected.
I concede the points that enum strings do not violate the LSP, and that they are subtypes of string. You're right, and I was wrong.
I was all over run.dlang.org like "Sure that's not going to work... wait a second, it does! But that other thing's not going to work... what, that works too!" I didn't know D's enums are _that_ odd. It seems you can do almost everything with an enum that you can do with its base type. Keyword being "almost". For example, x ~= "asd"; works whether x is a string or an enum based on string. However, x = x ~ "asd"; works if x is a string and does not work if x is an enum derived from string. Therefore, a function using that expression works for strings but not for enum strings. Similarly: x += 3; works for int and enums derived from int. However, x = x + 3; does not. So you can't transparently substitute enums for their base type. I suspect there'd be other cases, too.
May 12
next sibling parent reply Paul Backus <snarwin gmail.com> writes:
On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu 
wrote:
 It seems you can do almost everything with an enum that you can 
 do with its base type. Keyword being "almost". For example,

 x ~= "asd";

 works whether x is a string or an enum based on string. However,

 x = x ~ "asd";

 works if x is a string and does not work if x is an enum 
 derived from string. Therefore, a function using that 
 expression works for strings but not for enum strings.
A template function, you mean? Because (as the rest of the post you quoted demonstrates) the LSP does not and has never applied (in D) to substitutions that involve different instantiations of the same template. If you explicitly instantiate `func!string`, then it will work exactly as the LSP dictates, but if you substitute `func!string(x)` with `func!E(x)`, you have no guarantee. Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` doesn't is definitely a bug.
May 12
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/21 9:41 PM, Paul Backus wrote:
 On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu wrote:
 It seems you can do almost everything with an enum that you can do 
 with its base type. Keyword being "almost". For example,

 x ~= "asd";

 works whether x is a string or an enum based on string. However,

 x = x ~ "asd";

 works if x is a string and does not work if x is an enum derived from 
 string. Therefore, a function using that expression works for strings 
 but not for enum strings.
A template function, you mean? Because (as the rest of the post you quoted demonstrates) the LSP does not and has never applied (in D) to substitutions that involve different instantiations of the same template. If you explicitly instantiate `func!string`, then it will work exactly as the LSP dictates, but if you substitute `func!string(x)` with `func!E(x)`, you have no guarantee. Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` doesn't is definitely a bug.
Well the problem is that the choice of covariance of results for operations on enums vs their "base" is quite arbitrary. For strings, the result of "~" is not covariant but the result of "~=" is - not only it works, but it returns a reference to the enum type, not the base type. However, for enums derived from integrals the result of "+" is not covariant when adding an enum with an integral, but covariant when two enums are added together. Same goes for "-", "/", "*", but oddly not for "^^". I suspect nobody thought of trying to raise an enum to the power of an enum. The plot thickens when considering enums derived from user-defined types: void main() { import std; struct S { void fun() { writefln("%s", &this); } int min = -1; } enum X : S { x = S() } X x; x.fun; (cast(S*) &x).fun; writeln(x.min); } The two addresses are the same, meaning the enum value gets to call the base member's function, in a subtyping manner. However, the last line doesn't compile, which breaks subtyping. On the face of it, enums are defined by the language, so whatever choices are made are... there. I understand the practicality of some choices, but overall the entire enum algebra is quirky and difficult to maneuver around in generic code. Which harkens back to the opener of this thread - Phobos should not go out of its way to support enumerated types everywhere, when a trivial recourse exists on the caller side - pass value.representation instead of value. A much stronger argument could be made against supporting convertibility (to e.g. strings or ranges) by means of alias this. Callers should convert to the needed type prior to calling into the standard library.
May 12
prev sibling next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, May 12, 2021 7:16:42 PM MDT Andrei Alexandrescu via Digitalmars-
d wrote:
 I was all over run.dlang.org like "Sure that's not going to work... wait
 a second, it does! But that other thing's not going to work... what,
 that works too!" I didn't know D's enums are _that_ odd.

 It seems you can do almost everything with an enum that you can do with
 its base type. Keyword being "almost".
Yeah, if enums are supposed to only have a fixed set of values, then they're completely broken. The language does almost nothing to guarantee it. One result of that is that you have to be _very_ careful about how you use something like final switch - especially since it's not checked with -release. Of course, if enums are just named values without caring about whether it's possible to have an enum with a different value than the ones listed, then the fact that the enum is even treated differently from the base type causes other problems. So, ultimately, I think that D enums are pretty schizophrenic and not particularly well-designed. I've argued in the past that the language should disallow all operations on enums (aside from casts) which aren't guaranteed to result in a valid value for that enum type, but not everyone agrees with that stance. - Jonathan M Davis
May 12
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via Digitalmars-d 
wrote:
 I've argued in the past that the language should disallow all operations on
 enums (aside from casts) which aren't guaranteed to result in a valid value
 for that enum type, but not everyone agrees with that stance.
Or more accurately, all operations on an enum which are not guaranteed to result in a valid enum value should result in the base type (and thus not be assignable to a variable of that enum type without a cast), and operations which mutate the enum should not be allowed unless they're guaranteed to result in a valid enum value. But regardless, the point is that ideally, unless a cast is used, it should be impossible to have something typed as an enum without it being guaranteed that the value be one of the enumerated values for that enum type. But that's definitely not how D enums work... - Jonathan M Davis
May 12
next sibling parent Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:
 On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via 
 Digitalmars-d wrote:
 Or more accurately, all operations on an enum which are not 
 guaranteed to result in a valid enum value should result in the 
 base type (and thus not be assignable to a variable of that 
 enum type without a cast), and operations which mutate the enum 
 should not be allowed unless they're guaranteed to result in a 
 valid enum value. But regardless, the point is that ideally, 
 unless a cast is used, it should be impossible to have 
 something typed as an enum without it being guaranteed that the 
 value be one of the enumerated values for that enum type. But 
 that's definitely not how D enums work...

 - Jonathan M Davis
So basically enum should implicitly be declared to be immutable right?
May 12
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:
 On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via 
 Digitalmars-d wrote:
 I've argued in the past that the language should disallow all 
 operations on enums (aside from casts) which aren't guaranteed 
 to result in a valid value for that enum type, but not 
 everyone agrees with that stance.
Or more accurately, all operations on an enum which are not guaranteed to result in a valid enum value should result in the base type (and thus not be assignable to a variable of that enum type without a cast), and operations which mutate the enum should not be allowed unless they're guaranteed to result in a valid enum value. But regardless, the point is that ideally, unless a cast is used, it should be impossible to have something typed as an enum without it being guaranteed that the value be one of the enumerated values for that enum type. But that's definitely not how D enums work... - Jonathan M Davis
YES!
May 13
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu 
 wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.
If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
More to the point, consider this: class String { private: immutable(char)[] value; public: this(immutable(char)[] value) { this.value = value; } // ... } class EnumString : String { public: static EnumString value1() { return new EnumString("value1"); } static EnumString value2() { return new EnumString("value2"); } private: this(immutable(char)[] value) { super(value); } } While the implementation differs, conceptually, from a the theory standpoint, this is the same. This is using a subtype to constrain instance of type (String here) to a certain et of possible values. When using the subtype (EnumString) you have the knowledge that it is limited to some value, and you lose that knowledge as soon as you convert to the parent type. But instead, we gets some bastardised monster from the compiler, that's not quit a subtype, but that's not quite something else that really make sens either. As expected, this nonsense ends up spilling into user code, and then the standard lib, based on user constraints, and everybody is left choosing between bad tradeof down the road because the whole house of cards is built on shaky foundations. The bad news is, there is already a language like this. It's called C++, and it's actually quite successful. With all due respect to you and Walter, you guys are legends, but I think there is also a bit of learned helplessness coming from both of you due to a lifetime of exposure to the soul corroding effects of C++. This attitudes pervades everything, and most language constructs suffer of some form of it in one way or another, causing a cascade of bad side effects, starting with this whole thread. A few examples en vrac for instance: DIP1000, delegate context qualifiers, functions vs first class functions, etc... Back to the case of enum, it is obviously and trivially a subtype. In fact, even the syntax is the same: enum Foo: string { ... } Handling enum strings should never have been a special that was added to phobos, because it should never have been a special to begin with, in phobos or elsewhere.
May 10
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:
 The bad news is, there is already a language like this. It's 
 called C++, and it's actually quite successful. With all due 
 respect to you and Walter, you guys are legends, but I think 
 there is also a bit of learned helplessness coming from both of 
 you due to a lifetime of exposure to the soul corroding effects 
 of C++.
Not sure how this applies to C++, what subtyping issues are you having with C++?
 This attitudes pervades everything, and most language 
 constructs suffer of some form of it in one way or another, 
 causing a cascade of bad side effects, starting with this whole 
 thread. A few examples en vrac for instance: DIP1000, delegate 
 context qualifiers, functions vs first class functions, etc...
That's a direct result of the process. Features have always been added as an experiment rather than being completed on paper, even the ones with a DIP. At this point, this pretty much defines what D is... Just look at the addition of a C compiler that is being advanced right now. It is being added because there might be some benefits from it the future, perhaps. Of course, you also have the side effect that the AST becomes more resistant to change... and refactoring costs doubles... So that is why D has these issues. People wanted something, and it was added in an experimental way, not in an analytical way. That is the way of D. Experiment in features. Ideally D should have boosted meta programming and cut down on features to the bare minimum. Literals should have been a compile time type... and alias should bind to them, strings should've been a library construct, etc etc. But if you look at the features being added, meta programming is not in focus. So this won't change. Features are being added that has nothing to do with metaprogramming (memory safety, C interop etc). D will continue to evolve experimentally. So there will never be a small core language that is consistent. It is what it is, at this point.
May 10
parent reply deadalnix <deadalnix gmail.com> writes:
On Monday, 10 May 2021 at 22:37:51 UTC, Ola Fosheim Grøstad wrote:
 On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:
 The bad news is, there is already a language like this. It's 
 called C++, and it's actually quite successful. With all due 
 respect to you and Walter, you guys are legends, but I think 
 there is also a bit of learned helplessness coming from both 
 of you due to a lifetime of exposure to the soul corroding 
 effects of C++.
Not sure how this applies to C++, what subtyping issues are you having with C++?
Function type don't have the right covariance/contravariance, you can slice subtypes, and there are more, but this is not my point. My point is that we already have a language that is a mixed bag of accidentally defined features that don't compose properly with each others. I don't need one more of these, I already have one, and, let's be frank, it has at the very least an order of magnitude more support in the wild, in tools and so on. Doing the same thing with less manpower is a futile exercise.
 This attitudes pervades everything, and most language 
 constructs suffer of some form of it in one way or another, 
 causing a cascade of bad side effects, starting with this 
 whole thread. A few examples en vrac for instance: DIP1000, 
 delegate context qualifiers, functions vs first class 
 functions, etc...
That's a direct result of the process. Features have always been added as an experiment rather than being completed on paper, even the ones with a DIP. At this point, this pretty much defines what D is... Just look at the addition of a C compiler that is being advanced right now. It is being added because there might be some benefits from it the future, perhaps. Of course, you also have the side effect that the AST becomes more resistant to change... and refactoring costs doubles... So that is why D has these issues. People wanted something, and it was added in an experimental way, not in an analytical way. That is the way of D. Experiment in features.
Sure, but look at this thread. D is crumbling under the weight, not of the number f feature, but of the fact that a large portion of them simply are unsound. At this point, the decision made is to push the madness on the user. Fair enough, but if the standard lib devs are not willing to put up with it, why in hell would you expect anyone else to? Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close. This stuff ain't working properly so let's just given getting to work at all is not how you iterate toward a great useful product.
May 10
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 My point is that we already have a language that is a mixed bag 
 of accidentally defined features that don't compose properly 
 with each others. I don't need one more of these, I already 
 have one, and, let's be frank, it has at the very least an 
 order of magnitude more support in the wild, in tools and so on.
Yes, I think everyone can agree with this. A good starting point would to implement proper unification of as was discussed some months ago. This is critical for composing types in a sensible manner (composing templates of templates and binding them to a simple name that is exported). Then one can look and see if some types/features that are builtins can be expressed with the same building blocks in a unification process (somehow). When you see what cannot fit into this machinery you get a feeling for which features needs to be redesigned. Something like that.
 Doing the same thing with less manpower is a futile exercise.
Yes.
 Sure, but look at this thread. D is crumbling under the weight, 
 not of the number f feature, but of the fact that a large 
 portion of them simply are unsound.
Yes, but designing something that is sound is best done by having a tiny set of (theoretical) mechanisms that all other features can be expressed with (even though that might not be visible to the end user). It is very difficult to even discuss soundness with no constructive framework to represent ideas with.
 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.
Yes, even stuff that is well designed in C++ is a lot of work. Implementing a new container library with all the iterators is quite verbose, tedious and typos will happen... I think defining protocols and making mechanisms available that can extend types with protocols is the way to go (concepts is one step in the right direction). How to do it? Not sure, but it seems like templating by itself is not enough really. E.g. if ranges-functionality should be available to everything that can be treated like a sequence, then this should be a protocol that is present in all the builtin types that are sequential. Or somehow bound to them in some global fashion (kinda like injected into the type). Nothing should be special cased. Ideally. But there is no clear model for how to do that, I think. However it is tied to unification. Deduce the protocol if possible.
May 10
prev sibling next sibling parent Imperatorn <johan_forsberg_86 hotmail.com> writes:
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:

 At this point, the decision made is to push the madness on the 
 user. Fair enough, but if the standard lib devs are not willing 
 to put up with it, why in hell would you expect anyone else to? 
 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.

 This stuff ain't working properly so let's just given getting 
 to work at all is not how you iterate toward a great useful 
 product.
+1 We *must* focus more on consistency and soundness imo. I've heard several users talk about this. So it's nice to see it being talked about here. The way for D forward is to polish up D2. Maybe have 2.100.0 as a goal. Like any project, it needs milestones. We should take a pause, look around and see, we're now in the "optimizing" phase. We can at least try. Then after 2.100.0 for example we can start talking about new cool features again.
May 11
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/21 6:58 PM, deadalnix wrote:

 
 Sure, but look at this thread. D is crumbling under the weight, not of 
 the number f feature, but of the fact that a large portion of them 
 simply are unsound.
 
 At this point, the decision made is to push the madness on the user. 
 Fair enough, but if the standard lib devs are not willing to put up with 
 it, why in hell would you expect anyone else to? Just look at what's in 
 the C++ standard lib or boost and compare to your average C++ project to 
 see the kind of gap in term of motivation to put up with bullshit exists 
 between standard lib devs and Joe coder. It's not even close.
 
 This stuff ain't working properly so let's just given getting to work at 
 all is not how you iterate toward a great useful product.
In case you're referring to deprecating support for enum strings in phobos - definitely that's not pushing any madness anywhere. Adding said support was a mistake in the first place.
May 11
prev sibling parent reply Mathias LANG <geod24 gmail.com> writes:
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 Sure, but look at this thread. D is crumbling under the weight, 
 not of the number f feature, but of the fact that a large 
 portion of them simply are unsound.

 At this point, the decision made is to push the madness on the 
 user. Fair enough, but if the standard lib devs are not willing 
 to put up with it, why in hell would you expect anyone else to? 
 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.

 This stuff ain't working properly so let's just given getting 
 to work at all is not how you iterate toward a great useful 
 product.
Well, this thread is 11 pages and show no sign of winding down. In the meantime, has anyone looked at the code that sparked this outrage ? [As I mentioned in the PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the issue wouldn't have happened if the `fmt` template parameter was a `string` and not an `alias`.
 Q: why is fmt an alias and not a simple string ?
 A: No real reason.
The way I see it, the issue is valid, the fix wasn't. `format` API should have accepted a `string` and let the compiler perform any allowed implicit conversion, instead of taking exactly the type via `alias`. I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
May 11
next sibling parent cmyka <mauricehuuskes hotmail.com> writes:
On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:
 On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 ...
Well, this thread is 11 pages and show no sign of winding down. In the meantime, has anyone looked at the code that sparked this outrage ? ... I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
I support bringing these types of discussions to github (not Reddit/Twitter) instead where people can respond to a comment directly, or through thumbs up or down and at least edit their comments rather than piling on emails sequentially. Or a different type of discussion platform entirely. (That said I am with Adam Ruppe's take on this matter)
May 11
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:
 Well, this thread is 11 pages and show no sign of winding down.
 In the meantime, has anyone looked at the code that sparked 
 this outrage ?

 [As I mentioned in the 
 PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the
issue wouldn't have happened if the `fmt` template parameter was a `string` and
not an `alias`.

 Q: why is fmt an alias and not a simple string ?
 A: No real reason.
The way I see it, the issue is valid, the fix wasn't. `format` API should have accepted a `string` and let the compiler perform any allowed implicit conversion, instead of taking exactly the type via `alias`. I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
If formats expects a string, then it is indeed the right thing to accept a string :) But that discussion goes further than this, and is necessary, IMO.
May 12
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/21 5:44 PM, deadalnix wrote:
 On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D variables 
 don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.
If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
More to the point, consider this: class String { private:     immutable(char)[] value; public:     this(immutable(char)[] value) { this.value = value; }     // ... } class EnumString : String { public:     static EnumString value1() { return new EnumString("value1"); }     static EnumString value2() { return new EnumString("value2"); } private:     this(immutable(char)[] value) { super(value); } } While the implementation differs, conceptually, from a the theory standpoint, this is the same.
No it isn't. EnumString and String are reference types. A reference to an enum value does not convert to a reference to its representation. Very very very VERY different.
 This is using a subtype to constrain 
 instance of type (String here) to a certain et of possible values. When 
 using the subtype (EnumString) you have the knowledge that it is limited 
 to some value, and you lose that knowledge as soon as you convert to the 
 parent type.
One question that you keep not answering (Paul and I both asked it) is how you'd implement the range primitive popFront.
 But instead, we gets some bastardised monster from the compiler, that's 
 not quit a subtype, but that's not quite something else that really make 
 sens either. As expected, this nonsense ends up spilling into user code, 
 and then the standard lib, based on user constraints, and everybody is 
 left choosing between bad tradeof down the road because the whole house 
 of cards is built on shaky foundations.
 
 The bad news is, there is already a language like this. It's called C++, 
 and it's actually quite successful. With all due respect to you and 
 Walter, you guys are legends, but I think there is also a bit of learned 
 helplessness coming from both of you due to a lifetime of exposure to 
 the soul corroding effects of C++.
 
 This attitudes pervades everything, and most language constructs suffer 
 of some form of it in one way or another, causing a cascade of bad side 
 effects, starting with this whole thread. A few examples en vrac for 
 instance: DIP1000, delegate context qualifiers, functions vs first class 
 functions, etc...
I very much agree Walter and I have brought C++ bias into D, sometimes in a detrimental way.
 Back to the case of enum, it is obviously and trivially a subtype.
No it isn't. How many times do I need to explain that?
 In 
 fact, even the syntax is the same:
 
 enum Foo: string { ... }
It doesn't matter. It's not a subtype.
 Handling enum strings should never have been a special that was added to 
 phobos, because it should never have been a special to begin with, in 
 phobos or elsewhere.
Clearly enums have their own oddities, most inherited from C++. Perhaps we should do what C++ did, add a new "enum class" construct that fixes its issues. But I don't know of a perfect design, and I very much would love to see one.
May 11
parent deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 13:50:46 UTC, Andrei Alexandrescu 
wrote:
 No it isn't.

 EnumString and String are reference types. A reference to an 
 enum value does not convert to a reference to its 
 representation. Very very very VERY different.
Here we hit at the core of the problem. A reference to a type B that is a subtype of A is not a subtype of ref A. Or, in simlpler terms, B is a subtype of A doesn't imply that ref B is a subtype of ref A. This means that you can pass a B where an A is expected, but not a ref B where a ref A is expected. You'll note that the example I provided with classes for understanding will also demonstrate the same behavior: class A { ... } class B : A { ... } void foo(ref A a) { ... } B b = ...; foo(b); // This must be an error because, while B is a subtype of A, ref B is not a subtype of ref A. That means that you shouldn't be able to pass SomeEnumString to any function, in phobos or elsewhere, that will mutate it, such as popFront. But you should be able to do so, transparently, so any function that won't. That includes all compile time parameters.
May 11
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/21 8:19 AM, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D variables 
 don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.
If you have a range of T, then you got to return a T.
There's no return. The range is being mutated.
 I'm not sure 
 what's the problem is here. Do you have a concrete example?
Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu 
wrote:
 I'm not sure what's the problem is here. Do you have a 
 concrete example?
Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
That must be a type error, this is a feature, not a bug. This is not expected to work.
May 11
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 8:14 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu wrote:
 I'm not sure what's the problem is here. Do you have a concrete example?
Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) {     ... please fill in the implementation ... }
That must be a type error, this is a feature, not a bug. This is not expected to work.
Then enum strings are not ranges, correct?
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu 
wrote:
 Then enum strings are not ranges, correct?
They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
May 11
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?
They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
`ref string` is not a type.
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu 
wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu 
 wrote:
 Then enum strings are not ranges, correct?
They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
`ref string` is not a type.
This is just denial. There are many exemple of conversions that differs with string and ref strings which do not involve enums. For instance, immutable(string) -> string is a valid conversion, but immutable(string) -> ref string isn't. Call it something else than a type if you want, nevertheless, conversions rules are simply different, even if you abstract the notion of rvalue/lvalue from the whole thing, so it is clearly more than just a regular storage class. When you say ref, you say "I do not want a subtype". Saying B isn't a subtype of A because I can't pass a B to what expects a ref A is just fallacious.
May 11
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 12:13 PM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?
They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
`ref string` is not a type.
This is just denial.
It's simple fact.
 There are many exemple of conversions that differs with string and ref 
 strings which do not involve enums. For instance, immutable(string) -> 
 string is a valid conversion, but immutable(string) -> ref string isn't.
 
 Call it something else than a type if you want, nevertheless, 
 conversions rules are simply different, even if you abstract the notion 
 of rvalue/lvalue from the whole thing, so it is clearly more than just a 
 regular storage class.
 
 When you say ref, you say "I do not want a subtype". Saying B isn't a 
 subtype of A because I can't pass a B to what expects a ref A is just 
 fallacious.
Again with moving the goalposts.
May 11
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 12:39 PM, Andrei Alexandrescu wrote:
 On 5/11/21 12:13 PM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?
They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
`ref string` is not a type.
This is just denial.
It's simple fact.
 There are many exemple of conversions that differs with string and ref 
 strings which do not involve enums. For instance, immutable(string) -> 
 string is a valid conversion, but immutable(string) -> ref string isn't.

 Call it something else than a type if you want, nevertheless, 
 conversions rules are simply different, even if you abstract the 
 notion of rvalue/lvalue from the whole thing, so it is clearly more 
 than just a regular storage class.

 When you say ref, you say "I do not want a subtype". Saying B isn't a 
 subtype of A because I can't pass a B to what expects a ref A is just 
 fallacious.
Again with moving the goalposts.
To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
May 11
next sibling parent reply Meta <jared771 gmail.com> writes:
On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu 
wrote:
 Again with moving the goalposts.
To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.
May 11
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 2:37 PM, Meta wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:
 Again with moving the goalposts.
To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.
Being blunt is totally cool, but that doesn't make you right. There's no true subtyping or polymorphism with value semantics. This has been common knowledge in C++ - inheriting a value type is an antipattern for many reasons, and conversion operators are to be used carefully (and not as a substitute to subtyping) for many other reasons. With value types, it's all static typing, no polymorphism, no LSP beyond what's called ad-hoc polymorphism in the classic Caderlli et al paper (http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf). What can be aimed for with values is called "parametric polymorphism" (which is NOT subtyping) by the same paper: "Parametric polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure." That works if and only if you can reasonably supplant the same primitives across said range of types. With enums that's onerous; as soon as you "derive" an enum from int you figure that ++x can't reasonably be implemented. Same goes for enum strings - you can't implement the expected string primitives so substitutability is out the window. Values are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again. The problem was the code was doing exactly what it shouldn't have, yet the assert was puzzlingly passing. Since everyone here is great at teaching basic type theory, it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); Then the assertion started failing as expected. Following that, I've used that example for years in teaching and to invariably there are eyes going wide when they hear that C++ pointers are monomorphic, it's the pointed-to values that are polymorphic, and that's an essential distinction. (In D, just like in Java, classes take care of that indirection automatically, which can get some confused.)
May 11
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 There's no true subtyping or polymorphism with value semantics.
I think you guys need to agree on what you mean by "type" and "subtype". Mathematically a type would be a set of states and a set of operators that can take you between the states. A subtype is just a reduced set of states/operators where operators keep you within the set of states. In OO a type is an abstraction (reduced set) of the states that the entity you model in The Real World has. A subclass in OO is increasing the number of states/operators, but decreasing the number of Real World entities covered. So these twi notions of "subtype" are opposite.
 primitives across said range of types. With enums that's 
 onerous; as soon as you "derive" an enum from int you figure 
 that ++x can't reasonably be implemented. Same goes for enum
In C enums are subtypes of int. You reduce the number of states. C enums are not sound, because operators can take you out of the allowed set of states in a heartbeat. Anyway, I've given up following this discussion. Just define the desirable outcome (practical design) and forget about the theoretical aspects... then others might be able to understand where the viewpoints differ.
May 11
prev sibling next sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 The problem was the code was doing exactly what it shouldn't 
 have, yet the assert was puzzlingly passing. Since everyone 
 here is great at teaching basic type theory, it's an obvious 
 problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

 Then the assertion started failing as expected. Following that, 
 I've used that example for years in teaching and to invariably 
 there are eyes going wide when they hear that C++ pointers are 
 monomorphic, it's the pointed-to values that are polymorphic, 
 and that's an essential distinction. (In D, just like in Java, 
 classes take care of that indirection automatically, which can 
 get some confused.)
While this is indeed very interesting, this is missing the larger point. This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid. There is no such thing as a monomorphic pointer to a polymorphic type in any sound type system. That cannot be made to work. It is pointer and the pointed data, as a package, being half a value, half a reference type in the process. This is unavoidable, you can't unbundle it or everything breaks down. So why is there an indirection in there? Simply because you cannot know the layout of the object at compile time when you are doing runtime polymorphism. But even then, you could decide to make it behave as a value type with eager deep copy or copy on write and that would work too, and it would still be polymorphic. But we get back to square one: this has nothing to do with the type, which hold a reference to a payload. And the whole typing and subtyping business happen on these value types.
May 11
next sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 01:46:25 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
 wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 The problem was the code was doing exactly what it shouldn't 
 have, yet the assert was puzzlingly passing. Since everyone 
 here is great at teaching basic type theory, it's an obvious 
 problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

 Then the assertion started failing as expected. Following 
 that, I've used that example for years in teaching and to 
 invariably there are eyes going wide when they hear that C++ 
 pointers are monomorphic, it's the pointed-to values that are 
 polymorphic, and that's an essential distinction. (In D, just 
 like in Java, classes take care of that indirection 
 automatically, which can get some confused.)
While this is indeed very interesting, this is missing the larger point. This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid. There is no such thing as a monomorphic pointer to a polymorphic type in any sound type system. That cannot be made represent both the pointer and the pointed data, as a package, being half a value, half a reference type in the process. This is unavoidable, you can't unbundle it or everything breaks down. So why is there an indirection in there? Simply because you cannot know the layout of the object at compile time when you are doing runtime polymorphism. But even then, you could decide to make it behave as a value type with eager deep copy or copy on write and that would work too, and it would still be polymorphic. But we get back to square one: this has nothing to do with the value type, which hold a reference to a payload. And the whole typing and subtyping business happen on these value types.
-Alex
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:


 -Alex
No, both are value type, but in the case of the class, the value contains a reference to the payload that you describe in the class's body. Consider: class A {} A a = new A(); void foo(A ainfoo) { ainfooo = new A(); } foo(a); Was "a" modified here? No it wasn't.
May 11
parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex
No, both are value type,
Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types
 but in the case of the class, the value contains a reference to 
 the payload that you describe in the class's body. Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?
Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
May 11
next sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex
No, both are value type,
Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types
 but in the case of the class, the value contains a reference 
 to the payload that you describe in the class's body. Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?
Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
In layman terms, just because I can replace the item in the box with the exact same box, it does not mean the box hasn't been modified. - Alex
May 11
parent 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 02:22:52 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex
No, both are value type,
Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types
 but in the case of the class, the value contains a reference 
 to the payload that you describe in the class's body. 
 Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?
Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
In layman terms, just because I can replace the item in the box with the exact same box, it does not mean the box hasn't been modified. - Alex
Woops, meant to say "with the exact same item." -Alex
May 11
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that happens 
 to have the same value here. There is no guarantee that they 
 will share the same address.

 - Alex
You might want to reconsider how sure of yourself you are. For instance by opening https://replit.com/languages/csharp and running the following code in there: using System; class A { int i; public A(int i_) { i = i_; } public int getI() { return i; } } class Program { static void Main(string[] args) { A a = new A(15); Console.WriteLine(a.getI()); foo(a); Console.WriteLine(a.getI()); } static void foo(A ainfoo) { ainfoo = new A(23); Console.WriteLine(ainfoo.getI()); } }
May 11
next sibling parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex
You might want to reconsider how sure of yourself you are.
The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -Alex
May 11
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex
You might want to reconsider how sure of yourself you are.
The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -Alex
Before posting that email was the best time to run the code, look at the output and deduce what it means. The second best time is now. In any case, I will disengage from that subthread with you, because it has reached its conclusion, and the point has been demonstrably made with actual code. Arguing about what the code does really is pointless when you can simply run it and look at the result.
May 12
parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 11:45:52 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex
You might want to reconsider how sure of yourself you are.
The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -Alex
Before posting that email was the best time to run the code, look at the output and deduce what it means.
Like I said before, it does not support your claims, whatsoever. not support your claims whatsoever. Classes are reference types not value types, end of discussion. the language if you don't believe me.
 In any case, I will disengage from that subthread with you, 
 because it has reached its conclusion, and the point has been 
 demonstrably made with actual code.
Replacing the item in the box with the different yet exact same item, doesn't mean that you didn't modify the box. Again, print the object memory address, and you will see what I am talking about. -Alex
May 12
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact same 
 item, doesn't mean that you didn't modify the box. Again, print 
 the object memory address, and you will see what I am talking 
 about.

 -Alex
I legitimately can't tell if you are an idiot or a troll.
May 12
parent reply 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact 
 same item, doesn't mean that you didn't modify the box. Again, 
 print the object memory address, and you will see what I am 
 talking about.

 -Alex
I legitimately can't tell if you are an idiot or a troll.
What kind of idiot that ignores official documentation provided by Microsoft that clearly states that classes are reference types not value types!? Your coding examples does NOT DISPROVE THIS NOTATION WHATSOEVER!!!! -Alex
May 12
parent Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:
On Wednesday, 12 May 2021 at 15:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact 
 same item, doesn't mean that you didn't modify the box. 
 Again, print the object memory address, and you will see what 
 I am talking about.

 -Alex
I legitimately can't tell if you are an idiot or a troll.
What kind of idiot that ignores official documentation provided by Microsoft that clearly states that classes are reference types not value types!? Your coding examples does NOT DISPROVE THIS NOTATION WHATSOEVER!!!! -Alex
I think, you both talking about same thing. I think what he meant about half value type, half reference type, is that the variables/function parameters, themselves are references to the data an object has, and that reference is basically a value type, while the actual object data is stored in memory on that address found in variable/parameter, and this half value/half reference semantics are packaged in a single type, which cannot be broken apart. I.e. you can't have a variable that just a simple pointer to some heap memory, and you can't also have a variable that actually contains the data the object has on stack, like in C++ for example. This is the same thing what you've meant by classes being reference types, he just went a level lower into the implementation of so called reference types. roots. Best regards, Alexandru.
May 12
prev sibling parent 12345swordy <alexanderheistermann gmail.com> writes:
On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex
You might want to reconsider how sure of yourself you are. For instance by opening https://replit.com/languages/csharp and running the following code in there: using System; class A { int i; public A(int i_) { i = i_; } public int getI() { return i; } } class Program { static void Main(string[] args) { A a = new A(15); Console.WriteLine(a.getI()); foo(a); Console.WriteLine(a.getI()); } static void foo(A ainfoo) { ainfoo = new A(23); Console.WriteLine(ainfoo.getI()); } }
You are conflicting passing an argument by value/reference with the concept of value/reference types. They are not the same thing. "Don't confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or a reference type. There is no boxing of a value type when it is passed by reference." https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/ref -Alex
May 12
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you above 
 example, the this pointer, typed as Widget*, points to an instance of a 
 subclass of Widget. If you were to assign a Widget to that pointer 
 (which you can do, this is a pointer to a mutable widget), then any 
 references to that widget using a subtype of Widget is now invalid.
All of this is bizarrely incorrect. Care to elaborate?
May 12
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu 
wrote:
 On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you 
 above example, the this pointer, typed as Widget*, points to 
 an instance of a subclass of Widget. If you were to assign a 
 Widget to that pointer (which you can do, this is a pointer to 
 a mutable widget), then any references to that widget using a 
 subtype of Widget is now invalid.
All of this is bizarrely incorrect. Care to elaborate?
Consider the following: https://godbolt.org/z/8vzx9W56a This is a clear demonstration that C++'s type system is unsound here. It is unsound because it has the property that you mentioned in your post: the pointer is monomorphic and the value this pointers points to is polymorphic. This is simply unsound, you cannot separate the two (unless you make everything immutable). The pointer and the value must come together as a bundle, and that whole bundle (which is a value type containing a reference) is right.
May 12
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 12:11:22 UTC, deadalnix wrote:
 This is a clear demonstration that C++'s type system is unsound 
 here.
In fairness all generic low level programming languages that are practical to use have somewhat unsound type systems. Only high level languages can be fully sound (detect invalid programs at runtime and abort). C++ was forced into this mold by C though... (You can have heavily constrained low level languages that are sound)
 It is unsound because it has the property that you mentioned in 
 your post: the pointer is monomorphic and the value this
What does monomorphic mean in this context? Why would not this hold: *Singleton <: *Anything I find the discussion very confusing at this point.
May 12
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 13:35:11 UTC, Ola Fosheim Grøstad 
wrote:
 *Singleton <: *Anything
Typo :-D, I meant pointer-to-Singeltong is subtype of pointer-to-Anything: Singleton* <: Anything*
May 12
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/21 8:11 AM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you above 
 example, the this pointer, typed as Widget*, points to an instance of 
 a subclass of Widget. If you were to assign a Widget to that pointer 
 (which you can do, this is a pointer to a mutable widget), then any 
 references to that widget using a subtype of Widget is now invalid.
All of this is bizarrely incorrect. Care to elaborate?
Consider the following: https://godbolt.org/z/8vzx9W56a This is a clear demonstration that C++'s type system is unsound here. It is unsound because it has the property that you mentioned in your post: the pointer is monomorphic and the value this pointers points to is polymorphic. This is simply unsound, you cannot separate the two (unless you make everything immutable). The pointer and the value must come together as a bundle, and that whole bundle (which is a value type containing a reference) is itself what is
Ah, now we're at slicing. Love these forum discussions!
May 12
prev sibling next sibling parent reply Meta <jared771 gmail.com> writes:
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 I apologize for injecting myself into this conversation, but 
 with all due respect, what the hell are you talking about? 
 Everything Deadalnix is saying makes perfect sense - it's 
 basic type theory, and yet you're accusing him of moving 
 goalposts and making up definitions, etc. The problem is that 
 `isSomeString` doesn't respect the LSP and the template 
 constraints on the relevant stdlib functions for enums are a 
 hack to work around that. End of story. if `isSomeString` was 
 defined sensibly, these template constraint hacks would not 
 have to exist.
 
 All the bluster about `popFront` on enum strings, etc. is 
 completely irrelevant, and is a red herring anyway (as was 
 already explained).
 
 I'm sorry for being so blunt, but this conversation is painful 
 to read.
Being blunt is totally cool, but that doesn't make you right. There's no true subtyping or polymorphism with value semantics. This has been common knowledge in C++ - inheriting a value type is an antipattern for many reasons, and conversion operators are to be used carefully (and not as a substitute to subtyping) for many other reasons. With value types, it's all static typing, no polymorphism, no LSP beyond what's called ad-hoc polymorphism in the classic Caderlli et al paper (http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf).
Of course, but I thought the conversation was about strings, not value types. Last I checked, strings are reference types, in the same way that Java objects are reference types.
 What can be aimed for with values is called "parametric 
 polymorphism" (which is NOT subtyping) by the same paper:
The nice thing about D's template constraints, though, is that it allows us to impose subtype polymorphism on a parametrically polymorphic function.
 "Parametric polymorphism is obtained when a function works 
 uniformly on a range of types; these types normally exhibit 
 some common structure."

 That works if and only if you can reasonably supplant the same 
 primitives across said range of types. With enums that's 
 onerous; as soon as you "derive" an enum from int you figure 
 that ++x can't reasonably be implemented. Same goes for enum 
 strings - you can't implement the expected string primitives so 
 substitutability is out the window.
++x still fulfills the contract that the derived enum has inherited from `int`: `++: int -> int`. It easily passes the substitutability test. Likewise, enums with a base type of string fulfill all the same contracts that `string` does. Nowhere in the contract of the string type does it specify that `s[1..$]` returns a value of the same type as `s`, just of type `string`, which a string enum does.
 Values are monomorphic.
Are you saying that all values are monomorphic, or that _value types_ are monomorphic?
 Years ago I found a bug in a large C++ system that went like 
 this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 The problem was the code was doing exactly what it shouldn't 
 have, yet the assert was puzzlingly passing. Since everyone 
 here is great at teaching basic type theory
Just so we're clear, my previous post was not trying to insinuate that I am an expert in type theory and you are just too ignorant to understand the arguments presented. I don't claim to be anything close to an expert and only know the basics, and you're the one with the doctorate here.
 it's an obvious problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

 Then the assertion started failing as expected. Following that, 
 I've used that example for years in teaching and to invariably 
 there are eyes going wide when they hear that C++ pointers are 
 monomorphic, it's the pointed-to values that are polymorphic, 
 and that's an essential distinction. (In D, just like in Java, 
 classes take care of that indirection automatically, which can 
 get some confused.)
You just said a paragraph back that values are monomorphic. So are pointed-to values monomorphic or polymorphic? This isn't a gotcha; I'm just confused about which you meant. I think the point you are trying to make with this story is that an operation on an enum that returns the base type will lead to confusing/wrong behaviour and allowing it for template functions which are meant to take strings would be bad design, just like it was with Widget.clone(). Is that right?
May 11
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 10:36 PM, Meta wrote:
 Last I checked, strings are reference types, in the same way that Java 
 objects are reference types.
Just by means of clarification, that's not true because the length is stored with the pointer. This occasionally trips folks starting with D.
May 12
parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Wednesday, May 12, 2021 5:52:28 AM MDT Andrei Alexandrescu via Digitalmars-
d wrote:
 On 5/11/21 10:36 PM, Meta wrote:
 Last I checked, strings are reference types, in the same way that Java
 objects are reference types.
Just by means of clarification, that's not true because the length is stored with the pointer. This occasionally trips folks starting with D.
To be more precise, a dynamic array in D is essentially struct Array(T) { size_t length; T* ptr; } So, the length is stored directly in the struct, and the data is referenced via the pointer stored in the struct. As such, we often refer to a D dynamic array as a pseudo-reference type. Either way, while the way it's put together has some very useful properties (like making it so that multiple dynamic arrays can be slices of the same data), there's no question that it can be confusing at first. And of course, that extends to strings, since D strings are dynamic arrays. - Jonathan M Davis
May 12
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 10:36 PM, Meta wrote:
 ++x still fulfills the contract that the derived enum has inherited from 
 `int`: `++: int -> int`.
No, that would be ref int -> ref int, which has consequences.
May 12
prev sibling next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 11.05.21 23:36, Andrei Alexandrescu wrote:
 On 5/11/21 2:37 PM, Meta wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:
 Again with moving the goalposts.
To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.
Being blunt is totally cool, but that doesn't make you right. ...
Deadalnix is saying that there is a subtyping relationship for rvalues, while you are pointing out that there is no subtyping relationship for lvalues. I think those are both correct. (Type theory has no notion of lvalues or rvalues, so those would indeed have to be interpreted as different types.) I fail to see why the semantics of lvalues should have any bearing on format strings even though I understand why most of Phobos might want to assume isSomeString talks about lvalues of the type.
 There's no true subtyping or polymorphism with value semantics.
 ...
There's certainly subtyping. The point about "polymorphism" (in type theory, polymorphism typically refers to parametric polymorphism, but I guess you mean existential types), is a bit more tricky. I guess the point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without any runtime semantics can't support data types whose values do not embed runtime type info. However, it can certainly support value types, even value types that are stored without indirections.
 the assert was puzzlingly passing. Since everyone here is great at 
 teaching basic type theory, it's an obvious problem - the fix is:
 
          assert(typeid(*this) == typeid(Widget));
 ...
That's a C++ quirk. Not much to do with type theory. In fact, C++ may not be a great example for illustration, as its type system is unsound.
May 11
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/11/21 10:39 PM, Timon Gehr wrote:
 Deadalnix is saying that there is a subtyping relationship for rvalues, 
 while you are pointing out that there is no subtyping relationship for 
 lvalues. I think those are both correct.
Well put. Rvalues can afford the luxury to change representation (e.g. from byte to int of float to double) because they're only used once. So a passable polymorphism scheme can be implemented via coercion.
 (Type theory has no notion of 
 lvalues or rvalues, so those would indeed have to be interpreted as 
 different types.)
Hmmm... haven't looked in a while, but don't some of Java formalizations account for int, double etc. being values and consequently rvalues when passed around? (Though they can't be passed by reference so a formalization could get away with assuming int is a reference, e.g. ++x means "rebind reference x to a new reference to the value x + 1").
 I fail to see why the semantics of lvalues should have any bearing on 
 format strings even though I understand why most of Phobos might want to 
 assume isSomeString talks about lvalues of the type.
It doesn't, the format string is just a symptom. The problem is that we change (already did, and massively... >100 instances of StringTypeOf) the standard library to accommodate what I think is an unproductive form of genericity.
 There's no true subtyping or polymorphism with value semantics.
 ...
There's certainly subtyping. The point about "polymorphism" (in type theory, polymorphism typically refers to parametric polymorphism, but I guess you mean existential types), is a bit more tricky. I guess the point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without any runtime semantics can't support data types whose values do not embed runtime type info. However, it can certainly support value types, even value types that are stored without indirections.
One matter is to distinguish what can be done from what D has already done and cannot change. For example, I tried some code just now and was... surprised. Meta mentioned that increment works with enums, and lo and behold it does: void main() { import std; enum X : int { x = 10, y = 20 } X x; writeln(x); ++x; writeln(x); } That prints "x" and then "cast(X)11". Meaning you can easily write a program that takes you outside enumerated values without a cast, which somewhat dilutes the value of "final switch" and the general notion that enumerated types are a small closed set. Arguably ++ should not be allowed on enumerated values. Surprises go on: void main() { import std; enum X : string { x = "Hello, world!", y = "xyz" } X x; writeln(x); x = x[1 .. $]; writeln(x); } That prints: x cast(X)ello, world! which showcases, as a little distraction, a bug in the formatting of enums - the string should be quoted properly. But the larger point is that enum types derived from string actually allow, again, stepping outside their universe with ease. This cramps my style somewhat because during the whole discussion I assume that doesn't work, or at least shouldn't. I guess an argument could be built that its semantics is what it is. Anyway, the other side of the argument that got ignored is the alias this thing: void main() { import std; static struct X { string fun(); alias fun this; } X x; x = x[1 .. $]; } This doesn't compile; the slice does, but the assignment doesn't. Which means there are differences in what would be expected of a string (or, as it turns out, an enum string) and what would be expected of a type that converts to string by means of alias this.
May 12
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 13:39:50 UTC, Andrei Alexandrescu 
wrote:
 One matter is to distinguish what can be done from what D has 
 already done and cannot change. For example, I tried some code 
 just now and was... surprised.

 Meta mentioned that increment works with enums, and lo and 
 behold it does:

 void main() {
     import std;
     enum X : int { x = 10, y = 20 }
     X x;
     writeln(x);
     ++x;
     writeln(x);
 }

 That prints "x" and then "cast(X)11". Meaning you can easily 
 write a program that takes you outside enumerated values 
 without a cast, which somewhat dilutes the value of "final 
 switch" and the general notion that enumerated types are a 
 small closed set. Arguably ++ should not be allowed on 
 enumerated values.

 Surprises go on:

 void main() {
     import std;
     enum X : string { x = "Hello, world!", y = "xyz" }
     X x;
     writeln(x);
     x = x[1 .. $];
     writeln(x);
 }

 That prints:

 x
 cast(X)ello, world!

 which showcases, as a little distraction, a bug in the 
 formatting of enums - the string should be quoted properly.

 But the larger point is that enum types derived from string 
 actually allow, again, stepping outside their universe with 
 ease.
I've raised these problem on a regular basis for years now. This is obviously another instance where things are unsound, and needs to be fixed. Last time I we had a discussion on the matter, it went in a loop that is best summarized as this: enum E : int { A, B, C } while (true) { Me: A | B ought to be an int, not an E. W&A: But you need it to be an enum, so that you can do things like combining flags and stay. As in: enum Mode { Read, Write } openFile(file, Mode.Read | Mode.Write); Me: Wl then, you can't have final switch, because you don't have the guarantee it rely on. W&A: final switch is very much needed, from X, Y Z reason. } This is extremely tiresome and kinda looks like the current discussion (or another one would be the in contract needing to be statically bound, where Timon and Myself had to fish for Bertrand Meyer because nothing short of an argument from authority could do the trick). So if we get nothing else out of that discussion, fixing enum so that they don't go out of the allowed set of value would be nice. It's just unfortunate that it takes literally 5 years+ to get to a point where this is even acknowledged as being an issue. I hope we can somehow shorten that process, because it's not workable as it is. You have people around like Timon and myself who have an eye for this. It's free brainpower you are leaving not leveraging.
May 12
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/21 11:30 AM, deadalnix wrote:
 Last time I we had a discussion on the matter, it went in a loop that is 
 best summarized as this:
 enum E : int { A, B, C }
 
 while (true) {
    Me: A | B ought to be an int, not an E.
    W&A: But you need it to be an enum, so that you can do things like 
 combining flags and stay. As in:
      enum Mode { Read, Write }
      openFile(file, Mode.Read | Mode.Write);
    Me: Wl then, you can't have final switch, because you don't have the 
 guarantee it rely on.
    W&A: final switch is very much needed, from X, Y Z reason.
 }
I know this is Walter's take, but please don't ascribe it to me as well. I could at the very best give a nod to practicality, but I very much think that typing binary "or" on enums as the enum is a kludge.
 This is extremely tiresome and kinda looks like the current discussion 
 (or another one would be the in contract needing to be statically bound, 
 where Timon and Myself had to fish for Bertrand Meyer because nothing 
 short of an argument from authority could do the trick).
 
 So if we get nothing else out of that discussion, fixing enum so that 
 they don't go out of the allowed set of value would be nice. It's just 
 unfortunate that it takes literally 5 years+ to get to a point where 
 this is even acknowledged as being an issue.
This reach for credit here does not seem very well deserved.
 I hope we can somehow shorten that process, because it's not workable as 
 it is. You have people around like Timon and myself who have an eye for 
 this. It's free brainpower you are leaving not leveraging.
I will say what follows with the utmost respect. I think Timon is way better at these things (like in, incomparably better) than you and me combined. He most certainly is less skilled than you at other things, but as far as PL theory in this group goes, he and Paul are the only game in town.
May 12
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu 
wrote:
 I will say what follows with the utmost respect. I think Timon 
 is way better at these things (like in, incomparably better) 
 than you and me combined. He most certainly is less skilled 
 than you at other things, but as far as PL theory in this group 
 goes, he and Paul are the only game in town.
You are so wonderful at being inclusive... :-P Never seen anyone in these forums that haven' said things about PL theory that is either wrong or lacks nuance. Applies to Andreis, Timons, Pauls alike... However, since most here does not have comp.sci. background it would be nice if we stop hiding behind terminology (which people will perceive differently even if they have comp.sci. background which is why papers use references). deadalnix is explaining how he uses the terms which makes the thread more inclusive for all. Your dismissal is not helpful.
May 12
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu 
wrote:
 I will say what follows with the utmost respect. I think Timon 
 is way better at these things (like in, incomparably better) 
 than you and me combined. He most certainly is less skilled 
 than you at other things, but as far as PL theory in this group 
 goes, he and Paul are the only game in town.
It's fine, then just listen to him and not to me. That already would be vast improvement over the current state of affairs.
May 12
prev sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 02:39:23 UTC, Timon Gehr wrote:
          assert(typeid(*this) == typeid(Widget));
 ...
That's a C++ quirk. Not much to do with type theory. In fact, C++ may not be a great example for illustration, as its type system is unsound.
It isn't a quirk. To get dynamic lookup you need to add a virtual member. class A { public: virtual void nothing(){} void test(){ std::cout << typeid(*this).name() << std::endl; std::cout << typeid(A).name() << std::endl; } }; class B : public A { }; void test_typeinfo(){ B b{}; b.test(); }
May 12
prev sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.
I don't understand what you mean by pointers being monomorphic. this will always have the type of the class it was defined in. So the assert will always hold. How is this surprising??? What is more dangerous is that if you forget to add a virtual member then *this will also always hold as being a Widget. That is the result of C++ being a low-level language. No sensible high level language would allow such semantics.
May 12
parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 14:52:42 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
 wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.
I don't understand what you mean by pointers being monomorphic.
Ok, consider the following. class A {}; class B: public A {}; A *a = new B(); tyepid(a) is A*. In C++, a is monomorphic. typeid(*a) is B. In C++, *a is polymorphic.
May 12
parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 15:35:26 UTC, deadalnix wrote:
 Ok, consider the following.

 class A {};
 class B: public A {};

 A *a = new B();

 tyepid(a) is A*. In C++, a is monomorphic.
 typeid(*a) is B. In C++, *a is polymorphic.
Sadly, IIRC typeid(*a) is A, because A does not contain a virtual member... typeid(a) is A*, because that is the type of the pointer. However, the relationship between B* and A* is polymorphic, because you can use B* in the context where you expect A*? E.g. you can call a function that expects paramater A* with a pointer B*. So that makes the relationship polymorphic? I have to admit I never use the terminology monomorphic and polymorphic, so my understanding could be wrong. If so, I am probably not alone in the thread, so for the sake of other readers, maybe someone can provide a definition for monomorphic?
May 12
next sibling parent reply Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
wrote:
 typeid(a) is A*, because that is the type of the pointer. 
 However, the relationship between B* and A* is polymorphic, 
 because you can use B* in the context where you expect A*? E.g. 
 you can call a function that expects paramater A* with a 
 pointer B*. So that makes the relationship polymorphic?
To be more precise. B* is a subtype of A* if you can use B* in contexts where A* is expected, which is polymorphic in nature. More interestingly, pure OO-languages like Beta provide type-variables. C++/D lack those. So in such languages you can bind new types to type-variables and therefore change the typing of elements of arrays and such in subclasses. (Which leads to other challenges, all languages seem to have some kind of challenge associated with them once they allow polymorphisms)
May 12
parent deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 16:46:40 UTC, Ola Fosheim Grøstad 
wrote:
 On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
 wrote:
 typeid(a) is A*, because that is the type of the pointer. 
 However, the relationship between B* and A* is polymorphic, 
 because you can use B* in the context where you expect A*? 
 E.g. you can call a function that expects paramater A* with a 
 pointer B*. So that makes the relationship polymorphic?
To be more precise. B* is a subtype of A* if you can use B* in contexts where A* is expected, which is polymorphic in nature.
I would say it is a sybtype, yes, but polymorphism imply that there are several ways to see the same thing, which, as Andrei points out, imply that you go through a reference somewhere.
May 12
prev sibling parent reply deadalnix <deadalnix gmail.com> writes:
On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
wrote:
 I have to admit I never use the terminology monomorphic and 
 polymorphic, so my understanding could be wrong. If so, I am 
 probably not alone in the thread, so for the sake of other 
 readers, maybe someone can provide a definition for monomorphic?
It's quite simple. *a is polymorphic, because it it an object of type A as far as the user of *a is concerned, but it is actually an object of type B (or any other subtype of a). a itself isn't polymorphic, because it is a pointer to an A no matter what. It is not a pointer to a B that is observed as if it was a pointer to an A. There is nothing more in it to be discovered at run time, it's just a pointer. Even if you do B *b = ...; A *a = b; Then you have not an instance of polymorphism, simply that you had a pointer to a B, and now you also have a pointer to an A.
May 12
parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Wednesday, 12 May 2021 at 16:49:12 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
 wrote:
 I have to admit I never use the terminology monomorphic and 
 polymorphic, so my understanding could be wrong. If so, I am 
 probably not alone in the thread, so for the sake of other 
 readers, maybe someone can provide a definition for 
 monomorphic?
It's quite simple. *a is polymorphic, because it it an object of type A as far as the user of *a is concerned, but it is actually an object of type B (or any other subtype of a). a itself isn't polymorphic, because it is a pointer to an A no matter what. It is not a pointer to a B that is observed as if it was a pointer to an A. There is nothing more in it to be discovered at run time, it's just a pointer.
I think I understand what you mean, but the terminology used is confusing me. A monomorphic function/operator works on only one type, but a polymorphic function/operators works on many types. Seems to me that A* can work on many types, but B* can only work on one type (if has no subclasses. So wouldn't that make A* be polymorphic in nature, but B* be monomorphic in nature? I've recently found it better (less baggage) to think in terms of protocols than classes. Then A* would be a pointer to something that provides the A-protocols. So when an A* pointer points to a B instance then we can think of it as if it points to the A-protocols that B provides. Maybe then you could claim that it is monomorphic as it only binds to A-protocols. But that is not actually the case, as you have the ability to cast A* to B*. So then it would be polymorphic...? I dunno. Seems it is a matter of perspective, if "monomorphic" means "of one form".
May 12
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, May 11, 2021 12:37:20 PM MDT Meta via Digitalmars-d wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu

 wrote:
 Again with moving the goalposts.
To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.
Having isSomeString accept types that implicitly converted to string would be a disaster. Templates do not operate on implict conversions - or even on subtypes. They operate on the exact type they're given. You can, of course, write a template constraint which checks for implicit conversions, but you still don't get the implicit conversion when the template is instantiated. You get the original type. This has a number of implications, but in general, it leads to bugs if templates check for implicit conversions instead of exact types. In particular, any templated function which checks for an implicit conversion then needs to force the implicit conversion, or it will likely not work properly - be it because you get compilation errors, or because the original type compiles with the same code but does not behave the same way as the type from the implicit conversion which was not actually made. In fact, IIRC, at one point, isSomeString _did_ work with enums, and we fixed it so that it didn't, because it was causing problems. Also, IIRC, it was my fault that it was ever made to work with enums, and I very much regret that. In general, implicit conversions have no business in template constraints. Obviously, there are exceptions to that, but in general, there will be fewer bugs if the conversions are done explicitly by the code instantiating the template. The reason that it's done in Phobos as much as it is is primarily because of code that was originally not generic which was later templatized (often because it took string and was changed to work on multiple string types or to work on general ranges of characters). And in most cases where we've tried to templatize functions without breaking code, we've had problems because of the implicit conversions that worked before. std.traits.isConvertibleToString is one such abomination which came out of that (its use usually results in code that slices local variables and escapes them, which is really bad). IIRC, that was done by Walter, and if he's making mistakes like that with regards to implicit conversions and templated code, what do you think the average D programmer is doing? The main reason for bringing up popFront and enums is to show that that enums with a base type of string are not actually strings, and treating them as if they were causes serious problems. There are of course places where that sub-typing results in implicit conversions, but templates do not work that way, and trying to force it is very problematic. The proliferation of template constraint and static if complexity that Andrei is complaining about with regards to stuff like format is the result of that, and it's the kind of code that's very hard to get right. Simply not trying to support those implicit conversions with templated functions _significantly_ reduces the complexity of such code with the only cost being that the code instantiating the template will have to use cast(string) on the enum value. - Jonathan M Davis
May 12
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/12/21 6:36 AM, Jonathan M Davis wrote:
 Having isSomeString accept types that implicitly converted to string would
 be a disaster.
Sadly that's exactly what StringTypeOf does: https://run.dlang.io/is/8xqPKr We should eliminate all uses of StringTypeOf from phobos.
May 12
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu 
wrote:
 By the same token you can't make up your own definition of what 
 subtyping is and isn't. Value types and reference types are 
 well-trodden ground. You can't just claim new terminology and 
 then prove your own point by using it.
I simply removed an assumption that isn't relevant to the case I'm making, namely wether you consider ref string to be a type or not, because it doesn't affect the conclusion and therefore isn't a debate worth getting into. You made the point that SomeEnumString cannot be considered a subtype of string because things start breaking when it is passed by ref, and I retort that the exact same things break in the exact same way for subtypes, making your argument moot. You say "B is not a subtype of A because it exhibit behavior X when passed by ref" I say "D is a known subtype of C, and it also exibhit behavior X when passed by ref, therefore X cannot be used as a justification that B isn't a subtype of A" We can argue to no end about what is the right definition that should be used for X, but it really doesn't change the overall point that is being made.
May 11
prev sibling parent deadalnix <deadalnix gmail.com> writes:
On Tuesday, 11 May 2021 at 12:14:42 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu 
 wrote:
 I'm not sure what's the problem is here. Do you have a 
 concrete example?
Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
That must be a type error, this is a feature, not a bug. This is not expected to work.
I realize that this require further explanations. The fact that B is a subtype of A doesn't imply that a type constructed from B is a subtype of that same construction using A. For instance, A function() would be a subtype of B function(), the relation reversed in that example. In your example, you are constructing a ref SomeEnumString and expecting it to be a subtype of string (or maybe ref string) but both are incorrect assumptions. This is because you can execute operation that require covariance as well as operation that require contravariance on a ref, therefore, it needs to be exactly the same type. This is hardly an exceptional situation, this also happens when taking an array, B being a subtype of A doesn't mean the B[] is a subtype of A[]. Interestingly, it is the case for const ref, or const arrays, which is where the push toward handling const ref differently comes from. In any case, it is not expect from format that it modify teh pattern it takes as an input. In fact, it is a god damn compile time parameter, it is not mutable to begin with. It is therefore expected that this works.
May 11
prev sibling parent reply Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:
On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 Popping the head out of an enum value ought to be a string, 
 not that enum's value. I don't really see where the problem is 
 here, this is subtyping 101.
So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.
This feels a bit like the real problem might be in the conflation of the container (the enum or the string) and the range? Cf. the way this is handled in Rust, where there is a clear distinction between a container, versus an iterator over that container: https://doc.rust-lang.org/rust-by-example/flow_control/for.html Note also the different ways that the iterator can be generated: either using a reference to the container itself, or by moving the container into the iterator so the container itself is consumed by the iteration.
May 10
next sibling parent Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:
On Monday, 10 May 2021 at 17:09:37 UTC, Joseph Rushton Wakeling 
wrote:
 Cf. the way this is handled in Rust, where there is a clear 
 distinction between a container, versus an iterator over that 
 container:
That is true for C++ and Python as well. C++ has begin(object)/end(object) and Python has iter(object).
May 10
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 5/10/21 1:09 PM, Joseph Rushton Wakeling wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 Popping the head out of an enum value ought to be a string, not that 
 enum's value. I don't really see where the problem is here, this is 
 subtyping 101.
So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.
This feels a bit like the real problem might be in the conflation of the container (the enum or the string) and the range? Cf. the way this is handled in Rust, where there is a clear distinction between a container, versus an iterator over that container: https://doc.rust-lang.org/rust-by-example/flow_control/for.html Note also the different ways that the iterator can be generated: either using a reference to the container itself, or by moving the container into the iterator so the container itself is consumed by the iteration.
True, D has only "orphan" ranges, no containers. std.container is not working out and with current D technology we can't define containers that work with safe/pure/nogc at the same time (two out of three we can). If you consider the enum string value a container and the string extracted from it a range of that container, I think that would be a valid way to look at the matter.
May 11
parent reply Paul Backus <snarwin gmail.com> writes:
On Tuesday, 11 May 2021 at 13:41:53 UTC, Andrei Alexandrescu 
wrote:
 True, D has only "orphan" ranges, no containers. std.container 
 is not working out and with current D technology we can't 
 define containers that work with safe/pure/nogc at the same 
 time (two out of three we can).
How much value does pure have here anyway? Typical container usage involves allocating from the global (!) heap, which arguably *should* be impure, hacks like `pureMalloc` notwithstanding.
May 11
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 11.05.21 16:38, Paul Backus wrote:
 allocating from the global (!) heap, which arguably *should* be impure
I think this is confusing different levels of abstraction. What should be impure is accessing memory addresses as integers.
May 11
prev sibling parent ruheladev40 <ruheladev400 gmail.com> writes:
I think it makes possible sense to require either wrappers that 
clarify intent, or always treat enums the same way (as an enum). 
I think Phobos *mostly* does the latter. Erroring for ambiguity 
might be more disruptive than it's worth.
May 11