digitalmars.D - No we should not support enum types derived from strings

Andrei Alexandrescu (2/2) May 06 2021 We should remove all that rot from phobos pronto.

evilrat (3/5) May 06 2021 Just a commoner here, can you explain for stupid what makes enum

Andrei Alexandrescu (3/10) May 07 2021 Heavy toll on the infra for a very niche use case with trivial

deadalnix (5/7) May 07 2021 It seems like the toll comes from isSomeString to return false

Paul Backus (3/11) May 07 2021 "Is a string type" and "is implicitly convertible to a string

Andrei Alexandrescu (36/49) May 07 2021 Yah. It's really been a string (heh!) of suboptimal decisions.

Jacob Carlborg (6/12) May 07 2021 You can have enums with the base type being a struct or a class. How

Andrei Alexandrescu (5/17) May 07 2021 The solution to that is "We do not support enums". But if you use a

Jacob Carlborg (6/7) May 07 2021 If you're going to make strings a user defined type, how are you

Meta (3/9) May 07 2021 It really, really should be. Pattern matching and destructuring
Andrei Alexandrescu (2/8) May 07 2021 Built-in strings remain as they are.

Jon Degenhardt (21/25) May 07 2021 This is a bit orthogonal, but... An important characteristic of

Andrei Alexandrescu (4/32) May 07 2021 String s;

Jon Degenhardt (36/42) May 07 2021 That's not quite what I was getting at. But that's my fault. A
Walter Bright (8/11) May 09 2021 Already done:

Andrei Alexandrescu (3/17) May 09 2021 Problem being of course that there's no UDT String type, only the crappy...

guai (10/16) May 08 2021 In my experience treating a string as byte array is almost never

Berni44 (18/21) May 08 2021 It is not difficult to recognize this case and go back 1 to 3

Jon Degenhardt (3/15) May 08 2021 Exactly. All the ideas you listed apply. Parallelization is very
guai (14/31) May 08 2021 I ment this [combining

Adam D. Ruppe (19/23) May 08 2021 The thing is making the range be of dchars doesn't help with this.

guai (2/7) May 08 2021 At least it won't induce more problems

Adam D. Ruppe (7/10) May 08 2021 This is what Phobos already does and it has already created more

Max Haughton (2/13) May 08 2021 The opaque blob model also allows SSO much more easily.

Berni44 (27/50) May 08 2021 You are talking about generic algorithms that work for every

guai (4/8) May 08 2021 No cryptography is done on strings but instead on byte arrays.

Jon Degenhardt (23/40) May 08 2021 Data and log file processing are common cases. Single byte ascii

guai (12/55) May 08 2021 When you work with log files first you pull it in as a byte

Jon Degenhardt (15/21) May 08 2021 Sure you can. It's necessary to take of advantage of the

guai (10/15) May 08 2021 Those algorithms you talking about are either doesn't need

Jon Degenhardt (22/38) May 08 2021 I don't understand the point you are trying to make. Perhaps you

Q. Schroll (2/7) May 07 2021 True. But why have it easy when you can have it complicated?

Walter Bright (5/7) May 08 2021 Language lawyer point:

deadalnix (8/16) May 09 2021 Sorry to be blunt, but this is complete language layering fail.

Jon Degenhardt (60/71) May 11 2021 To try to put some focus on the user perspective, here's a sample

Andrei Alexandrescu (7/19) May 11 2021 Thanks. I agree it's confusing. The mystery gets elucidated with some

Andrei Alexandrescu (6/27) May 11 2021 Another unpleasant issue:

Walter Bright (4/11) May 11 2021 The representation of a named enum is its base type.

deadalnix (9/20) May 11 2021 Y.f7 is of type Y. It's representation is string, not

Walter Bright (2/22) May 11 2021 That's what I said.

Imperatorn (2/26) May 11 2021 🍿

Andrei Alexandrescu (4/27) May 12 2021 `representation` is a library function, so in a way we get to have a say...

Walter Bright (3/5) May 11 2021 That came about due to the decision to overload enum to create manifest

Per =?UTF-8?B?Tm9yZGzDtnc=?= (7/9) May 07 2021 Can you describe the scope of the rottenness in terms of contexts
Steven Schveighoffer (23/26) May 07 2021 What do you mean "not support"? The language has enums derived from

Andrei Alexandrescu (4/14) May 07 2021 Enums derived from strings should not be supported as strings in the

Adam D. Ruppe (12/14) May 07 2021 I don't think the stdlib should special case much of anything.

Andrei Alexandrescu (5/22) May 07 2021 Yes

Jonathan M Davis (10/32) May 12 2021 Agreed. While implicit conversions can at times be useful, they cause a ...

deadalnix (17/26) May 09 2021 100% agreed, but, back to my original point, why is the enum

Andrei Alexandrescu (16/24) May 07 2021 Enums are poorly designed, but that's only a small part of the problem.

Steven Schveighoffer (14/43) May 07 2021 But an enum with base string type can be passed as a string. The PR in

Adam D. Ruppe (17/21) May 07 2021 "Can be passed as a" is not the same as "is a". There's a

Andrei Alexandrescu (5/10) May 07 2021 YES! Int is not floating point, but yes you can initiate a floating
Paul Backus (16/22) May 07 2021 We can already *almost* express this in the language. This code

Adam D. Ruppe (39/41) May 07 2021 eeeeh that's a compile time argument and it still isn't actually

Steven Schveighoffer (13/35) May 07 2021 But that's the intention of the function. format doesn't care what the

Adam D. Ruppe (42/45) May 07 2021 Well, one way we can do that today is to have the template

Adam D. Ruppe (4/7) May 07 2021 oh i should have added of course you can do the wchar and dchar
Steven Schveighoffer (9/12) May 07 2021 The most common range BY FAR in all of D code is an array.

Adam D. Ruppe (23/25) May 07 2021 int[5] arr;
Andrei Alexandrescu (3/21) May 07 2021 Yah, ranges are a generalization of arrays. It would be odd if the

NonNull (13/25) May 12 2021 No. Ranges are not a generalization of arrays unless you ignore

Paul Backus (9/19) May 12 2021 Ranges are a generalization of arrays (or slices, if you prefer)

NonNull (30/45) May 12 2021 This is the standard pattern of the interpretation of the meaning

Andrei Alexandrescu (14/58) May 07 2021 Well you see here is the problem. An enum with base string can be

Steven Schveighoffer (21/36) May 07 2021 Sorry, let's jump out of the fake dialog here for a second.

Steven Schveighoffer (5/11) May 07 2021 I forgot to finish this thought, got interrupted.

Daniel N (3/16) May 07 2021 What's wrong with this?

Adam D. Ruppe (17/19) May 07 2021 That doesn't convert to string. It allows it to compile because T
Steven Schveighoffer (4/22) May 07 2021 Because T is not a string.

Andrei Alexandrescu (12/35) May 07 2021 Of course. I understand that very well. But that's a minor confusion and...
Q. Schroll (10/25) May 07 2021 Maybe this is special casing here, but if you have a finite list

deadalnix (8/16) May 09 2021 Popping the head out of an enum value ought to be a string, not

Andrei Alexandrescu (7/20) May 09 2021 So you have a range r of type T.

deadalnix (5/11) May 10 2021 If you have a range of T, then you got to return a T. I'm not

Paul Backus (8/24) May 10 2021 popFront doesn't return a value, it mutates. So `r` before

deadalnix (33/40) May 10 2021 r = r[1 .. $] is an error unless r actually is a string. You

Andrei Alexandrescu (38/65) May 11 2021 If we move the goalposts we can with certain ease create the illusion

deadalnix (23/29) May 11 2021 I don't think that any of what you wrote is incorrect, and these

Andrei Alexandrescu (37/43) May 11 2021 Reasonable, though I should add that it's a decision made by the author

Andrei Alexandrescu (7/13) May 11 2021 Correx:
deadalnix (24/51) May 11 2021 It's debatable. There are many languages out there where it

Walter Bright (11/13) May 11 2021 D has no notion of a "special kind of type". It only has a notion of "im...

deadalnix (30/44) May 11 2021 Except, it is.

12345swordy (4/10) May 11 2021 Remove alias this support for classes and replace it with compile
Walter Bright (3/5) May 11 2021 Converting a derived class reference to a base class reference is an "im...

deadalnix (17/23) May 11 2021 That is trivially demonstrably false. Consider:

Paul Backus (29/53) May 12 2021 I concede the points that enum strings do not violate the LSP,

deadalnix (46/63) May 12 2021 That is true, and there are definitively cases where it is

Paul Backus (5/17) May 12 2021 This *does* work as expected: https://run.dlang.io/is/Ru9phk

deadalnix (20/24) May 12 2021 Yes, so we are getting at the root of this.

Paul Backus (23/37) May 12 2021 Well, no, it doesn't--because, again, the LSP doesn't apply here

deadalnix (33/61) May 13 2021 While what you say is correct, I'm not convinced it is right.

Andrei Alexandrescu (18/49) May 12 2021 I was all over run.dlang.org like "Sure that's not going to work... wait...

Paul Backus (11/19) May 12 2021 A template function, you mean? Because (as the rest of the post

Andrei Alexandrescu (37/60) May 12 2021 Well the problem is that the choice of covariance of results for

Jonathan M Davis (16/21) May 12 2021 Yeah, if enums are supposed to only have a fixed set of values, then the...
Jonathan M Davis (11/14) May 12 2021 Or more accurately, all operations on an enum which are not guaranteed t...

Alexandru Ermicioi (3/16) May 12 2021 So basically enum should implicitly be declared to be immutable
deadalnix (2/19) May 13 2021 YES!

deadalnix (48/64) May 10 2021 More to the point, consider this:

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (25/36) May 10 2021 Not sure how this applies to C++, what subtyping issues are you

deadalnix (21/48) May 10 2021 Function type don't have the right covariance/contravariance, you

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (35/48) May 10 2021 Yes, I think everyone can agree with this. A good starting point
Imperatorn (10/20) May 11 2021 +1
Andrei Alexandrescu (4/19) May 11 2021 In case you're referring to deprecating support for enum strings in
Mathias LANG (14/29) May 11 2021 Well, this thread is 11 pages and show no sign of winding down.

cmyka (7/17) May 11 2021 I support bringing these types of discussions to github (not
deadalnix (4/19) May 12 2021 If formats expects a string, then it is indeed the right thing to

Andrei Alexandrescu (15/88) May 11 2021 No it isn't.

deadalnix (21/25) May 11 2021 Here we hit at the core of the problem. A reference to a type B

Andrei Alexandrescu (6/21) May 11 2021 Of course. A range must implement popFront with the signature:

deadalnix (4/10) May 11 2021 That must be a type error, this is a feature, not a bug. This is

Andrei Alexandrescu (2/14) May 11 2021 Then enum strings are not ranges, correct?

deadalnix (5/6) May 11 2021 They are not. But they are strings. Which imply that string

Andrei Alexandrescu (2/7) May 11 2021 `ref string` is not a type.

deadalnix (14/23) May 11 2021 This is just denial.

Andrei Alexandrescu (3/26) May 11 2021 Again with moving the goalposts.

Andrei Alexandrescu (8/37) May 11 2021 To clarify: you can't make up your own definitions as you go so as to

Meta (16/25) May 11 2021 I apologize for injecting myself into this conversation, but with

Andrei Alexandrescu (41/67) May 11 2021 Being blunt is totally cool, but that doesn't make you right.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (20/24) May 11 2021 I think you guys need to agree on what you mean by "type" and
deadalnix (26/50) May 11 2021 While this is indeed very interesting, this is missing the larger

12345swordy (3/59) May 11 2021 No, classes are reference types, structs are values types in c#.

deadalnix (11/13) May 11 2021 No, both are value type, but in the case of the class, the value

12345swordy (7/22) May 11 2021 Wrong.

12345swordy (5/32) May 11 2021 In layman terms, just because I can replace the item in the box

12345swordy (3/39) May 11 2021 Woops, meant to say "with the exact same item."

deadalnix (22/26) May 11 2021 You might want to reconsider how sure of yourself you are. For

12345swordy (12/19) May 11 2021 The code you posted, do not support your claim what so ever. When

deadalnix (9/29) May 12 2021 Before posting that email was the best time to run the code, look

12345swordy (12/39) May 12 2021 Like I said before, it does not support your claims, whatsoever.

deadalnix (2/7) May 12 2021 I legitimately can't tell if you are an idiot or a troll.

12345swordy (6/14) May 12 2021 What kind of idiot that ignores official documentation provided

Alexandru Ermicioi (20/35) May 12 2021 I think, you both talking about same thing. I think what he meant

12345swordy (10/37) May 12 2021 You are conflicting passing an argument by value/reference with

Andrei Alexandrescu (2/7) May 12 2021 All of this is bizarrely incorrect. Care to elaborate?

deadalnix (13/21) May 12 2021 Consider the following: https://godbolt.org/z/8vzx9W56a

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/15) May 12 2021 In fairness all generic low level programming languages that are

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/6) May 12 2021 Typo :-D, I meant pointer-to-Singeltong is subtype of

Andrei Alexandrescu (2/24) May 12 2021 Ah, now we're at slicing. Love these forum discussions!

Meta (30/93) May 11 2021 Of course, but I thought the conversation was about strings, not

Andrei Alexandrescu (3/5) May 12 2021 Just by means of clarification, that's not true because the length is

Jonathan M Davis (16/21) May 12 2021 To be more precise, a dynamic array in D is essentially

Andrei Alexandrescu (2/4) May 12 2021 No, that would be ref int -> ref int, which has consequences.

Timon Gehr (18/55) May 11 2021 Deadalnix is saying that there is a subtyping relationship for rvalues,

Andrei Alexandrescu (64/83) May 12 2021 Well put. Rvalues can afford the luxury to change representation (e.g.

deadalnix (31/67) May 12 2021 I've raised these problem on a regular basis for years now.

Andrei Alexandrescu (10/36) May 12 2021 I know this is Walter's take, but please don't ascribe it to me as well....

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/18) May 12 2021 You are so wonderful at being inclusive... :-P Never seen anyone
deadalnix (4/9) May 12 2021 It's fine, then just listen to him and not to me. That already

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/22) May 12 2021 It isn't a quirk. To get dynamic lookup you need to add a virtual

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/22) May 12 2021 I don't understand what you mean by pointers being monomorphic.

deadalnix (8/25) May 12 2021 Ok, consider the following.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/18) May 12 2021 Sadly, IIRC typeid(*a) is A, because A does not contain a virtual

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/16) May 12 2021 To be more precise. B* is a subtype of A* if you can use B* in

deadalnix (5/14) May 12 2021 I would say it is a sybtype, yes, but polymorphism imply that

deadalnix (15/19) May 12 2021 It's quite simple.

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/32) May 12 2021 I think I understand what you mean, but the terminology used is

Jonathan M Davis (44/71) May 12 2021 Having isSomeString accept types that implicitly converted to string wou...

Andrei Alexandrescu (3/5) May 12 2021 Sadly that's exactly what StringTypeOf does: https://run.dlang.io/is/8xq...

deadalnix (18/22) May 11 2021 I simply removed an assumption that isn't relevant to the case

deadalnix (22/34) May 11 2021 I realize that this require further explanations.

Joseph Rushton Wakeling (11/20) May 10 2021 This feels a bit like the real problem might be in the conflation

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/7) May 10 2021 That is true for C++ and Python as well. C++ has
Andrei Alexandrescu (7/33) May 11 2021 True, D has only "orphan" ranges, no containers. std.container is not

Paul Backus (6/10) May 11 2021 How much value does pure have here anyway? Typical container

Timon Gehr (3/4) May 11 2021 I think this is confusing different levels of abstraction. What should

ruheladev40 (4/4) May 11 2021 I think it makes possible sense to require either wrappers that

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

We should remove all that rot from phobos pronto.

https://github.com/dlang/phobos/pull/8029

May 06 2021

evilrat <evilrat666 gmail.com> writes:

On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029

Just a commoner here, can you explain for stupid what makes enum 
string a no go and why it should begone?

May 06 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 2:03 AM, evilrat wrote:
 On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029

 
 Just a commoner here, can you explain for stupid what makes enum string 
 a no go and why it should begone?

Heavy toll on the infra for a very niche use case with trivial 
workarounds on the user side.

May 07 2021

deadalnix <deadalnix gmail.com> writes:

On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.

It seems like the toll comes from isSomeString to return false 
for these nums, no? What is the root cause of this not working?

It doesn't seems like this should be a special case anywhere and 
just work.

May 07 2021

Paul Backus <snarwin gmail.com> writes:

On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:
 On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu 
 wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.

 It seems like the toll comes from isSomeString to return false 
 for these nums, no? What is the root cause of this not working?

 It doesn't seems like this should be a special case anywhere 
 and just work.

"Is a string type" and "is implicitly convertible to a string 
type" are not the same thing.

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 10:16 AM, Paul Backus wrote:
 On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:
 On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.

 It seems like the toll comes from isSomeString to return false for 
 these nums, no? What is the root cause of this not working?

 It doesn't seems like this should be a special case anywhere and just 
 work.

 
 "Is a string type" and "is implicitly convertible to a string type" are 
 not the same thing.

Yah. It's really been a string (heh!) of suboptimal decisions.

1. We wanted strings to be synonym to built-in slices of char. "Users 
should not need to define their own string type!" This has been D's 
billion dollars mistake.

2. Representing strings are char[] meant GC is a must and also there's 
long-distance coupling between callers and callees whenever strings are 
passed about: a callee may modify characters in the caller's string. 
Such changes could have been absolutely trivially disallowed with a 
user-defined string type, but see (1) and did I mention D's billion 
dollars mistake?

3. So yours truly (shudder) came up with the idea of doing strings as 
immutable(char)[] so that people can pass strings around, no coupling, 
no problem. GC is still a must. That satisfies (1) but bought us into 
the entire qualifiers business, which, any way I look at it, did not 
produce enough dividends compared to the effort put into it and the 
massive complications added to the language. (Aside: inout is the 
weirdest thing. How could we ever think that that was a good idea.)

4. When doing generic string functions for phobos, it made sense to 
support... oh wait a second we have so many string types. char[], 
wchar[], dchar[], each in triplicate because of const and immutable. So 
right of the bat we decided to support 9 string types. That was another 
mistake because nobody cares about wchar and dchar. Anyway, that's how 
isSomeChar and isSomeString were born.

5. Then came the question of ranges that have one of those 9 character 
types as elements... those should be supported too, no? IIRC at least a 
subset of phobos supports that stuff.

6. Then of course someone figured, wait a second, what about enums 
derived from strings and user-defined types that have an alias this as 
string... those deserve attention too, right? And right here we had 
descended into madness.


Compare all that with:

0. We put a String type in the standard library. It uses UTF8 inside and 
supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages 
its own memory so no need for the GC. It disallows remote coupling 
across callers/callees. Case closed.

May 07 2021

Jacob Carlborg <doob me.com> writes:

On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 Compare all that with:
 
 0. We put a String type in the standard library. It uses UTF8 inside and 
 supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages 
 its own memory so no need for the GC. It disallows remote coupling 
 across callers/callees. Case closed.

You can have enums with the base type being a struct or a class. How 
does putting a String type in the standard library help with the enum 
problem you're describing?

-- 
/Jacob Carlborg

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 2:22 PM, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:
 
 Compare all that with:

 0. We put a String type in the standard library. It uses UTF8 inside 
 and supports iteration by either bytes, UTF8, UTF16, or UTF32. It 
 manages its own memory so no need for the GC. It disallows remote 
 coupling across callers/callees. Case closed.

 
 You can have enums with the base type being a struct or a class. How 
 does putting a String type in the standard library help with the enum 
 problem you're describing?

The solution to that is "We do not support enums". But if you use a 
non-templated class String, you won't feel much of a pain in the first 
place because the enums will be converted to String objects upon call.

The String type solves all other problems mentioned.

May 07 2021

Jacob Carlborg <doob me.com> writes:

On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 0. We put a String type in the standard library.

If you're going to make strings a user defined type, how are you 
planning to support things like switch statements with strings? It's not 
currently possible to have switch statements with user defined types.

-- 
/Jacob Carlborg

May 07 2021

Meta <jared771 gmail.com> writes:

On Friday, 7 May 2021 at 18:25:57 UTC, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:

 0. We put a String type in the standard library.

 If you're going to make strings a user defined type, how are 
 you planning to support things like switch statements with 
 strings? It's not currently possible to have switch statements 
 with user defined types.

It really, really should be. Pattern matching and destructuring 
are two of my most wanted features in D.

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 2:25 PM, Jacob Carlborg wrote:
 On 2021-05-07 17:24, Andrei Alexandrescu wrote:
 
 0. We put a String type in the standard library.

 
 If you're going to make strings a user defined type, how are you 
 planning to support things like switch statements with strings?

Built-in strings remain as they are.

May 07 2021

Jon Degenhardt <jond noreply.com> writes:

On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 0. We put a String type in the standard library. It uses UTF8 
 inside and supports iteration by either bytes, UTF8, UTF16, or 
 UTF32. It manages its own memory so no need for the GC. It 
 disallows remote coupling across callers/callees. Case closed.

This is a bit orthogonal, but... An important characteristic of 
utf-8 arrays is that they are simultaneously a random access 
range of bytes and an input range of utf-8 characters. For 
efficiency it's often important to switch back and forth between 
these two interpretations.

`byLine` is one type of example, where a byte oriented search is 
done (e.g. with `memchr`), but afterward the representation array 
is accessed as utf-8 input range.

`byLine` implementations will usually work by iterating forward, 
but there are random access use cases as well. For example, it is 
perfectly reasonable to divide a utf-8 array in roughly in half 
using byte offsets, then searching for the nearest utf-8 
character boundary. At after this both halves are treated as 
utf-8 input ranges, not random access.

This switching between interpretations doesn't fit well with 
current distinction between `char[]` and `byte[]`. A numbers of 
algorithms in phobos operate on one or the other, but not both.

It'd be very useful to have an approach to utf-8 strings that 
enabled switching interpretations easily, without casting.

--Jon

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 6:34 PM, Jon Degenhardt wrote:
 On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 0. We put a String type in the standard library. It uses UTF8 inside 
 and supports iteration by either bytes, UTF8, UTF16, or UTF32. It 
 manages its own memory so no need for the GC. It disallows remote 
 coupling across callers/callees. Case closed.

 
 This is a bit orthogonal, but... An important characteristic of utf-8 
 arrays is that they are simultaneously a random access range of bytes 
 and an input range of utf-8 characters. For efficiency it's often 
 important to switch back and forth between these two interpretations.
 
 `byLine` is one type of example, where a byte oriented search is done 
 (e.g. with `memchr`), but afterward the representation array is accessed 
 as utf-8 input range.
 
 `byLine` implementations will usually work by iterating forward, but 
 there are random access use cases as well. For example, it is perfectly 
 reasonable to divide a utf-8 array in roughly in half using byte 
 offsets, then searching for the nearest utf-8 character boundary. At 
 after this both halves are treated as utf-8 input ranges, not random 
 access.
 
 This switching between interpretations doesn't fit well with current 
 distinction between `char[]` and `byte[]`. A numbers of algorithms in 
 phobos operate on one or the other, but not both.
 
 It'd be very useful to have an approach to utf-8 strings that enabled 
 switching interpretations easily, without casting.

String s;
func1(s.bytes);
func2(s.dchars);

May 07 2021

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 8 May 2021 at 02:05:42 UTC, Andrei Alexandrescu 
wrote:
 On 5/7/21 6:34 PM, Jon Degenhardt wrote:
 It'd be very useful to have an approach to utf-8 strings that 
 enabled switching interpretations easily, without casting.

 String s;
 func1(s.bytes);
 func2(s.dchars);

That's not quite what I was getting at. But that's my fault. A 
hastily written message that muddled a couple of concepts. Sorry 
about that, I need to write up a better description. But there 
are two underlying thoughts.

One is being able to convert from a random access byte array to 
char input range (e.g. `byUTF`), do something with it (e.g. 
`popFront`), then convert that form back to a random access byte 
range. This is logically doable because both are views on the 
same physical array. However, once something is an input range it 
doesn't convert simply to a random access range.

This first one strikes me as potentially challenging because this 
dual view on the underlying data is not common, so there's not a 
lot of incentive to support it as a general concept.

The second issue is more about current Phobos algorithms that 
specialize their implementations depending on whether the 
argument is a `char[]` or a `byte[]`. This normally involves 
conditioning on `isSomeString` or `isSomeChar`. `char[]` / `char` 
pass these tests, `byte[]` / `byte` do not. The cases I remember 
are cases where the string form was specialized to have better 
performance than the byte form. Look through searching.d for 
`isSomeString` use to see this.

The trouble with this is that at the application level it can be 
necessary to use a byte array when working with a number 
facilities. This often involves I/O. E.g. Reading fixed sized 
blocks from an input stream (`File.byChunk`). This operates on 
`ubyte[]` arrays. It can be cast to a `char[]`. But, this can run 
afoul of autodecoding related routines that expect correctly 
formed utf-8 characters. When reading fixed size buffers, the 
starts and ends of the buffer will often not fall on utf-8 
boundaries, so examining the bytes is necessary to handle these 
cases. (And input streams may contain corrupt utf-8 characters.)

I know the above is still not an adequate description. At some 
point I'll try to write up something more compelling.

--Jon

May 07 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:
 String s;
 func1(s.bytes);
 func2(s.dchars);

Already done:

s.byCodeUnit
s.byChar
s.byWchar
s.byDchar
s.byUTF

https://dlang.org/phobos/std_utf.html

May 09 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/9/21 5:04 AM, Walter Bright wrote:
 On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:
 String s;
 func1(s.bytes);
 func2(s.dchars);

 
 Already done:
 
 s.byCodeUnit
 s.byChar
 s.byWchar
 s.byDchar
 s.byUTF
 
 https://dlang.org/phobos/std_utf.html

Problem being of course that there's no UDT String type, only the crappy 
immutable(char)[].

May 09 2021

guai <guai inbox.ru> writes:

On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array in 
 roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves are 
 treated as utf-8 input ranges, not random access.

In my experience treating a string as byte array is almost never 
a good thing. Person doing it must be very careful and truly 
understand what they are doing.
What are those use cases other than `byLine` where this is useful?
Dividing utf-8 array and searching for the nearest char may split 
inside a combining character which isn't a thing you usually 
want. Especially when human would read this text.
Conceptually string is a sequence of characters. A range of dchar 
in D's terms.

May 08 2021

Berni44 <someone somemail.com> writes:

On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.

It is not difficult to recognize this case and go back 1 to 3 
bytes to reach a correct splitting place. UTF-8 was designed with 
this in mind.

- I can imagine, that this can be useful in divide-and-conquer 
algorithms, like binary search.
- Or when you've got for whatever reason the possibility to do 
larger jumps while scanning a string, e.g. when you know there 
are now 50 letters ahead, that do not contain a certain token you 
are looking for, you can safely jump 50 bytes, go back to the 
next splitting point and continue linear search there.
- Or you want to cut a string into pieces of a certain length 
(again 50?), where the exact length is not so much important. So 
you just jump ahead 50, go back again and split at this point. If 
there are a lot of non ascii characters in between, this is of 
course shorter, but maybe ok, because speed is more important.
- You want to process pieces of a string in parallel: Cut it in 
16 pieces and let your 16 cores work on each of them.

May 08 2021

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.

 It is not difficult to recognize this case and go back 1 to 3 
 bytes to reach a correct splitting place. UTF-8 was designed 
 with this in mind.

 - I can imagine, that this can be useful in divide-and-conquer 
 algorithms, like binary search.
 ... (more examples) ..
 - You want to process pieces of a string in parallel: Cut it in 
 16 pieces and let your 16 cores work on each of them.

Exactly. All the ideas you listed apply. Parallelization is very 
often useful.

May 08 2021

guai <guai inbox.ru> writes:

On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want.

 It is not difficult to recognize this case and go back 1 to 3 
 bytes to reach a correct splitting place. UTF-8 was designed 
 with this in mind.


I ment this [combining 
characters](https://en.wikipedia.org/wiki/Combining_character). 
they are language-specific, but most of the time the string does 
not contain any clue which language is it.

 - I can imagine, that this can be useful in divide-and-conquer 
 algorithms, like binary search.

They must be applied with great careful to non-ascii texts. What 
about RTL for example? You cannot split inside RTL block

 - Or you want to cut a string into pieces of a certain length 
 (again 50?), where the exact length is not so much important.

For what business task would I do that? I may want to split a 
string on some char subsequence for lexing. But one cannot assume 
lengths of those chunks.

 So you just jump ahead 50, go back again and split at this 
 point. If there are a lot of non ascii characters in between, 
 this is of course shorter, but maybe ok, because speed is more 
 important.

Not sure if speed is more important than correctness.

 - You want to process pieces of a string in parallel: Cut it in 
 16 pieces and let your 16 cores work on each of them.

I'm not sure if this is possible with all the quirks of unicode. 
Never herd even of parallel processors of structured texts like 
xml.

May 08 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). 
 they are language-specific, but most of the time the string 
 does not contain any clue which language is it.

The thing is making the range be of dchars doesn't help with this.

This kind of thinking is why Phobos does the autodecoding thing 
it does now, converting utf-8 to a range of dchar as it sees 
it... but those combining characters are still (or rather can be) 
two separate dchars!

So right now Phobos does something that seems useful... but 
actually isn't. All of the bad, none of the good.

BTW I also like to point out that Ascii actually has a lot of the 
same mysteries we ascribe to unicode. Like variable width chars: 
\t is an ascii char. Zero width char, ascii has \0 and \a. 
Negative width char? Is \b one? idk.

But there's still a lot of times you can treat it as bytes and 
get away with it.

This is why I'm not sold on Andrei's new String idea myself. I 
totally agree making char[] a range of dchars is a bad idea. But 
I think the only right thing to do is to expose what it actually 
is and then both educate and empower the user to do what they 
need themselves.

May 08 2021

guai <guai inbox.ru> writes:

On Saturday, 8 May 2021 at 19:30:03 UTC, Adam D. Ruppe wrote:
 On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). they are
language-specific, but most of the time the string does not contain any clue
which language is it.

 The thing is making the range be of dchars doesn't help with 
 this.

At least it won't induce more problems

May 08 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:
 The thing is making the range be of dchars doesn't help with 
 this.

 At least it won't induce more problems

This is what Phobos already does and it has already created more 
problems. It was a mistake to do it this way.

But if string was just an opaque(ish) blob with a variety of 
accessor properties it would work better then. The big mistake 
Phobos made was trying to automatically do something and causing 
friction by that automatic thing not being right.

May 08 2021

Max Haughton <maxhaton gmail.com> writes:

On Saturday, 8 May 2021 at 21:54:28 UTC, Adam D. Ruppe wrote:
 On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:
 The thing is making the range be of dchars doesn't help with 
 this.

 At least it won't induce more problems

 This is what Phobos already does and it has already created 
 more problems. It was a mistake to do it this way.

 But if string was just an opaque(ish) blob with a variety of 
 accessor properties it would work better then. The big mistake 
 Phobos made was trying to automatically do something and 
 causing friction by that automatic thing not being right.

The opaque blob model also allows SSO much more easily.

May 08 2021

Berni44 <someone somemail.com> writes:

On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:
 I ment this [combining 
 characters](https://en.wikipedia.org/wiki/Combining_character). 
 they are language-specific, but most of the time the string 
 does not contain any clue which language is it.

You are talking about generic algorithms that work for every 
script. But unicode allows for algorithms only supporting 
subsets. If your subset doesn't contain combining characters, you 
don't need to care about them. And else you may need to go back 
to the next base character. Depends on the usecase.

 - I can imagine, that this can be useful in divide-and-conquer 
 algorithms, like binary search.

 They must be applied with great careful to non-ascii texts. 
 What about RTL for example? You cannot split inside RTL block

Oh, yes, you can! Think of an algorithm which is doing 
cryptographic analysis and counting consecutive pairs of ascii 
characters. For that it doesn't matter if there is RTL text cut 
into pieces.

 - Or you want to cut a string into pieces of a certain length 
 (again 50?), where the exact length is not so much important.

 For what business task would I do that?

Simple wrapping to avoid loosing text when printing, or to avoid 
having to scroll vertically. Is probably not useful for a high 
quality program...

 I may want to split a string on some char subsequence for 
 lexing. But one cannot assume lengths of those chunks.

Depending on the use case you may know ahead.

 So you just jump ahead 50, go back again and split at this 
 point. If there are a lot of non ascii characters in between, 
 this is of course shorter, but maybe ok, because speed is more 
 important.

 Not sure if speed is more important than correctness.

Of course, this again depends on the use case. You can't say that 
in general.

 - You want to process pieces of a string in parallel: Cut it 
 in 16 pieces and let your 16 cores work on each of them.

 I'm not sure if this is possible with all the quirks of unicode.

Think again of the cryptographic analysis above, for an example. 
(Or checking wikipedia entries for whatever automatically.)

Keep in mind, that we do not always have to support everything of 
unicode. If we know ahead, that our text contains mainly ascii 
and aside from this only a few base characters, but never 
combining characters and so on, we can use different algorithms 
which might be simpler or faster or both. To make sure, that this 
constraint holds, is then something, that has to be done outside 
of the algorithm.

 Never herd even of parallel processors of structured texts like 
 xml.

I would judge it much more difficult to process xml in parallel 
than to do the same with unicode.

May 08 2021

guai <guai inbox.ru> writes:

On Saturday, 8 May 2021 at 20:19:51 UTC, Berni44 wrote:
 Oh, yes, you can! Think of an algorithm which is doing 
 cryptographic analysis and counting consecutive pairs of ascii 
 characters. For that it doesn't matter if there is RTL text cut 
 into pieces.

No cryptography is done on strings but instead on byte arrays. 
Why would you even want to use string here? Its methods won't be 
in any help.

May 08 2021

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array in 
 roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves 
 are treated as utf-8 input ranges, not random access.

 In my experience treating a string as byte array is almost 
 never a good thing. Person doing it must be very careful and 
 truly understand what they are doing.
 What are those use cases other than `byLine` where this is 
 useful?
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want. Especially when human would read this text.
 Conceptually string is a sequence of characters. A range of 
 dchar in D's terms.

Data and log file processing are common cases. Single byte ascii 
characters are normally used to delimit structure in such files. 
Record delimiters, field delimiters, name-value pair delimiters, 
escape syntax, etc. A common way to operate on such files is to 
identify structural boundaries by finding the requisite single 
byte ascii characters and treating the contained data as opaque 
(uninterpreted) sequences of utf-8 bytes.

The details depend on the file format. But the key part is that 
single byte ascii characters can be unambiguously identified 
without interpreting other characters in a utf-8 data stream. Of 
course, when it comes time to interpreting the data inside these 
data streams it is necessary to operate on cohesive blocks. Yes 
graphemes, but also things like numbers. It's not useful to split 
a number in the middle and then call `std.conv.to!double` on it.

Operating on the single byte structural elements allows deferring 
interpretation of multi-byte unicode content until it is needed. 
This is why it's useful to switch back and forth between a 
byte-oriented view and a UTF character view. Operating on bytes 
is faster (e.g. `memchr`, no utf-8 decoding), enables 
parallelization (depending on the type of file), and can be used 
with fixed size buffer reads and writes.

--Jon

May 08 2021

guai <guai inbox.ru> writes:

On Saturday, 8 May 2021 at 18:44:00 UTC, Jon Degenhardt wrote:
 On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:
 On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:
 `byLine` implementations will usually work by iterating 
 forward, but there are random access use cases as well. For 
 example, it is perfectly reasonable to divide a utf-8 array 
 in roughly in half using byte offsets, then searching for the 
 nearest utf-8 character boundary. At after this both halves 
 are treated as utf-8 input ranges, not random access.

 In my experience treating a string as byte array is almost 
 never a good thing. Person doing it must be very careful and 
 truly understand what they are doing.
 What are those use cases other than `byLine` where this is 
 useful?
 Dividing utf-8 array and searching for the nearest char may 
 split inside a combining character which isn't a thing you 
 usually want. Especially when human would read this text.
 Conceptually string is a sequence of characters. A range of 
 dchar in D's terms.

 Data and log file processing are common cases. Single byte 
 ascii characters are normally used to delimit structure in such 
 files. Record delimiters, field delimiters, name-value pair 
 delimiters, escape syntax, etc. A common way to operate on such 
 files is to identify structural boundaries by finding the 
 requisite single byte ascii characters and treating the 
 contained data as opaque (uninterpreted) sequences of utf-8 
 bytes.

 The details depend on the file format. But the key part is that 
 single byte ascii characters can be unambiguously identified 
 without interpreting other characters in a utf-8 data stream. 
 Of course, when it comes time to interpreting the data inside 
 these data streams it is necessary to operate on cohesive 
 blocks. Yes graphemes, but also things like numbers. It's not 
 useful to split a number in the middle and then call 
 `std.conv.to!double` on it.

 Operating on the single byte structural elements allows 
 deferring interpretation of multi-byte unicode content until it 
 is needed. This is why it's useful to switch back and forth 
 between a byte-oriented view and a UTF character view. 
 Operating on bytes is faster (e.g. `memchr`, no utf-8 
 decoding), enables parallelization (depending on the type of 
 file), and can be used with fixed size buffer reads and writes.

 --Jon

When you work with log files first you pull it in as a byte 
stream, split in chunks. Then make a string out of each of them. 
Once you've done it, you process it like a string with all the 
rules of unicode. For example split it into words. And then you 
may want to convert a word to bytes back again.
But you cannot split a string wherever you want treating it as 
bytes. It most certainly wouldn't work with all the languages out 
there.
With string you cannot get a char by index, you must read them 
sequentially. You can search, you can tokenize, rewind and 
reinterpret maybe.

May 08 2021

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 8 May 2021 at 19:33:45 UTC, guai wrote:
 ...
 But you cannot split a string wherever you want treating it as 
 bytes. It most certainly wouldn't work with all the languages 
 out there.

Sure you can. It's necessary to take of advantage of the 
properties of utf-8 encoding to do it. That is, it's necessary to 
find a nearby utf-8 character boundary, but utf-8 is defined in a 
manner that enables this. Take a look at [section 2.5 Encoding 
Forms](http://www.unicode.org/versions/Unicode13.0.0/ch02.pdf#G13708) in the
Unicode Standards doc. It describes exactly this.

 With string you cannot get a char by index, you must read them 
 sequentially.

Correct, you cannot find a unicode character using a character 
based index without processing sequentially. But for large 
classes of algorithms this is not necessary. That is, there is 
often no need to find, for example, the 100th character. If all 
an algorithm needs to do is split a string roughly in half, then 
use the byte offsets to find the halfway point and then look for 
a utf-8 character boundary. If the algorithm is based on some 
other boundary, say, token boundaries, then find one of those 
boundaries.

May 08 2021

guai <guai inbox.ru> writes:

On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:
 If all an algorithm needs to do is split a string roughly in 
 half, then use the byte offsets to find the halfway point and 
 then look for a utf-8 character boundary. If the algorithm is 
 based on some other boundary, say, token boundaries, then find 
 one of those boundaries.

Those algorithms you talking about are either doesn't need 
strings at all but instead byte/char arrays or would produce 
garbage for any input other than ascii.
Your example with log files mixes binary data with text. Properly 
done logger will escape delimiters inside text chunks, so it 
isn't even a string per se, it's some binary data from which you 
need to extract a string first.
A lot of bugs are caused by this mixing of text with binary. And 
I think it is better to distinguish them properly on a type level.

May 08 2021

Jon Degenhardt <jond noreply.com> writes:

On Saturday, 8 May 2021 at 21:47:21 UTC, guai wrote:
 On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:
 If all an algorithm needs to do is split a string roughly in 
 half, then use the byte offsets to find the halfway point and 
 then look for a utf-8 character boundary. If the algorithm is 
 based on some other boundary, say, token boundaries, then find 
 one of those boundaries.

 Those algorithms you talking about are either doesn't need 
 strings at all but instead byte/char arrays or would produce 
 garbage for any input other than ascii.

I don't understand the point you are trying to make. Perhaps you 
could rephrase.

I've implemented any number of these types of algorithms. Its 
very common to mix interpretation as unicode strings with 
interpretation as utf-8 bytes. e.g. Maybe its necessary to do 
case-conversion at some stage of processing. This has to be done 
on unicode characters, not bytes. But needing to do such 
processing at some point does exclude such treating the data as 
utf-8 bytes for other purposes.

Also, a `char[]` in D is defined to be utf-8, and a `string` is 
an `immutable(char)[]`. So why would utf-8 data, including 
non-ascii characters, read into a `char[]` produce garbage? The 
answer is that it wouldn't. No, you cannot simply start on an 
arbitrary byte boundary, but nobody has suggested this.

 Your example with log files mixes binary data with text. 
 Properly done logger will escape delimiters inside text chunks, 
 so it isn't even a string per se, it's some binary data from 
 which you need to extract a string first.

Again, I'm not following the logic. Log files may or may not 
include binary data. But I'm sure why that matters. I'm talking 
about log files where the text portions are encoded as utf-8.

 A lot of bugs are caused by this mixing of text with binary. 
 And I think it is better to distinguish them properly on a type 
 level.

Perhaps it would help if you described what you mean by "binary". 
I tend to think of "binary" as things like image data, binary 
serialization formats, base-64 coding, compressed or encrypted 
text. These are quite different than utf-8 encoded unicode text.

May 08 2021

Q. Schroll <qs.il.paperinik gmail.com> writes:

On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:
 Compare all that with:

 We put a String type in the standard library. It uses UTF8 
 inside and supports iteration by either bytes, UTF8, UTF16, or 
 UTF32. It manages its own memory so no need for the GC. It 
 disallows remote coupling across callers/callees. Case closed.

True. But why have it easy when you can have it complicated?

May 07 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/7/2021 7:16 AM, Paul Backus wrote:
 "Is a string type" and "is implicitly convertible to a string type" are not
the 
 same thing.

Language lawyer point:

An enum can be implicitly converted to its base type, but it's a match level 2:

https://dlang.org/spec/function.html#function-overloading

(Agreeing with Paul)

May 08 2021

deadalnix <deadalnix gmail.com> writes:

On Sunday, 9 May 2021 at 02:57:42 UTC, Walter Bright wrote:
 On 5/7/2021 7:16 AM, Paul Backus wrote:
 "Is a string type" and "is implicitly convertible to a string 
 type" are not the same thing.

 Language lawyer point:

 An enum can be implicitly converted to its base type, but it's 
 a match level 2:

 https://dlang.org/spec/function.html#function-overloading

 (Agreeing with Paul)

Sorry to be blunt, but this is complete language layering fail.

Classes implementing and interface are a subtype and are match 
level 2 (implicit conversion) when matching against the interface.

In fact, any subtype is expected to be a match level 2 - 
arguably, this isn't bijective, as not all level 2 match will be 
subtypes, that doesn't definitively nails the topic at hand, but 
the argument made in this thread are disturbingly unsound.

May 09 2021

Jon Degenhardt <jond noreply.com> writes:

On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:
 On 5/7/21 2:03 AM, evilrat wrote:
 On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu 
 wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029

 
 Just a commoner here, can you explain for stupid what makes 
 enum string a no go and why it should begone?

 Heavy toll on the infra for a very niche use case with trivial 
 workarounds on the user side.

To try to put some focus on the user perspective, here's a sample 
program:

```
import std.stdio;
import std.array;
import std.range;

void main()
{
     writefln!"%d"(0);

     immutable string f1 = "%d";
     writefln!f1(1);

     enum f2 = "%d";
     writefln!f2(2);

     enum string f3 = "%d";
     writefln!f3(3);

     enum { f4 = "%d" }
     writefln!f4(4);

     enum : string { f5 = "%d" }
     writefln!f5(5);

     enum X { f6 = "%d" }
     writefln!(X.f6)(6);   // Compilation error

     enum Y : string { f7 = "%d" }
     writefln!(Y.f7)(7);   // Compilation error
}
```

All but the named enums (last two) are fine. These fail with 
similar compilation errors:

```
Error: template std.stdio.writefln cannot deduce function from 
argument types !("%d")(int), candidates are:
dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4258):        
writefln(alias fmt, A...)(A args)
   with fmt = f6,
        A = (int)
   must satisfy the following constraint:
        isSomeString!(typeof(fmt))
dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4269):        
writefln(Char, A...)(in Char[] fmt, A args
```

This is at least a potentially confusing situation for users. The 
error message indicates that `f6` should be a "string" of some 
kind, and it looks like one. One needs to be very familiar with 
the details to understand why it does not satisfy `isSomeString`. 
Similarly with understanding why anonymous enums are fine but 
named enums are not.

The error message is also not particularly helpful in determining 
what the available workarounds are. They may be trivial once 
understood, but there's non-trivial learning to get there. Note 
that slicing (`[]`) and `.representation()` do not work for the 
template argument. Casting does. e.g. The following is fine:

```
     writefln!(cast(string)X.f6)(6);
```

It can be argued that this case is rare enough in user code that 
the ROI from either making the case work or improving the 
compiler error message is too low to devote time to this now. But 
maybe there are other cheap options that could help users. A 
documentation note perhaps. A FAQ somewhere on the D site that 
would surface in searches.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 3:43 PM, Jon Degenhardt wrote:
 
      enum { f4 = "%d" }
      writefln!f4(4);
 
      enum : string { f5 = "%d" }
      writefln!f5(5);
 
      enum X { f6 = "%d" }
      writefln!(X.f6)(6);   // Compilation error
 
      enum Y : string { f7 = "%d" }
      writefln!(Y.f7)(7);   // Compilation error

Thanks. I agree it's confusing. The mystery gets elucidated with some 
ease if we write the types involved: f4 and f5 have type string, f6 has 
type X, and f7 have type Y.

It's unpleasant that `enum : string { f5 = "%d" }` is really the same as 
`enum f5 = "%d"`. I expected that some anonymous enum type would be 
generated.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 7:00 PM, Andrei Alexandrescu wrote:
 On 5/11/21 3:43 PM, Jon Degenhardt wrote:
      enum { f4 = "%d" }
      writefln!f4(4);

      enum : string { f5 = "%d" }
      writefln!f5(5);

      enum X { f6 = "%d" }
      writefln!(X.f6)(6);   // Compilation error

      enum Y : string { f7 = "%d" }
      writefln!(Y.f7)(7);   // Compilation error

 
 Thanks. I agree it's confusing. The mystery gets elucidated with some 
 ease if we write the types involved: f4 and f5 have type string, f6 has 
 type X, and f7 have type Y.
 
 It's unpleasant that `enum : string { f5 = "%d" }` is really the same as 
 `enum f5 = "%d"`. I expected that some anonymous enum type would be 
 generated.

Another unpleasant issue:

     enum Y : string { f7 = "%d" }
     writeln(typeof(Y.f7.representation).stringof);

prints immutable(ubyte)[], not immutable(char)[]. So not even 
Y.f7.representation is usable. Sigh.

May 11 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:
 
      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);
 
 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.

The representation of a named enum is its base type.

The representation of a string type is immutable(ubyte)[].

It's consistent.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:
 
      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);
 
 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.

 The representation of a named enum is its base type.

 The representation of a string type is immutable(ubyte)[].

 It's consistent.

Y.f7 is of type Y. It's representation is string, not 
immutable(ubyte)[]

typeof(Y.f7.representation) ought to be string.
typeof(Y.f7.representation.representation) ought to be 
immutable(ubyte)[]

Unless I'm missing something, that wold b the consistent 
behavior. Unless representation is supposed to recurse up to the 
bottom turtle?

May 11 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/11/2021 7:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.

 The representation of a named enum is its base type.

 The representation of a string type is immutable(ubyte)[].

 It's consistent.

 
 Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[]
 
 typeof(Y.f7.representation) ought to be string.
 typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]

That's what I said.

May 11 2021

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Wednesday, 12 May 2021 at 02:56:49 UTC, Walter Bright wrote:
 On 5/11/2021 7:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not 
 even Y.f7.representation is usable. Sigh.

 The representation of a named enum is its base type.

 The representation of a string type is immutable(ubyte)[].

 It's consistent.

 
 Y.f7 is of type Y. It's representation is string, not 
 immutable(ubyte)[]
 
 typeof(Y.f7.representation) ought to be string.
 typeof(Y.f7.representation.representation) ought to be 
 immutable(ubyte)[]

 That's what I said.

🍿

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 10:04 PM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:
 On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:
 Another unpleasant issue:

      enum Y : string { f7 = "%d" }
      writeln(typeof(Y.f7.representation).stringof);

 prints immutable(ubyte)[], not immutable(char)[]. So not even 
 Y.f7.representation is usable. Sigh.

 The representation of a named enum is its base type.

 The representation of a string type is immutable(ubyte)[].

 It's consistent.

 
 Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[]
 
 typeof(Y.f7.representation) ought to be string.
 typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]
 
 Unless I'm missing something, that wold b the consistent behavior. 
 Unless representation is supposed to recurse up to the bottom turtle?

`representation` is a library function, so in a way we get to have a say 
in what it does. I would have expected it doesn't go all the way to 
primitive types, but if it does, that's not necessarily incorrect.

May 12 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/11/2021 4:00 PM, Andrei Alexandrescu wrote:
 It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum 
 f5 = "%d"`. I expected that some anonymous enum type would be generated.

That came about due to the decision to overload enum to create manifest 
constants. This way, a block of manifest constants can be created.

May 11 2021

Per =?UTF-8?B?Tm9yZGzDtnc=?= <per.nordlow gmail.com> writes:

On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029

Can you describe the scope of the rottenness in terms of contexts 
and arguments?

Are you referring to enums derived from aggregates aswell?

And how does this rottenness relate to the discrepancy in 
behavior between builtin `__traits(X, ...)` and 
`std.traits.X!(...)` for enum arguments?

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.
 
 https://github.com/dlang/phobos/pull/8029

What do you mean "not support"? The language has enums derived from 
strings. Did you mean remove it from the language? That would be a 
severe penalty.

Did you mean that Phobos routines just should error whenever you use 
enum types derived from strings? That's also a severe penalty.

If you mean we shouldn't support it (as an ambiguous case) in 
*conversion* utilities (i.e. to/from string), then this makes some 
sense. But it's also not straightforward. Sometimes you WANT to convert 
from the enum to the base type. Sometimes you want to convert to the 
enum name. Going backwards (string to enum), which one makes more sense? 
It depends on context. It also doesn't help that a string enum 
implicitly converts to a string. The language is going to circumvent any 
policies Phobos has on that front.

For an example, in the serializers I have written, I usually have a 
"treat this enum type as it's base type" UDA, because the data inside 
the serialized format is the base type, but I want it as an enum in 
d-land. But it depends on the situation.

I think it makes possible sense to require either wrappers that clarify 
intent, or always treat enums the same way (as an enum). I think Phobos 
*mostly* does the latter. Erroring for ambiguity might be more 
disruptive than it's worth.

-Steve

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:
 We should remove all that rot from phobos pronto.

 https://github.com/dlang/phobos/pull/8029

 
 What do you mean "not support"? The language has enums derived from 
 strings. Did you mean remove it from the language? That would be a 
 severe penalty.

Enums derived from strings should not be supported as strings in the 
standard library.

 Did you mean that Phobos routines just should error whenever you use 
 enum types derived from strings? That's also a severe penalty.

No it isn't.

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings 
 in the standard library.

I don't think the stdlib should special case much of anything.

Special casing enums is a mistake. If the user wants it treated 
as a string, they can cast it to a string.

Special casing static arrays is a mistake. The user can just 
slice it out the outside.

Special casing alias this is a mistake. The user can pass what 
they meant to pass.

The phobos templates should work like all other templates - on 
the exact type passed. Other functions work with the normal 
overloading and implicit conversion rules.

Kill all the special cases!

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 11:33 AM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings in the 
 standard library.

 
 I don't think the stdlib should special case much of anything.
 
 Special casing enums is a mistake. If the user wants it treated as a 
 string, they can cast it to a string.

yes

 Special casing static arrays is a mistake. The user can just slice it 
 out the outside.

Yes

 Special casing alias this is a mistake. The user can pass what they 
 meant to pass.

YES

 The phobos templates should work like all other templates - on the exact 
 type passed. Other functions work with the normal overloading and 
 implicit conversion rules.
 
 Kill all the special cases!

YES!!!

May 07 2021

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Friday, May 7, 2021 9:39:40 AM MDT Andrei Alexandrescu via Digitalmars-d 
wrote:
 On 5/7/21 11:33 AM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:
 Enums derived from strings should not be supported as strings in the
 standard library.

 I don't think the stdlib should special case much of anything.

 Special casing enums is a mistake. If the user wants it treated as a
 string, they can cast it to a string.

 yes

 Special casing static arrays is a mistake. The user can just slice it
 out the outside.

 Yes

 Special casing alias this is a mistake. The user can pass what they
 meant to pass.

 YES

 The phobos templates should work like all other templates - on the exact
 type passed. Other functions work with the normal overloading and
 implicit conversion rules.

 Kill all the special cases!

 YES!!!

Agreed. While implicit conversions can at times be useful, they cause a ton
of problems when templates are involved. Ideally, we should accept no
implicit conversions of any kind with templated code. And honestly, I wish
that the language had fewer implicit conversions in it. In particular, I
think that implicitly slicing static arrays was a big mistake, and we've had
a number of issues in Phobos because of it when trying to later generalize
functions that originally just took strings.

- Jonathan M Davis

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Friday, 7 May 2021 at 15:33:56 UTC, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu 
 wrote:
 Enums derived from strings should not be supported as strings 
 in the standard library.

 I don't think the stdlib should special case much of anything.

 Special casing enums is a mistake. If the user wants it treated 
 as a string, they can cast it to a string.

 [...]

 Kill all the special cases!

100% agreed, but, back to my original point, why is the enum 
thing a special case to begin with?

The fact that it is a special case to begin with flies in the 
face of Liskov's substitution principle - the enum type clearly 
is a subtype of string.

You got to wonder how it came to be that it just don't work 
automatically to begin with. Adding special cases is indeed the 
wrong path. There is something deeper rotten here, and just 
saying, no, this shouldn't work is just not cutting it.

Note that there should be special cases, but it's be good to 
understand why these are special case to begin with, and fix this.

Alternatively, we decide enums are not subtypes, in which case 
they shouldn't be implicitly convertible either. That wouldn't be 
such a bad idea as I've often missed the ability to do opaque 
type aliasing in D, but that seems way more disruptive than just 
admitting that "enum strings" are indeed a subtype of string.

May 09 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to convert 
 from the enum to the base type. Sometimes you want to convert to the 
 enum name. Going backwards (string to enum), which one makes more sense? 
 It depends on context. It also doesn't help that a string enum 
 implicitly converts to a string. The language is going to circumvent any 
 policies Phobos has on that front.

Enums are poorly designed, but that's only a small part of the problem.

The bigger problem is the corruption of a noble principle. We wanted to 
be as generic as possible, and indeed in the beginning that seemed not 
only possible, but also easy. I don't think there's any other language 
or library supporting different character widths with this little 
aggravation.

Then this whole "be as generic as possible" became a slippery slope of 
inclusion. Allow enum strings. Allow alias this strings.

How about no.

User: "I have this enum string str and phobos won't consider it a 
string. Help!"

Another user: "Just use str.representation if you want to pass str 
around as a string."

User. "Cool."

Case closed.

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to 
 convert from the enum to the base type. Sometimes you want to convert 
 to the enum name. Going backwards (string to enum), which one makes 
 more sense? It depends on context. It also doesn't help that a string 
 enum implicitly converts to a string. The language is going to 
 circumvent any policies Phobos has on that front.

 
 Enums are poorly designed, but that's only a small part of the problem.
 
 The bigger problem is the corruption of a noble principle. We wanted to 
 be as generic as possible, and indeed in the beginning that seemed not 
 only possible, but also easy. I don't think there's any other language 
 or library supporting different character widths with this little 
 aggravation.
 
 Then this whole "be as generic as possible" became a slippery slope of 
 inclusion. Allow enum strings. Allow alias this strings.

But an enum with base string type can be passed as a string. The PR in 
question is working around a limitation of the Phobos trait that says 
something derived from a string isn't really usable as a string (when it 
is).

The problem I see is, when phobos says something isn't true, when it 
really is, causes no end of confusion (*cough* autodecoding)

static assert(!isSomeString!T);
// yet...
string s = someT;

 
 How about no.
 
 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"
 
 Another user: "Just use str.representation if you want to pass str 
 around as a string."
 

User: "OK, but when should I use representation? I already pass it 
around as a string and it works fine. Why can't phobos comprehend that, 
when the language has no problems with it?"

-Steve

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.

"Can be passed as a" is not the same as "is a". There's a 
conversion involved.

For better or for worse, D templates do not participate in 
conversion and we shouldn't pretend that they do. This is often 
times very useful - you don't want to lose information in many 
templates. But there's other times when that information doesn't 
matter and it would be nice it you didn't have to think about 
it....

...so maybe we should consider changing templates so they can 
participate at the language level... it would be interesting if 
the compiler did the conversions BEFORE instantiating any 
template. Then it can reuse the instances more easily too. I 
think it actually does for const params for example, but it could 
do more.

 User: "OK, but when should I use representation? I already pass 
 it around as a string and it works fine. Why can't phobos 
 comprehend that, when the language has no problems with it?"

But the language DOES have problems with it for certain types of 
functions. Phobos is trying to deny that reality.

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 12:30 PM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.

 
 "Can be passed as a" is not the same as "is a". There's a conversion 
 involved.

YES! Int is not floating point, but yes you can initiate a floating 
point from an int.

BTW it's worse than I feared. There are 104 occurrences of StringTypeOf 
in phobos. There should be 0.

May 07 2021

Paul Backus <snarwin gmail.com> writes:

On Friday, 7 May 2021 at 16:30:26 UTC, Adam D. Ruppe wrote:
 For better or for worse, D templates do not participate in 
 conversion and we shouldn't pretend that they do. This is often 
 times very useful - you don't want to lose information in many 
 templates. But there's other times when that information 
 doesn't matter and it would be nice it you didn't have to think 
 about it....

We can already *almost* express this in the language. This code 
works:

     void fun(T : string, T val)() {
         pragma(msg, "instantiated with ", T.stringof);
     }

     enum E : string { x = "hello" }

     alias test = fun!(E, E.x);
     // prints: instantiated with E

But if you try to write it the more natural way, with the value 
parameter first, and have the compiler deduce the type, you get 
an error:

     void fun(T val, T : string)() {
         pragma(msg, "instantiated with ", T.stringof);
     }
     // Error: undefined identifier `T`

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 17:02:17 UTC, Paul Backus wrote:
 We can already *almost* express this in the language. This code 
 works:

eeeeh that's a compile time argument and it still isn't actually 
a string.

What I'm talking about is like in the normal function:

void test(string s) {
         writeln(s);
}

enum Test : string {
         a = "foo"
}

test(Test.a);


The conversion to string happens outside `test`. So caller 
instead of callee, whereas with a template - any template - the 
exact type is passed, what T:string is saying is that the callee 
*can* do the conversion if it wants to inside, but the compiler 
won't actually do it for you.

This is very useful in a lot of cases. Like if you do

void foo(T : SomeBase)(T t) {}

and pass foo(new Derived()), you can still see the whole Dervied 
type and thus do some reflection and such over it, with the 
compiler promising that it can be converted to SomeBase if you 
want to.

Of course, in this case, it is not really different than a 
template constraint. You could do

void foo(T)(T t) if(is(T : SomeBase)) {}

and get that same rejection behavior. But of course what's nice 
about specialization is you can then add an overload

void foo(T : SomeBase)(T t) {}
void foo(T : Derived)(T t) {}

And if you get like

class Derived : SomeBase {}
class OtherBranch : SomeBase{}

and call

foo(new Derived()); // goes to second overload as it is a more 
specific match
foo(new OtherBranch()); // goes to first overload as it is the 
best option available, but it still can see it is OtherBranch 
inside there, unlike a normal interface cast where you'd only 
have that detail at runtime.

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 12:30 PM, Adam D. Ruppe wrote:
 On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:
 But an enum with base string type can be passed as a string.

 
 "Can be passed as a" is not the same as "is a". There's a conversion 
 involved.

But that's the intention of the function. format doesn't care what the 
expression really is, it wants some type of string.

How do you say "I want to accept something that's a string, but I want 
it as a string please"

 For better or for worse, D templates do not participate in conversion 
 and we shouldn't pretend that they do. This is often times very useful - 
 you don't want to lose information in many templates. But there's other 
 times when that information doesn't matter and it would be nice it you 
 didn't have to think about it....

e.g. format.

 ...so maybe we should consider changing templates so they can 
 participate at the language level... it would be interesting if the 
 compiler did the conversions BEFORE instantiating any template. Then it 
 can reuse the instances more easily too. I think it actually does for 
 const params for example, but it could do more.

Interesting idea!

 
 User: "OK, but when should I use representation? I already pass it 
 around as a string and it works fine. Why can't phobos comprehend 
 that, when the language has no problems with it?"

 
 But the language DOES have problems with it for certain types of 
 functions. Phobos is trying to deny that reality.

What I mean is, I can write:

void foo(string s);

and it works for enums that are string-based. Why doesn't format work 
with that same principle? The answer is because there isn't a good way 
to do it.

-Steve

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 17:11:32 UTC, Steven Schveighoffer wrote:
 How do you say "I want to accept something that's a string, but 
 I want it as a string please"

Well, one way we can do that today is to have the template 
forward to a normal function, or a normal function forward to a 
template.

void format(T...)(const char[] s, T args) {
       format(asRangeOfDchar(s), args);
}

void format(Range T...)(Range r, T args) 
if(isAppopriateRange!Range) {
        // actual impl based on the range interface
        // and actually tbh I'd personally take another step
        // and collapse all these down even more.
}

Then a whole bunch of conversions are done to match `const 
char[]` and the template is then working with that entry point 
instead of the whole plate.

This of course assumes isAppropriateRange is false for anything 
that isn't actually already a range. And I'm assuming string is 
not already a range. Otherwise you enter back into the hell of 
not only saying what you accept, but having to exclude things too.


So let me rant.

I think it was actually a mistake for Phobos to UFCS shoe-horn in 
range functions on arrays too - this includes strings as well as 
int[] and such as well. Lots of new users ask why they can't do 
the same thing. And like Phobos took this opportunity to do silly 
things like autodecoding when we all hate now, but I don't think 
the freestanding ufcs range functions should exist at all.

Just have the user fetch a range out of the container. Then they 
get in that habit with other containers too and it moves a bunch 
of ugly code out of every consuming function.

Heck the `asRange` thing itself might have a variety of overloads 
it forwards to.


MyRange asRangeHelper(const char[] s) { return MyRange(s); }

auto asRange(T)(T t) { /* generic stuff */ }
auto asRange(T : const char[])(T t) { return asRangeHelper(t); } 
// let the language convert it in these specializations

and so on and so forth.



This is a half-baked rant im sure you can destroy at will. But 
like I'm pretty sure if we did develop this it would be nicer 
overall than what we have now.


 The answer is because there isn't a good way to do it.

And it is possible the language could insert some magic to make 
it easier if we really put our thinking caps on.

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 18:17:31 UTC, Adam D. Ruppe wrote:
 void format(T...)(const char[] s, T args) {
       format(asRangeOfDchar(s), args);
 }

oh i should have added of course you can do the wchar and dchar 
overloads here too. yeah yeah i know "DRY" but like it is a 
trivial forwarder, get over it.

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS shoe-horn in range 
 functions on arrays too - this includes strings as well as int[] and 
 such as well.

The most common range BY FAR in all of D code is an array.

The end result of something like you allude to would result in nearly 
all of phobos NOT working with arrays.

Just a taste:

int[] arr = genArray;
arr.sort(); // fail.

I don't want to go to that place, ever.

-Steve

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 18:44:26 UTC, Steven Schveighoffer wrote:
 The end result of something like you allude to would result in 
 nearly all of phobos NOT working with arrays.

int[5] arr;
arr.sort(); // fails, you need to use []

Array!int arr;
arr.sort(); // fails, you need to use []

some random phobos functions special-case this to make it work 
which is the real wtf and those should be undone, just get the 
user to slice a static array.

So I'd just make it all consistent.


But tbh I don't feel that strongly about it... except for string. 
string should no longer be a range. Delete its popFront overload 
and let the user pick byCodeUnit or byCodePoint or whatever. Just 
rip that band aid right off.

Just even for the others, even if the [] was deemed unacceptable, 
i don't love the ufcs solution.

So many people try to do freestanding functions for other types, 
inspired by the phobos popFront.. and isInputRange fails because 
phobos itself must import the ufcs module. Other new people do 
foo.empty and it fails because they didn't import the module.

So like even if the behavior remained the same as today, I'd like 
to define it a little differently.

but meh dont wanna continue too far down this particular thing 
since it is the part of my rant i care the least about.

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 2:44 PM, Steven Schveighoffer wrote:
 On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS shoe-horn in 
 range functions on arrays too - this includes strings as well as int[] 
 and such as well.

 
 The most common range BY FAR in all of D code is an array.
 
 The end result of something like you allude to would result in nearly 
 all of phobos NOT working with arrays.
 
 Just a taste:
 
 int[] arr = genArray;
 arr.sort(); // fail.
 
 I don't want to go to that place, ever.
 
 -Steve

Yah, ranges are a generalization of arrays. It would be odd if the 
generalization of arrays didn't work when tried with arrays.

May 07 2021

NonNull <non-null use.startmail.com> writes:

On Friday, 7 May 2021 at 20:53:08 UTC, Andrei Alexandrescu wrote:
 On 5/7/21 2:44 PM, Steven Schveighoffer wrote:
 On 5/7/21 2:17 PM, Adam D. Ruppe wrote:
 I think it was actually a mistake for Phobos to UFCS 
 shoe-horn in range functions on arrays too - this includes 
 strings as well as int[] and such as well.

 
 The most common range BY FAR in all of D code is an array.
 
 The end result of something like you allude to would result in 
 nearly all of phobos NOT working with arrays.

 Yah, ranges are a generalization of arrays. It would be odd if 
 the generalization of arrays didn't work when tried with arrays.

No. Ranges are not a generalization of arrays unless you ignore 
the most important feature of the notion of a Range. An array is 
a sequence of things in space: a spatial container (all values 
stored) that happens to be a sequence. A Range is a sequence of 
things in time. (Purist definition, often true in practice.)

A spatial container can be /exploded/ into a sequence in time. 
And a sequence in time can be /accreted/ into a spatial container 
(whether it has sequence or not).

Explode is a natural idea and could be defined for any spatial 
container, producing a Range from a spatial container, and 
specifically from an array.

Making a distinction of spatial and temporal makes sense.

May 12 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:
 On Friday, 7 May 2021 at 20:53:08 UTC, Andrei Alexandrescu
 Yah, ranges are a generalization of arrays. It would be odd if 
 the generalization of arrays didn't work when tried with 
 arrays.

 No. Ranges are not a generalization of arrays unless you ignore 
 the most important feature of the notion of a Range. An array 
 is a sequence of things in space: a spatial container (all 
 values stored) that happens to be a sequence. A Range is a 
 sequence of things in time. (Purist definition, often true in 
 practice.)

Ranges are a generalization of arrays (or slices, if you prefer) 
in the same way that iterators are a generalization of pointers. 
In both cases, certain features of the specialized version are 
ignored or left out in the generalized version. As you've 
correctly pointed out, one of those ignored features is the 
array's layout in memory. A range *may* store all of its elements 
in memory, or it may not; as users of the range API, we are not 
suppose to know or care.

May 12 2021

NonNull <non-null use.startmail.com> writes:

On Wednesday, 12 May 2021 at 15:08:46 UTC, Paul Backus wrote:
 On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:
 No. Ranges are not a generalization of arrays unless you 
 ignore the most important feature of the notion of a Range. An 
 array is a sequence of things in space: a spatial container 
 (all values stored) that happens to be a sequence. A Range is 
 a sequence of things in time. (Purist definition, often true 
 in practice.)

 Ranges are a generalization of arrays (or slices, if you 
 prefer) in the same way that iterators are a generalization of 
 pointers. In both cases, certain features of the specialized 
 version are ignored or left out in the generalized version. As 
 you've correctly pointed out, one of those ignored features is 
 the array's layout in memory. A range *may* store all of its 
 elements in memory, or it may not; as users of the range API, 
 we are not suppose to know or care.

This is the standard pattern of the interpretation of the meaning 
of Range. It is more concrete. I want the idea of range to escape 
its historical semantic origins.

I am suggesting a different and cleaner interpretation of that 
meaning. One that draws a deliberate line between space and time 
as a means of motivating language design.

Instead of regarding the psychological process of regarding a 
spatial data structure as a range as being the psychological 
process of simply ignoring other non-range features and just 
using range operations, I am suggesting a semantic hard line be 
drawn between the two. A range could be obtained by exploding a 
spatial data structure (array say) and regarded as a distinct 
entity. Concretely the latent temporal sequence of things taken 
from the spatial data structure (the derived range) could be 
regarded as semantically quite different and separate from that 
data structure.

While some may consider this a distinction without a difference, 
it does nevertheless change how one might relate a range to a 
spatial data structure in a programming language.

My view leads to an explicit explode operation of some kind on 
all occasions, whereas yours can munge together range stuff with 
other operations on spatial data structures, so that your spatial 
structure IS a range and abstraction is avoided. Moving away from 
the historical semantics to the semantics I suggest above and 
having that guide language design separates those concerns.

The idea of /explode/ is a nice intuitive fundamental concept 
that is concealed and entangled in D right now. Things could be 
less baroque. Specifically, arrays would then be treated the same 
way as any other spatial data structure. They would not be ranges.

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 11:51 AM, Steven Schveighoffer wrote:
 On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 On 5/7/21 11:20 AM, Steven Schveighoffer wrote:
 If you mean we shouldn't support it (as an ambiguous case) in 
 *conversion* utilities (i.e. to/from string), then this makes some 
 sense. But it's also not straightforward. Sometimes you WANT to 
 convert from the enum to the base type. Sometimes you want to convert 
 to the enum name. Going backwards (string to enum), which one makes 
 more sense? It depends on context. It also doesn't help that a string 
 enum implicitly converts to a string. The language is going to 
 circumvent any policies Phobos has on that front.

 Enums are poorly designed, but that's only a small part of the problem.

 The bigger problem is the corruption of a noble principle. We wanted 
 to be as generic as possible, and indeed in the beginning that seemed 
 not only possible, but also easy. I don't think there's any other 
 language or library supporting different character widths with this 
 little aggravation.

 Then this whole "be as generic as possible" became a slippery slope of 
 inclusion. Allow enum strings. Allow alias this strings.

 
 But an enum with base string type can be passed as a string. The PR in 
 question is working around a limitation of the Phobos trait that says 
 something derived from a string isn't really usable as a string (when it 
 is).

Well you see here is the problem. An enum with base string can be 
coerced to a string, but is not a true subtype of string. This came to a 
head with ranges, too - you can pop off the head of a string still have 
a string, but if you pop off the head of an enum string you get some 
enum value that is not present in the set of enum values. Concatenation 
has similar problems, e.g. s ~ s for enum strings yields string, not an 
enum string. (Weirdly s ~= s works...)

So enum strings break ISA/Liskov. Alias this also does due to an 
overwhelming number of errors in its design and implementation.

 The problem I see is, when phobos says something isn't true, when it 
 really is, causes no end of confusion (*cough* autodecoding)
 
 static assert(!isSomeString!T);
 // yet...
 string s = someT;

This only shows that we have a baroque language that allows user-defined 
conversions from non-strings to strings. The code above is NO PROOF that 
T is supposed to be a string.

 How about no.

 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"

 Another user: "Just use str.representation if you want to pass str 
 around as a string."

 
 User: "OK, but when should I use representation? I already pass it 
 around as a string and it works fine. Why can't phobos comprehend that, 
 when the language has no problems with it?"

"When you want a string".

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 12:43 PM, Andrei Alexandrescu wrote:
 On 5/7/21 11:51 AM, Steven Schveighoffer wrote:
 On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:
 User: "I have this enum string str and phobos won't consider it a 
 string. Help!"

 Another user: "Just use str.representation if you want to pass str 
 around as a string."

 User: "OK, but when should I use representation? I already pass it 
 around as a string and it works fine. Why can't phobos comprehend 
 that, when the language has no problems with it?"

 
 "When you want a string".
 

Sorry, let's jump out of the fake dialog here for a second.

The problem I have is, you have a function like:

foo(T)(T s) if (isSomeString!T)

The *intention* here is that, I want to NOT have to write:

foo(string s) { impl }
foo(wstring s) { impl }
foo(dstring s) { impl }
... // etc with const, mutable

BUT, if I have an enum that converts to a string, then if I actually DID 
write all those, then it would compile. However, the template version 
does not. This is the confusion that a user and library author has.

I think the problem here is that the language doesn't give you a good 
way to express that. So we rely on template constraints that both can't 
exactly express that intention, and where the approximations create 
various template instantiations that cause strange problems (i.e. if you 
accept an enum that converts to string, it's still an enum inside the 
template). Whereas the language

I'm not suggesting any specific changes here, but I recognize there is a 
disconnect from what we *want* to express, and what the language provides.

-Steve

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both can't 
 exactly express that intention, and where the approximations create 
 various template instantiations that cause strange problems (i.e. if you 
 accept an enum that converts to string, it's still an enum inside the 
 template). Whereas the language

I forgot to finish this thought, got interrupted.

Whereas the language (with non-template parameters) does the matching 
and conversion simultaneously without needing special cases.

-Steve

May 07 2021

Daniel N <no public.email> writes:

On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:
 On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you 
 a good way to express that. So we rely on template constraints 
 that both can't exactly express that intention, and where the 
 approximations create various template instantiations that 
 cause strange problems (i.e. if you accept an enum that 
 converts to string, it's still an enum inside the template). 
 Whereas the language

 I forgot to finish this thought, got interrupted.

 Whereas the language (with non-template parameters) does the 
 matching and conversion simultaneously without needing special 
 cases.

 -Steve

What's wrong with this?

void fun(T : string)(T t)

May 07 2021

Adam D. Ruppe <destructionator gmail.com> writes:

On Friday, 7 May 2021 at 17:27:18 UTC, Daniel N wrote:
 What's wrong with this?

 void fun(T : string)(T t)

That doesn't convert to string. It allows it to compile because T 
*can* be converted to string and thus it is the closest 
specialization it can get, but it does NOT actually convert it.

----
import std.stdio;

enum Test : string {
         a = "foo"
}

void test2(T:string)(T t) {
         pragma(msg, T); // Test, not string!
         writeln(t);
}

void main() {
         test2(Test.a);
}
-----

May 07 2021

Steven Schveighoffer <schveiguy gmail.com> writes:

On 5/7/21 1:27 PM, Daniel N wrote:
 On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:
 On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both 
 can't exactly express that intention, and where the approximations 
 create various template instantiations that cause strange problems 
 (i.e. if you accept an enum that converts to string, it's still an 
 enum inside the template). Whereas the language

 I forgot to finish this thought, got interrupted.

 Whereas the language (with non-template parameters) does the matching 
 and conversion simultaneously without needing special cases.

 
 What's wrong with this?
 
 void fun(T : string)(T t)

Because T is not a string.

e.g. for an string-based enum, t.popFront won't work.

-Steve

May 07 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/7/21 1:05 PM, Steven Schveighoffer wrote:
 The problem I have is, you have a function like:
 
 foo(T)(T s) if (isSomeString!T)
 
 The *intention* here is that, I want to NOT have to write:
 
 foo(string s) { impl }
 foo(wstring s) { impl }
 foo(dstring s) { impl }
 ... // etc with const, mutable
 
 BUT, if I have an enum that converts to a string, then if I actually DID 
 write all those, then it would compile. However, the template version 
 does not. This is the confusion that a user and library author has.

Of course. I understand that very well. But that's a minor confusion and 
inconvenience; people understand very well that e.g. this won't work:

void foo(float);
void foo(double);
void main() { foo(1); }

The reason is slightly different but the point is the same: 
convertibility has its subtleties and programming languages comprehend 
small surprises.

Supporting enum strings and alias this at the huge cost we incur now is 
definitely over two standard deviations away from what's reasonable.

 I think the problem here is that the language doesn't give you a good 
 way to express that. So we rely on template constraints that both can't 
 exactly express that intention, and where the approximations create 
 various template instantiations that cause strange problems (i.e. if you 
 accept an enum that converts to string, it's still an enum inside the 
 template). Whereas the language
 
 I'm not suggesting any specific changes here, but I recognize there is a 
 disconnect from what we *want* to express, and what the language provides.

That I am on board with.

May 07 2021

Q. Schroll <qs.il.paperinik gmail.com> writes:

On Friday, 7 May 2021 at 17:05:08 UTC, Steven Schveighoffer wrote:
 The problem I have is, you have a function like:
 ```D
 auto foo(T)(T s) if (isSomeString!T) { impl }
 ```
 The *intention* here is that, I want to NOT have to write:
 ```D
 auto foo(string s) { impl }
 auto foo(wstring s) { impl }
 auto foo(dstring s) { impl }
 ... // etc with const, mutable
 ```
 BUT, if I have an enum that converts to a string, then if I 
 actually DID write all those, then it would compile. However, 
 the template version does not. This is the confusion that a 
 user and library author has.

Maybe this is special casing here, but if you have a finite list 
of types you want to support, it might be easier to add an 
`AliasSeq` of all string types to `std.traits` or so and use

```D
static foreach (String; Strings)
auto foo(String s) { impl }
```

Looks generic, but actually isn't. The implementation bloat is a 
different beast though.

May 07 2021

deadalnix <deadalnix gmail.com> writes:

On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:
 Well you see here is the problem. An enum with base string can 
 be coerced to a string, but is not a true subtype of string. 
 This came to a head with ranges, too - you can pop off the head 
 of a string still have a string, but if you pop off the head of 
 an enum string you get some enum value that is not present in 
 the set of enum values. Concatenation has similar problems, 
 e.g. s ~ s for enum strings yields string, not an enum string. 
 (Weirdly s ~= s works...)

Popping the head out of an enum value ought to be a string, not 
that enum's value. I don't really see where the problem is here, 
this is subtyping 101.

I raised a few times int he past that there were unsound 
operations performed in the past (as in "Weirdly s ~= s 
works...") but I don't think turning compiler bugs into standard 
library policies is going to lead to better tomorrows.

May 09 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/9/21 8:57 PM, deadalnix wrote:
 On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:
 Well you see here is the problem. An enum with base string can be 
 coerced to a string, but is not a true subtype of string. This came to 
 a head with ranges, too - you can pop off the head of a string still 
 have a string, but if you pop off the head of an enum string you get 
 some enum value that is not present in the set of enum values. 
 Concatenation has similar problems, e.g. s ~ s for enum strings yields 
 string, not an enum string. (Weirdly s ~= s works...)

 
 Popping the head out of an enum value ought to be a string, not that 
 enum's value. I don't really see where the problem is here, this is 
 subtyping 101.

So you have a range r of type T.

You call r.popFront().

Obvioulsly the type of r should stay the same because in D variables 
don't change type.

So... what gives, young Padawan?

No, this is not subtyping 101.

May 09 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

If you have a range of T, then you got to return a T. I'm not 
sure what's the problem is here. Do you have a concrete example?

All I can think of are things like slicing and alike, and they 
should obviously return a string, not a T.

May 10 2021

Paul Backus <snarwin gmail.com> writes:

On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu 
 wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

 If you have a range of T, then you got to return a T. I'm not 
 sure what's the problem is here. Do you have a concrete example?

 All I can think of are things like slicing and alike, and they 
 should obviously return a string, not a T.

popFront doesn't return a value, it mutates. So `r` before 
popFront and `r` after popFront must be the same type, because 
they are the same variable.

If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 
.. $]) != typeof(r), then it doesn't work, and string enums can't 
be ranges (from which it follows that they are not 
Liskov-substitutable for strings).

May 10 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:
 popFront doesn't return a value, it mutates. So `r` before 
 popFront and `r` after popFront must be the same type, because 
 they are the same variable.

 If popFront for a string enum is `r = r[1 .. $]`, and 
 typeof(r[1 .. $]) != typeof(r), then it doesn't work, and 
 string enums can't be ranges (from which it follows that they 
 are not Liskov-substitutable for strings).

r = r[1 .. $] is an error unless r actually is a string. You 
cannot mutate an enum value and have it stay an enum.

If you think that invalidate the LSP, I'm afraid there is a big 
misunderstanding about the LSP. Not all operation on a subtype 
have to return said subtype. It is made clearer if you consider 
the slicing operationa s a member function on an object instead - 
as I seems classes and inheritance is the only way OPP is 
understood these days.

class A {
    A slice(int start, int end) { ... }
}

class B : A {}

Where is it implied that B's version of the slice operation must 
return an A? Nowhere, the LSP absolutely doesn't mandate that. It 
mandate that you can pass a B to something that expects an A, and 
that thing will behave the way you'd expect.

And it does!

If your code needs an A, then you mark it as accepting an A as 
input. If I have a B and want to pass it to your code, I can too, 
transparently. You do not need to even know about the existence 
of B when your wrote your code. This is what the LSP is at its 
core.

Back to our string example, the code should accept string (A), 
with zero knowledge of the existence of any enum string (B). You 
should be able to pass a B to that code and have everything work 
as expected.

The argument that the enum string is not a subtype because it 
breaks he LSP is nonsense, this in fact demonstrate that the type 
system is unsound and it breaks LSP is broken. And this is why 
people end up desperately trying to re-implement it in libraries, 
which result in a ton of more work and complexity for everybody 
involved.

May 10 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/21 5:55 PM, deadalnix wrote:
 On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:
 popFront doesn't return a value, it mutates. So `r` before popFront 
 and `r` after popFront must be the same type, because they are the 
 same variable.

 If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. 
 $]) != typeof(r), then it doesn't work, and string enums can't be 
 ranges (from which it follows that they are not Liskov-substitutable 
 for strings).

 
 r = r[1 .. $] is an error unless r actually is a string. You cannot 
 mutate an enum value and have it stay an enum.
 
 If you think that invalidate the LSP, I'm afraid there is a big 
 misunderstanding about the LSP. Not all operation on a subtype have to 
 return said subtype. It is made clearer if you consider the slicing 
 operationa s a member function on an object instead - as I seems classes 
 and inheritance is the only way OPP is understood these days.
 
 class A {
     A slice(int start, int end) { ... }
 }
 
 class B : A {}
 
 Where is it implied that B's version of the slice operation must return 
 an A?

If we move the goalposts we can with certain ease create the illusion 
that a lot of things are possible and even easy. This works very well in 
forum discussions where all needed is eloquence and the perseverance to 
answer every post with one that just slightly moves the discussion 
around so it appears to have answers to every objection and have the 
last word on any topic. This is exactly what happens here - half of your 
points contradict the other half, but never in the same post and the 
appearance is you seem to have easy answers to everything.

In the initial days of ranges we actually considered that popFront() 
would be actually tail() that returns by value. So instead of today's 
form (given a range r):

for (; !r.empty; r.popFront) {
    ... use r.front ...
}

we'd have had:

for (; !r.empty; r = r.tail) {
    ... use r.front ...
}

This doesn't change things much (and wouldn't improve the situation with 
enums) but does open up the possibility - what if r.tail() actually 
returns a type different from r?

In all interesting cases that means r = r.tail wouldn't work anymore, 
which complicates range algorithms A LOT. They'd need to use recursion 
instead of iteration:

void someRangeFunction(R)(R range) {
     if (range.empty) {
         ... empty case ...
     } else {
         ... do some work for r.front ...
         return someRangeFunction(r.tail);
     }
}

(I should note that that's actually of interest for immutable ranges, 
for the simple reason they aren't assignable.)

At any rate, we decided this would complicate everything in Phobos way 
too much (and I think that was a correct prediction) so we chose to have 
popFront() mutate the current range.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 15:33:45 UTC, Andrei Alexandrescu 
wrote:
 If we move the goalposts we can with certain ease create the 
 illusion that a lot of things are possible and even easy.

 [...]

 At any rate, we decided this would complicate everything in 
 Phobos way too much (and I think that was a correct prediction) 
 so we chose to have popFront() mutate the current range.

I don't think that any of what you wrote is incorrect, and these 
are even reasonable tradeofs as far as I can tell.

I however would like to remind where this whole thing starts from:

format!SomeEnumString(...) is expected to work for users.

Not that SomeEnumString is a full fledged range or anything, 
simply that you can pass is down to phobos, or anything else for 
that matter, in place where a string is expected.

This is reasonable expectation.

It is also a reasonable expectation that this shouldn't require a 
ton of scaffolding to work, in phobos or elsewhere.

Therefore, the fact that phobos required scaffolding to make this 
work is indicative that there is a deeper problem. Focusing on 
finding what that deeper problem is and fixing it seems like a 
healthier path forward than simply pretending there is no problem 
and pushing it all on the users.

I this case, it was noted here ( 
https://forum.dlang.org/post/umndraexmrxiyrmfpcyo forum.dlang.org 
) that the root cause of the problem might be that there is a 
conflation between the container and the range. I think this is a 
reasonable hypothesis. Having two things trying to do one thing 
is a very typical source of such problems.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 12:26 PM, deadalnix wrote:
 I however would like to remind where this whole thing starts from:
 
 format!SomeEnumString(...) is expected to work for users.

Reasonable, though I should add that it's a decision made by the author 
of the format() API.

 Not that SomeEnumString is a full fledged range or anything, simply that 
 you can pass is down to phobos, or anything else for that matter, in 
 place where a string is expected.

Reasonable, though again a matter of API definition. Would you expect 
this to work?

float sin(float x);
double sin(double x);
real sin(real x);
...
auto x = sin(1);

Shouldn't that work? Not that int is a full fledged floating point 
number or anything, simply that you can pass it down to phobos, or 
anything else for that matter, in place where a floating point number is 
expected.

Oh, but wait, it's the templates. Great.

T sin(T)(T x) if (isFloatingPoint!T);
...
auto x = sin(1);

Shouldn't that work? Not that int is a full fledged floating point 
number or anything, simply that you can pass it down to phobos, or 
anything else for that matter, in place where a floating point number is 
expected.

Well an argument can be made that it should work, or the API designer 
can wisely choose to NOT yield true from isFloatingPoint!int.

And if we explore this madness further, we get to an enormity just as 
awful as StringTypeOf:

template FloatingPointTypeOf(T) {
     static if (isIntegral!T) {
         alias FloatingPointTypeOf = T;
     } else ...
}

And then whenever we need a floating point type we use 
is(FloatingPointTypeOf!T) like a bunch of dimwits.

What use case does that helps? Who is helped by that? Someone who can't 
bring themselves to convert whatever they have to double prior to using 
the standard library.

Arguably not a good design.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 12:57 PM, Andrei Alexandrescu wrote:
 
 template FloatingPointTypeOf(T) {
      static if (isIntegral!T) {
          alias FloatingPointTypeOf = T;
      } else ...
 }

Correx:


template FloatingPointTypeOf(T) {
     static if (isIntegral!T) {
         alias FloatingPointTypeOf = double;
     } else ...
}

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 16:57:13 UTC, Andrei Alexandrescu 
wrote:
 Reasonable, though again a matter of API definition. Would you 
 expect this to work?

 float sin(float x);
 double sin(double x);
 real sin(real x);
 ...
 auto x = sin(1);

 Shouldn't that work? Not that int is a full fledged floating 
 point number or anything, simply that you can pass it down to 
 phobos, or anything else for that matter, in place where a 
 floating point number is expected.


It's debatable. There are many languages out there where it 
doesn't.

I think your case here is disingenuous, because an int is not a 
special kind of float. We are explicitly outside of the scope of 
the argument being made to begin with. Whatever conclusion we 
reach using int and float would have no bearing on what should 
happen for string and SomeEnumString.

However, in D, it is possible to do:

enum SomeEnumInt : int;

This is for instance used in std.encoding. UI'm not sure if this 
works with float or not, but assuming that it does, then this 
absolutely and unambiguously work:

enum SomeEnumFloat : float;
SomeEnumFloat f = ...;
auto x = sin(f);

Here, x would have type float, based on `float sin(float x)`.

 Well an argument can be made that it should work, or the API 
 designer can wisely choose to NOT yield true from 
 isFloatingPoint!int.

An argument could be made, however, this is not the argument I am 
making, so I don't really see the point of bringing this up.

 And if we explore this madness further, we get to an enormity 
 just as awful as StringTypeOf:

 template FloatingPointTypeOf(T) {
     static if (isIntegral!T) {
         alias FloatingPointTypeOf = T;
     } else ...
 }

 And then whenever we need a floating point type we use 
 is(FloatingPointTypeOf!T) like a bunch of dimwits.

 What use case does that helps? Who is helped by that? Someone 
 who can't bring themselves to convert whatever they have to 
 double prior to using the standard library.

 Arguably not a good design.

This is indeed not a good design, but also isn't really required 
if the places requiring a float can consistently accept 
SomeEnumFloat, because in this case, it turtles transparently all 
the way down.

May 11 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/11/2021 12:14 PM, deadalnix wrote:
 I think your case here is disingenuous, because an int is not a special kind
of 
 float.

D has no notion of a "special kind of type". It only has a notion of
"implicitly 
convertible".

* An int is implicitly convertible to a float.

* An enum is implicitly convertible to its base type.

The two *must* behave the same way, or the language falls apart with hackish 
special cases that will never work in a predictable manner.

One could design a language with two kinds of conversions:

1. is-a-special-case-of
2. is-implicitly-convertible-to

but D isn't it.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:
 On 5/11/2021 12:14 PM, deadalnix wrote:
 I think your case here is disingenuous, because an int is not 
 a special kind of float.

 D has no notion of a "special kind of type". It only has a 
 notion of "implicitly convertible".

 * An int is implicitly convertible to a float.

 * An enum is implicitly convertible to its base type.

 The two *must* behave the same way, or the language falls apart 
 with hackish special cases that will never work in a 
 predictable manner.

 One could design a language with two kinds of conversions:

 1. is-a-special-case-of
 2. is-implicitly-convertible-to

 but D isn't it.

Except, it is.

D has numerous instance of both already and pretending it doesn't 
really isn't going to lead anywhere useful.

And this very thread is indeed proof that "the language falls 
apart with hackish special cases that will never work in a 
predictable manner."

The fact is that you can't get rid of 1. and support OOP, because 
polymorphism is a key ingredient of OOP. And we even go as far as 
to talk about some of the metaprogramming techniques in D as 
being compile time polymorphism, so so this is clearly a road we 
want to embark on.

The alternative is to go full functional on these things, and, as 
Andrei explain with the tail example, this is an option that 
works as well, but you have to write everything in functional 
style, which makes some code harder to write.

Personally, I'm not interested in D going full functional, 
because I appreciate that different ideas are better expressed in 
different paradigms. But I understand that it means that we must 
have 1.

Now, do we need 2. ? Strictly speaking, we do not. We could just 
say that string float conversion and vice versa must be explicit. 
We can remove alias this, and whatever other feature of the 
language does implicit conversion. I'm actually confident that in 
some cases, that would be a win, but also that we are too far 
gone to realistically be able to remove 2.

So we have both, we need to live with both, and make sensible 
decisions based on that. Pretending that we don't have both only 
leads to the guarantee that we'll make more bad decisions on that 
front in the future.

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Tuesday, 11 May 2021 at 19:56:05 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:
 [...]

 Except, it is.

 D has numerous instance of both already and pretending it 
 doesn't really isn't going to lead anywhere useful.

 [...]

Remove alias this support for classes and replace it with compile 
time default interface methods.

-Alex

May 11 2021

Walter Bright <newshound2 digitalmars.com> writes:

On 5/11/2021 12:56 PM, deadalnix wrote:
 The fact is that you can't get rid of 1. and support OOP, because polymorphism 
 is a key ingredient of OOP.

Converting a derived class reference to a base class reference is an
"implicitly 
convert" operation, not a special-kind-of conversion.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 00:22:56 UTC, Walter Bright wrote:
 On 5/11/2021 12:56 PM, deadalnix wrote:
 The fact is that you can't get rid of 1. and support OOP, 
 because polymorphism is a key ingredient of OOP.

 Converting a derived class reference to a base class reference 
 is an "implicitly convert" operation, not a special-kind-of 
 conversion.

That is trivially demonstrably false. Consider:

class A {}
class B : A {}

B function() implicitly converts to A function()

But

byte function() doesn't implicitly converts to int function()

Clear, the implicit conversion from byte to int is of different 
nature than the one from B to A, and one doesn't have to dig very 
deep to find these differences.

Now, mind you, this is not a problem. At all. After all, B is a 
subtype of A, while byte is not a subtype of int. There are 
different kind of implicit conversions. This is pefectly sound 
and required if D wants to have implicit conversion of things 
which aren't subtypes of each others. There are no ways around it.

Let's just not pretend it's the same, because this from these 
erroneous assumptions that bad design grows.

May 11 2021

Paul Backus <snarwin gmail.com> writes:

On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:
 If you think that invalidate the LSP, I'm afraid there is a big 
 misunderstanding about the LSP. Not all operation on a subtype 
 have to return said subtype. It is made clearer if you consider 
 the slicing operationa s a member function on an object instead 
 - as I seems classes and inheritance is the only way OPP is 
 understood these days.

 class A {
    A slice(int start, int end) { ... }
 }

 class B : A {}

 Where is it implied that B's version of the slice operation 
 must return an A? Nowhere, the LSP absolutely doesn't mandate 
 that. It mandate that you can pass a B to something that 
 expects an A, and that thing will behave the way you'd expect.

 And it does!

 If your code needs an A, then you mark it as accepting an A as 
 input. If I have a B and want to pass it to your code, I can 
 too, transparently. You do not need to even know about the 
 existence of B when your wrote your code. This is what the LSP 
 is at its core.

 Back to our string example, the code should accept string (A), 
 with zero knowledge of the existence of any enum string (B). 
 You should be able to pass a B to that code and have everything 
 work as expected.

I concede the points that enum strings do not violate the LSP, 
and that they are subtypes of string. You're right, and I was 
wrong.

The point I should have made is that, at least in D, the LSP is 
not universal. There are situations where it simply does not 
apply. In particular, it does not guarantee that a substitution 
which changes the arguments used to instantiate a template will 
succeed; e.g.,

     class A { int x; }
     class B : A { int y; }

     void example(T)(T obj) {
         static assert(!__traits(hasMember, T, "y"));
     }

`example(new A)` will compile, but `example(new B)` will 
not--because they are not actually calling the same function. One 
calls `example!A` and the other calls `example!B`. This is an 
unavoidable consequence of the expressive power of D's templates: 
without specific knowledge about `example`'s implementation, we 
cannot guarantee anything about the relationship between 
`example!A` and `example!B`.

All of which is to say, the fact that you can pass a string as an 
argument to a template does not *necessarily* imply that you can 
pass an enum string as an argument to the same template. That 
`format` handles them differently does not "fly in the face of 
Liskov's substitution principle" [1], any more than my example 
above does.

[1] 
https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.org

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 22:00:57 UTC, Paul Backus wrote:
 I concede the points that enum strings do not violate the LSP, 
 and that they are subtypes of string. You're right, and I was 
 wrong.

Thanks.

 The point I should have made is that, at least in D, the LSP is 
 not universal. There are situations where it simply does not 
 apply. In particular, it does not guarantee that a substitution 
 which changes the arguments used to instantiate a template will 
 succeed; e.g.,

 [...]

 All of which is to say, the fact that you can pass a string as 
 an argument to a template does not *necessarily* imply that you 
 can pass an enum string as an argument to the same template. 
 That `format` handles them differently does not "fly in the 
 face of Liskov's substitution principle" [1], any more than my 
 example above does.

 [1] 
 https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.org

That is true, and there are definitively cases where it is 
unavoidable.

However, I don't think format fits that bill, because format does 
expect a string, not any random type. Where I'm getting at is a 
bit complicated to express clearly, because types are effectively 
also "values" that you can pass around at compile time, but let 
me try.

We should reasonably expect the LSP to work when what is passed 
down is the value of the enum, but not when it it's type - which, 
in fact, isn't too surprising because the type itself isn't 
subject to the LSP.

Consider:

class A{}
class B : A {}

void foo(A a); // We should expect the LSP to hold true here, 
because the value is the only argument passed down to foo.
void bar(T)(T t); // There is no expectation that foo(new A) and 
foo(new B) behave consistently, because not only the value is 
passed down, but also the type.

While we expect passing down the value to respect the LSP, no 
such expectation can exist for the type. So in the second 
exemple, while we expect the runtime parameter `t` to conform to 
the LSP, we do not expect the compile time parameter `T` to do 
so. However, if we do not change the value of `T` but pass a B 
down to `t`, then we should get back to a situation where the LSP 
is respected.

For instance:

bar!A(new B()); // We expect this to be well behaved when it 
comes to the LSP, vs say bar(new A()) because the only change 
happened to the value parameter, which is supposed to uphold the 
LSP.

So far, so good, I don't think this is too controversial, even 
though it is confusing to express that concept clearly.

Now, with enum string, there is an interesting twist, because 
they can be passed at compile time too. in theory, that should 
not change anything when it comes to the LSP, but in practice, it 
seems like it does, which is IMO where the root of the problem is.

Consider:

string format(string S, A...)(A args);

While S is a compile time parameter, it is not a type parameter, 
but a value parameter. In that case, it is expected as per the 
LSP that I can pass down string, or any subtype of strings as the 
first compile time parameter of format, and this ought to work as 
expected.

May 12 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 12 May 2021 at 23:08:24 UTC, deadalnix wrote:
 Now, with enum string, there is an interesting twist, because 
 they can be passed at compile time too. in theory, that should 
 not change anything when it comes to the LSP, but in practice, 
 it seems like it does, which is IMO where the root of the 
 problem is.

 Consider:

 string format(string S, A...)(A args);

 While S is a compile time parameter, it is not a type 
 parameter, but a value parameter. In that case, it is expected 
 as per the LSP that I can pass down string, or any subtype of 
 strings as the first compile time parameter of format, and this 
 ought to work as expected.

This *does* work as expected: https://run.dlang.io/is/Ru9phk

The issue with `format` is that it takes an alias parameter, not 
a value parameter--and the reason it does *that* is to support 
string, wstring, and dstring with a single overload.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 23:31:21 UTC, Paul Backus wrote:
 This *does* work as expected: https://run.dlang.io/is/Ru9phk

 The issue with `format` is that it takes an alias parameter, 
 not a value parameter--and the reason it does *that* is to 
 support string, wstring, and dstring with a single overload.

Yes, so we are getting at the root of this.

I know these thing work, this is why I stated that SomeEnumString 
is a subtype of string to begin with, it has all the properties. 
If that wasn't working, then I would have been mistaken when 
making such assertions.

It is working in the simple case, it is expected to work from the 
caller's standpoint due to the LSP, but it doesn't work in 
practice due to some obscure implementation detail that is of 
little concern to the user.

Pushing this on the user is not the way to go.

If the library writer desire to bundle string/dstring/wstring in 
the same implementation, this doesn't change the fact that it 
ought to work with subtypes. Choosing to break this is what 
"flies in the face of the LSP".

I would also like to see people think what make respecting the 
LSP challenging in such case, and see what can be done at a 
systemic level. It's kind of a bummer that the path of least 
resistance is to break the LSP when going for more genericity in 
another dimension.

May 12 2021

Paul Backus <snarwin gmail.com> writes:

On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:
 It is working in the simple case, it is expected to work from 
 the caller's standpoint due to the LSP, but it doesn't work in 
 practice due to some obscure implementation detail that is of 
 little concern to the user.

 Pushing this on the user is not the way to go.

 If the library writer desire to bundle string/dstring/wstring 
 in the same implementation, this doesn't change the fact that 
 it ought to work with subtypes. Choosing to break this is what 
 "flies in the face of the LSP".

Well, no, it doesn't--because, again, the LSP doesn't apply here 
in the first place, and never has. Flies in the face of user 
expectations, perhaps--though even then, if the user looks at the 
documentation and see `isSomeString!(typeof(fmt))`, is it really 
reasonable for them to expect that a non-string type will be 
accepted?

I think it's a reasonable API design decision to support any type 
that implicitly converts to a string type, but it's not the 
*only* reasonable decision, and we ought to acknowledge the costs 
as well as the benefits.

Personally, my inclination is to err on the side of making the 
standard library a little more complex so that user code can be 
simpler, but Andrei makes a convincing argument that this 
tendency has gotten us into trouble before [1]. How do we decide 
where to draw the line? There has to be some principle here 
beyond just "users expect it" and "respect the LSP."

[1] https://forum.dlang.org/thread/q6plhj$1l9$1 digitalmars.com

 I would also like to see people think what make respecting the 
 LSP challenging in such case, and see what can be done at a 
 systemic level. It's kind of a bummer that the path of least 
 resistance is to break the LSP when going for more genericity 
 in another dimension.

IMO this is all downstream of D's choice to use untyped templates 
as opposed to typed generics (a tradeoff that goes all the way 
back to Lisp vs. ML). It's a fun thought experiment to imagine a 
version of D that took the other path, but there's not much we 
can do about it now.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Thursday, 13 May 2021 at 01:03:19 UTC, Paul Backus wrote:
 On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:
 It is working in the simple case, it is expected to work from 
 the caller's standpoint due to the LSP, but it doesn't work in 
 practice due to some obscure implementation detail that is of 
 little concern to the user.

 Pushing this on the user is not the way to go.

 If the library writer desire to bundle string/dstring/wstring 
 in the same implementation, this doesn't change the fact that 
 it ought to work with subtypes. Choosing to break this is what 
 "flies in the face of the LSP".

 Well, no, it doesn't--because, again, the LSP doesn't apply 
 here in the first place, and never has. Flies in the face of 
 user expectations, perhaps--though even then, if the user looks 
 at the documentation and see `isSomeString!(typeof(fmt))`, is 
 it really reasonable for them to expect that a non-string type 
 will be accepted?

 I think it's a reasonable API design decision to support any 
 type that implicitly converts to a string type, but it's not 
 the *only* reasonable decision, and we ought to acknowledge the 
 costs as well as the benefits.

 Personally, my inclination is to err on the side of making the 
 standard library a little more complex so that user code can be 
 simpler, but Andrei makes a convincing argument that this 
 tendency has gotten us into trouble before [1]. How do we 
 decide where to draw the line? There has to be some principle 
 here beyond just "users expect it" and "respect the LSP."

While what you say is correct, I'm not convinced it is right.

We established before that effectively, we should expect the LSP 
to hold when values are passed down, but not when types are. 
Which i think we both agree is the reasonable thing to do here, 
because B being a subtype of A doesn't say anything about 
meta_typeof(B) being a subtype of meta_typeof(A), and therefore 
there is no expectation that the LSP holds.

So it is correct to assert that if format takes the type as a 
parameter, then there is no expectation that the LSP holds. It is 
also correct to say that the documentation describes things 
accurately.

But I strongly disagree with the fact that it is right.

To use an analogy, I could make a car where the gaz and break 
pedal are swapped, and explain as much in the user manual, yet, I 
fully expect people would crash such cars at a higher rate than 
the alternative.

In the case of format, we need to ask ourselves what does the 
user expect, to pass a value down or to pass a type (plus 
possibly a value) down? Because if it the first, then it 
reasonable from the user standpoint that the LSP works and if it 
is the second, then there isn't such an expectation.

The fact that we see people trying to do format!SomeEnumString , 
but not something like format!42 provides a good answer to that 
question. Format's parameter is expected to be a string, not any 
random type. And if that is the case, then it is reasonable to 
expect to LSP to hold.

Now, the matter of cost is an interesting one. But I argue that 
doing what the user expect ought to be cheap, if not the cheapest 
option available. This is simply the difference between a 
language that helps its users and a language that gets in the 
way. So if the cost is high, then we need to consider this high 
cost a serious problem to solve.

May 13 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/21 6:00 PM, Paul Backus wrote:
 On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:
 If you think that invalidate the LSP, I'm afraid there is a big 
 misunderstanding about the LSP. Not all operation on a subtype have to 
 return said subtype. It is made clearer if you consider the slicing 
 operationa s a member function on an object instead - as I seems 
 classes and inheritance is the only way OPP is understood these days.

 class A {
    A slice(int start, int end) { ... }
 }

 class B : A {}

 Where is it implied that B's version of the slice operation must 
 return an A? Nowhere, the LSP absolutely doesn't mandate that. It 
 mandate that you can pass a B to something that expects an A, and that 
 thing will behave the way you'd expect.

 And it does!

 If your code needs an A, then you mark it as accepting an A as input. 
 If I have a B and want to pass it to your code, I can too, 
 transparently. You do not need to even know about the existence of B 
 when your wrote your code. This is what the LSP is at its core.

 Back to our string example, the code should accept string (A), with 
 zero knowledge of the existence of any enum string (B). You should be 
 able to pass a B to that code and have everything work as expected.

 
 I concede the points that enum strings do not violate the LSP, and that 
 they are subtypes of string. You're right, and I was wrong.

I was all over run.dlang.org like "Sure that's not going to work... wait 
a second, it does! But that other thing's not going to work... what, 
that works too!" I didn't know D's enums are _that_ odd.

It seems you can do almost everything with an enum that you can do with 
its base type. Keyword being "almost". For example,

x ~= "asd";

works whether x is a string or an enum based on string. However,

x = x ~ "asd";

works if x is a string and does not work if x is an enum derived from 
string. Therefore, a function using that expression works for strings 
but not for enum strings.

Similarly:

x += 3;

works for int and enums derived from int. However,

x = x + 3;

does not. So you can't transparently substitute enums for their base 
type. I suspect there'd be other cases, too.

May 12 2021

Paul Backus <snarwin gmail.com> writes:

On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu 
wrote:
 It seems you can do almost everything with an enum that you can 
 do with its base type. Keyword being "almost". For example,

 x ~= "asd";

 works whether x is a string or an enum based on string. However,

 x = x ~ "asd";

 works if x is a string and does not work if x is an enum 
 derived from string. Therefore, a function using that 
 expression works for strings but not for enum strings.

A template function, you mean? Because (as the rest of the post 
you quoted demonstrates) the LSP does not and has never applied 
(in D) to substitutions that involve different instantiations of 
the same template. If you explicitly instantiate `func!string`, 
then it will work exactly as the LSP dictates, but if you 
substitute `func!string(x)` with `func!E(x)`, you have no 
guarantee.

Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` 
doesn't is definitely a bug.

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/21 9:41 PM, Paul Backus wrote:
 On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu wrote:
 It seems you can do almost everything with an enum that you can do 
 with its base type. Keyword being "almost". For example,

 x ~= "asd";

 works whether x is a string or an enum based on string. However,

 x = x ~ "asd";

 works if x is a string and does not work if x is an enum derived from 
 string. Therefore, a function using that expression works for strings 
 but not for enum strings.

 
 A template function, you mean? Because (as the rest of the post you 
 quoted demonstrates) the LSP does not and has never applied (in D) to 
 substitutions that involve different instantiations of the same 
 template. If you explicitly instantiate `func!string`, then it will work 
 exactly as the LSP dictates, but if you substitute `func!string(x)` with 
 `func!E(x)`, you have no guarantee.
 
 Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` doesn't is 
 definitely a bug.

Well the problem is that the choice of covariance of results for 
operations on enums vs their "base" is quite arbitrary.

For strings, the result of "~" is not covariant but the result of "~=" 
is - not only it works, but it returns a reference to the enum type, not 
the base type.

However, for enums derived from integrals the result of "+" is not 
covariant when adding an enum with an integral, but covariant when two 
enums are added together. Same goes for "-", "/", "*", but oddly not for 
"^^". I suspect nobody thought of trying to raise an enum to the power 
of an enum.

The plot thickens when considering enums derived from user-defined types:

void main() {
     import std;
     struct S {
         void fun() { writefln("%s", &this); }
         int min = -1;
     }
     enum X : S { x = S() }
     X x;
     x.fun;
     (cast(S*) &x).fun;
     writeln(x.min);
}

The two addresses are the same, meaning the enum value gets to call the 
base member's function, in a subtyping manner. However, the last line 
doesn't compile, which breaks subtyping.

On the face of it, enums are defined by the language, so whatever 
choices are made are... there. I understand the practicality of some 
choices, but overall the entire enum algebra is quirky and difficult to 
maneuver around in generic code. Which harkens back to the opener of 
this thread - Phobos should not go out of its way to support enumerated 
types everywhere, when a trivial recourse exists on the caller side - 
pass value.representation instead of value.

A much stronger argument could be made against supporting convertibility 
(to e.g. strings or ranges) by means of alias this. Callers should 
convert to the needed type prior to calling into the standard library.

May 12 2021

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Wednesday, May 12, 2021 7:16:42 PM MDT Andrei Alexandrescu via Digitalmars-
d wrote:
 I was all over run.dlang.org like "Sure that's not going to work... wait
 a second, it does! But that other thing's not going to work... what,
 that works too!" I didn't know D's enums are _that_ odd.

 It seems you can do almost everything with an enum that you can do with
 its base type. Keyword being "almost".

Yeah, if enums are supposed to only have a fixed set of values, then they're
completely broken. The language does almost nothing to guarantee it. One
result of that is that you have to be _very_ careful about how you use
something like final switch - especially since it's not checked with
-release.

Of course, if enums are just named values without caring about whether it's
possible to have an enum with a different value than the ones listed, then
the fact that the enum is even treated differently from the base type causes
other problems. So, ultimately, I think that D enums are pretty
schizophrenic and not particularly well-designed.

I've argued in the past that the language should disallow all operations on
enums (aside from casts) which aren't guaranteed to result in a valid value
for that enum type, but not everyone agrees with that stance.

- Jonathan M Davis

May 12 2021

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via Digitalmars-d 
wrote:
 I've argued in the past that the language should disallow all operations on
 enums (aside from casts) which aren't guaranteed to result in a valid value
 for that enum type, but not everyone agrees with that stance.

Or more accurately, all operations on an enum which are not guaranteed to
result in a valid enum value should result in the base type (and thus not be
assignable to a variable of that enum type without a cast), and operations
which mutate the enum should not be allowed unless they're guaranteed to
result in a valid enum value. But regardless, the point is that ideally,
unless a cast is used, it should be impossible to have something typed as an
enum without it being guaranteed that the value be one of the enumerated
values for that enum type. But that's definitely not how D enums work...

- Jonathan M Davis

May 12 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:
 On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via 
 Digitalmars-d wrote:
 Or more accurately, all operations on an enum which are not 
 guaranteed to result in a valid enum value should result in the 
 base type (and thus not be assignable to a variable of that 
 enum type without a cast), and operations which mutate the enum 
 should not be allowed unless they're guaranteed to result in a 
 valid enum value. But regardless, the point is that ideally, 
 unless a cast is used, it should be impossible to have 
 something typed as an enum without it being guaranteed that the 
 value be one of the enumerated values for that enum type. But 
 that's definitely not how D enums work...

 - Jonathan M Davis


So basically enum should implicitly be declared to be immutable 
right?

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:
 On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via 
 Digitalmars-d wrote:
 I've argued in the past that the language should disallow all 
 operations on enums (aside from casts) which aren't guaranteed 
 to result in a valid value for that enum type, but not 
 everyone agrees with that stance.

 Or more accurately, all operations on an enum which are not 
 guaranteed to result in a valid enum value should result in the 
 base type (and thus not be assignable to a variable of that 
 enum type without a cast), and operations which mutate the enum 
 should not be allowed unless they're guaranteed to result in a 
 valid enum value. But regardless, the point is that ideally, 
 unless a cast is used, it should be impossible to have 
 something typed as an enum without it being guaranteed that the 
 value be one of the enumerated values for that enum type. But 
 that's definitely not how D enums work...

 - Jonathan M Davis

YES!

May 13 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu 
 wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

 If you have a range of T, then you got to return a T. I'm not 
 sure what's the problem is here. Do you have a concrete example?

 All I can think of are things like slicing and alike, and they 
 should obviously return a string, not a T.

More to the point, consider this:

class String {
private:
     immutable(char)[] value;

public:
     this(immutable(char)[] value) { this.value = value; }

     // ...
}

class EnumString : String {
public:
     static EnumString value1() { return new EnumString("value1"); 
}
     static EnumString value2() { return new EnumString("value2"); 
}

private:
     this(immutable(char)[] value) { super(value); }
}

While the implementation differs, conceptually, from a the theory 
standpoint, this is the same. This is using a subtype to 
constrain instance of type (String here) to a certain et of 
possible values. When using the subtype (EnumString) you have the 
knowledge that it is limited to some value, and you lose that 
knowledge as soon as you convert to the parent type.

But instead, we gets some bastardised monster from the compiler, 
that's not quit a subtype, but that's not quite something else 
that really make sens either. As expected, this nonsense ends up 
spilling into user code, and then the standard lib, based on user 
constraints, and everybody is left choosing between bad tradeof 
down the road because the whole house of cards is built on shaky 
foundations.

The bad news is, there is already a language like this. It's 
called C++, and it's actually quite successful. With all due 
respect to you and Walter, you guys are legends, but I think 
there is also a bit of learned helplessness coming from both of 
you due to a lifetime of exposure to the soul corroding effects 
of C++.

This attitudes pervades everything, and most language constructs 
suffer of some form of it in one way or another, causing a 
cascade of bad side effects, starting with this whole thread. A 
few examples en vrac for instance: DIP1000, delegate context 
qualifiers, functions vs first class functions, etc...

Back to the case of enum, it is obviously and trivially a 
subtype. In fact, even the syntax is the same:

enum Foo: string { ... }

Handling enum strings should never have been a special that was 
added to phobos, because it should never have been a special to 
begin with, in phobos or elsewhere.

May 10 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:
 The bad news is, there is already a language like this. It's 
 called C++, and it's actually quite successful. With all due 
 respect to you and Walter, you guys are legends, but I think 
 there is also a bit of learned helplessness coming from both of 
 you due to a lifetime of exposure to the soul corroding effects 
 of C++.

Not sure how this applies to C++, what subtyping issues are you 
having with C++?


 This attitudes pervades everything, and most language 
 constructs suffer of some form of it in one way or another, 
 causing a cascade of bad side effects, starting with this whole 
 thread. A few examples en vrac for instance: DIP1000, delegate 
 context qualifiers, functions vs first class functions, etc...

That's a direct result of the process. Features have always been 
added as an experiment rather than being completed on paper, even 
the ones with a DIP. At this point, this pretty much defines what 
D is... Just look at the addition of a C compiler that is being 
advanced right now. It is being added because there might be some 
benefits from it the future, perhaps. Of course, you also have 
the side effect that the AST becomes more resistant to change... 
and refactoring costs doubles...

So that is why D has these issues. People wanted something, and 
it was added in an experimental way, not in an analytical way. 
That is the way of D. Experiment in features.

Ideally D should have boosted meta programming and cut down on 
features to the bare minimum. Literals should have been a compile 
time type... and alias should bind to them, strings should've 
been a library construct, etc etc.

But if you look at the features being added, meta programming is 
not in focus. So this won't change. Features are being added that 
has nothing to do with metaprogramming (memory safety, C interop 
etc).

D will continue to evolve experimentally.

So there will never be a small core language that is consistent.

It is what it is, at this point.

May 10 2021

deadalnix <deadalnix gmail.com> writes:

On Monday, 10 May 2021 at 22:37:51 UTC, Ola Fosheim Grøstad wrote:
 On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:
 The bad news is, there is already a language like this. It's 
 called C++, and it's actually quite successful. With all due 
 respect to you and Walter, you guys are legends, but I think 
 there is also a bit of learned helplessness coming from both 
 of you due to a lifetime of exposure to the soul corroding 
 effects of C++.

 Not sure how this applies to C++, what subtyping issues are you 
 having with C++?

Function type don't have the right covariance/contravariance, you 
can slice subtypes, and there are more, but this is not my point.

My point is that we already have a language that is a mixed bag 
of accidentally defined features that don't compose properly with 
each others. I don't need one more of these, I already have one, 
and, let's be frank, it has at the very least an order of 
magnitude more support in the wild, in tools and so on.

Doing the same thing with less manpower is a futile exercise.

 This attitudes pervades everything, and most language 
 constructs suffer of some form of it in one way or another, 
 causing a cascade of bad side effects, starting with this 
 whole thread. A few examples en vrac for instance: DIP1000, 
 delegate context qualifiers, functions vs first class 
 functions, etc...

 That's a direct result of the process. Features have always 
 been added as an experiment rather than being completed on 
 paper, even the ones with a DIP. At this point, this pretty 
 much defines what D is... Just look at the addition of a C 
 compiler that is being advanced right now. It is being added 
 because there might be some benefits from it the future, 
 perhaps. Of course, you also have the side effect that the AST 
 becomes more resistant to change... and refactoring costs 
 doubles...

 So that is why D has these issues. People wanted something, and 
 it was added in an experimental way, not in an analytical way. 
 That is the way of D. Experiment in features.

Sure, but look at this thread. D is crumbling under the weight, 
not of the number f feature, but of the fact that a large portion 
of them simply are unsound.

At this point, the decision made is to push the madness on the 
user. Fair enough, but if the standard lib devs are not willing 
to put up with it, why in hell would you expect anyone else to? 
Just look at what's in the C++ standard lib or boost and compare 
to your average C++ project to see the kind of gap in term of 
motivation to put up with bullshit exists between standard lib 
devs and Joe coder. It's not even close.

This stuff ain't working properly so let's just given getting to 
work at all is not how you iterate toward a great useful product.

May 10 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 My point is that we already have a language that is a mixed bag 
 of accidentally defined features that don't compose properly 
 with each others. I don't need one more of these, I already 
 have one, and, let's be frank, it has at the very least an 
 order of magnitude more support in the wild, in tools and so on.

Yes, I think everyone can agree with this. A good starting point 
would to implement proper unification of as was discussed some 
months ago. This is critical for composing types in a sensible 
manner (composing templates of templates and binding them to a 
simple name that is exported).

Then one can look and see if some types/features that are 
builtins can be expressed with the same building blocks in a 
unification process (somehow).

When you see what cannot fit into this machinery you get a 
feeling for which features needs to be redesigned.

Something like that.

 Doing the same thing with less manpower is a futile exercise.

Yes.


 Sure, but look at this thread. D is crumbling under the weight, 
 not of the number f feature, but of the fact that a large 
 portion of them simply are unsound.

Yes, but designing something that is sound is best done by having 
a tiny set of (theoretical) mechanisms that all other features 
can be expressed with (even though that might not be visible to 
the end user).

It is very difficult to even discuss soundness with no 
constructive framework to represent ideas with.

 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.

Yes, even stuff that is well designed in C++ is a lot of work. 
Implementing a new container library with all the iterators is 
quite verbose, tedious and typos will happen...

I think defining protocols and making mechanisms available that 
can extend types with protocols is the way to go (concepts is one 
step in the right direction). How to do it? Not sure, but it 
seems like templating by itself is not enough really.

E.g. if ranges-functionality should be available to everything 
that can be treated like a sequence, then this should be a 
protocol that is present in all the builtin types that are 
sequential. Or somehow bound to them in some global fashion 
(kinda like injected into the type). Nothing should be special 
cased. Ideally.

But there is no clear model for how to do that, I think.

However it is tied to unification. Deduce the protocol if 
possible.

May 10 2021

Imperatorn <johan_forsberg_86 hotmail.com> writes:

On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:

 At this point, the decision made is to push the madness on the 
 user. Fair enough, but if the standard lib devs are not willing 
 to put up with it, why in hell would you expect anyone else to? 
 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.

 This stuff ain't working properly so let's just given getting 
 to work at all is not how you iterate toward a great useful 
 product.

+1

We *must* focus more on consistency and soundness imo. I've heard 
several users talk about this. So it's nice to see it being 
talked about here. The way for D forward is to polish up D2. 
Maybe have 2.100.0 as a goal. Like any project, it needs 
milestones. We should take a pause, look around and see, we're 
now in the "optimizing" phase. We can at least try. Then after 
2.100.0 for example we can start talking about new cool features 
again.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/21 6:58 PM, deadalnix wrote:

 
 Sure, but look at this thread. D is crumbling under the weight, not of 
 the number f feature, but of the fact that a large portion of them 
 simply are unsound.
 
 At this point, the decision made is to push the madness on the user. 
 Fair enough, but if the standard lib devs are not willing to put up with 
 it, why in hell would you expect anyone else to? Just look at what's in 
 the C++ standard lib or boost and compare to your average C++ project to 
 see the kind of gap in term of motivation to put up with bullshit exists 
 between standard lib devs and Joe coder. It's not even close.
 
 This stuff ain't working properly so let's just given getting to work at 
 all is not how you iterate toward a great useful product.

In case you're referring to deprecating support for enum strings in 
phobos - definitely that's not pushing any madness anywhere. Adding said 
support was a mistake in the first place.

May 11 2021

Mathias LANG <geod24 gmail.com> writes:

On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 Sure, but look at this thread. D is crumbling under the weight, 
 not of the number f feature, but of the fact that a large 
 portion of them simply are unsound.

 At this point, the decision made is to push the madness on the 
 user. Fair enough, but if the standard lib devs are not willing 
 to put up with it, why in hell would you expect anyone else to? 
 Just look at what's in the C++ standard lib or boost and 
 compare to your average C++ project to see the kind of gap in 
 term of motivation to put up with bullshit exists between 
 standard lib devs and Joe coder. It's not even close.

 This stuff ain't working properly so let's just given getting 
 to work at all is not how you iterate toward a great useful 
 product.

Well, this thread is 11 pages and show no sign of winding down.
In the meantime, has anyone looked at the code that sparked this 
outrage ?

[As I mentioned in the 
PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the
issue wouldn't have happened if the `fmt` template parameter was a `string` and
not an `alias`.

 Q: why is fmt an alias and not a simple string ?
 A: No real reason.

The way I see it, the issue is valid, the fix wasn't. `format` 
API should have accepted a `string` and let the compiler perform 
any allowed implicit conversion, instead of taking exactly the 
type via `alias`.

I wish our most competent contributors would find it more 
interesting to direct their attention to Github or promote the 
language to their large Twitter following over engaging in 
flamewar.

May 11 2021

cmyka <mauricehuuskes hotmail.com> writes:

On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:
 On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:
 ...

 Well, this thread is 11 pages and show no sign of winding down.
 In the meantime, has anyone looked at the code that sparked 
 this outrage ?

 ...

 I wish our most competent contributors would find it more 
 interesting to direct their attention to Github or promote the 
 language to their large Twitter following over engaging in 
 flamewar.

I support bringing these types of discussions to github (not 
Reddit/Twitter) instead where people can respond to a comment 
directly, or through thumbs up or down and at least edit their 
comments rather than piling on emails sequentially. Or a 
different type of discussion platform entirely.

(That said I am with Adam Ruppe's take on this matter)

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:
 Well, this thread is 11 pages and show no sign of winding down.
 In the meantime, has anyone looked at the code that sparked 
 this outrage ?

 [As I mentioned in the 
 PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the
issue wouldn't have happened if the `fmt` template parameter was a `string` and
not an `alias`.

 Q: why is fmt an alias and not a simple string ?
 A: No real reason.

 The way I see it, the issue is valid, the fix wasn't. `format` 
 API should have accepted a `string` and let the compiler 
 perform any allowed implicit conversion, instead of taking 
 exactly the type via `alias`.

 I wish our most competent contributors would find it more 
 interesting to direct their attention to Github or promote the 
 language to their large Twitter following over engaging in 
 flamewar.

If formats expects a string, then it is indeed the right thing to 
accept a string :)

But that discussion goes further than this, and is necessary, IMO.

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/21 5:44 PM, deadalnix wrote:
 On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D variables 
 don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

 If you have a range of T, then you got to return a T. I'm not sure 
 what's the problem is here. Do you have a concrete example?

 All I can think of are things like slicing and alike, and they should 
 obviously return a string, not a T.

 
 More to the point, consider this:
 
 class String {
 private:
      immutable(char)[] value;
 
 public:
      this(immutable(char)[] value) { this.value = value; }
 
      // ...
 }
 
 class EnumString : String {
 public:
      static EnumString value1() { return new EnumString("value1"); }
      static EnumString value2() { return new EnumString("value2"); }
 
 private:
      this(immutable(char)[] value) { super(value); }
 }
 
 While the implementation differs, conceptually, from a the theory 
 standpoint, this is the same.

No it isn't.

EnumString and String are reference types. A reference to an enum value 
does not convert to a reference to its representation. Very very very 
VERY different.

 This is using a subtype to constrain 
 instance of type (String here) to a certain et of possible values. When 
 using the subtype (EnumString) you have the knowledge that it is limited 
 to some value, and you lose that knowledge as soon as you convert to the 
 parent type.

One question that you keep not answering (Paul and I both asked it) is 
how you'd implement the range primitive popFront.

 But instead, we gets some bastardised monster from the compiler, that's 
 not quit a subtype, but that's not quite something else that really make 
 sens either. As expected, this nonsense ends up spilling into user code, 
 and then the standard lib, based on user constraints, and everybody is 
 left choosing between bad tradeof down the road because the whole house 
 of cards is built on shaky foundations.
 
 The bad news is, there is already a language like this. It's called C++, 
 and it's actually quite successful. With all due respect to you and 
 Walter, you guys are legends, but I think there is also a bit of learned 
 helplessness coming from both of you due to a lifetime of exposure to 
 the soul corroding effects of C++.
 
 This attitudes pervades everything, and most language constructs suffer 
 of some form of it in one way or another, causing a cascade of bad side 
 effects, starting with this whole thread. A few examples en vrac for 
 instance: DIP1000, delegate context qualifiers, functions vs first class 
 functions, etc...

I very much agree Walter and I have brought C++ bias into D, sometimes 
in a detrimental way.

 Back to the case of enum, it is obviously and trivially a subtype.

No it isn't. How many times do I need to explain that?

 In 
 fact, even the syntax is the same:
 
 enum Foo: string { ... }

It doesn't matter. It's not a subtype.

 Handling enum strings should never have been a special that was added to 
 phobos, because it should never have been a special to begin with, in 
 phobos or elsewhere.

Clearly enums have their own oddities, most inherited from C++. Perhaps 
we should do what C++ did, add a new "enum class" construct that fixes 
its issues. But I don't know of a perfect design, and I very much would 
love to see one.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 13:50:46 UTC, Andrei Alexandrescu 
wrote:
 No it isn't.

 EnumString and String are reference types. A reference to an 
 enum value does not convert to a reference to its 
 representation. Very very very VERY different.

Here we hit at the core of the problem. A reference to a type B 
that is a subtype of A is not a subtype of ref A. Or, in simlpler 
terms, B is a subtype of A doesn't imply that ref B is a subtype 
of ref A.

This means that you can pass a B where an A is expected, but not 
a ref B where a ref A is expected.

You'll note that the example I provided with classes for 
understanding will also demonstrate the same behavior:

class A { ... }
class B : A { ... }

void foo(ref A a) { ... }

B b = ...;
foo(b); // This must be an error because, while B is a subtype of 
A, ref B is not a subtype of ref A.

That means that you shouldn't be able to pass SomeEnumString to 
any function, in phobos or elsewhere, that will mutate it, such 
as popFront. But you should be able to do so, transparently, so 
any function that won't. That includes all compile time 
parameters.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/21 8:19 AM, deadalnix wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D variables 
 don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

 
 If you have a range of T, then you got to return a T.

There's no return. The range is being mutated.

 I'm not sure 
 what's the problem is here. Do you have a concrete example?

Of course. A range must implement popFront with the signature:

void popFront(ref SomeEnumString s) {
     ... please fill in the implementation ...
}

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu 
wrote:
 I'm not sure what's the problem is here. Do you have a 
 concrete example?

 Of course. A range must implement popFront with the signature:

 void popFront(ref SomeEnumString s) {
     ... please fill in the implementation ...
 }

That must be a type error, this is a feature, not a bug. This is 
not expected to work.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 8:14 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu wrote:
 I'm not sure what's the problem is here. Do you have a concrete example?

 Of course. A range must implement popFront with the signature:

 void popFront(ref SomeEnumString s) {
     ... please fill in the implementation ...
 }

 
 That must be a type error, this is a feature, not a bug. This is not 
 expected to work.
 

Then enum strings are not ranges, correct?

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu 
wrote:
 Then enum strings are not ranges, correct?

They are not. But they are strings. Which imply that string 
aren't ranges, which is right, `ref strings` are ranges, not 
strings.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?

 
 They are not. But they are strings. Which imply that string aren't 
 ranges, which is right, `ref strings` are ranges, not strings.

`ref string` is not a type.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu 
wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu 
 wrote:
 Then enum strings are not ranges, correct?

 
 They are not. But they are strings. Which imply that string 
 aren't ranges, which is right, `ref strings` are ranges, not 
 strings.

 `ref string` is not a type.

This is just denial.

There are many exemple of conversions that differs with string 
and ref strings which do not involve enums. For instance, 
immutable(string) -> string is a valid conversion, but 
immutable(string) -> ref string isn't.

Call it something else than a type if you want, nevertheless, 
conversions rules are simply different, even if you abstract the 
notion of rvalue/lvalue from the whole thing, so it is clearly 
more than just a regular storage class.

When you say ref, you say "I do not want a subtype". Saying B 
isn't a subtype of A because I can't pass a B to what expects a 
ref A is just fallacious.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 12:13 PM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?

 They are not. But they are strings. Which imply that string aren't 
 ranges, which is right, `ref strings` are ranges, not strings.

 `ref string` is not a type.

 
 This is just denial.

It's simple fact.

 There are many exemple of conversions that differs with string and ref 
 strings which do not involve enums. For instance, immutable(string) -> 
 string is a valid conversion, but immutable(string) -> ref string isn't.
 
 Call it something else than a type if you want, nevertheless, 
 conversions rules are simply different, even if you abstract the notion 
 of rvalue/lvalue from the whole thing, so it is clearly more than just a 
 regular storage class.
 
 When you say ref, you say "I do not want a subtype". Saying B isn't a 
 subtype of A because I can't pass a B to what expects a ref A is just 
 fallacious.

Again with moving the goalposts.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 12:39 PM, Andrei Alexandrescu wrote:
 On 5/11/21 12:13 PM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 10:34 AM, deadalnix wrote:
 On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:
 Then enum strings are not ranges, correct?

 They are not. But they are strings. Which imply that string aren't 
 ranges, which is right, `ref strings` are ranges, not strings.

 `ref string` is not a type.

 This is just denial.

 
 It's simple fact.
 
 There are many exemple of conversions that differs with string and ref 
 strings which do not involve enums. For instance, immutable(string) -> 
 string is a valid conversion, but immutable(string) -> ref string isn't.

 Call it something else than a type if you want, nevertheless, 
 conversions rules are simply different, even if you abstract the 
 notion of rvalue/lvalue from the whole thing, so it is clearly more 
 than just a regular storage class.

 When you say ref, you say "I do not want a subtype". Saying B isn't a 
 subtype of A because I can't pass a B to what expects a ref A is just 
 fallacious.

 
 Again with moving the goalposts.

To clarify: you can't make up your own definitions as you go so as to 
support the point you're making at the moment. You can't go "oh, call it 
something else than a type, my point stays". No. Your point doesn't stay.

By the same token you can't make up your own definition of what 
subtyping is and isn't. Value types and reference types are well-trodden 
ground. You can't just claim new terminology and then prove your own 
point by using it.

May 11 2021

Meta <jared771 gmail.com> writes:

On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu 
wrote:
 Again with moving the goalposts.

 To clarify: you can't make up your own definitions as you go so 
 as to support the point you're making at the moment. You can't 
 go "oh, call it something else than a type, my point stays". 
 No. Your point doesn't stay.

 By the same token you can't make up your own definition of what 
 subtyping is and isn't. Value types and reference types are 
 well-trodden ground. You can't just claim new terminology and 
 then prove your own point by using it.

I apologize for injecting myself into this conversation, but with 
all due respect, what the hell are you talking about? Everything 
Deadalnix is saying makes perfect sense - it's basic type theory, 
and yet you're accusing him of moving goalposts and making up 
definitions, etc. The problem is that `isSomeString` doesn't 
respect the LSP and the template constraints on the relevant 
stdlib functions for enums are a hack to work around that. End of 
story. if `isSomeString` was defined sensibly, these template 
constraint hacks would not have to exist.

All the bluster about `popFront` on enum strings, etc. is 
completely irrelevant, and is a red herring anyway (as was 
already explained).

I'm sorry for being so blunt, but this conversation is painful to 
read.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 2:37 PM, Meta wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:
 Again with moving the goalposts.

 To clarify: you can't make up your own definitions as you go so as to 
 support the point you're making at the moment. You can't go "oh, call 
 it something else than a type, my point stays". No. Your point doesn't 
 stay.

 By the same token you can't make up your own definition of what 
 subtyping is and isn't. Value types and reference types are 
 well-trodden ground. You can't just claim new terminology and then 
 prove your own point by using it.

 
 I apologize for injecting myself into this conversation, but with all 
 due respect, what the hell are you talking about? Everything Deadalnix 
 is saying makes perfect sense - it's basic type theory, and yet you're 
 accusing him of moving goalposts and making up definitions, etc. The 
 problem is that `isSomeString` doesn't respect the LSP and the template 
 constraints on the relevant stdlib functions for enums are a hack to 
 work around that. End of story. if `isSomeString` was defined sensibly, 
 these template constraint hacks would not have to exist.
 
 All the bluster about `popFront` on enum strings, etc. is completely 
 irrelevant, and is a red herring anyway (as was already explained).
 
 I'm sorry for being so blunt, but this conversation is painful to read.

Being blunt is totally cool, but that doesn't make you right.

There's no true subtyping or polymorphism with value semantics. This has 
been common knowledge in C++ - inheriting a value type is an antipattern 
for many reasons, and conversion operators are to be used carefully (and 
not as a substitute to subtyping) for many other reasons.

With value types, it's all static typing, no polymorphism, no LSP beyond 
what's called ad-hoc polymorphism in the classic Caderlli et al paper 
(http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf).

What can be aimed for with values is called "parametric polymorphism" 
(which is NOT subtyping) by the same paper: "Parametric polymorphism is 
obtained when a function works uniformly on a range of types; these 
types normally exhibit some common structure."

That works if and only if you can reasonably supplant the same 
primitives across said range of types. With enums that's onerous; as 
soon as you "derive" an enum from int you figure that ++x can't 
reasonably be implemented. Same goes for enum strings - you can't 
implement the expected string primitives so substitutability is out the 
window.

Values are monomorphic. Years ago I found a bug in a large C++ system 
that went like this:

class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
};

The assert was a _monomorphism test_, i.e. it made sure that the current 
object is actually a Widget and not something derived from it, who 
forgot to override clone() once again.

The problem was the code was doing exactly what it shouldn't have, yet 
the assert was puzzlingly passing. Since everyone here is great at 
teaching basic type theory, it's an obvious problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

Then the assertion started failing as expected. Following that, I've 
used that example for years in teaching and to invariably there are eyes 
going wide when they hear that C++ pointers are monomorphic, it's the 
pointed-to values that are polymorphic, and that's an essential 
distinction. (In D, just like in Java, classes take care of that 
indirection automatically, which can get some confused.)

May 11 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 There's no true subtyping or polymorphism with value semantics.

I think you guys need to agree on what you mean by "type" and 
"subtype".

Mathematically a type would be a set of states and a set of 
operators that can take you between the states.  A subtype is 
just a reduced set of states/operators where operators keep you 
within the set of states.

In OO a type is an abstraction (reduced set) of the states that 
the entity you model in The Real World has. A subclass in OO is 
increasing the number of states/operators, but decreasing the 
number of Real World entities covered.

So these twi notions of "subtype" are opposite.

 primitives across said range of types. With enums that's 
 onerous; as soon as you "derive" an enum from int you figure 
 that ++x can't reasonably be implemented. Same goes for enum

In C enums are subtypes of int. You reduce the number of states.

C enums are not sound, because operators can take you out of the 
allowed set of states in a heartbeat.

Anyway, I've given up following this discussion.

Just define the desirable outcome (practical design) and forget 
about the theoretical aspects... then others might be able to 
understand where the viewpoints differ.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 The problem was the code was doing exactly what it shouldn't 
 have, yet the assert was puzzlingly passing. Since everyone 
 here is great at teaching basic type theory, it's an obvious 
 problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

 Then the assertion started failing as expected. Following that, 
 I've used that example for years in teaching and to invariably 
 there are eyes going wide when they hear that C++ pointers are 
 monomorphic, it's the pointed-to values that are polymorphic, 
 and that's an essential distinction. (In D, just like in Java, 
 classes take care of that indirection automatically, which can 
 get some confused.)

While this is indeed very interesting, this is missing the larger 
point.

This whole model in C++ is unsound. It's easy to show. In you 
above example, the this pointer, typed as Widget*, points to an 
instance of a subclass of Widget. If you were to assign a Widget 
to that pointer (which you can do, this is a pointer to a mutable 
widget), then any references to that widget using a subtype of 
Widget is now invalid.

There is no such thing as a monomorphic pointer to a polymorphic 
type in any sound type system. That cannot be made to work. It is 

pointer and the pointed data, as a package, being half a value, 
half a reference type in the process. This is unavoidable, you 
can't unbundle it or everything breaks down.

So why is there an indirection in there? Simply because you 
cannot know the layout of the object at compile time when you are 
doing runtime polymorphism. But even then, you could decide to 
make it behave as a value type with eager deep copy or copy on 
write and that would work too, and it would still be polymorphic.



But we get back to square one: this has nothing to do with the 

type, which hold a reference to a payload. And the whole typing 
and subtyping business happen on these value types.

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 01:46:25 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
 wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 The problem was the code was doing exactly what it shouldn't 
 have, yet the assert was puzzlingly passing. Since everyone 
 here is great at teaching basic type theory, it's an obvious 
 problem - the fix is:

         assert(typeid(*this) == typeid(Widget));

 Then the assertion started failing as expected. Following 
 that, I've used that example for years in teaching and to 
 invariably there are eyes going wide when they hear that C++ 
 pointers are monomorphic, it's the pointed-to values that are 
 polymorphic, and that's an essential distinction. (In D, just 
 like in Java, classes take care of that indirection 
 automatically, which can get some confused.)

 While this is indeed very interesting, this is missing the 
 larger point.

 This whole model in C++ is unsound. It's easy to show. In you 
 above example, the this pointer, typed as Widget*, points to an 
 instance of a subclass of Widget. If you were to assign a 
 Widget to that pointer (which you can do, this is a pointer to 
 a mutable widget), then any references to that widget using a 
 subtype of Widget is now invalid.

 There is no such thing as a monomorphic pointer to a 
 polymorphic type in any sound type system. That cannot be made 

 represent both the pointer and the pointed data, as a package, 
 being half a value, half a reference type in the process. This 
 is unavoidable, you can't unbundle it or everything breaks down.

 So why is there an indirection in there? Simply because you 
 cannot know the layout of the object at compile time when you 
 are doing runtime polymorphism. But even then, you could decide 
 to make it behave as a value type with eager deep copy or copy 
 on write and that would work too, and it would still be 
 polymorphic.



 But we get back to square one: this has nothing to do with the 

 value type, which hold a reference to a payload. And the whole 
 typing and subtyping business happen on these value types.



-Alex

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:


 -Alex

No, both are value type, but in the case of the class, the value 
contains a reference to the payload that you describe in the 
class's body. Consider:

class A {}
A a = new A();

void foo(A ainfoo) {
     ainfooo = new A();
}

foo(a);

Was "a" modified here? No it wasn't.

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex

 No, both are value type,

Wrong.

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types

 but in the case of the class, the value contains a reference to 
 the payload that you describe in the class's body. Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?

Yes. A is being replace with the new instance of A that happens 
to have the same value here. There is no guarantee that they will 
share the same address.

- Alex

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex

 No, both are value type,

 Wrong.

 https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types

 but in the case of the class, the value contains a reference 
 to the payload that you describe in the class's body. Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?

 Yes. A is being replace with the new instance of A that happens 
 to have the same value here. There is no guarantee that they 
 will share the same address.

 - Alex

In layman terms, just because I can replace the item in the box 
with the exact same box, it does not mean the box hasn't been 
modified.

- Alex

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 02:22:52 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:
 No, classes are reference types, structs are values types in 


 -Alex

 No, both are value type,

 Wrong.

 https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-types

 but in the case of the class, the value contains a reference 
 to the payload that you describe in the class's body. 
 Consider:

 class A {}
 A a = new A();

 void foo(A ainfoo) {
     ainfooo = new A();
 }

 foo(a);

 Was "a" modified here?

 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex

 In layman terms, just because I can replace the item in the box 
 with the exact same box, it does not mean the box hasn't been 
 modified.

 - Alex

Woops, meant to say "with the exact same item."

-Alex

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that happens 
 to have the same value here. There is no guarantee that they 
 will share the same address.

 - Alex

You might want to reconsider how sure of yourself you are. For 
instance by opening https://replit.com/languages/csharp and 
running the following code in there:

using System;

class A {
   int i;
   public A(int i_) { i = i_; }
   public int getI() { return i; }
}

class Program {
   static void Main(string[] args) {
     A a = new A(15);
     Console.WriteLine(a.getI());
     foo(a);
     Console.WriteLine(a.getI());
   }

   static void foo(A ainfoo) {
     ainfoo = new A(23);
     Console.WriteLine(ainfoo.getI());
   }
}

May 11 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex

 You might want to reconsider how sure of yourself you are.

The code you posted, do not support your claim what so ever. When 
I am talk about address I am literally talking about virtual 
memory address here, such as 0x40000 or something similar to 
that. You do not know what the actual virtual memory address of 
variable of 'a' for class 'b', as the GC takes it care of it for 
you.
So when A is being replace with the new instance of A that 
happens to have the same value that is being replace, the virtual 
memory that A holds from the function parameter currently holds 
will change.

-Alex

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex

 You might want to reconsider how sure of yourself you are.

 The code you posted, do not support your claim what so ever. 
 When I am talk about address I am literally talking about 
 virtual memory address here, such as 0x40000 or something 
 similar to that. You do not know what the actual virtual memory 
 address of variable of 'a' for class 'b', as the GC takes it 
 care of it for you.
 So when A is being replace with the new instance of A that 
 happens to have the same value that is being replace, the 
 virtual memory that A holds from the function parameter 
 currently holds will change.

 -Alex

Before posting that email was the best time to run the code, look 
at the output and deduce what it means.

The second best time is now.

In any case, I will disengage from that subthread with you, 
because it has reached its conclusion, and the point has been 
demonstrably made with actual code.

Arguing about what the code does really is pointless when you can 
simply run it and look at the result.

May 12 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 11:45:52 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex

 You might want to reconsider how sure of yourself you are.

 The code you posted, do not support your claim what so ever. 
 When I am talk about address I am literally talking about 
 virtual memory address here, such as 0x40000 or something 
 similar to that. You do not know what the actual virtual 
 memory address of variable of 'a' for class 'b', as the GC 
 takes it care of it for you.
 So when A is being replace with the new instance of A that 
 happens to have the same value that is being replace, the 
 virtual memory that A holds from the function parameter 
 currently holds will change.

 -Alex

 Before posting that email was the best time to run the code, 
 look at the output and deduce what it means.

Like I said before, it does not support your claims, whatsoever. 

not support your claims whatsoever.

Classes are reference types not value types, end of discussion. 

the language if you don't believe me.


 In any case, I will disengage from that subthread with you, 
 because it has reached its conclusion, and the point has been 
 demonstrably made with actual code.

Replacing the item in the box with the different yet exact same 
item, doesn't mean that you didn't modify the box. Again, print 
the object memory address, and you will see what I am talking 
about.

-Alex

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact same 
 item, doesn't mean that you didn't modify the box. Again, print 
 the object memory address, and you will see what I am talking 
 about.

 -Alex

I legitimately can't tell if you are an idiot or a troll.

May 12 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact 
 same item, doesn't mean that you didn't modify the box. Again, 
 print the object memory address, and you will see what I am 
 talking about.

 -Alex

 I legitimately can't tell if you are an idiot or a troll.

What kind of idiot that ignores official documentation provided 
by Microsoft that clearly states that classes are reference types 
not value types!? Your coding examples does NOT DISPROVE THIS 
NOTATION WHATSOEVER!!!!

-Alex

May 12 2021

Alexandru Ermicioi <alexandru.ermicioi gmail.com> writes:

On Wednesday, 12 May 2021 at 15:41:31 UTC, 12345swordy wrote:
 On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:
 Replacing the item in the box with the different yet exact 
 same item, doesn't mean that you didn't modify the box. 
 Again, print the object memory address, and you will see what 
 I am talking about.

 -Alex

 I legitimately can't tell if you are an idiot or a troll.

 What kind of idiot that ignores official documentation provided 
 by Microsoft that clearly states that classes are reference 
 types not value types!? Your coding examples does NOT DISPROVE 
 THIS NOTATION WHATSOEVER!!!!

 -Alex

I think, you both talking about same thing. I think what he meant 
about half value type, half reference type, is that the 
variables/function parameters, themselves are references to the 
data an object has, and that reference is basically a value type, 
while the actual object data is stored in memory on that address 
found in variable/parameter, and this half value/half reference 
semantics are packaged in a single type, which cannot be broken 
apart.

I.e. you can't have a variable that just a simple pointer to some 
heap memory, and you can't also have a variable that actually 
contains the data the object has on stack, like in C++ for 
example.

This is the same thing what you've meant by classes being 
reference types, he just went a level lower into the 
implementation of so called reference types.


roots.

Best regards,
Alexandru.

May 12 2021

12345swordy <alexanderheistermann gmail.com> writes:

On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:
 Yes. A is being replace with the new instance of A that 
 happens to have the same value here. There is no guarantee 
 that they will share the same address.

 - Alex

 You might want to reconsider how sure of yourself you are. For 
 instance by opening https://replit.com/languages/csharp and 
 running the following code in there:

 using System;

 class A {
   int i;
   public A(int i_) { i = i_; }
   public int getI() { return i; }
 }

 class Program {
   static void Main(string[] args) {
     A a = new A(15);
     Console.WriteLine(a.getI());
     foo(a);
     Console.WriteLine(a.getI());
   }

   static void foo(A ainfoo) {
     ainfoo = new A(23);
     Console.WriteLine(ainfoo.getI());
   }
 }

You are conflicting passing an argument by value/reference with 
the concept of value/reference types. They are not the same thing.

"Don't confuse the concept of passing by reference with the 
concept of reference types. The two concepts are not the same. A 
method parameter can be modified by ref regardless of whether it 
is a value type or a reference type. There is no boxing of a 
value type when it is passed by reference."

https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/ref

-Alex

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you above 
 example, the this pointer, typed as Widget*, points to an instance of a 
 subclass of Widget. If you were to assign a Widget to that pointer 
 (which you can do, this is a pointer to a mutable widget), then any 
 references to that widget using a subtype of Widget is now invalid.

All of this is bizarrely incorrect. Care to elaborate?

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu 
wrote:
 On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you 
 above example, the this pointer, typed as Widget*, points to 
 an instance of a subclass of Widget. If you were to assign a 
 Widget to that pointer (which you can do, this is a pointer to 
 a mutable widget), then any references to that widget using a 
 subtype of Widget is now invalid.

 All of this is bizarrely incorrect. Care to elaborate?

Consider the following: https://godbolt.org/z/8vzx9W56a

This is a clear demonstration that C++'s type system is unsound 
here.

It is unsound because it has the property that you mentioned in 
your post: the pointer is monomorphic and the value this pointers 
points to is polymorphic. This is simply unsound, you cannot 
separate the two (unless you make everything immutable).

The pointer and the value must come together as a bundle, and 
that whole bundle (which is a value type containing a reference) 

is right.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 12:11:22 UTC, deadalnix wrote:
 This is a clear demonstration that C++'s type system is unsound 
 here.

In fairness all generic low level programming languages that are 
practical to use have somewhat unsound type systems. Only high 
level languages can be fully sound (detect invalid programs at 
runtime and abort). C++ was forced into this mold by C though...

(You can have heavily constrained low level languages that are 
sound)

 It is unsound because it has the property that you mentioned in 
 your post: the pointer is monomorphic and the value this

What does monomorphic mean in this context? Why would not this 
hold:

*Singleton <: *Anything

I find the discussion very confusing at this point.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 13:35:11 UTC, Ola Fosheim Grøstad 
wrote:
 *Singleton <: *Anything

Typo :-D, I meant pointer-to-Singeltong is subtype of 
pointer-to-Anything:

Singleton* <: Anything*

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/21 8:11 AM, deadalnix wrote:
 On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu wrote:
 On 5/11/21 9:46 PM, deadalnix wrote:
 This whole model in C++ is unsound. It's easy to show. In you above 
 example, the this pointer, typed as Widget*, points to an instance of 
 a subclass of Widget. If you were to assign a Widget to that pointer 
 (which you can do, this is a pointer to a mutable widget), then any 
 references to that widget using a subtype of Widget is now invalid.

 All of this is bizarrely incorrect. Care to elaborate?

 
 Consider the following: https://godbolt.org/z/8vzx9W56a
 
 This is a clear demonstration that C++'s type system is unsound here.
 
 It is unsound because it has the property that you mentioned in your 
 post: the pointer is monomorphic and the value this pointers points to 
 is polymorphic. This is simply unsound, you cannot separate the two 
 (unless you make everything immutable).
 
 The pointer and the value must come together as a bundle, and that whole 
 bundle (which is a value type containing a reference) is itself what is 


Ah, now we're at slicing. Love these forum discussions!

May 12 2021

Meta <jared771 gmail.com> writes:

On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu
wrote:
I apologize for injecting myself into this conversation, but
with all due respect, what the hell are you talking about?
Everything Deadalnix is saying makes perfect sense - it's
basic type theory, and yet you're accusing him of moving
goalposts and making up definitions, etc. The problem is that
`isSomeString` doesn't respect the LSP and the template
constraints on the relevant stdlib functions for enums are a
hack to work around that. End of story. if `isSomeString` was
defined sensibly, these template constraint hacks would not
have to exist.

All the bluster about `popFront` on enum strings, etc. is
completely irrelevant, and is a red herring anyway (as was
already explained).

I'm sorry for being so blunt, but this conversation is painful
to read.

Being blunt is totally cool, but that doesn't make you right.

There's no true subtyping or polymorphism with value semantics.
This has been common knowledge in C++ - inheriting a value type
is an antipattern for many reasons, and conversion operators
are to be used carefully (and not as a substitute to subtyping)
for many other reasons.

With value types, it's all static typing, no polymorphism, no
LSP beyond what's called ad-hoc polymorphism in the classic
Caderlli et al paper
(http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf).

Of course, but I thought the conversation was about strings, not
value types. Last I checked, strings are reference types, in the
same way that Java objects are reference types.

What can be aimed for with values is called "parametric
polymorphism" (which is NOT subtyping) by the same paper:

The nice thing about D's template constraints, though, is that it
allows us to impose subtype polymorphism on a parametrically
polymorphic function.

"Parametric polymorphism is obtained when a function works
uniformly on a range of types; these types normally exhibit
some common structure."

That works if and only if you can reasonably supplant the same
primitives across said range of types. With enums that's
onerous; as soon as you "derive" an enum from int you figure
that ++x can't reasonably be implemented. Same goes for enum
strings - you can't implement the expected string primitives so
substitutability is out the window.

++x still fulfills the contract that the derived enum has
inherited from `int`: `++: int -> int`. It easily passes the
substitutability test. Likewise, enums with a base type of string
fulfill all the same contracts that `string` does.

Nowhere in the contract of the string type does it specify that
`s[1..$]` returns a value of the same type as `s`, just of type
`string`, which a string enum does.

Values are monomorphic.

Are you saying that all values are monomorphic, or that _value
types_ are monomorphic?

Years ago I found a bug in a large C++ system that went like
this:

class Widget : BaseWidget {
...
Widget* clone() {
assert(typeid(this) == typeid(Widget*));
return new Widget(*this);
}
};

The assert was a _monomorphism test_, i.e. it made sure that
the current object is actually a Widget and not something
derived from it, who forgot to override clone() once again.

The problem was the code was doing exactly what it shouldn't
have, yet the assert was puzzlingly passing. Since everyone
here is great at teaching basic type theory

Just so we're clear, my previous post was not trying to insinuate
that I am an expert in type theory and you are just too ignorant
to understand the arguments presented. I don't claim to be
anything close to an expert and only know the basics, and you're
the one with the doctorate here.

it's an obvious problem - the fix is:

assert(typeid(*this) == typeid(Widget));

Then the assertion started failing as expected. Following that,
I've used that example for years in teaching and to invariably
there are eyes going wide when they hear that C++ pointers are
monomorphic, it's the pointed-to values that are polymorphic,
and that's an essential distinction. (In D, just like in Java,
classes take care of that indirection automatically, which can
get some confused.)

You just said a paragraph back that values are monomorphic. So
are pointed-to values monomorphic or polymorphic? This isn't a
gotcha; I'm just confused about which you meant.

I think the point you are trying to make with this story is that
an operation on an enum that returns the base type will lead to
confusing/wrong behaviour and allowing it for template functions
which are meant to take strings would be bad design, just like it
was with Widget.clone(). Is that right?

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 10:36 PM, Meta wrote:
 Last I checked, strings are reference types, in the same way that Java 
 objects are reference types.

Just by means of clarification, that's not true because the length is 
stored with the pointer. This occasionally trips folks starting with D.

May 12 2021

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Wednesday, May 12, 2021 5:52:28 AM MDT Andrei Alexandrescu via Digitalmars-
d wrote:
 On 5/11/21 10:36 PM, Meta wrote:
 Last I checked, strings are reference types, in the same way that Java
 objects are reference types.

 Just by means of clarification, that's not true because the length is
 stored with the pointer. This occasionally trips folks starting with D.

To be more precise, a dynamic array in D is essentially

struct Array(T)
{
    size_t length;
    T* ptr;
}

So, the length is stored directly in the struct, and the data is referenced
via the pointer stored in the struct. As such, we often refer to a D dynamic
array as a pseudo-reference type. Either way, while the way it's put
together has some very useful properties (like making it so that multiple
dynamic arrays can be slices of the same data), there's no question that it
can be confusing at first. And of course, that extends to strings, since D
strings are dynamic arrays.

- Jonathan M Davis

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 10:36 PM, Meta wrote:
 ++x still fulfills the contract that the derived enum has inherited from 
 `int`: `++: int -> int`.

No, that would be ref int -> ref int, which has consequences.

May 12 2021

Timon Gehr <timon.gehr gmx.ch> writes:

On 11.05.21 23:36, Andrei Alexandrescu wrote:
 On 5/11/21 2:37 PM, Meta wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:
 Again with moving the goalposts.

 To clarify: you can't make up your own definitions as you go so as to 
 support the point you're making at the moment. You can't go "oh, call 
 it something else than a type, my point stays". No. Your point 
 doesn't stay.

 By the same token you can't make up your own definition of what 
 subtyping is and isn't. Value types and reference types are 
 well-trodden ground. You can't just claim new terminology and then 
 prove your own point by using it.

 I apologize for injecting myself into this conversation, but with all 
 due respect, what the hell are you talking about? Everything Deadalnix 
 is saying makes perfect sense - it's basic type theory, and yet you're 
 accusing him of moving goalposts and making up definitions, etc. The 
 problem is that `isSomeString` doesn't respect the LSP and the 
 template constraints on the relevant stdlib functions for enums are a 
 hack to work around that. End of story. if `isSomeString` was defined 
 sensibly, these template constraint hacks would not have to exist.

 All the bluster about `popFront` on enum strings, etc. is completely 
 irrelevant, and is a red herring anyway (as was already explained).

 I'm sorry for being so blunt, but this conversation is painful to read.

 
 Being blunt is totally cool, but that doesn't make you right.
 ...

Deadalnix is saying that there is a subtyping relationship for rvalues, 
while you are pointing out that there is no subtyping relationship for 
lvalues. I think those are both correct. (Type theory has no notion of 
lvalues or rvalues, so those would indeed have to be interpreted as 
different types.)

I fail to see why the semantics of lvalues should have any bearing on 
format strings even though I understand why most of Phobos might want to 
assume isSomeString talks about lvalues of the type.

 There's no true subtyping or polymorphism with value semantics.
 ...

There's certainly subtyping. The point about "polymorphism" (in type 
theory, polymorphism typically refers to parametric polymorphism, but I 
guess you mean existential types), is a bit more tricky. I guess the 
point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without 
any runtime semantics can't support data types whose values do not embed 
runtime type info. However, it can certainly support value types, even 
value types that are stored without indirections.

 the assert was puzzlingly passing. Since everyone here is great at 
 teaching basic type theory, it's an obvious problem - the fix is:
 
          assert(typeid(*this) == typeid(Widget));
 ...


That's a C++ quirk. Not much to do with type theory. In fact, C++ may 
not be a great example for illustration, as its type system is unsound.

May 11 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/11/21 10:39 PM, Timon Gehr wrote:
 Deadalnix is saying that there is a subtyping relationship for rvalues, 
 while you are pointing out that there is no subtyping relationship for 
 lvalues. I think those are both correct.

Well put. Rvalues can afford the luxury to change representation (e.g. 
from byte to int of float to double) because they're only used once. So 
a passable polymorphism scheme can be implemented via coercion.

 (Type theory has no notion of 
 lvalues or rvalues, so those would indeed have to be interpreted as 
 different types.)

Hmmm... haven't looked in a while, but don't some of Java formalizations 
account for int, double etc. being values and consequently rvalues when 
passed around? (Though they can't be passed by reference so a 
formalization could get away with assuming int is a reference, e.g. ++x 
means "rebind reference x to a new reference to the value x + 1").

 I fail to see why the semantics of lvalues should have any bearing on 
 format strings even though I understand why most of Phobos might want to 
 assume isSomeString talks about lvalues of the type.

It doesn't, the format string is just a symptom. The problem is that we 
change (already did, and massively... >100 instances of StringTypeOf) 
the standard library to accommodate what I think is an unproductive form 
of genericity.

 There's no true subtyping or polymorphism with value semantics.
 ...

 
 There's certainly subtyping. The point about "polymorphism" (in type 
 theory, polymorphism typically refers to parametric polymorphism, but I 
 guess you mean existential types), is a bit more tricky. I guess the 
 point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without 
 any runtime semantics can't support data types whose values do not embed 
 runtime type info. However, it can certainly support value types, even 
 value types that are stored without indirections.

One matter is to distinguish what can be done from what D has already 
done and cannot change. For example, I tried some code just now and 
was... surprised.

Meta mentioned that increment works with enums, and lo and behold it does:

void main() {
     import std;
     enum X : int { x = 10, y = 20 }
     X x;
     writeln(x);
     ++x;
     writeln(x);
}

That prints "x" and then "cast(X)11". Meaning you can easily write a 
program that takes you outside enumerated values without a cast, which 
somewhat dilutes the value of "final switch" and the general notion that 
enumerated types are a small closed set. Arguably ++ should not be 
allowed on enumerated values.

Surprises go on:

void main() {
     import std;
     enum X : string { x = "Hello, world!", y = "xyz" }
     X x;
     writeln(x);
     x = x[1 .. $];
     writeln(x);
}

That prints:

x
cast(X)ello, world!

which showcases, as a little distraction, a bug in the formatting of 
enums - the string should be quoted properly.

But the larger point is that enum types derived from string actually 
allow, again, stepping outside their universe with ease.

This cramps my style somewhat because during the whole discussion I 
assume that doesn't work, or at least shouldn't. I guess an argument 
could be built that its semantics is what it is.

Anyway, the other side of the argument that got ignored is the alias 
this thing:

void main() {
     import std;
     static struct X {
         string fun();
         alias fun this;
     }
     X x;
     x = x[1 .. $];
}

This doesn't compile; the slice does, but the assignment doesn't. Which 
means there are differences in what would be expected of a string (or, 
as it turns out, an enum string) and what would be expected of a type 
that converts to string by means of alias this.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 13:39:50 UTC, Andrei Alexandrescu 
wrote:
 One matter is to distinguish what can be done from what D has 
 already done and cannot change. For example, I tried some code 
 just now and was... surprised.

 Meta mentioned that increment works with enums, and lo and 
 behold it does:

 void main() {
     import std;
     enum X : int { x = 10, y = 20 }
     X x;
     writeln(x);
     ++x;
     writeln(x);
 }

 That prints "x" and then "cast(X)11". Meaning you can easily 
 write a program that takes you outside enumerated values 
 without a cast, which somewhat dilutes the value of "final 
 switch" and the general notion that enumerated types are a 
 small closed set. Arguably ++ should not be allowed on 
 enumerated values.

 Surprises go on:

 void main() {
     import std;
     enum X : string { x = "Hello, world!", y = "xyz" }
     X x;
     writeln(x);
     x = x[1 .. $];
     writeln(x);
 }

 That prints:

 x
 cast(X)ello, world!

 which showcases, as a little distraction, a bug in the 
 formatting of enums - the string should be quoted properly.

 But the larger point is that enum types derived from string 
 actually allow, again, stepping outside their universe with 
 ease.

I've raised these problem on a regular basis for years now.

This is obviously another instance where things are unsound, and 
needs to be fixed.

Last time I we had a discussion on the matter, it went in a loop 
that is best summarized as this:
enum E : int { A, B, C }

while (true) {
   Me: A | B ought to be an int, not an E.
   W&A: But you need it to be an enum, so that you can do things 
like combining flags and stay. As in:
     enum Mode { Read, Write }
     openFile(file, Mode.Read | Mode.Write);
   Me: Wl then, you can't have final switch, because you don't 
have the guarantee it rely on.
   W&A: final switch is very much needed, from X, Y Z reason.
}

This is extremely tiresome and kinda looks like the current 
discussion (or another one would be the in contract needing to be 
statically bound, where Timon and Myself had to fish for Bertrand 
Meyer because nothing short of an argument from authority could 
do the trick).

So if we get nothing else out of that discussion, fixing enum so 
that they don't go out of the allowed set of value would be nice. 
It's just unfortunate that it takes literally 5 years+ to get to 
a point where this is even acknowledged as being an issue.

I hope we can somehow shorten that process, because it's not 
workable as it is. You have people around like Timon and myself 
who have an eye for this. It's free brainpower you are leaving 
not leveraging.

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/21 11:30 AM, deadalnix wrote:
 Last time I we had a discussion on the matter, it went in a loop that is 
 best summarized as this:
 enum E : int { A, B, C }
 
 while (true) {
    Me: A | B ought to be an int, not an E.
    W&A: But you need it to be an enum, so that you can do things like 
 combining flags and stay. As in:
      enum Mode { Read, Write }
      openFile(file, Mode.Read | Mode.Write);
    Me: Wl then, you can't have final switch, because you don't have the 
 guarantee it rely on.
    W&A: final switch is very much needed, from X, Y Z reason.
 }

I know this is Walter's take, but please don't ascribe it to me as well. 
I could at the very best give a nod to practicality, but I very much 
think that typing binary "or" on enums as the enum is a kludge.

 This is extremely tiresome and kinda looks like the current discussion 
 (or another one would be the in contract needing to be statically bound, 
 where Timon and Myself had to fish for Bertrand Meyer because nothing 
 short of an argument from authority could do the trick).
 
 So if we get nothing else out of that discussion, fixing enum so that 
 they don't go out of the allowed set of value would be nice. It's just 
 unfortunate that it takes literally 5 years+ to get to a point where 
 this is even acknowledged as being an issue.

This reach for credit here does not seem very well deserved.

 I hope we can somehow shorten that process, because it's not workable as 
 it is. You have people around like Timon and myself who have an eye for 
 this. It's free brainpower you are leaving not leveraging.

I will say what follows with the utmost respect. I think Timon is way 
better at these things (like in, incomparably better) than you and me 
combined. He most certainly is less skilled than you at other things, 
but as far as PL theory in this group goes, he and Paul are the only 
game in town.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu 
wrote:
 I will say what follows with the utmost respect. I think Timon 
 is way better at these things (like in, incomparably better) 
 than you and me combined. He most certainly is less skilled 
 than you at other things, but as far as PL theory in this group 
 goes, he and Paul are the only game in town.

You are so wonderful at being inclusive... :-P Never seen anyone 
in these forums that haven' said things about PL theory that is 
either wrong or lacks nuance. Applies to Andreis, Timons, Pauls 
alike...

However, since most here does not have comp.sci. background it 
would be nice if we stop hiding behind terminology (which people 
will perceive differently even if they have comp.sci. background 
which is why papers use references).

deadalnix is explaining how he uses the terms which makes the 
thread more inclusive for all.

Your dismissal is not helpful.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu 
wrote:
 I will say what follows with the utmost respect. I think Timon 
 is way better at these things (like in, incomparably better) 
 than you and me combined. He most certainly is less skilled 
 than you at other things, but as far as PL theory in this group 
 goes, he and Paul are the only game in town.

It's fine, then just listen to him and not to me. That already 
would be vast improvement over the current state of affairs.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 02:39:23 UTC, Timon Gehr wrote:
          assert(typeid(*this) == typeid(Widget));
 ...


 That's a C++ quirk. Not much to do with type theory. In fact, 
 C++ may not be a great example for illustration, as its type 
 system is unsound.

It isn't a quirk. To get dynamic lookup you need to add a virtual 
member.

class A {
public:
     virtual void nothing(){}
     void test(){
         std::cout << typeid(*this).name() << std::endl;
         std::cout << typeid(A).name() << std::endl;
     }
};
class B : public A {
};


void test_typeinfo(){
     B b{};
     b.test();
}

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

I don't understand what you mean by pointers being monomorphic.

this will always have the type of the class it was defined in. So 
the assert will always hold.

How is this surprising???

What is more dangerous is that if you forget to add a virtual 
member then *this will also always hold as being a Widget.

That is the result of C++ being a low-level language. No sensible 
high level language would allow such semantics.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 14:52:42 UTC, Ola Fosheim Grøstad 
wrote:
 On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu 
 wrote:
 Values are monomorphic. Years ago I found a bug in a large C++ 
 system that went like this:

 class Widget : BaseWidget {
     ...
     Widget* clone() {
         assert(typeid(this) == typeid(Widget*));
         return new Widget(*this);
     }
 };

 The assert was a _monomorphism test_, i.e. it made sure that 
 the current object is actually a Widget and not something 
 derived from it, who forgot to override clone() once again.

 I don't understand what you mean by pointers being monomorphic.

Ok, consider the following.

class A {};
class B: public A {};

A *a = new B();

tyepid(a) is A*. In C++, a is monomorphic.
typeid(*a) is B. In C++, *a is polymorphic.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 15:35:26 UTC, deadalnix wrote:
 Ok, consider the following.

 class A {};
 class B: public A {};

 A *a = new B();

 tyepid(a) is A*. In C++, a is monomorphic.
 typeid(*a) is B. In C++, *a is polymorphic.

Sadly, IIRC typeid(*a) is A, because A does not contain a virtual 
member...

typeid(a) is A*, because that is the type of the pointer. 
However, the relationship between B* and A* is polymorphic, 
because you can use B* in the context where you expect A*? E.g. 
you can call a function that expects paramater A* with a pointer 
B*. So that makes the relationship polymorphic?

I have to admit I never use the terminology monomorphic and 
polymorphic, so my understanding could be wrong. If so, I am 
probably not alone in the thread, so for the sake of other 
readers, maybe someone can provide a definition for monomorphic?

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
wrote:
 typeid(a) is A*, because that is the type of the pointer. 
 However, the relationship between B* and A* is polymorphic, 
 because you can use B* in the context where you expect A*? E.g. 
 you can call a function that expects paramater A* with a 
 pointer B*. So that makes the relationship polymorphic?


To be more precise. B* is a subtype of A* if you can use B* in 
contexts where A* is expected, which is polymorphic in nature.

More interestingly, pure OO-languages like Beta provide 
type-variables. C++/D lack those. So in such languages you can 
bind new types to type-variables and therefore change the typing 
of elements of arrays and such in subclasses.

(Which leads to other challenges, all languages seem to have some 
kind of challenge associated with them once they allow 
polymorphisms)

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 16:46:40 UTC, Ola Fosheim Grøstad 
wrote:
 On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
 wrote:
 typeid(a) is A*, because that is the type of the pointer. 
 However, the relationship between B* and A* is polymorphic, 
 because you can use B* in the context where you expect A*? 
 E.g. you can call a function that expects paramater A* with a 
 pointer B*. So that makes the relationship polymorphic?


 To be more precise. B* is a subtype of A* if you can use B* in 
 contexts where A* is expected, which is polymorphic in nature.

I would say it is a sybtype, yes, but polymorphism imply that 
there are several ways to see the same thing, which, as Andrei 
points out, imply that you go through a reference somewhere.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
wrote:
 I have to admit I never use the terminology monomorphic and 
 polymorphic, so my understanding could be wrong. If so, I am 
 probably not alone in the thread, so for the sake of other 
 readers, maybe someone can provide a definition for monomorphic?

It's quite simple.

*a is polymorphic, because it it an object of type A as far as 
the user of *a is concerned, but it is actually an object of type 
B (or any other subtype of a).

a itself isn't polymorphic, because it is a pointer to an A no 
matter what. It is not a pointer to a B that is observed as if it 
was a pointer to an A. There is nothing more in it to be 
discovered at run time, it's just a pointer.

Even if you do

B *b = ...;
A *a = b;

Then you have not an instance of polymorphism, simply that you 
had a pointer to a B, and now you also have a pointer to an A.

May 12 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Wednesday, 12 May 2021 at 16:49:12 UTC, deadalnix wrote:
 On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad 
 wrote:
 I have to admit I never use the terminology monomorphic and 
 polymorphic, so my understanding could be wrong. If so, I am 
 probably not alone in the thread, so for the sake of other 
 readers, maybe someone can provide a definition for 
 monomorphic?

 It's quite simple.

 *a is polymorphic, because it it an object of type A as far as 
 the user of *a is concerned, but it is actually an object of 
 type B (or any other subtype of a).

 a itself isn't polymorphic, because it is a pointer to an A no 
 matter what. It is not a pointer to a B that is observed as if 
 it was a pointer to an A. There is nothing more in it to be 
 discovered at run time, it's just a pointer.

I think I understand what you mean, but the terminology used is 
confusing me. A monomorphic function/operator works on only one 
type, but a polymorphic function/operators works on many types.

Seems to me that A* can work on many types, but B* can only work 
on one type (if has no subclasses. So wouldn't that make A* be 
polymorphic in nature, but B* be monomorphic in nature?

I've recently found it better (less baggage) to think in terms of 
protocols than classes. Then A* would be a pointer to something 
that provides the A-protocols. So when an A* pointer points to a 
B instance then we can think of it as if it points to the 
A-protocols that B provides. Maybe then you could claim that it 
is monomorphic as it only binds to A-protocols.

But that is not actually the case, as you have the ability to 
cast A* to B*. So then it would be polymorphic...? I dunno. Seems 
it is a matter of perspective, if "monomorphic" means "of one 
form".

May 12 2021

Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:

On Tuesday, May 11, 2021 12:37:20 PM MDT Meta via Digitalmars-d wrote:
 On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu

 wrote:
 Again with moving the goalposts.

 To clarify: you can't make up your own definitions as you go so
 as to support the point you're making at the moment. You can't
 go "oh, call it something else than a type, my point stays".
 No. Your point doesn't stay.

 By the same token you can't make up your own definition of what
 subtyping is and isn't. Value types and reference types are
 well-trodden ground. You can't just claim new terminology and
 then prove your own point by using it.

 I apologize for injecting myself into this conversation, but with
 all due respect, what the hell are you talking about? Everything
 Deadalnix is saying makes perfect sense - it's basic type theory,
 and yet you're accusing him of moving goalposts and making up
 definitions, etc. The problem is that `isSomeString` doesn't
 respect the LSP and the template constraints on the relevant
 stdlib functions for enums are a hack to work around that. End of
 story. if `isSomeString` was defined sensibly, these template
 constraint hacks would not have to exist.

 All the bluster about `popFront` on enum strings, etc. is
 completely irrelevant, and is a red herring anyway (as was
 already explained).

 I'm sorry for being so blunt, but this conversation is painful to
 read.

Having isSomeString accept types that implicitly converted to string would
be a disaster. Templates do not operate on implict conversions - or even on
subtypes. They operate on the exact type they're given. You can, of course,
write a template constraint which checks for implicit conversions, but you
still don't get the implicit conversion when the template is instantiated.
You get the original type. This has a number of implications, but in
general, it leads to bugs if templates check for implicit conversions
instead of exact types. In particular, any templated function which checks
for an implicit conversion then needs to force the implicit conversion, or
it will likely not work properly - be it because you get compilation errors,
or because the original type compiles with the same code but does not behave
the same way as the type from the implicit conversion which was not actually
made.

In fact, IIRC, at one point, isSomeString _did_ work with enums, and we
fixed it so that it didn't, because it was causing problems. Also, IIRC, it
was my fault that it was ever made to work with enums, and I very much
regret that.

In general, implicit conversions have no business in template constraints.
Obviously, there are exceptions to that, but in general, there will be fewer
bugs if the conversions are done explicitly by the code instantiating the
template. The reason that it's done in Phobos as much as it is is primarily
because of code that was originally not generic which was later templatized
(often because it took string and was changed to work on multiple string
types or to work on general ranges of characters). And in most cases where
we've tried to templatize functions without breaking code, we've had
problems because of the implicit conversions that worked before.
std.traits.isConvertibleToString is one such abomination which came out of
that (its use usually results in code that slices local variables and
escapes them, which is really bad). IIRC, that was done by Walter, and if
he's making mistakes like that with regards to implicit conversions and
templated code, what do you think the average D programmer is doing?

The main reason for bringing up popFront and enums is to show that that
enums with a base type of string are not actually strings, and treating them
as if they were causes serious problems. There are of course places where
that sub-typing results in implicit conversions, but templates do not work
that way, and trying to force it is very problematic. The proliferation of
template constraint and static if complexity that Andrei is complaining
about with regards to stuff like format is the result of that, and it's the
kind of code that's very hard to get right. Simply not trying to support
those implicit conversions with templated functions _significantly_ reduces
the complexity of such code with the only cost being that the code
instantiating the template will have to use cast(string) on the enum value.

- Jonathan M Davis

May 12 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/12/21 6:36 AM, Jonathan M Davis wrote:
 Having isSomeString accept types that implicitly converted to string would
 be a disaster.

Sadly that's exactly what StringTypeOf does: https://run.dlang.io/is/8xqPKr

We should eliminate all uses of StringTypeOf from phobos.

May 12 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu 
wrote:
 By the same token you can't make up your own definition of what 
 subtyping is and isn't. Value types and reference types are 
 well-trodden ground. You can't just claim new terminology and 
 then prove your own point by using it.

I simply removed an assumption that isn't relevant to the case 
I'm making, namely wether you consider ref string to be a type or 
not, because it doesn't affect the conclusion and therefore isn't 
a debate worth getting into.

You made the point that SomeEnumString cannot be considered a 
subtype of string because things start breaking when it is passed 
by ref, and I retort that the exact same things break in the 
exact same way for subtypes, making your argument moot.

You say "B is not a subtype of A because it exhibit behavior X 
when passed by ref"
I say "D is a known subtype of C, and it also exibhit behavior X 
when passed by ref, therefore X cannot be used as a justification 
that B isn't a subtype of A"

We can argue to no end about what is the right definition that 
should be used for X, but it really doesn't change the overall 
point that is being made.

May 11 2021

deadalnix <deadalnix gmail.com> writes:

On Tuesday, 11 May 2021 at 12:14:42 UTC, deadalnix wrote:
 On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu 
 wrote:
 I'm not sure what's the problem is here. Do you have a 
 concrete example?

 Of course. A range must implement popFront with the signature:

 void popFront(ref SomeEnumString s) {
     ... please fill in the implementation ...
 }

 That must be a type error, this is a feature, not a bug. This 
 is not expected to work.

I realize that this require further explanations.

The fact that B is a subtype of A doesn't imply that a type 
constructed from B is a subtype of that same construction using 
A. For instance,

A function() would be a subtype of B function(), the relation 
reversed in that example.

In your example, you are constructing a ref SomeEnumString and 
expecting it to be a subtype of string (or maybe ref string) but 
both are incorrect assumptions. This is because you can execute 
operation that require covariance as well as operation that 
require contravariance on a ref, therefore, it needs to be 
exactly the same type. This is hardly an exceptional situation, 
this also happens when taking an array, B being a subtype of A 
doesn't mean the B[] is a subtype of A[].

Interestingly, it is the case for const ref, or const arrays, 
which is where the push toward handling const ref differently 
comes from.

In any case, it is not expect from format that it modify teh 
pattern it takes as an input. In fact, it is a god damn compile 
time parameter, it is not mutable to begin with. It is therefore 
expected that this works.

May 11 2021

Joseph Rushton Wakeling <joseph.wakeling webdrake.net> writes:

On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 Popping the head out of an enum value ought to be a string, 
 not that enum's value. I don't really see where the problem is 
 here, this is subtyping 101.

 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D 
 variables don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

This feels a bit like the real problem might be in the conflation 
of the container (the enum or the string) and the range?

Cf. the way this is handled in Rust, where there is a clear 
distinction between a container, versus an iterator over that 
container:
https://doc.rust-lang.org/rust-by-example/flow_control/for.html

Note also the different ways that the iterator can be generated: 
either using a reference to the container itself, or by moving 
the container into the iterator so the container itself is 
consumed by the iteration.

May 10 2021

Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= <ola.fosheim.grostad gmail.com> writes:

On Monday, 10 May 2021 at 17:09:37 UTC, Joseph Rushton Wakeling 
wrote:
 Cf. the way this is handled in Rust, where there is a clear 
 distinction between a container, versus an iterator over that 
 container:

That is true for C++ and Python as well. C++ has 
begin(object)/end(object) and Python has iter(object).

May 10 2021

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 5/10/21 1:09 PM, Joseph Rushton Wakeling wrote:
 On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:
 Popping the head out of an enum value ought to be a string, not that 
 enum's value. I don't really see where the problem is here, this is 
 subtyping 101.

 So you have a range r of type T.

 You call r.popFront().

 Obvioulsly the type of r should stay the same because in D variables 
 don't change type.

 So... what gives, young Padawan?

 No, this is not subtyping 101.

 
 This feels a bit like the real problem might be in the conflation of the 
 container (the enum or the string) and the range?
 
 Cf. the way this is handled in Rust, where there is a clear distinction 
 between a container, versus an iterator over that container:
 https://doc.rust-lang.org/rust-by-example/flow_control/for.html
 
 Note also the different ways that the iterator can be generated: either 
 using a reference to the container itself, or by moving the container 
 into the iterator so the container itself is consumed by the iteration.

True, D has only "orphan" ranges, no containers. std.container is not 
working out and with current D technology we can't define containers 
that work with safe/pure/nogc at the same time (two out of three we can).

If you consider the enum string value a container and the string 
extracted from it a range of that container, I think that would be a 
valid way to look at the matter.

May 11 2021

Paul Backus <snarwin gmail.com> writes:

On Tuesday, 11 May 2021 at 13:41:53 UTC, Andrei Alexandrescu 
wrote:
 True, D has only "orphan" ranges, no containers. std.container 
 is not working out and with current D technology we can't 
 define containers that work with safe/pure/nogc at the same 
 time (two out of three we can).

How much value does pure have here anyway? Typical container 
usage involves allocating from the global (!) heap, which 
arguably *should* be impure, hacks like `pureMalloc` 
notwithstanding.

May 11 2021

Timon Gehr <timon.gehr gmx.ch> writes:

On 11.05.21 16:38, Paul Backus wrote:
 allocating from the global (!) heap, which arguably *should* be impure

I think this is confusing different levels of abstraction. What should 
be impure is accessing memory addresses as integers.

May 11 2021

ruheladev40 <ruheladev400 gmail.com> writes:

I think it makes possible sense to require either wrappers that 
clarify intent, or always treat enums the same way (as an enum). 
I think Phobos *mostly* does the latter. Erroring for ambiguity 
might be more disruptive than it's worth.

May 11 2021

D Programming

C/C++ Programming

Other

digitalmars.D - No we should not support enum types derived from strings