digitalmars.D - No we should not support enum types derived from strings
- Andrei Alexandrescu (2/2) May 06 2021 We should remove all that rot from phobos pronto.
- evilrat (3/5) May 06 2021 Just a commoner here, can you explain for stupid what makes enum
- Andrei Alexandrescu (3/10) May 07 2021 Heavy toll on the infra for a very niche use case with trivial
- deadalnix (5/7) May 07 2021 It seems like the toll comes from isSomeString to return false
- Paul Backus (3/11) May 07 2021 "Is a string type" and "is implicitly convertible to a string
- Andrei Alexandrescu (36/49) May 07 2021 Yah. It's really been a string (heh!) of suboptimal decisions.
- Jacob Carlborg (6/12) May 07 2021 You can have enums with the base type being a struct or a class. How
- Andrei Alexandrescu (5/17) May 07 2021 The solution to that is "We do not support enums". But if you use a
- Jacob Carlborg (6/7) May 07 2021 If you're going to make strings a user defined type, how are you
- Meta (3/9) May 07 2021 It really, really should be. Pattern matching and destructuring
- Andrei Alexandrescu (2/8) May 07 2021 Built-in strings remain as they are.
- Jon Degenhardt (21/25) May 07 2021 This is a bit orthogonal, but... An important characteristic of
- Andrei Alexandrescu (4/32) May 07 2021 String s;
- Jon Degenhardt (36/42) May 07 2021 That's not quite what I was getting at. But that's my fault. A
- Walter Bright (8/11) May 09 2021 Already done:
- Andrei Alexandrescu (3/17) May 09 2021 Problem being of course that there's no UDT String type, only the crappy...
- guai (10/16) May 08 2021 In my experience treating a string as byte array is almost never
- Berni44 (18/21) May 08 2021 It is not difficult to recognize this case and go back 1 to 3
- Jon Degenhardt (3/15) May 08 2021 Exactly. All the ideas you listed apply. Parallelization is very
- guai (14/31) May 08 2021 I ment this [combining
- Adam D. Ruppe (19/23) May 08 2021 The thing is making the range be of dchars doesn't help with this.
- guai (2/7) May 08 2021 At least it won't induce more problems
- Adam D. Ruppe (7/10) May 08 2021 This is what Phobos already does and it has already created more
- Max Haughton (2/13) May 08 2021 The opaque blob model also allows SSO much more easily.
- Berni44 (27/50) May 08 2021 You are talking about generic algorithms that work for every
- guai (4/8) May 08 2021 No cryptography is done on strings but instead on byte arrays.
- Jon Degenhardt (23/40) May 08 2021 Data and log file processing are common cases. Single byte ascii
- guai (12/55) May 08 2021 When you work with log files first you pull it in as a byte
- Jon Degenhardt (15/21) May 08 2021 Sure you can. It's necessary to take of advantage of the
- guai (10/15) May 08 2021 Those algorithms you talking about are either doesn't need
- Jon Degenhardt (22/38) May 08 2021 I don't understand the point you are trying to make. Perhaps you
- Q. Schroll (2/7) May 07 2021 True. But why have it easy when you can have it complicated?
- Walter Bright (5/7) May 08 2021 Language lawyer point:
- deadalnix (8/16) May 09 2021 Sorry to be blunt, but this is complete language layering fail.
- Jon Degenhardt (60/71) May 11 2021 To try to put some focus on the user perspective, here's a sample
- Andrei Alexandrescu (7/19) May 11 2021 Thanks. I agree it's confusing. The mystery gets elucidated with some
- Andrei Alexandrescu (6/27) May 11 2021 Another unpleasant issue:
- Walter Bright (4/11) May 11 2021 The representation of a named enum is its base type.
- deadalnix (9/20) May 11 2021 Y.f7 is of type Y. It's representation is string, not
- Walter Bright (2/22) May 11 2021 That's what I said.
- Imperatorn (2/26) May 11 2021 🍿
- Andrei Alexandrescu (4/27) May 12 2021 `representation` is a library function, so in a way we get to have a say...
- Walter Bright (3/5) May 11 2021 That came about due to the decision to overload enum to create manifest
- Per =?UTF-8?B?Tm9yZGzDtnc=?= (7/9) May 07 2021 Can you describe the scope of the rottenness in terms of contexts
- Steven Schveighoffer (23/26) May 07 2021 What do you mean "not support"? The language has enums derived from
- Andrei Alexandrescu (4/14) May 07 2021 Enums derived from strings should not be supported as strings in the
- Adam D. Ruppe (12/14) May 07 2021 I don't think the stdlib should special case much of anything.
- Andrei Alexandrescu (5/22) May 07 2021 Yes
- Jonathan M Davis (10/32) May 12 2021 Agreed. While implicit conversions can at times be useful, they cause a ...
- deadalnix (17/26) May 09 2021 100% agreed, but, back to my original point, why is the enum
- Andrei Alexandrescu (16/24) May 07 2021 Enums are poorly designed, but that's only a small part of the problem.
- Steven Schveighoffer (14/43) May 07 2021 But an enum with base string type can be passed as a string. The PR in
- Adam D. Ruppe (17/21) May 07 2021 "Can be passed as a" is not the same as "is a". There's a
- Andrei Alexandrescu (5/10) May 07 2021 YES! Int is not floating point, but yes you can initiate a floating
- Paul Backus (16/22) May 07 2021 We can already *almost* express this in the language. This code
- Adam D. Ruppe (39/41) May 07 2021 eeeeh that's a compile time argument and it still isn't actually
- Steven Schveighoffer (13/35) May 07 2021 But that's the intention of the function. format doesn't care what the
- Adam D. Ruppe (42/45) May 07 2021 Well, one way we can do that today is to have the template
- Adam D. Ruppe (4/7) May 07 2021 oh i should have added of course you can do the wchar and dchar
- Steven Schveighoffer (9/12) May 07 2021 The most common range BY FAR in all of D code is an array.
- Adam D. Ruppe (23/25) May 07 2021 int[5] arr;
- Andrei Alexandrescu (3/21) May 07 2021 Yah, ranges are a generalization of arrays. It would be odd if the
- NonNull (13/25) May 12 2021 No. Ranges are not a generalization of arrays unless you ignore
- Paul Backus (9/19) May 12 2021 Ranges are a generalization of arrays (or slices, if you prefer)
- NonNull (30/45) May 12 2021 This is the standard pattern of the interpretation of the meaning
- Andrei Alexandrescu (14/58) May 07 2021 Well you see here is the problem. An enum with base string can be
- Steven Schveighoffer (21/36) May 07 2021 Sorry, let's jump out of the fake dialog here for a second.
- Steven Schveighoffer (5/11) May 07 2021 I forgot to finish this thought, got interrupted.
- Daniel N (3/16) May 07 2021 What's wrong with this?
- Adam D. Ruppe (17/19) May 07 2021 That doesn't convert to string. It allows it to compile because T
- Steven Schveighoffer (4/22) May 07 2021 Because T is not a string.
- Andrei Alexandrescu (12/35) May 07 2021 Of course. I understand that very well. But that's a minor confusion and...
- Q. Schroll (10/25) May 07 2021 Maybe this is special casing here, but if you have a finite list
- deadalnix (8/16) May 09 2021 Popping the head out of an enum value ought to be a string, not
- Andrei Alexandrescu (7/20) May 09 2021 So you have a range r of type T.
- deadalnix (5/11) May 10 2021 If you have a range of T, then you got to return a T. I'm not
- Paul Backus (8/24) May 10 2021 popFront doesn't return a value, it mutates. So `r` before
- deadalnix (33/40) May 10 2021 r = r[1 .. $] is an error unless r actually is a string. You
- Andrei Alexandrescu (38/65) May 11 2021 If we move the goalposts we can with certain ease create the illusion
- deadalnix (23/29) May 11 2021 I don't think that any of what you wrote is incorrect, and these
- Andrei Alexandrescu (37/43) May 11 2021 Reasonable, though I should add that it's a decision made by the author
- Andrei Alexandrescu (7/13) May 11 2021 Correx:
- deadalnix (24/51) May 11 2021 It's debatable. There are many languages out there where it
- Walter Bright (11/13) May 11 2021 D has no notion of a "special kind of type". It only has a notion of "im...
- deadalnix (30/44) May 11 2021 Except, it is.
- 12345swordy (4/10) May 11 2021 Remove alias this support for classes and replace it with compile
- Walter Bright (3/5) May 11 2021 Converting a derived class reference to a base class reference is an "im...
- deadalnix (17/23) May 11 2021 That is trivially demonstrably false. Consider:
- Paul Backus (29/53) May 12 2021 I concede the points that enum strings do not violate the LSP,
- deadalnix (46/63) May 12 2021 That is true, and there are definitively cases where it is
- Paul Backus (5/17) May 12 2021 This *does* work as expected: https://run.dlang.io/is/Ru9phk
- deadalnix (20/24) May 12 2021 Yes, so we are getting at the root of this.
- Paul Backus (23/37) May 12 2021 Well, no, it doesn't--because, again, the LSP doesn't apply here
- deadalnix (33/61) May 13 2021 While what you say is correct, I'm not convinced it is right.
- Andrei Alexandrescu (18/49) May 12 2021 I was all over run.dlang.org like "Sure that's not going to work... wait...
- Paul Backus (11/19) May 12 2021 A template function, you mean? Because (as the rest of the post
- Andrei Alexandrescu (37/60) May 12 2021 Well the problem is that the choice of covariance of results for
- Jonathan M Davis (16/21) May 12 2021 Yeah, if enums are supposed to only have a fixed set of values, then the...
- Jonathan M Davis (11/14) May 12 2021 Or more accurately, all operations on an enum which are not guaranteed t...
- Alexandru Ermicioi (3/16) May 12 2021 So basically enum should implicitly be declared to be immutable
- deadalnix (2/19) May 13 2021 YES!
- deadalnix (48/64) May 10 2021 More to the point, consider this:
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (25/36) May 10 2021 Not sure how this applies to C++, what subtyping issues are you
- deadalnix (21/48) May 10 2021 Function type don't have the right covariance/contravariance, you
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (35/48) May 10 2021 Yes, I think everyone can agree with this. A good starting point
- Imperatorn (10/20) May 11 2021 +1
- Andrei Alexandrescu (4/19) May 11 2021 In case you're referring to deprecating support for enum strings in
- Mathias LANG (14/29) May 11 2021 Well, this thread is 11 pages and show no sign of winding down.
- Andrei Alexandrescu (15/88) May 11 2021 No it isn't.
- deadalnix (21/25) May 11 2021 Here we hit at the core of the problem. A reference to a type B
- Andrei Alexandrescu (6/21) May 11 2021 Of course. A range must implement popFront with the signature:
- deadalnix (4/10) May 11 2021 That must be a type error, this is a feature, not a bug. This is
- Andrei Alexandrescu (2/14) May 11 2021 Then enum strings are not ranges, correct?
- deadalnix (5/6) May 11 2021 They are not. But they are strings. Which imply that string
- Andrei Alexandrescu (2/7) May 11 2021 `ref string` is not a type.
- deadalnix (14/23) May 11 2021 This is just denial.
- Andrei Alexandrescu (3/26) May 11 2021 Again with moving the goalposts.
- Andrei Alexandrescu (8/37) May 11 2021 To clarify: you can't make up your own definitions as you go so as to
- Meta (16/25) May 11 2021 I apologize for injecting myself into this conversation, but with
- Andrei Alexandrescu (41/67) May 11 2021 Being blunt is totally cool, but that doesn't make you right.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (20/24) May 11 2021 I think you guys need to agree on what you mean by "type" and
- deadalnix (26/50) May 11 2021 While this is indeed very interesting, this is missing the larger
- 12345swordy (3/59) May 11 2021 No, classes are reference types, structs are values types in c#.
- deadalnix (11/13) May 11 2021 No, both are value type, but in the case of the class, the value
- 12345swordy (7/22) May 11 2021 Wrong.
- 12345swordy (5/32) May 11 2021 In layman terms, just because I can replace the item in the box
- 12345swordy (3/39) May 11 2021 Woops, meant to say "with the exact same item."
- deadalnix (22/26) May 11 2021 You might want to reconsider how sure of yourself you are. For
- 12345swordy (12/19) May 11 2021 The code you posted, do not support your claim what so ever. When
- deadalnix (9/29) May 12 2021 Before posting that email was the best time to run the code, look
- 12345swordy (12/39) May 12 2021 Like I said before, it does not support your claims, whatsoever.
- deadalnix (2/7) May 12 2021 I legitimately can't tell if you are an idiot or a troll.
- 12345swordy (6/14) May 12 2021 What kind of idiot that ignores official documentation provided
- Alexandru Ermicioi (20/35) May 12 2021 I think, you both talking about same thing. I think what he meant
- 12345swordy (10/37) May 12 2021 You are conflicting passing an argument by value/reference with
- Andrei Alexandrescu (2/7) May 12 2021 All of this is bizarrely incorrect. Care to elaborate?
- deadalnix (13/21) May 12 2021 Consider the following: https://godbolt.org/z/8vzx9W56a
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/15) May 12 2021 In fairness all generic low level programming languages that are
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (5/6) May 12 2021 Typo :-D, I meant pointer-to-Singeltong is subtype of
- Andrei Alexandrescu (2/24) May 12 2021 Ah, now we're at slicing. Love these forum discussions!
- Meta (30/93) May 11 2021 Of course, but I thought the conversation was about strings, not
- Andrei Alexandrescu (3/5) May 12 2021 Just by means of clarification, that's not true because the length is
- Jonathan M Davis (16/21) May 12 2021 To be more precise, a dynamic array in D is essentially
- Andrei Alexandrescu (2/4) May 12 2021 No, that would be ref int -> ref int, which has consequences.
- Timon Gehr (18/55) May 11 2021 Deadalnix is saying that there is a subtyping relationship for rvalues,
- Andrei Alexandrescu (64/83) May 12 2021 Well put. Rvalues can afford the luxury to change representation (e.g.
- deadalnix (31/67) May 12 2021 I've raised these problem on a regular basis for years now.
- Andrei Alexandrescu (10/36) May 12 2021 I know this is Walter's take, but please don't ascribe it to me as well....
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (13/18) May 12 2021 You are so wonderful at being inclusive... :-P Never seen anyone
- deadalnix (4/9) May 12 2021 It's fine, then just listen to him and not to me. That already
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/22) May 12 2021 It isn't a quirk. To get dynamic lookup you need to add a virtual
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (10/22) May 12 2021 I don't understand what you mean by pointers being monomorphic.
- deadalnix (8/25) May 12 2021 Ok, consider the following.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (12/18) May 12 2021 Sadly, IIRC typeid(*a) is A, because A does not contain a virtual
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (11/16) May 12 2021 To be more precise. B* is a subtype of A* if you can use B* in
- deadalnix (5/14) May 12 2021 I would say it is a sybtype, yes, but polymorphism imply that
- deadalnix (15/19) May 12 2021 It's quite simple.
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (17/32) May 12 2021 I think I understand what you mean, but the terminology used is
- Jonathan M Davis (44/71) May 12 2021 Having isSomeString accept types that implicitly converted to string wou...
- Andrei Alexandrescu (3/5) May 12 2021 Sadly that's exactly what StringTypeOf does: https://run.dlang.io/is/8xq...
- deadalnix (18/22) May 11 2021 I simply removed an assumption that isn't relevant to the case
- deadalnix (22/34) May 11 2021 I realize that this require further explanations.
- Joseph Rushton Wakeling (11/20) May 10 2021 This feels a bit like the real problem might be in the conflation
- Ola Fosheim =?UTF-8?B?R3LDuHN0YWQ=?= (4/7) May 10 2021 That is true for C++ and Python as well. C++ has
- Andrei Alexandrescu (7/33) May 11 2021 True, D has only "orphan" ranges, no containers. std.container is not
- Paul Backus (6/10) May 11 2021 How much value does pure have here anyway? Typical container
- Timon Gehr (3/4) May 11 2021 I think this is confusing different levels of abstraction. What should
- ruheladev40 (4/4) May 11 2021 I think it makes possible sense to require either wrappers that
We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029
May 06 2021
On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
May 06 2021
On 5/7/21 2:03 AM, evilrat wrote:On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
May 07 2021
On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
May 07 2021
On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:"Is a string type" and "is implicitly convertible to a string type" are not the same thing.Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
May 07 2021
On 5/7/21 10:16 AM, Paul Backus wrote:On Friday, 7 May 2021 at 12:06:43 UTC, deadalnix wrote:Yah. It's really been a string (heh!) of suboptimal decisions. 1. We wanted strings to be synonym to built-in slices of char. "Users should not need to define their own string type!" This has been D's billion dollars mistake. 2. Representing strings are char[] meant GC is a must and also there's long-distance coupling between callers and callees whenever strings are passed about: a callee may modify characters in the caller's string. Such changes could have been absolutely trivially disallowed with a user-defined string type, but see (1) and did I mention D's billion dollars mistake? 3. So yours truly (shudder) came up with the idea of doing strings as immutable(char)[] so that people can pass strings around, no coupling, no problem. GC is still a must. That satisfies (1) but bought us into the entire qualifiers business, which, any way I look at it, did not produce enough dividends compared to the effort put into it and the massive complications added to the language. (Aside: inout is the weirdest thing. How could we ever think that that was a good idea.) 4. When doing generic string functions for phobos, it made sense to support... oh wait a second we have so many string types. char[], wchar[], dchar[], each in triplicate because of const and immutable. So right of the bat we decided to support 9 string types. That was another mistake because nobody cares about wchar and dchar. Anyway, that's how isSomeChar and isSomeString were born. 5. Then came the question of ranges that have one of those 9 character types as elements... those should be supported too, no? IIRC at least a subset of phobos supports that stuff. 6. Then of course someone figured, wait a second, what about enums derived from strings and user-defined types that have an alias this as string... those deserve attention too, right? And right here we had descended into madness. Compare all that with: 0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:"Is a string type" and "is implicitly convertible to a string type" are not the same thing.Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.It seems like the toll comes from isSomeString to return false for these nums, no? What is the root cause of this not working? It doesn't seems like this should be a special case anywhere and just work.
May 07 2021
On 2021-05-07 17:24, Andrei Alexandrescu wrote:Compare all that with: 0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.You can have enums with the base type being a struct or a class. How does putting a String type in the standard library help with the enum problem you're describing? -- /Jacob Carlborg
May 07 2021
On 5/7/21 2:22 PM, Jacob Carlborg wrote:On 2021-05-07 17:24, Andrei Alexandrescu wrote:The solution to that is "We do not support enums". But if you use a non-templated class String, you won't feel much of a pain in the first place because the enums will be converted to String objects upon call. The String type solves all other problems mentioned.Compare all that with: 0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.You can have enums with the base type being a struct or a class. How does putting a String type in the standard library help with the enum problem you're describing?
May 07 2021
On 2021-05-07 17:24, Andrei Alexandrescu wrote:0. We put a String type in the standard library.If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings? It's not currently possible to have switch statements with user defined types. -- /Jacob Carlborg
May 07 2021
On Friday, 7 May 2021 at 18:25:57 UTC, Jacob Carlborg wrote:On 2021-05-07 17:24, Andrei Alexandrescu wrote:It really, really should be. Pattern matching and destructuring are two of my most wanted features in D.0. We put a String type in the standard library.If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings? It's not currently possible to have switch statements with user defined types.
May 07 2021
On 5/7/21 2:25 PM, Jacob Carlborg wrote:On 2021-05-07 17:24, Andrei Alexandrescu wrote:Built-in strings remain as they are.0. We put a String type in the standard library.If you're going to make strings a user defined type, how are you planning to support things like switch statements with strings?
May 07 2021
On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.This is a bit orthogonal, but... An important characteristic of utf-8 arrays is that they are simultaneously a random access range of bytes and an input range of utf-8 characters. For efficiency it's often important to switch back and forth between these two interpretations. `byLine` is one type of example, where a byte oriented search is done (e.g. with `memchr`), but afterward the representation array is accessed as utf-8 input range. `byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access. This switching between interpretations doesn't fit well with current distinction between `char[]` and `byte[]`. A numbers of algorithms in phobos operate on one or the other, but not both. It'd be very useful to have an approach to utf-8 strings that enabled switching interpretations easily, without casting. --Jon
May 07 2021
On 5/7/21 6:34 PM, Jon Degenhardt wrote:On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:String s; func1(s.bytes); func2(s.dchars);0. We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.This is a bit orthogonal, but... An important characteristic of utf-8 arrays is that they are simultaneously a random access range of bytes and an input range of utf-8 characters. For efficiency it's often important to switch back and forth between these two interpretations. `byLine` is one type of example, where a byte oriented search is done (e.g. with `memchr`), but afterward the representation array is accessed as utf-8 input range. `byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access. This switching between interpretations doesn't fit well with current distinction between `char[]` and `byte[]`. A numbers of algorithms in phobos operate on one or the other, but not both. It'd be very useful to have an approach to utf-8 strings that enabled switching interpretations easily, without casting.
May 07 2021
On Saturday, 8 May 2021 at 02:05:42 UTC, Andrei Alexandrescu wrote:On 5/7/21 6:34 PM, Jon Degenhardt wrote:That's not quite what I was getting at. But that's my fault. A hastily written message that muddled a couple of concepts. Sorry about that, I need to write up a better description. But there are two underlying thoughts. One is being able to convert from a random access byte array to char input range (e.g. `byUTF`), do something with it (e.g. `popFront`), then convert that form back to a random access byte range. This is logically doable because both are views on the same physical array. However, once something is an input range it doesn't convert simply to a random access range. This first one strikes me as potentially challenging because this dual view on the underlying data is not common, so there's not a lot of incentive to support it as a general concept. The second issue is more about current Phobos algorithms that specialize their implementations depending on whether the argument is a `char[]` or a `byte[]`. This normally involves conditioning on `isSomeString` or `isSomeChar`. `char[]` / `char` pass these tests, `byte[]` / `byte` do not. The cases I remember are cases where the string form was specialized to have better performance than the byte form. Look through searching.d for `isSomeString` use to see this. The trouble with this is that at the application level it can be necessary to use a byte array when working with a number facilities. This often involves I/O. E.g. Reading fixed sized blocks from an input stream (`File.byChunk`). This operates on `ubyte[]` arrays. It can be cast to a `char[]`. But, this can run afoul of autodecoding related routines that expect correctly formed utf-8 characters. When reading fixed size buffers, the starts and ends of the buffer will often not fall on utf-8 boundaries, so examining the bytes is necessary to handle these cases. (And input streams may contain corrupt utf-8 characters.) I know the above is still not an adequate description. At some point I'll try to write up something more compelling. --JonIt'd be very useful to have an approach to utf-8 strings that enabled switching interpretations easily, without casting.String s; func1(s.bytes); func2(s.dchars);
May 07 2021
On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:String s; func1(s.bytes); func2(s.dchars);Already done: s.byCodeUnit s.byChar s.byWchar s.byDchar s.byUTF https://dlang.org/phobos/std_utf.html
May 09 2021
On 5/9/21 5:04 AM, Walter Bright wrote:On 5/7/2021 7:05 PM, Andrei Alexandrescu wrote:Problem being of course that there's no UDT String type, only the crappy immutable(char)[].String s; func1(s.bytes); func2(s.dchars);Already done: s.byCodeUnit s.byChar s.byWchar s.byDchar s.byUTF https://dlang.org/phobos/std_utf.html
May 09 2021
On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:`byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access.In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
May 08 2021
On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want.It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind. - I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search. - Or when you've got for whatever reason the possibility to do larger jumps while scanning a string, e.g. when you know there are now 50 letters ahead, that do not contain a certain token you are looking for, you can safely jump 50 bytes, go back to the next splitting point and continue linear search there. - Or you want to cut a string into pieces of a certain length (again 50?), where the exact length is not so much important. So you just jump ahead 50, go back again and split at this point. If there are a lot of non ascii characters in between, this is of course shorter, but maybe ok, because speed is more important. - You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.
May 08 2021
On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:Exactly. All the ideas you listed apply. Parallelization is very often useful.Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want.It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind. - I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search. ... (more examples) .. - You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.
May 08 2021
On Saturday, 8 May 2021 at 16:25:31 UTC, Berni44 wrote:On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:I ment this [combining characters](https://en.wikipedia.org/wiki/Combining_character). they are language-specific, but most of the time the string does not contain any clue which language is it.Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want.It is not difficult to recognize this case and go back 1 to 3 bytes to reach a correct splitting place. UTF-8 was designed with this in mind.- I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search.They must be applied with great careful to non-ascii texts. What about RTL for example? You cannot split inside RTL block- Or you want to cut a string into pieces of a certain length (again 50?), where the exact length is not so much important.For what business task would I do that? I may want to split a string on some char subsequence for lexing. But one cannot assume lengths of those chunks.So you just jump ahead 50, go back again and split at this point. If there are a lot of non ascii characters in between, this is of course shorter, but maybe ok, because speed is more important.Not sure if speed is more important than correctness.- You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.I'm not sure if this is possible with all the quirks of unicode. Never herd even of parallel processors of structured texts like xml.
May 08 2021
On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:I ment this [combining characters](https://en.wikipedia.org/wiki/Combining_character). they are language-specific, but most of the time the string does not contain any clue which language is it.The thing is making the range be of dchars doesn't help with this. This kind of thinking is why Phobos does the autodecoding thing it does now, converting utf-8 to a range of dchar as it sees it... but those combining characters are still (or rather can be) two separate dchars! So right now Phobos does something that seems useful... but actually isn't. All of the bad, none of the good. BTW I also like to point out that Ascii actually has a lot of the same mysteries we ascribe to unicode. Like variable width chars: \t is an ascii char. Zero width char, ascii has \0 and \a. Negative width char? Is \b one? idk. But there's still a lot of times you can treat it as bytes and get away with it. This is why I'm not sold on Andrei's new String idea myself. I totally agree making char[] a range of dchars is a bad idea. But I think the only right thing to do is to expose what it actually is and then both educate and empower the user to do what they need themselves.
May 08 2021
On Saturday, 8 May 2021 at 19:30:03 UTC, Adam D. Ruppe wrote:On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:At least it won't induce more problemsI ment this [combining characters](https://en.wikipedia.org/wiki/Combining_character). they are language-specific, but most of the time the string does not contain any clue which language is it.The thing is making the range be of dchars doesn't help with this.
May 08 2021
On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:This is what Phobos already does and it has already created more problems. It was a mistake to do it this way. But if string was just an opaque(ish) blob with a variety of accessor properties it would work better then. The big mistake Phobos made was trying to automatically do something and causing friction by that automatic thing not being right.The thing is making the range be of dchars doesn't help with this.At least it won't induce more problems
May 08 2021
On Saturday, 8 May 2021 at 21:54:28 UTC, Adam D. Ruppe wrote:On Saturday, 8 May 2021 at 20:06:35 UTC, guai wrote:The opaque blob model also allows SSO much more easily.This is what Phobos already does and it has already created more problems. It was a mistake to do it this way. But if string was just an opaque(ish) blob with a variety of accessor properties it would work better then. The big mistake Phobos made was trying to automatically do something and causing friction by that automatic thing not being right.The thing is making the range be of dchars doesn't help with this.At least it won't induce more problems
May 08 2021
On Saturday, 8 May 2021 at 19:06:48 UTC, guai wrote:I ment this [combining characters](https://en.wikipedia.org/wiki/Combining_character). they are language-specific, but most of the time the string does not contain any clue which language is it.You are talking about generic algorithms that work for every script. But unicode allows for algorithms only supporting subsets. If your subset doesn't contain combining characters, you don't need to care about them. And else you may need to go back to the next base character. Depends on the usecase.Oh, yes, you can! Think of an algorithm which is doing cryptographic analysis and counting consecutive pairs of ascii characters. For that it doesn't matter if there is RTL text cut into pieces.- I can imagine, that this can be useful in divide-and-conquer algorithms, like binary search.They must be applied with great careful to non-ascii texts. What about RTL for example? You cannot split inside RTL blockSimple wrapping to avoid loosing text when printing, or to avoid having to scroll vertically. Is probably not useful for a high quality program...- Or you want to cut a string into pieces of a certain length (again 50?), where the exact length is not so much important.For what business task would I do that?I may want to split a string on some char subsequence for lexing. But one cannot assume lengths of those chunks.Depending on the use case you may know ahead.Of course, this again depends on the use case. You can't say that in general.So you just jump ahead 50, go back again and split at this point. If there are a lot of non ascii characters in between, this is of course shorter, but maybe ok, because speed is more important.Not sure if speed is more important than correctness.Think again of the cryptographic analysis above, for an example. (Or checking wikipedia entries for whatever automatically.) Keep in mind, that we do not always have to support everything of unicode. If we know ahead, that our text contains mainly ascii and aside from this only a few base characters, but never combining characters and so on, we can use different algorithms which might be simpler or faster or both. To make sure, that this constraint holds, is then something, that has to be done outside of the algorithm.- You want to process pieces of a string in parallel: Cut it in 16 pieces and let your 16 cores work on each of them.I'm not sure if this is possible with all the quirks of unicode.Never herd even of parallel processors of structured texts like xml.I would judge it much more difficult to process xml in parallel than to do the same with unicode.
May 08 2021
On Saturday, 8 May 2021 at 20:19:51 UTC, Berni44 wrote:Oh, yes, you can! Think of an algorithm which is doing cryptographic analysis and counting consecutive pairs of ascii characters. For that it doesn't matter if there is RTL text cut into pieces.No cryptography is done on strings but instead on byte arrays. Why would you even want to use string here? Its methods won't be in any help.
May 08 2021
On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:Data and log file processing are common cases. Single byte ascii characters are normally used to delimit structure in such files. Record delimiters, field delimiters, name-value pair delimiters, escape syntax, etc. A common way to operate on such files is to identify structural boundaries by finding the requisite single byte ascii characters and treating the contained data as opaque (uninterpreted) sequences of utf-8 bytes. The details depend on the file format. But the key part is that single byte ascii characters can be unambiguously identified without interpreting other characters in a utf-8 data stream. Of course, when it comes time to interpreting the data inside these data streams it is necessary to operate on cohesive blocks. Yes graphemes, but also things like numbers. It's not useful to split a number in the middle and then call `std.conv.to!double` on it. Operating on the single byte structural elements allows deferring interpretation of multi-byte unicode content until it is needed. This is why it's useful to switch back and forth between a byte-oriented view and a UTF character view. Operating on bytes is faster (e.g. `memchr`, no utf-8 decoding), enables parallelization (depending on the type of file), and can be used with fixed size buffer reads and writes. --Jon`byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access.In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
May 08 2021
On Saturday, 8 May 2021 at 18:44:00 UTC, Jon Degenhardt wrote:On Saturday, 8 May 2021 at 16:04:24 UTC, guai wrote:When you work with log files first you pull it in as a byte stream, split in chunks. Then make a string out of each of them. Once you've done it, you process it like a string with all the rules of unicode. For example split it into words. And then you may want to convert a word to bytes back again. But you cannot split a string wherever you want treating it as bytes. It most certainly wouldn't work with all the languages out there. With string you cannot get a char by index, you must read them sequentially. You can search, you can tokenize, rewind and reinterpret maybe.On Friday, 7 May 2021 at 22:34:19 UTC, Jon Degenhardt wrote:Data and log file processing are common cases. Single byte ascii characters are normally used to delimit structure in such files. Record delimiters, field delimiters, name-value pair delimiters, escape syntax, etc. A common way to operate on such files is to identify structural boundaries by finding the requisite single byte ascii characters and treating the contained data as opaque (uninterpreted) sequences of utf-8 bytes. The details depend on the file format. But the key part is that single byte ascii characters can be unambiguously identified without interpreting other characters in a utf-8 data stream. Of course, when it comes time to interpreting the data inside these data streams it is necessary to operate on cohesive blocks. Yes graphemes, but also things like numbers. It's not useful to split a number in the middle and then call `std.conv.to!double` on it. Operating on the single byte structural elements allows deferring interpretation of multi-byte unicode content until it is needed. This is why it's useful to switch back and forth between a byte-oriented view and a UTF character view. Operating on bytes is faster (e.g. `memchr`, no utf-8 decoding), enables parallelization (depending on the type of file), and can be used with fixed size buffer reads and writes. --Jon`byLine` implementations will usually work by iterating forward, but there are random access use cases as well. For example, it is perfectly reasonable to divide a utf-8 array in roughly in half using byte offsets, then searching for the nearest utf-8 character boundary. At after this both halves are treated as utf-8 input ranges, not random access.In my experience treating a string as byte array is almost never a good thing. Person doing it must be very careful and truly understand what they are doing. What are those use cases other than `byLine` where this is useful? Dividing utf-8 array and searching for the nearest char may split inside a combining character which isn't a thing you usually want. Especially when human would read this text. Conceptually string is a sequence of characters. A range of dchar in D's terms.
May 08 2021
On Saturday, 8 May 2021 at 19:33:45 UTC, guai wrote:... But you cannot split a string wherever you want treating it as bytes. It most certainly wouldn't work with all the languages out there.Sure you can. It's necessary to take of advantage of the properties of utf-8 encoding to do it. That is, it's necessary to find a nearby utf-8 character boundary, but utf-8 is defined in a manner that enables this. Take a look at [section 2.5 Encoding Forms](http://www.unicode.org/versions/Unicode13.0.0/ch02.pdf#G13708) in the Unicode Standards doc. It describes exactly this.With string you cannot get a char by index, you must read them sequentially.Correct, you cannot find a unicode character using a character based index without processing sequentially. But for large classes of algorithms this is not necessary. That is, there is often no need to find, for example, the 100th character. If all an algorithm needs to do is split a string roughly in half, then use the byte offsets to find the halfway point and then look for a utf-8 character boundary. If the algorithm is based on some other boundary, say, token boundaries, then find one of those boundaries.
May 08 2021
On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:If all an algorithm needs to do is split a string roughly in half, then use the byte offsets to find the halfway point and then look for a utf-8 character boundary. If the algorithm is based on some other boundary, say, token boundaries, then find one of those boundaries.Those algorithms you talking about are either doesn't need strings at all but instead byte/char arrays or would produce garbage for any input other than ascii. Your example with log files mixes binary data with text. Properly done logger will escape delimiters inside text chunks, so it isn't even a string per se, it's some binary data from which you need to extract a string first. A lot of bugs are caused by this mixing of text with binary. And I think it is better to distinguish them properly on a type level.
May 08 2021
On Saturday, 8 May 2021 at 21:47:21 UTC, guai wrote:On Saturday, 8 May 2021 at 20:22:28 UTC, Jon Degenhardt wrote:I don't understand the point you are trying to make. Perhaps you could rephrase. I've implemented any number of these types of algorithms. Its very common to mix interpretation as unicode strings with interpretation as utf-8 bytes. e.g. Maybe its necessary to do case-conversion at some stage of processing. This has to be done on unicode characters, not bytes. But needing to do such processing at some point does exclude such treating the data as utf-8 bytes for other purposes. Also, a `char[]` in D is defined to be utf-8, and a `string` is an `immutable(char)[]`. So why would utf-8 data, including non-ascii characters, read into a `char[]` produce garbage? The answer is that it wouldn't. No, you cannot simply start on an arbitrary byte boundary, but nobody has suggested this.If all an algorithm needs to do is split a string roughly in half, then use the byte offsets to find the halfway point and then look for a utf-8 character boundary. If the algorithm is based on some other boundary, say, token boundaries, then find one of those boundaries.Those algorithms you talking about are either doesn't need strings at all but instead byte/char arrays or would produce garbage for any input other than ascii.Your example with log files mixes binary data with text. Properly done logger will escape delimiters inside text chunks, so it isn't even a string per se, it's some binary data from which you need to extract a string first.Again, I'm not following the logic. Log files may or may not include binary data. But I'm sure why that matters. I'm talking about log files where the text portions are encoded as utf-8.A lot of bugs are caused by this mixing of text with binary. And I think it is better to distinguish them properly on a type level.Perhaps it would help if you described what you mean by "binary". I tend to think of "binary" as things like image data, binary serialization formats, base-64 coding, compressed or encrypted text. These are quite different than utf-8 encoded unicode text.
May 08 2021
On Friday, 7 May 2021 at 15:24:42 UTC, Andrei Alexandrescu wrote:Compare all that with: We put a String type in the standard library. It uses UTF8 inside and supports iteration by either bytes, UTF8, UTF16, or UTF32. It manages its own memory so no need for the GC. It disallows remote coupling across callers/callees. Case closed.True. But why have it easy when you can have it complicated?
May 07 2021
On 5/7/2021 7:16 AM, Paul Backus wrote:"Is a string type" and "is implicitly convertible to a string type" are not the same thing.Language lawyer point: An enum can be implicitly converted to its base type, but it's a match level 2: https://dlang.org/spec/function.html#function-overloading (Agreeing with Paul)
May 08 2021
On Sunday, 9 May 2021 at 02:57:42 UTC, Walter Bright wrote:On 5/7/2021 7:16 AM, Paul Backus wrote:Sorry to be blunt, but this is complete language layering fail. Classes implementing and interface are a subtype and are match level 2 (implicit conversion) when matching against the interface. In fact, any subtype is expected to be a match level 2 - arguably, this isn't bijective, as not all level 2 match will be subtypes, that doesn't definitively nails the topic at hand, but the argument made in this thread are disturbingly unsound."Is a string type" and "is implicitly convertible to a string type" are not the same thing.Language lawyer point: An enum can be implicitly converted to its base type, but it's a match level 2: https://dlang.org/spec/function.html#function-overloading (Agreeing with Paul)
May 09 2021
On Friday, 7 May 2021 at 11:55:53 UTC, Andrei Alexandrescu wrote:On 5/7/21 2:03 AM, evilrat wrote:To try to put some focus on the user perspective, here's a sample program: ``` import std.stdio; import std.array; import std.range; void main() { writefln!"%d"(0); immutable string f1 = "%d"; writefln!f1(1); enum f2 = "%d"; writefln!f2(2); enum string f3 = "%d"; writefln!f3(3); enum { f4 = "%d" } writefln!f4(4); enum : string { f5 = "%d" } writefln!f5(5); enum X { f6 = "%d" } writefln!(X.f6)(6); // Compilation error enum Y : string { f7 = "%d" } writefln!(Y.f7)(7); // Compilation error } ``` All but the named enums (last two) are fine. These fail with similar compilation errors: ``` Error: template std.stdio.writefln cannot deduce function from argument types !("%d")(int), candidates are: dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4258): writefln(alias fmt, A...)(A args) with fmt = f6, A = (int) must satisfy the following constraint: isSomeString!(typeof(fmt)) dmd-2.095.1/osx/bin/../../src/phobos/std/stdio.d(4269): writefln(Char, A...)(in Char[] fmt, A args ``` This is at least a potentially confusing situation for users. The error message indicates that `f6` should be a "string" of some kind, and it looks like one. One needs to be very familiar with the details to understand why it does not satisfy `isSomeString`. Similarly with understanding why anonymous enums are fine but named enums are not. The error message is also not particularly helpful in determining what the available workarounds are. They may be trivial once understood, but there's non-trivial learning to get there. Note that slicing (`[]`) and `.representation()` do not work for the template argument. Casting does. e.g. The following is fine: ``` writefln!(cast(string)X.f6)(6); ``` It can be argued that this case is rare enough in user code that the ROI from either making the case work or improving the compiler error message is too low to devote time to this now. But maybe there are other cheap options that could help users. A documentation note perhaps. A FAQ somewhere on the D site that would surface in searches.On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:Heavy toll on the infra for a very niche use case with trivial workarounds on the user side.We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029Just a commoner here, can you explain for stupid what makes enum string a no go and why it should begone?
May 11 2021
On 5/11/21 3:43 PM, Jon Degenhardt wrote:enum { f4 = "%d" } writefln!f4(4); enum : string { f5 = "%d" } writefln!f5(5); enum X { f6 = "%d" } writefln!(X.f6)(6); // Compilation error enum Y : string { f7 = "%d" } writefln!(Y.f7)(7); // Compilation errorThanks. I agree it's confusing. The mystery gets elucidated with some ease if we write the types involved: f4 and f5 have type string, f6 has type X, and f7 have type Y. It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum f5 = "%d"`. I expected that some anonymous enum type would be generated.
May 11 2021
On 5/11/21 7:00 PM, Andrei Alexandrescu wrote:On 5/11/21 3:43 PM, Jon Degenhardt wrote:Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.enum { f4 = "%d" } writefln!f4(4); enum : string { f5 = "%d" } writefln!f5(5); enum X { f6 = "%d" } writefln!(X.f6)(6); // Compilation error enum Y : string { f7 = "%d" } writefln!(Y.f7)(7); // Compilation errorThanks. I agree it's confusing. The mystery gets elucidated with some ease if we write the types involved: f4 and f5 have type string, f6 has type X, and f7 have type Y. It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum f5 = "%d"`. I expected that some anonymous enum type would be generated.
May 11 2021
On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 11 2021
On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[] Unless I'm missing something, that wold b the consistent behavior. Unless representation is supposed to recurse up to the bottom turtle?Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 11 2021
On 5/11/2021 7:04 PM, deadalnix wrote:On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:That's what I said.On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 11 2021
On Wednesday, 12 May 2021 at 02:56:49 UTC, Walter Bright wrote:On 5/11/2021 7:04 PM, deadalnix wrote:🍿On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:That's what I said.On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[]Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 11 2021
On 5/11/21 10:04 PM, deadalnix wrote:On Wednesday, 12 May 2021 at 01:06:29 UTC, Walter Bright wrote:`representation` is a library function, so in a way we get to have a say in what it does. I would have expected it doesn't go all the way to primitive types, but if it does, that's not necessarily incorrect.On 5/11/2021 5:04 PM, Andrei Alexandrescu wrote:Y.f7 is of type Y. It's representation is string, not immutable(ubyte)[] typeof(Y.f7.representation) ought to be string. typeof(Y.f7.representation.representation) ought to be immutable(ubyte)[] Unless I'm missing something, that wold b the consistent behavior. Unless representation is supposed to recurse up to the bottom turtle?Another unpleasant issue: enum Y : string { f7 = "%d" } writeln(typeof(Y.f7.representation).stringof); prints immutable(ubyte)[], not immutable(char)[]. So not even Y.f7.representation is usable. Sigh.The representation of a named enum is its base type. The representation of a string type is immutable(ubyte)[]. It's consistent.
May 12 2021
On 5/11/2021 4:00 PM, Andrei Alexandrescu wrote:It's unpleasant that `enum : string { f5 = "%d" }` is really the same as `enum f5 = "%d"`. I expected that some anonymous enum type would be generated.That came about due to the decision to overload enum to create manifest constants. This way, a block of manifest constants can be created.
May 11 2021
On Friday, 7 May 2021 at 03:48:47 UTC, Andrei Alexandrescu wrote:We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029Can you describe the scope of the rottenness in terms of contexts and arguments? Are you referring to enums derived from aggregates aswell? And how does this rottenness relate to the discrepancy in behavior between builtin `__traits(X, ...)` and `std.traits.X!(...)` for enum arguments?
May 07 2021
On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029What do you mean "not support"? The language has enums derived from strings. Did you mean remove it from the language? That would be a severe penalty. Did you mean that Phobos routines just should error whenever you use enum types derived from strings? That's also a severe penalty. If you mean we shouldn't support it (as an ambiguous case) in *conversion* utilities (i.e. to/from string), then this makes some sense. But it's also not straightforward. Sometimes you WANT to convert from the enum to the base type. Sometimes you want to convert to the enum name. Going backwards (string to enum), which one makes more sense? It depends on context. It also doesn't help that a string enum implicitly converts to a string. The language is going to circumvent any policies Phobos has on that front. For an example, in the serializers I have written, I usually have a "treat this enum type as it's base type" UDA, because the data inside the serialized format is the base type, but I want it as an enum in d-land. But it depends on the situation. I think it makes possible sense to require either wrappers that clarify intent, or always treat enums the same way (as an enum). I think Phobos *mostly* does the latter. Erroring for ambiguity might be more disruptive than it's worth. -Steve
May 07 2021
On 5/7/21 11:20 AM, Steven Schveighoffer wrote:On 5/6/21 11:48 PM, Andrei Alexandrescu wrote:Enums derived from strings should not be supported as strings in the standard library.We should remove all that rot from phobos pronto. https://github.com/dlang/phobos/pull/8029What do you mean "not support"? The language has enums derived from strings. Did you mean remove it from the language? That would be a severe penalty.Did you mean that Phobos routines just should error whenever you use enum types derived from strings? That's also a severe penalty.No it isn't.
May 07 2021
On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:Enums derived from strings should not be supported as strings in the standard library.I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string. Special casing static arrays is a mistake. The user can just slice it out the outside. Special casing alias this is a mistake. The user can pass what they meant to pass. The phobos templates should work like all other templates - on the exact type passed. Other functions work with the normal overloading and implicit conversion rules. Kill all the special cases!
May 07 2021
On 5/7/21 11:33 AM, Adam D. Ruppe wrote:On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:yesEnums derived from strings should not be supported as strings in the standard library.I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string.Special casing static arrays is a mistake. The user can just slice it out the outside.YesSpecial casing alias this is a mistake. The user can pass what they meant to pass.YESThe phobos templates should work like all other templates - on the exact type passed. Other functions work with the normal overloading and implicit conversion rules. Kill all the special cases!YES!!!
May 07 2021
On Friday, May 7, 2021 9:39:40 AM MDT Andrei Alexandrescu via Digitalmars-d wrote:On 5/7/21 11:33 AM, Adam D. Ruppe wrote:Agreed. While implicit conversions can at times be useful, they cause a ton of problems when templates are involved. Ideally, we should accept no implicit conversions of any kind with templated code. And honestly, I wish that the language had fewer implicit conversions in it. In particular, I think that implicitly slicing static arrays was a big mistake, and we've had a number of issues in Phobos because of it when trying to later generalize functions that originally just took strings. - Jonathan M DavisOn Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:yesEnums derived from strings should not be supported as strings in the standard library.I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string.Special casing static arrays is a mistake. The user can just slice it out the outside.YesSpecial casing alias this is a mistake. The user can pass what they meant to pass.YESThe phobos templates should work like all other templates - on the exact type passed. Other functions work with the normal overloading and implicit conversion rules. Kill all the special cases!YES!!!
May 12 2021
On Friday, 7 May 2021 at 15:33:56 UTC, Adam D. Ruppe wrote:On Friday, 7 May 2021 at 15:25:30 UTC, Andrei Alexandrescu wrote:100% agreed, but, back to my original point, why is the enum thing a special case to begin with? The fact that it is a special case to begin with flies in the face of Liskov's substitution principle - the enum type clearly is a subtype of string. You got to wonder how it came to be that it just don't work automatically to begin with. Adding special cases is indeed the wrong path. There is something deeper rotten here, and just saying, no, this shouldn't work is just not cutting it. Note that there should be special cases, but it's be good to understand why these are special case to begin with, and fix this. Alternatively, we decide enums are not subtypes, in which case they shouldn't be implicitly convertible either. That wouldn't be such a bad idea as I've often missed the ability to do opaque type aliasing in D, but that seems way more disruptive than just admitting that "enum strings" are indeed a subtype of string.Enums derived from strings should not be supported as strings in the standard library.I don't think the stdlib should special case much of anything. Special casing enums is a mistake. If the user wants it treated as a string, they can cast it to a string. [...] Kill all the special cases!
May 09 2021
On 5/7/21 11:20 AM, Steven Schveighoffer wrote:If you mean we shouldn't support it (as an ambiguous case) in *conversion* utilities (i.e. to/from string), then this makes some sense. But it's also not straightforward. Sometimes you WANT to convert from the enum to the base type. Sometimes you want to convert to the enum name. Going backwards (string to enum), which one makes more sense? It depends on context. It also doesn't help that a string enum implicitly converts to a string. The language is going to circumvent any policies Phobos has on that front.Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings. How about no. User: "I have this enum string str and phobos won't consider it a string. Help!" Another user: "Just use str.representation if you want to pass str around as a string." User. "Cool." Case closed.
May 07 2021
On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:On 5/7/21 11:20 AM, Steven Schveighoffer wrote:But an enum with base string type can be passed as a string. The PR in question is working around a limitation of the Phobos trait that says something derived from a string isn't really usable as a string (when it is). The problem I see is, when phobos says something isn't true, when it really is, causes no end of confusion (*cough* autodecoding) static assert(!isSomeString!T); // yet... string s = someT;If you mean we shouldn't support it (as an ambiguous case) in *conversion* utilities (i.e. to/from string), then this makes some sense. But it's also not straightforward. Sometimes you WANT to convert from the enum to the base type. Sometimes you want to convert to the enum name. Going backwards (string to enum), which one makes more sense? It depends on context. It also doesn't help that a string enum implicitly converts to a string. The language is going to circumvent any policies Phobos has on that front.Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings.How about no. User: "I have this enum string str and phobos won't consider it a string. Help!" Another user: "Just use str.representation if you want to pass str around as a string."User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?" -Steve
May 07 2021
On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:But an enum with base string type can be passed as a string."Can be passed as a" is not the same as "is a". There's a conversion involved. For better or for worse, D templates do not participate in conversion and we shouldn't pretend that they do. This is often times very useful - you don't want to lose information in many templates. But there's other times when that information doesn't matter and it would be nice it you didn't have to think about it.... ...so maybe we should consider changing templates so they can participate at the language level... it would be interesting if the compiler did the conversions BEFORE instantiating any template. Then it can reuse the instances more easily too. I think it actually does for const params for example, but it could do more.User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"But the language DOES have problems with it for certain types of functions. Phobos is trying to deny that reality.
May 07 2021
On 5/7/21 12:30 PM, Adam D. Ruppe wrote:On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:YES! Int is not floating point, but yes you can initiate a floating point from an int. BTW it's worse than I feared. There are 104 occurrences of StringTypeOf in phobos. There should be 0.But an enum with base string type can be passed as a string."Can be passed as a" is not the same as "is a". There's a conversion involved.
May 07 2021
On Friday, 7 May 2021 at 16:30:26 UTC, Adam D. Ruppe wrote:For better or for worse, D templates do not participate in conversion and we shouldn't pretend that they do. This is often times very useful - you don't want to lose information in many templates. But there's other times when that information doesn't matter and it would be nice it you didn't have to think about it....We can already *almost* express this in the language. This code works: void fun(T : string, T val)() { pragma(msg, "instantiated with ", T.stringof); } enum E : string { x = "hello" } alias test = fun!(E, E.x); // prints: instantiated with E But if you try to write it the more natural way, with the value parameter first, and have the compiler deduce the type, you get an error: void fun(T val, T : string)() { pragma(msg, "instantiated with ", T.stringof); } // Error: undefined identifier `T`
May 07 2021
On Friday, 7 May 2021 at 17:02:17 UTC, Paul Backus wrote:We can already *almost* express this in the language. This code works:eeeeh that's a compile time argument and it still isn't actually a string. What I'm talking about is like in the normal function: void test(string s) { writeln(s); } enum Test : string { a = "foo" } test(Test.a); The conversion to string happens outside `test`. So caller instead of callee, whereas with a template - any template - the exact type is passed, what T:string is saying is that the callee *can* do the conversion if it wants to inside, but the compiler won't actually do it for you. This is very useful in a lot of cases. Like if you do void foo(T : SomeBase)(T t) {} and pass foo(new Derived()), you can still see the whole Dervied type and thus do some reflection and such over it, with the compiler promising that it can be converted to SomeBase if you want to. Of course, in this case, it is not really different than a template constraint. You could do void foo(T)(T t) if(is(T : SomeBase)) {} and get that same rejection behavior. But of course what's nice about specialization is you can then add an overload void foo(T : SomeBase)(T t) {} void foo(T : Derived)(T t) {} And if you get like class Derived : SomeBase {} class OtherBranch : SomeBase{} and call foo(new Derived()); // goes to second overload as it is a more specific match foo(new OtherBranch()); // goes to first overload as it is the best option available, but it still can see it is OtherBranch inside there, unlike a normal interface cast where you'd only have that detail at runtime.
May 07 2021
On 5/7/21 12:30 PM, Adam D. Ruppe wrote:On Friday, 7 May 2021 at 15:51:39 UTC, Steven Schveighoffer wrote:But that's the intention of the function. format doesn't care what the expression really is, it wants some type of string. How do you say "I want to accept something that's a string, but I want it as a string please"But an enum with base string type can be passed as a string."Can be passed as a" is not the same as "is a". There's a conversion involved.For better or for worse, D templates do not participate in conversion and we shouldn't pretend that they do. This is often times very useful - you don't want to lose information in many templates. But there's other times when that information doesn't matter and it would be nice it you didn't have to think about it....e.g. format....so maybe we should consider changing templates so they can participate at the language level... it would be interesting if the compiler did the conversions BEFORE instantiating any template. Then it can reuse the instances more easily too. I think it actually does for const params for example, but it could do more.Interesting idea!What I mean is, I can write: void foo(string s); and it works for enums that are string-based. Why doesn't format work with that same principle? The answer is because there isn't a good way to do it. -SteveUser: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"But the language DOES have problems with it for certain types of functions. Phobos is trying to deny that reality.
May 07 2021
On Friday, 7 May 2021 at 17:11:32 UTC, Steven Schveighoffer wrote:How do you say "I want to accept something that's a string, but I want it as a string please"Well, one way we can do that today is to have the template forward to a normal function, or a normal function forward to a template. void format(T...)(const char[] s, T args) { format(asRangeOfDchar(s), args); } void format(Range T...)(Range r, T args) if(isAppopriateRange!Range) { // actual impl based on the range interface // and actually tbh I'd personally take another step // and collapse all these down even more. } Then a whole bunch of conversions are done to match `const char[]` and the template is then working with that entry point instead of the whole plate. This of course assumes isAppropriateRange is false for anything that isn't actually already a range. And I'm assuming string is not already a range. Otherwise you enter back into the hell of not only saying what you accept, but having to exclude things too. So let me rant. I think it was actually a mistake for Phobos to UFCS shoe-horn in range functions on arrays too - this includes strings as well as int[] and such as well. Lots of new users ask why they can't do the same thing. And like Phobos took this opportunity to do silly things like autodecoding when we all hate now, but I don't think the freestanding ufcs range functions should exist at all. Just have the user fetch a range out of the container. Then they get in that habit with other containers too and it moves a bunch of ugly code out of every consuming function. Heck the `asRange` thing itself might have a variety of overloads it forwards to. MyRange asRangeHelper(const char[] s) { return MyRange(s); } auto asRange(T)(T t) { /* generic stuff */ } auto asRange(T : const char[])(T t) { return asRangeHelper(t); } // let the language convert it in these specializations and so on and so forth. This is a half-baked rant im sure you can destroy at will. But like I'm pretty sure if we did develop this it would be nicer overall than what we have now.The answer is because there isn't a good way to do it.And it is possible the language could insert some magic to make it easier if we really put our thinking caps on.
May 07 2021
On Friday, 7 May 2021 at 18:17:31 UTC, Adam D. Ruppe wrote:void format(T...)(const char[] s, T args) { format(asRangeOfDchar(s), args); }oh i should have added of course you can do the wchar and dchar overloads here too. yeah yeah i know "DRY" but like it is a trivial forwarder, get over it.
May 07 2021
On 5/7/21 2:17 PM, Adam D. Ruppe wrote:I think it was actually a mistake for Phobos to UFCS shoe-horn in range functions on arrays too - this includes strings as well as int[] and such as well.The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays. Just a taste: int[] arr = genArray; arr.sort(); // fail. I don't want to go to that place, ever. -Steve
May 07 2021
On Friday, 7 May 2021 at 18:44:26 UTC, Steven Schveighoffer wrote:The end result of something like you allude to would result in nearly all of phobos NOT working with arrays.int[5] arr; arr.sort(); // fails, you need to use [] Array!int arr; arr.sort(); // fails, you need to use [] some random phobos functions special-case this to make it work which is the real wtf and those should be undone, just get the user to slice a static array. So I'd just make it all consistent. But tbh I don't feel that strongly about it... except for string. string should no longer be a range. Delete its popFront overload and let the user pick byCodeUnit or byCodePoint or whatever. Just rip that band aid right off. Just even for the others, even if the [] was deemed unacceptable, i don't love the ufcs solution. So many people try to do freestanding functions for other types, inspired by the phobos popFront.. and isInputRange fails because phobos itself must import the ufcs module. Other new people do foo.empty and it fails because they didn't import the module. So like even if the behavior remained the same as today, I'd like to define it a little differently. but meh dont wanna continue too far down this particular thing since it is the part of my rant i care the least about.
May 07 2021
On 5/7/21 2:44 PM, Steven Schveighoffer wrote:On 5/7/21 2:17 PM, Adam D. Ruppe wrote:Yah, ranges are a generalization of arrays. It would be odd if the generalization of arrays didn't work when tried with arrays.I think it was actually a mistake for Phobos to UFCS shoe-horn in range functions on arrays too - this includes strings as well as int[] and such as well.The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays. Just a taste: int[] arr = genArray; arr.sort(); // fail. I don't want to go to that place, ever. -Steve
May 07 2021
On Friday, 7 May 2021 at 20:53:08 UTC, Andrei Alexandrescu wrote:On 5/7/21 2:44 PM, Steven Schveighoffer wrote:No. Ranges are not a generalization of arrays unless you ignore the most important feature of the notion of a Range. An array is a sequence of things in space: a spatial container (all values stored) that happens to be a sequence. A Range is a sequence of things in time. (Purist definition, often true in practice.) A spatial container can be /exploded/ into a sequence in time. And a sequence in time can be /accreted/ into a spatial container (whether it has sequence or not). Explode is a natural idea and could be defined for any spatial container, producing a Range from a spatial container, and specifically from an array. Making a distinction of spatial and temporal makes sense.On 5/7/21 2:17 PM, Adam D. Ruppe wrote:Yah, ranges are a generalization of arrays. It would be odd if the generalization of arrays didn't work when tried with arrays.I think it was actually a mistake for Phobos to UFCS shoe-horn in range functions on arrays too - this includes strings as well as int[] and such as well.The most common range BY FAR in all of D code is an array. The end result of something like you allude to would result in nearly all of phobos NOT working with arrays.
May 12 2021
On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:On Friday, 7 May 2021 at 20:53:08 UTC, Andrei AlexandrescuRanges are a generalization of arrays (or slices, if you prefer) in the same way that iterators are a generalization of pointers. In both cases, certain features of the specialized version are ignored or left out in the generalized version. As you've correctly pointed out, one of those ignored features is the array's layout in memory. A range *may* store all of its elements in memory, or it may not; as users of the range API, we are not suppose to know or care.Yah, ranges are a generalization of arrays. It would be odd if the generalization of arrays didn't work when tried with arrays.No. Ranges are not a generalization of arrays unless you ignore the most important feature of the notion of a Range. An array is a sequence of things in space: a spatial container (all values stored) that happens to be a sequence. A Range is a sequence of things in time. (Purist definition, often true in practice.)
May 12 2021
On Wednesday, 12 May 2021 at 15:08:46 UTC, Paul Backus wrote:On Wednesday, 12 May 2021 at 14:49:35 UTC, NonNull wrote:This is the standard pattern of the interpretation of the meaning of Range. It is more concrete. I want the idea of range to escape its historical semantic origins. I am suggesting a different and cleaner interpretation of that meaning. One that draws a deliberate line between space and time as a means of motivating language design. Instead of regarding the psychological process of regarding a spatial data structure as a range as being the psychological process of simply ignoring other non-range features and just using range operations, I am suggesting a semantic hard line be drawn between the two. A range could be obtained by exploding a spatial data structure (array say) and regarded as a distinct entity. Concretely the latent temporal sequence of things taken from the spatial data structure (the derived range) could be regarded as semantically quite different and separate from that data structure. While some may consider this a distinction without a difference, it does nevertheless change how one might relate a range to a spatial data structure in a programming language. My view leads to an explicit explode operation of some kind on all occasions, whereas yours can munge together range stuff with other operations on spatial data structures, so that your spatial structure IS a range and abstraction is avoided. Moving away from the historical semantics to the semantics I suggest above and having that guide language design separates those concerns. The idea of /explode/ is a nice intuitive fundamental concept that is concealed and entangled in D right now. Things could be less baroque. Specifically, arrays would then be treated the same way as any other spatial data structure. They would not be ranges.No. Ranges are not a generalization of arrays unless you ignore the most important feature of the notion of a Range. An array is a sequence of things in space: a spatial container (all values stored) that happens to be a sequence. A Range is a sequence of things in time. (Purist definition, often true in practice.)Ranges are a generalization of arrays (or slices, if you prefer) in the same way that iterators are a generalization of pointers. In both cases, certain features of the specialized version are ignored or left out in the generalized version. As you've correctly pointed out, one of those ignored features is the array's layout in memory. A range *may* store all of its elements in memory, or it may not; as users of the range API, we are not suppose to know or care.
May 12 2021
On 5/7/21 11:51 AM, Steven Schveighoffer wrote:On 5/7/21 11:30 AM, Andrei Alexandrescu wrote:Well you see here is the problem. An enum with base string can be coerced to a string, but is not a true subtype of string. This came to a head with ranges, too - you can pop off the head of a string still have a string, but if you pop off the head of an enum string you get some enum value that is not present in the set of enum values. Concatenation has similar problems, e.g. s ~ s for enum strings yields string, not an enum string. (Weirdly s ~= s works...) So enum strings break ISA/Liskov. Alias this also does due to an overwhelming number of errors in its design and implementation.On 5/7/21 11:20 AM, Steven Schveighoffer wrote:But an enum with base string type can be passed as a string. The PR in question is working around a limitation of the Phobos trait that says something derived from a string isn't really usable as a string (when it is).If you mean we shouldn't support it (as an ambiguous case) in *conversion* utilities (i.e. to/from string), then this makes some sense. But it's also not straightforward. Sometimes you WANT to convert from the enum to the base type. Sometimes you want to convert to the enum name. Going backwards (string to enum), which one makes more sense? It depends on context. It also doesn't help that a string enum implicitly converts to a string. The language is going to circumvent any policies Phobos has on that front.Enums are poorly designed, but that's only a small part of the problem. The bigger problem is the corruption of a noble principle. We wanted to be as generic as possible, and indeed in the beginning that seemed not only possible, but also easy. I don't think there's any other language or library supporting different character widths with this little aggravation. Then this whole "be as generic as possible" became a slippery slope of inclusion. Allow enum strings. Allow alias this strings.The problem I see is, when phobos says something isn't true, when it really is, causes no end of confusion (*cough* autodecoding) static assert(!isSomeString!T); // yet... string s = someT;This only shows that we have a baroque language that allows user-defined conversions from non-strings to strings. The code above is NO PROOF that T is supposed to be a string."When you want a string".How about no. User: "I have this enum string str and phobos won't consider it a string. Help!" Another user: "Just use str.representation if you want to pass str around as a string."User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"
May 07 2021
On 5/7/21 12:43 PM, Andrei Alexandrescu wrote:On 5/7/21 11:51 AM, Steven Schveighoffer wrote:Sorry, let's jump out of the fake dialog here for a second. The problem I have is, you have a function like: foo(T)(T s) if (isSomeString!T) The *intention* here is that, I want to NOT have to write: foo(string s) { impl } foo(wstring s) { impl } foo(dstring s) { impl } ... // etc with const, mutable BUT, if I have an enum that converts to a string, then if I actually DID write all those, then it would compile. However, the template version does not. This is the confusion that a user and library author has. I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the language I'm not suggesting any specific changes here, but I recognize there is a disconnect from what we *want* to express, and what the language provides. -SteveOn 5/7/21 11:30 AM, Andrei Alexandrescu wrote:"When you want a string".User: "I have this enum string str and phobos won't consider it a string. Help!" Another user: "Just use str.representation if you want to pass str around as a string."User: "OK, but when should I use representation? I already pass it around as a string and it works fine. Why can't phobos comprehend that, when the language has no problems with it?"
May 07 2021
On 5/7/21 1:05 PM, Steven Schveighoffer wrote:I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the languageI forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases. -Steve
May 07 2021
On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:On 5/7/21 1:05 PM, Steven Schveighoffer wrote:What's wrong with this? void fun(T : string)(T t)I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the languageI forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases. -Steve
May 07 2021
On Friday, 7 May 2021 at 17:27:18 UTC, Daniel N wrote:What's wrong with this? void fun(T : string)(T t)That doesn't convert to string. It allows it to compile because T *can* be converted to string and thus it is the closest specialization it can get, but it does NOT actually convert it. ---- import std.stdio; enum Test : string { a = "foo" } void test2(T:string)(T t) { pragma(msg, T); // Test, not string! writeln(t); } void main() { test2(Test.a); } -----
May 07 2021
On 5/7/21 1:27 PM, Daniel N wrote:On Friday, 7 May 2021 at 17:16:06 UTC, Steven Schveighoffer wrote:Because T is not a string. e.g. for an string-based enum, t.popFront won't work. -SteveOn 5/7/21 1:05 PM, Steven Schveighoffer wrote:What's wrong with this? void fun(T : string)(T t)I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the languageI forgot to finish this thought, got interrupted. Whereas the language (with non-template parameters) does the matching and conversion simultaneously without needing special cases.
May 07 2021
On 5/7/21 1:05 PM, Steven Schveighoffer wrote:The problem I have is, you have a function like: foo(T)(T s) if (isSomeString!T) The *intention* here is that, I want to NOT have to write: foo(string s) { impl } foo(wstring s) { impl } foo(dstring s) { impl } ... // etc with const, mutable BUT, if I have an enum that converts to a string, then if I actually DID write all those, then it would compile. However, the template version does not. This is the confusion that a user and library author has.Of course. I understand that very well. But that's a minor confusion and inconvenience; people understand very well that e.g. this won't work: void foo(float); void foo(double); void main() { foo(1); } The reason is slightly different but the point is the same: convertibility has its subtleties and programming languages comprehend small surprises. Supporting enum strings and alias this at the huge cost we incur now is definitely over two standard deviations away from what's reasonable.I think the problem here is that the language doesn't give you a good way to express that. So we rely on template constraints that both can't exactly express that intention, and where the approximations create various template instantiations that cause strange problems (i.e. if you accept an enum that converts to string, it's still an enum inside the template). Whereas the language I'm not suggesting any specific changes here, but I recognize there is a disconnect from what we *want* to express, and what the language provides.That I am on board with.
May 07 2021
On Friday, 7 May 2021 at 17:05:08 UTC, Steven Schveighoffer wrote:The problem I have is, you have a function like: ```D auto foo(T)(T s) if (isSomeString!T) { impl } ``` The *intention* here is that, I want to NOT have to write: ```D auto foo(string s) { impl } auto foo(wstring s) { impl } auto foo(dstring s) { impl } ... // etc with const, mutable ``` BUT, if I have an enum that converts to a string, then if I actually DID write all those, then it would compile. However, the template version does not. This is the confusion that a user and library author has.Maybe this is special casing here, but if you have a finite list of types you want to support, it might be easier to add an `AliasSeq` of all string types to `std.traits` or so and use ```D static foreach (String; Strings) auto foo(String s) { impl } ``` Looks generic, but actually isn't. The implementation bloat is a different beast though.
May 07 2021
On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:Well you see here is the problem. An enum with base string can be coerced to a string, but is not a true subtype of string. This came to a head with ranges, too - you can pop off the head of a string still have a string, but if you pop off the head of an enum string you get some enum value that is not present in the set of enum values. Concatenation has similar problems, e.g. s ~ s for enum strings yields string, not an enum string. (Weirdly s ~= s works...)Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101. I raised a few times int he past that there were unsound operations performed in the past (as in "Weirdly s ~= s works...") but I don't think turning compiler bugs into standard library policies is going to lead to better tomorrows.
May 09 2021
On 5/9/21 8:57 PM, deadalnix wrote:On Friday, 7 May 2021 at 16:43:20 UTC, Andrei Alexandrescu wrote:So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.Well you see here is the problem. An enum with base string can be coerced to a string, but is not a true subtype of string. This came to a head with ranges, too - you can pop off the head of a string still have a string, but if you pop off the head of an enum string you get some enum value that is not present in the set of enum values. Concatenation has similar problems, e.g. s ~ s for enum strings yields string, not an enum string. (Weirdly s ~= s works...)Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101.
May 09 2021
On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
May 10 2021
On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:popFront doesn't return a value, it mutates. So `r` before popFront and `r` after popFront must be the same type, because they are the same variable. If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. $]) != typeof(r), then it doesn't work, and string enums can't be ranges (from which it follows that they are not Liskov-substitutable for strings).So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
May 10 2021
On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:popFront doesn't return a value, it mutates. So `r` before popFront and `r` after popFront must be the same type, because they are the same variable. If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. $]) != typeof(r), then it doesn't work, and string enums can't be ranges (from which it follows that they are not Liskov-substitutable for strings).r = r[1 .. $] is an error unless r actually is a string. You cannot mutate an enum value and have it stay an enum. If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A { A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A? Nowhere, the LSP absolutely doesn't mandate that. It mandate that you can pass a B to something that expects an A, and that thing will behave the way you'd expect. And it does! If your code needs an A, then you mark it as accepting an A as input. If I have a B and want to pass it to your code, I can too, transparently. You do not need to even know about the existence of B when your wrote your code. This is what the LSP is at its core. Back to our string example, the code should accept string (A), with zero knowledge of the existence of any enum string (B). You should be able to pass a B to that code and have everything work as expected. The argument that the enum string is not a subtype because it breaks he LSP is nonsense, this in fact demonstrate that the type system is unsound and it breaks LSP is broken. And this is why people end up desperately trying to re-implement it in libraries, which result in a ton of more work and complexity for everybody involved.
May 10 2021
On 5/10/21 5:55 PM, deadalnix wrote:On Monday, 10 May 2021 at 13:30:52 UTC, Paul Backus wrote:If we move the goalposts we can with certain ease create the illusion that a lot of things are possible and even easy. This works very well in forum discussions where all needed is eloquence and the perseverance to answer every post with one that just slightly moves the discussion around so it appears to have answers to every objection and have the last word on any topic. This is exactly what happens here - half of your points contradict the other half, but never in the same post and the appearance is you seem to have easy answers to everything. In the initial days of ranges we actually considered that popFront() would be actually tail() that returns by value. So instead of today's form (given a range r): for (; !r.empty; r.popFront) { ... use r.front ... } we'd have had: for (; !r.empty; r = r.tail) { ... use r.front ... } This doesn't change things much (and wouldn't improve the situation with enums) but does open up the possibility - what if r.tail() actually returns a type different from r? In all interesting cases that means r = r.tail wouldn't work anymore, which complicates range algorithms A LOT. They'd need to use recursion instead of iteration: void someRangeFunction(R)(R range) { if (range.empty) { ... empty case ... } else { ... do some work for r.front ... return someRangeFunction(r.tail); } } (I should note that that's actually of interest for immutable ranges, for the simple reason they aren't assignable.) At any rate, we decided this would complicate everything in Phobos way too much (and I think that was a correct prediction) so we chose to have popFront() mutate the current range.popFront doesn't return a value, it mutates. So `r` before popFront and `r` after popFront must be the same type, because they are the same variable. If popFront for a string enum is `r = r[1 .. $]`, and typeof(r[1 .. $]) != typeof(r), then it doesn't work, and string enums can't be ranges (from which it follows that they are not Liskov-substitutable for strings).r = r[1 .. $] is an error unless r actually is a string. You cannot mutate an enum value and have it stay an enum. If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A { A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A?
May 11 2021
On Tuesday, 11 May 2021 at 15:33:45 UTC, Andrei Alexandrescu wrote:If we move the goalposts we can with certain ease create the illusion that a lot of things are possible and even easy. [...] At any rate, we decided this would complicate everything in Phobos way too much (and I think that was a correct prediction) so we chose to have popFront() mutate the current range.I don't think that any of what you wrote is incorrect, and these are even reasonable tradeofs as far as I can tell. I however would like to remind where this whole thing starts from: format!SomeEnumString(...) is expected to work for users. Not that SomeEnumString is a full fledged range or anything, simply that you can pass is down to phobos, or anything else for that matter, in place where a string is expected. This is reasonable expectation. It is also a reasonable expectation that this shouldn't require a ton of scaffolding to work, in phobos or elsewhere. Therefore, the fact that phobos required scaffolding to make this work is indicative that there is a deeper problem. Focusing on finding what that deeper problem is and fixing it seems like a healthier path forward than simply pretending there is no problem and pushing it all on the users. I this case, it was noted here ( https://forum.dlang.org/post/umndraexmrxiyrmfpcyo forum.dlang.org ) that the root cause of the problem might be that there is a conflation between the container and the range. I think this is a reasonable hypothesis. Having two things trying to do one thing is a very typical source of such problems.
May 11 2021
On 5/11/21 12:26 PM, deadalnix wrote:I however would like to remind where this whole thing starts from: format!SomeEnumString(...) is expected to work for users.Reasonable, though I should add that it's a decision made by the author of the format() API.Not that SomeEnumString is a full fledged range or anything, simply that you can pass is down to phobos, or anything else for that matter, in place where a string is expected.Reasonable, though again a matter of API definition. Would you expect this to work? float sin(float x); double sin(double x); real sin(real x); ... auto x = sin(1); Shouldn't that work? Not that int is a full fledged floating point number or anything, simply that you can pass it down to phobos, or anything else for that matter, in place where a floating point number is expected. Oh, but wait, it's the templates. Great. T sin(T)(T x) if (isFloatingPoint!T); ... auto x = sin(1); Shouldn't that work? Not that int is a full fledged floating point number or anything, simply that you can pass it down to phobos, or anything else for that matter, in place where a floating point number is expected. Well an argument can be made that it should work, or the API designer can wisely choose to NOT yield true from isFloatingPoint!int. And if we explore this madness further, we get to an enormity just as awful as StringTypeOf: template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = T; } else ... } And then whenever we need a floating point type we use is(FloatingPointTypeOf!T) like a bunch of dimwits. What use case does that helps? Who is helped by that? Someone who can't bring themselves to convert whatever they have to double prior to using the standard library. Arguably not a good design.
May 11 2021
On 5/11/21 12:57 PM, Andrei Alexandrescu wrote:template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = T; } else ... }Correx: template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = double; } else ... }
May 11 2021
On Tuesday, 11 May 2021 at 16:57:13 UTC, Andrei Alexandrescu wrote:Reasonable, though again a matter of API definition. Would you expect this to work? float sin(float x); double sin(double x); real sin(real x); ... auto x = sin(1); Shouldn't that work? Not that int is a full fledged floating point number or anything, simply that you can pass it down to phobos, or anything else for that matter, in place where a floating point number is expected.It's debatable. There are many languages out there where it doesn't. I think your case here is disingenuous, because an int is not a special kind of float. We are explicitly outside of the scope of the argument being made to begin with. Whatever conclusion we reach using int and float would have no bearing on what should happen for string and SomeEnumString. However, in D, it is possible to do: enum SomeEnumInt : int; This is for instance used in std.encoding. UI'm not sure if this works with float or not, but assuming that it does, then this absolutely and unambiguously work: enum SomeEnumFloat : float; SomeEnumFloat f = ...; auto x = sin(f); Here, x would have type float, based on `float sin(float x)`.Well an argument can be made that it should work, or the API designer can wisely choose to NOT yield true from isFloatingPoint!int.An argument could be made, however, this is not the argument I am making, so I don't really see the point of bringing this up.And if we explore this madness further, we get to an enormity just as awful as StringTypeOf: template FloatingPointTypeOf(T) { static if (isIntegral!T) { alias FloatingPointTypeOf = T; } else ... } And then whenever we need a floating point type we use is(FloatingPointTypeOf!T) like a bunch of dimwits. What use case does that helps? Who is helped by that? Someone who can't bring themselves to convert whatever they have to double prior to using the standard library. Arguably not a good design.This is indeed not a good design, but also isn't really required if the places requiring a float can consistently accept SomeEnumFloat, because in this case, it turtles transparently all the way down.
May 11 2021
On 5/11/2021 12:14 PM, deadalnix wrote:I think your case here is disingenuous, because an int is not a special kind of float.D has no notion of a "special kind of type". It only has a notion of "implicitly convertible". * An int is implicitly convertible to a float. * An enum is implicitly convertible to its base type. The two *must* behave the same way, or the language falls apart with hackish special cases that will never work in a predictable manner. One could design a language with two kinds of conversions: 1. is-a-special-case-of 2. is-implicitly-convertible-to but D isn't it.
May 11 2021
On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:On 5/11/2021 12:14 PM, deadalnix wrote:Except, it is. D has numerous instance of both already and pretending it doesn't really isn't going to lead anywhere useful. And this very thread is indeed proof that "the language falls apart with hackish special cases that will never work in a predictable manner." The fact is that you can't get rid of 1. and support OOP, because polymorphism is a key ingredient of OOP. And we even go as far as to talk about some of the metaprogramming techniques in D as being compile time polymorphism, so so this is clearly a road we want to embark on. The alternative is to go full functional on these things, and, as Andrei explain with the tail example, this is an option that works as well, but you have to write everything in functional style, which makes some code harder to write. Personally, I'm not interested in D going full functional, because I appreciate that different ideas are better expressed in different paradigms. But I understand that it means that we must have 1. Now, do we need 2. ? Strictly speaking, we do not. We could just say that string float conversion and vice versa must be explicit. We can remove alias this, and whatever other feature of the language does implicit conversion. I'm actually confident that in some cases, that would be a win, but also that we are too far gone to realistically be able to remove 2. So we have both, we need to live with both, and make sensible decisions based on that. Pretending that we don't have both only leads to the guarantee that we'll make more bad decisions on that front in the future.I think your case here is disingenuous, because an int is not a special kind of float.D has no notion of a "special kind of type". It only has a notion of "implicitly convertible". * An int is implicitly convertible to a float. * An enum is implicitly convertible to its base type. The two *must* behave the same way, or the language falls apart with hackish special cases that will never work in a predictable manner. One could design a language with two kinds of conversions: 1. is-a-special-case-of 2. is-implicitly-convertible-to but D isn't it.
May 11 2021
On Tuesday, 11 May 2021 at 19:56:05 UTC, deadalnix wrote:On Tuesday, 11 May 2021 at 19:37:55 UTC, Walter Bright wrote:Remove alias this support for classes and replace it with compile time default interface methods. -Alex[...]Except, it is. D has numerous instance of both already and pretending it doesn't really isn't going to lead anywhere useful. [...]
May 11 2021
On 5/11/2021 12:56 PM, deadalnix wrote:The fact is that you can't get rid of 1. and support OOP, because polymorphism is a key ingredient of OOP.Converting a derived class reference to a base class reference is an "implicitly convert" operation, not a special-kind-of conversion.
May 11 2021
On Wednesday, 12 May 2021 at 00:22:56 UTC, Walter Bright wrote:On 5/11/2021 12:56 PM, deadalnix wrote:That is trivially demonstrably false. Consider: class A {} class B : A {} B function() implicitly converts to A function() But byte function() doesn't implicitly converts to int function() Clear, the implicit conversion from byte to int is of different nature than the one from B to A, and one doesn't have to dig very deep to find these differences. Now, mind you, this is not a problem. At all. After all, B is a subtype of A, while byte is not a subtype of int. There are different kind of implicit conversions. This is pefectly sound and required if D wants to have implicit conversion of things which aren't subtypes of each others. There are no ways around it. Let's just not pretend it's the same, because this from these erroneous assumptions that bad design grows.The fact is that you can't get rid of 1. and support OOP, because polymorphism is a key ingredient of OOP.Converting a derived class reference to a base class reference is an "implicitly convert" operation, not a special-kind-of conversion.
May 11 2021
On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A { A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A? Nowhere, the LSP absolutely doesn't mandate that. It mandate that you can pass a B to something that expects an A, and that thing will behave the way you'd expect. And it does! If your code needs an A, then you mark it as accepting an A as input. If I have a B and want to pass it to your code, I can too, transparently. You do not need to even know about the existence of B when your wrote your code. This is what the LSP is at its core. Back to our string example, the code should accept string (A), with zero knowledge of the existence of any enum string (B). You should be able to pass a B to that code and have everything work as expected.I concede the points that enum strings do not violate the LSP, and that they are subtypes of string. You're right, and I was wrong. The point I should have made is that, at least in D, the LSP is not universal. There are situations where it simply does not apply. In particular, it does not guarantee that a substitution which changes the arguments used to instantiate a template will succeed; e.g., class A { int x; } class B : A { int y; } void example(T)(T obj) { static assert(!__traits(hasMember, T, "y")); } `example(new A)` will compile, but `example(new B)` will not--because they are not actually calling the same function. One calls `example!A` and the other calls `example!B`. This is an unavoidable consequence of the expressive power of D's templates: without specific knowledge about `example`'s implementation, we cannot guarantee anything about the relationship between `example!A` and `example!B`. All of which is to say, the fact that you can pass a string as an argument to a template does not *necessarily* imply that you can pass an enum string as an argument to the same template. That `format` handles them differently does not "fly in the face of Liskov's substitution principle" [1], any more than my example above does. [1] https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.org
May 12 2021
On Wednesday, 12 May 2021 at 22:00:57 UTC, Paul Backus wrote:I concede the points that enum strings do not violate the LSP, and that they are subtypes of string. You're right, and I was wrong.Thanks.The point I should have made is that, at least in D, the LSP is not universal. There are situations where it simply does not apply. In particular, it does not guarantee that a substitution which changes the arguments used to instantiate a template will succeed; e.g., [...] All of which is to say, the fact that you can pass a string as an argument to a template does not *necessarily* imply that you can pass an enum string as an argument to the same template. That `format` handles them differently does not "fly in the face of Liskov's substitution principle" [1], any more than my example above does. [1] https://forum.dlang.org/post/fnibsejuozasspsggxie forum.dlang.orgThat is true, and there are definitively cases where it is unavoidable. However, I don't think format fits that bill, because format does expect a string, not any random type. Where I'm getting at is a bit complicated to express clearly, because types are effectively also "values" that you can pass around at compile time, but let me try. We should reasonably expect the LSP to work when what is passed down is the value of the enum, but not when it it's type - which, in fact, isn't too surprising because the type itself isn't subject to the LSP. Consider: class A{} class B : A {} void foo(A a); // We should expect the LSP to hold true here, because the value is the only argument passed down to foo. void bar(T)(T t); // There is no expectation that foo(new A) and foo(new B) behave consistently, because not only the value is passed down, but also the type. While we expect passing down the value to respect the LSP, no such expectation can exist for the type. So in the second exemple, while we expect the runtime parameter `t` to conform to the LSP, we do not expect the compile time parameter `T` to do so. However, if we do not change the value of `T` but pass a B down to `t`, then we should get back to a situation where the LSP is respected. For instance: bar!A(new B()); // We expect this to be well behaved when it comes to the LSP, vs say bar(new A()) because the only change happened to the value parameter, which is supposed to uphold the LSP. So far, so good, I don't think this is too controversial, even though it is confusing to express that concept clearly. Now, with enum string, there is an interesting twist, because they can be passed at compile time too. in theory, that should not change anything when it comes to the LSP, but in practice, it seems like it does, which is IMO where the root of the problem is. Consider: string format(string S, A...)(A args); While S is a compile time parameter, it is not a type parameter, but a value parameter. In that case, it is expected as per the LSP that I can pass down string, or any subtype of strings as the first compile time parameter of format, and this ought to work as expected.
May 12 2021
On Wednesday, 12 May 2021 at 23:08:24 UTC, deadalnix wrote:Now, with enum string, there is an interesting twist, because they can be passed at compile time too. in theory, that should not change anything when it comes to the LSP, but in practice, it seems like it does, which is IMO where the root of the problem is. Consider: string format(string S, A...)(A args); While S is a compile time parameter, it is not a type parameter, but a value parameter. In that case, it is expected as per the LSP that I can pass down string, or any subtype of strings as the first compile time parameter of format, and this ought to work as expected.This *does* work as expected: https://run.dlang.io/is/Ru9phk The issue with `format` is that it takes an alias parameter, not a value parameter--and the reason it does *that* is to support string, wstring, and dstring with a single overload.
May 12 2021
On Wednesday, 12 May 2021 at 23:31:21 UTC, Paul Backus wrote:This *does* work as expected: https://run.dlang.io/is/Ru9phk The issue with `format` is that it takes an alias parameter, not a value parameter--and the reason it does *that* is to support string, wstring, and dstring with a single overload.Yes, so we are getting at the root of this. I know these thing work, this is why I stated that SomeEnumString is a subtype of string to begin with, it has all the properties. If that wasn't working, then I would have been mistaken when making such assertions. It is working in the simple case, it is expected to work from the caller's standpoint due to the LSP, but it doesn't work in practice due to some obscure implementation detail that is of little concern to the user. Pushing this on the user is not the way to go. If the library writer desire to bundle string/dstring/wstring in the same implementation, this doesn't change the fact that it ought to work with subtypes. Choosing to break this is what "flies in the face of the LSP". I would also like to see people think what make respecting the LSP challenging in such case, and see what can be done at a systemic level. It's kind of a bummer that the path of least resistance is to break the LSP when going for more genericity in another dimension.
May 12 2021
On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:It is working in the simple case, it is expected to work from the caller's standpoint due to the LSP, but it doesn't work in practice due to some obscure implementation detail that is of little concern to the user. Pushing this on the user is not the way to go. If the library writer desire to bundle string/dstring/wstring in the same implementation, this doesn't change the fact that it ought to work with subtypes. Choosing to break this is what "flies in the face of the LSP".Well, no, it doesn't--because, again, the LSP doesn't apply here in the first place, and never has. Flies in the face of user expectations, perhaps--though even then, if the user looks at the documentation and see `isSomeString!(typeof(fmt))`, is it really reasonable for them to expect that a non-string type will be accepted? I think it's a reasonable API design decision to support any type that implicitly converts to a string type, but it's not the *only* reasonable decision, and we ought to acknowledge the costs as well as the benefits. Personally, my inclination is to err on the side of making the standard library a little more complex so that user code can be simpler, but Andrei makes a convincing argument that this tendency has gotten us into trouble before [1]. How do we decide where to draw the line? There has to be some principle here beyond just "users expect it" and "respect the LSP." [1] https://forum.dlang.org/thread/q6plhj$1l9$1 digitalmars.comI would also like to see people think what make respecting the LSP challenging in such case, and see what can be done at a systemic level. It's kind of a bummer that the path of least resistance is to break the LSP when going for more genericity in another dimension.IMO this is all downstream of D's choice to use untyped templates as opposed to typed generics (a tradeoff that goes all the way back to Lisp vs. ML). It's a fun thought experiment to imagine a version of D that took the other path, but there's not much we can do about it now.
May 12 2021
On Thursday, 13 May 2021 at 01:03:19 UTC, Paul Backus wrote:On Wednesday, 12 May 2021 at 23:42:11 UTC, deadalnix wrote:While what you say is correct, I'm not convinced it is right. We established before that effectively, we should expect the LSP to hold when values are passed down, but not when types are. Which i think we both agree is the reasonable thing to do here, because B being a subtype of A doesn't say anything about meta_typeof(B) being a subtype of meta_typeof(A), and therefore there is no expectation that the LSP holds. So it is correct to assert that if format takes the type as a parameter, then there is no expectation that the LSP holds. It is also correct to say that the documentation describes things accurately. But I strongly disagree with the fact that it is right. To use an analogy, I could make a car where the gaz and break pedal are swapped, and explain as much in the user manual, yet, I fully expect people would crash such cars at a higher rate than the alternative. In the case of format, we need to ask ourselves what does the user expect, to pass a value down or to pass a type (plus possibly a value) down? Because if it the first, then it reasonable from the user standpoint that the LSP works and if it is the second, then there isn't such an expectation. The fact that we see people trying to do format!SomeEnumString , but not something like format!42 provides a good answer to that question. Format's parameter is expected to be a string, not any random type. And if that is the case, then it is reasonable to expect to LSP to hold. Now, the matter of cost is an interesting one. But I argue that doing what the user expect ought to be cheap, if not the cheapest option available. This is simply the difference between a language that helps its users and a language that gets in the way. So if the cost is high, then we need to consider this high cost a serious problem to solve.It is working in the simple case, it is expected to work from the caller's standpoint due to the LSP, but it doesn't work in practice due to some obscure implementation detail that is of little concern to the user. Pushing this on the user is not the way to go. If the library writer desire to bundle string/dstring/wstring in the same implementation, this doesn't change the fact that it ought to work with subtypes. Choosing to break this is what "flies in the face of the LSP".Well, no, it doesn't--because, again, the LSP doesn't apply here in the first place, and never has. Flies in the face of user expectations, perhaps--though even then, if the user looks at the documentation and see `isSomeString!(typeof(fmt))`, is it really reasonable for them to expect that a non-string type will be accepted? I think it's a reasonable API design decision to support any type that implicitly converts to a string type, but it's not the *only* reasonable decision, and we ought to acknowledge the costs as well as the benefits. Personally, my inclination is to err on the side of making the standard library a little more complex so that user code can be simpler, but Andrei makes a convincing argument that this tendency has gotten us into trouble before [1]. How do we decide where to draw the line? There has to be some principle here beyond just "users expect it" and "respect the LSP."
May 13 2021
On 5/12/21 6:00 PM, Paul Backus wrote:On Monday, 10 May 2021 at 21:55:54 UTC, deadalnix wrote:I was all over run.dlang.org like "Sure that's not going to work... wait a second, it does! But that other thing's not going to work... what, that works too!" I didn't know D's enums are _that_ odd. It seems you can do almost everything with an enum that you can do with its base type. Keyword being "almost". For example, x ~= "asd"; works whether x is a string or an enum based on string. However, x = x ~ "asd"; works if x is a string and does not work if x is an enum derived from string. Therefore, a function using that expression works for strings but not for enum strings. Similarly: x += 3; works for int and enums derived from int. However, x = x + 3; does not. So you can't transparently substitute enums for their base type. I suspect there'd be other cases, too.If you think that invalidate the LSP, I'm afraid there is a big misunderstanding about the LSP. Not all operation on a subtype have to return said subtype. It is made clearer if you consider the slicing operationa s a member function on an object instead - as I seems classes and inheritance is the only way OPP is understood these days. class A { A slice(int start, int end) { ... } } class B : A {} Where is it implied that B's version of the slice operation must return an A? Nowhere, the LSP absolutely doesn't mandate that. It mandate that you can pass a B to something that expects an A, and that thing will behave the way you'd expect. And it does! If your code needs an A, then you mark it as accepting an A as input. If I have a B and want to pass it to your code, I can too, transparently. You do not need to even know about the existence of B when your wrote your code. This is what the LSP is at its core. Back to our string example, the code should accept string (A), with zero knowledge of the existence of any enum string (B). You should be able to pass a B to that code and have everything work as expected.I concede the points that enum strings do not violate the LSP, and that they are subtypes of string. You're right, and I was wrong.
May 12 2021
On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu wrote:It seems you can do almost everything with an enum that you can do with its base type. Keyword being "almost". For example, x ~= "asd"; works whether x is a string or an enum based on string. However, x = x ~ "asd"; works if x is a string and does not work if x is an enum derived from string. Therefore, a function using that expression works for strings but not for enum strings.A template function, you mean? Because (as the rest of the post you quoted demonstrates) the LSP does not and has never applied (in D) to substitutions that involve different instantiations of the same template. If you explicitly instantiate `func!string`, then it will work exactly as the LSP dictates, but if you substitute `func!string(x)` with `func!E(x)`, you have no guarantee. Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` doesn't is definitely a bug.
May 12 2021
On 5/12/21 9:41 PM, Paul Backus wrote:On Thursday, 13 May 2021 at 01:16:42 UTC, Andrei Alexandrescu wrote:Well the problem is that the choice of covariance of results for operations on enums vs their "base" is quite arbitrary. For strings, the result of "~" is not covariant but the result of "~=" is - not only it works, but it returns a reference to the enum type, not the base type. However, for enums derived from integrals the result of "+" is not covariant when adding an enum with an integral, but covariant when two enums are added together. Same goes for "-", "/", "*", but oddly not for "^^". I suspect nobody thought of trying to raise an enum to the power of an enum. The plot thickens when considering enums derived from user-defined types: void main() { import std; struct S { void fun() { writefln("%s", &this); } int min = -1; } enum X : S { x = S() } X x; x.fun; (cast(S*) &x).fun; writeln(x.min); } The two addresses are the same, meaning the enum value gets to call the base member's function, in a subtyping manner. However, the last line doesn't compile, which breaks subtyping. On the face of it, enums are defined by the language, so whatever choices are made are... there. I understand the practicality of some choices, but overall the entire enum algebra is quirky and difficult to maneuver around in generic code. Which harkens back to the opener of this thread - Phobos should not go out of its way to support enumerated types everywhere, when a trivial recourse exists on the caller side - pass value.representation instead of value. A much stronger argument could be made against supporting convertibility (to e.g. strings or ranges) by means of alias this. Callers should convert to the needed type prior to calling into the standard library.It seems you can do almost everything with an enum that you can do with its base type. Keyword being "almost". For example, x ~= "asd"; works whether x is a string or an enum based on string. However, x = x ~ "asd"; works if x is a string and does not work if x is an enum derived from string. Therefore, a function using that expression works for strings but not for enum strings.A template function, you mean? Because (as the rest of the post you quoted demonstrates) the LSP does not and has never applied (in D) to substitutions that involve different instantiations of the same template. If you explicitly instantiate `func!string`, then it will work exactly as the LSP dictates, but if you substitute `func!string(x)` with `func!E(x)`, you have no guarantee. Granted, the fact that `x ~= "asd"` works and `x = x ~ "asd"` doesn't is definitely a bug.
May 12 2021
On Wednesday, May 12, 2021 7:16:42 PM MDT Andrei Alexandrescu via Digitalmars- d wrote:I was all over run.dlang.org like "Sure that's not going to work... wait a second, it does! But that other thing's not going to work... what, that works too!" I didn't know D's enums are _that_ odd. It seems you can do almost everything with an enum that you can do with its base type. Keyword being "almost".Yeah, if enums are supposed to only have a fixed set of values, then they're completely broken. The language does almost nothing to guarantee it. One result of that is that you have to be _very_ careful about how you use something like final switch - especially since it's not checked with -release. Of course, if enums are just named values without caring about whether it's possible to have an enum with a different value than the ones listed, then the fact that the enum is even treated differently from the base type causes other problems. So, ultimately, I think that D enums are pretty schizophrenic and not particularly well-designed. I've argued in the past that the language should disallow all operations on enums (aside from casts) which aren't guaranteed to result in a valid value for that enum type, but not everyone agrees with that stance. - Jonathan M Davis
May 12 2021
On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via Digitalmars-d wrote:I've argued in the past that the language should disallow all operations on enums (aside from casts) which aren't guaranteed to result in a valid value for that enum type, but not everyone agrees with that stance.Or more accurately, all operations on an enum which are not guaranteed to result in a valid enum value should result in the base type (and thus not be assignable to a variable of that enum type without a cast), and operations which mutate the enum should not be allowed unless they're guaranteed to result in a valid enum value. But regardless, the point is that ideally, unless a cast is used, it should be impossible to have something typed as an enum without it being guaranteed that the value be one of the enumerated values for that enum type. But that's definitely not how D enums work... - Jonathan M Davis
May 12 2021
On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via Digitalmars-d wrote: Or more accurately, all operations on an enum which are not guaranteed to result in a valid enum value should result in the base type (and thus not be assignable to a variable of that enum type without a cast), and operations which mutate the enum should not be allowed unless they're guaranteed to result in a valid enum value. But regardless, the point is that ideally, unless a cast is used, it should be impossible to have something typed as an enum without it being guaranteed that the value be one of the enumerated values for that enum type. But that's definitely not how D enums work... - Jonathan M DavisSo basically enum should implicitly be declared to be immutable right?
May 12 2021
On Thursday, 13 May 2021 at 02:43:41 UTC, Jonathan M Davis wrote:On Wednesday, May 12, 2021 8:13:04 PM MDT Jonathan M Davis via Digitalmars-d wrote:YES!I've argued in the past that the language should disallow all operations on enums (aside from casts) which aren't guaranteed to result in a valid value for that enum type, but not everyone agrees with that stance.Or more accurately, all operations on an enum which are not guaranteed to result in a valid enum value should result in the base type (and thus not be assignable to a variable of that enum type without a cast), and operations which mutate the enum should not be allowed unless they're guaranteed to result in a valid enum value. But regardless, the point is that ideally, unless a cast is used, it should be impossible to have something typed as an enum without it being guaranteed that the value be one of the enumerated values for that enum type. But that's definitely not how D enums work... - Jonathan M Davis
May 13 2021
On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:More to the point, consider this: class String { private: immutable(char)[] value; public: this(immutable(char)[] value) { this.value = value; } // ... } class EnumString : String { public: static EnumString value1() { return new EnumString("value1"); } static EnumString value2() { return new EnumString("value2"); } private: this(immutable(char)[] value) { super(value); } } While the implementation differs, conceptually, from a the theory standpoint, this is the same. This is using a subtype to constrain instance of type (String here) to a certain et of possible values. When using the subtype (EnumString) you have the knowledge that it is limited to some value, and you lose that knowledge as soon as you convert to the parent type. But instead, we gets some bastardised monster from the compiler, that's not quit a subtype, but that's not quite something else that really make sens either. As expected, this nonsense ends up spilling into user code, and then the standard lib, based on user constraints, and everybody is left choosing between bad tradeof down the road because the whole house of cards is built on shaky foundations. The bad news is, there is already a language like this. It's called C++, and it's actually quite successful. With all due respect to you and Walter, you guys are legends, but I think there is also a bit of learned helplessness coming from both of you due to a lifetime of exposure to the soul corroding effects of C++. This attitudes pervades everything, and most language constructs suffer of some form of it in one way or another, causing a cascade of bad side effects, starting with this whole thread. A few examples en vrac for instance: DIP1000, delegate context qualifiers, functions vs first class functions, etc... Back to the case of enum, it is obviously and trivially a subtype. In fact, even the syntax is the same: enum Foo: string { ... } Handling enum strings should never have been a special that was added to phobos, because it should never have been a special to begin with, in phobos or elsewhere.So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.
May 10 2021
On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:The bad news is, there is already a language like this. It's called C++, and it's actually quite successful. With all due respect to you and Walter, you guys are legends, but I think there is also a bit of learned helplessness coming from both of you due to a lifetime of exposure to the soul corroding effects of C++.Not sure how this applies to C++, what subtyping issues are you having with C++?This attitudes pervades everything, and most language constructs suffer of some form of it in one way or another, causing a cascade of bad side effects, starting with this whole thread. A few examples en vrac for instance: DIP1000, delegate context qualifiers, functions vs first class functions, etc...That's a direct result of the process. Features have always been added as an experiment rather than being completed on paper, even the ones with a DIP. At this point, this pretty much defines what D is... Just look at the addition of a C compiler that is being advanced right now. It is being added because there might be some benefits from it the future, perhaps. Of course, you also have the side effect that the AST becomes more resistant to change... and refactoring costs doubles... So that is why D has these issues. People wanted something, and it was added in an experimental way, not in an analytical way. That is the way of D. Experiment in features. Ideally D should have boosted meta programming and cut down on features to the bare minimum. Literals should have been a compile time type... and alias should bind to them, strings should've been a library construct, etc etc. But if you look at the features being added, meta programming is not in focus. So this won't change. Features are being added that has nothing to do with metaprogramming (memory safety, C interop etc). D will continue to evolve experimentally. So there will never be a small core language that is consistent. It is what it is, at this point.
May 10 2021
On Monday, 10 May 2021 at 22:37:51 UTC, Ola Fosheim Grøstad wrote:On Monday, 10 May 2021 at 21:44:02 UTC, deadalnix wrote:Function type don't have the right covariance/contravariance, you can slice subtypes, and there are more, but this is not my point. My point is that we already have a language that is a mixed bag of accidentally defined features that don't compose properly with each others. I don't need one more of these, I already have one, and, let's be frank, it has at the very least an order of magnitude more support in the wild, in tools and so on. Doing the same thing with less manpower is a futile exercise.The bad news is, there is already a language like this. It's called C++, and it's actually quite successful. With all due respect to you and Walter, you guys are legends, but I think there is also a bit of learned helplessness coming from both of you due to a lifetime of exposure to the soul corroding effects of C++.Not sure how this applies to C++, what subtyping issues are you having with C++?Sure, but look at this thread. D is crumbling under the weight, not of the number f feature, but of the fact that a large portion of them simply are unsound. At this point, the decision made is to push the madness on the user. Fair enough, but if the standard lib devs are not willing to put up with it, why in hell would you expect anyone else to? Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close. This stuff ain't working properly so let's just given getting to work at all is not how you iterate toward a great useful product.This attitudes pervades everything, and most language constructs suffer of some form of it in one way or another, causing a cascade of bad side effects, starting with this whole thread. A few examples en vrac for instance: DIP1000, delegate context qualifiers, functions vs first class functions, etc...That's a direct result of the process. Features have always been added as an experiment rather than being completed on paper, even the ones with a DIP. At this point, this pretty much defines what D is... Just look at the addition of a C compiler that is being advanced right now. It is being added because there might be some benefits from it the future, perhaps. Of course, you also have the side effect that the AST becomes more resistant to change... and refactoring costs doubles... So that is why D has these issues. People wanted something, and it was added in an experimental way, not in an analytical way. That is the way of D. Experiment in features.
May 10 2021
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:My point is that we already have a language that is a mixed bag of accidentally defined features that don't compose properly with each others. I don't need one more of these, I already have one, and, let's be frank, it has at the very least an order of magnitude more support in the wild, in tools and so on.Yes, I think everyone can agree with this. A good starting point would to implement proper unification of as was discussed some months ago. This is critical for composing types in a sensible manner (composing templates of templates and binding them to a simple name that is exported). Then one can look and see if some types/features that are builtins can be expressed with the same building blocks in a unification process (somehow). When you see what cannot fit into this machinery you get a feeling for which features needs to be redesigned. Something like that.Doing the same thing with less manpower is a futile exercise.Yes.Sure, but look at this thread. D is crumbling under the weight, not of the number f feature, but of the fact that a large portion of them simply are unsound.Yes, but designing something that is sound is best done by having a tiny set of (theoretical) mechanisms that all other features can be expressed with (even though that might not be visible to the end user). It is very difficult to even discuss soundness with no constructive framework to represent ideas with.Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close.Yes, even stuff that is well designed in C++ is a lot of work. Implementing a new container library with all the iterators is quite verbose, tedious and typos will happen... I think defining protocols and making mechanisms available that can extend types with protocols is the way to go (concepts is one step in the right direction). How to do it? Not sure, but it seems like templating by itself is not enough really. E.g. if ranges-functionality should be available to everything that can be treated like a sequence, then this should be a protocol that is present in all the builtin types that are sequential. Or somehow bound to them in some global fashion (kinda like injected into the type). Nothing should be special cased. Ideally. But there is no clear model for how to do that, I think. However it is tied to unification. Deduce the protocol if possible.
May 10 2021
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:At this point, the decision made is to push the madness on the user. Fair enough, but if the standard lib devs are not willing to put up with it, why in hell would you expect anyone else to? Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close. This stuff ain't working properly so let's just given getting to work at all is not how you iterate toward a great useful product.+1 We *must* focus more on consistency and soundness imo. I've heard several users talk about this. So it's nice to see it being talked about here. The way for D forward is to polish up D2. Maybe have 2.100.0 as a goal. Like any project, it needs milestones. We should take a pause, look around and see, we're now in the "optimizing" phase. We can at least try. Then after 2.100.0 for example we can start talking about new cool features again.
May 11 2021
On 5/10/21 6:58 PM, deadalnix wrote:In case you're referring to deprecating support for enum strings in phobos - definitely that's not pushing any madness anywhere. Adding said support was a mistake in the first place.Sure, but look at this thread. D is crumbling under the weight, not of the number f feature, but of the fact that a large portion of them simply are unsound. At this point, the decision made is to push the madness on the user. Fair enough, but if the standard lib devs are not willing to put up with it, why in hell would you expect anyone else to? Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close. This stuff ain't working properly so let's just given getting to work at all is not how you iterate toward a great useful product.
May 11 2021
On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:Sure, but look at this thread. D is crumbling under the weight, not of the number f feature, but of the fact that a large portion of them simply are unsound. At this point, the decision made is to push the madness on the user. Fair enough, but if the standard lib devs are not willing to put up with it, why in hell would you expect anyone else to? Just look at what's in the C++ standard lib or boost and compare to your average C++ project to see the kind of gap in term of motivation to put up with bullshit exists between standard lib devs and Joe coder. It's not even close. This stuff ain't working properly so let's just given getting to work at all is not how you iterate toward a great useful product.Well, this thread is 11 pages and show no sign of winding down. In the meantime, has anyone looked at the code that sparked this outrage ? [As I mentioned in the PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the issue wouldn't have happened if the `fmt` template parameter was a `string` and not an `alias`.Q: why is fmt an alias and not a simple string ? A: No real reason.The way I see it, the issue is valid, the fix wasn't. `format` API should have accepted a `string` and let the compiler perform any allowed implicit conversion, instead of taking exactly the type via `alias`. I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
May 11 2021
On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:On Monday, 10 May 2021 at 22:58:41 UTC, deadalnix wrote:I support bringing these types of discussions to github (not Reddit/Twitter) instead where people can respond to a comment directly, or through thumbs up or down and at least edit their comments rather than piling on emails sequentially. Or a different type of discussion platform entirely. (That said I am with Adam Ruppe's take on this matter)...Well, this thread is 11 pages and show no sign of winding down. In the meantime, has anyone looked at the code that sparked this outrage ? ... I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
May 11 2021
On Wednesday, 12 May 2021 at 05:25:59 UTC, Mathias LANG wrote:Well, this thread is 11 pages and show no sign of winding down. In the meantime, has anyone looked at the code that sparked this outrage ? [As I mentioned in the PR](https://github.com/dlang/phobos/pull/8029#issuecomment-834221552), the issue wouldn't have happened if the `fmt` template parameter was a `string` and not an `alias`.If formats expects a string, then it is indeed the right thing to accept a string :) But that discussion goes further than this, and is necessary, IMO.Q: why is fmt an alias and not a simple string ? A: No real reason.The way I see it, the issue is valid, the fix wasn't. `format` API should have accepted a `string` and let the compiler perform any allowed implicit conversion, instead of taking exactly the type via `alias`. I wish our most competent contributors would find it more interesting to direct their attention to Github or promote the language to their large Twitter following over engaging in flamewar.
May 12 2021
On 5/10/21 5:44 PM, deadalnix wrote:On Monday, 10 May 2021 at 12:19:07 UTC, deadalnix wrote:No it isn't. EnumString and String are reference types. A reference to an enum value does not convert to a reference to its representation. Very very very VERY different.On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:More to the point, consider this: class String { private: immutable(char)[] value; public: this(immutable(char)[] value) { this.value = value; } // ... } class EnumString : String { public: static EnumString value1() { return new EnumString("value1"); } static EnumString value2() { return new EnumString("value2"); } private: this(immutable(char)[] value) { super(value); } } While the implementation differs, conceptually, from a the theory standpoint, this is the same.So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.If you have a range of T, then you got to return a T. I'm not sure what's the problem is here. Do you have a concrete example? All I can think of are things like slicing and alike, and they should obviously return a string, not a T.This is using a subtype to constrain instance of type (String here) to a certain et of possible values. When using the subtype (EnumString) you have the knowledge that it is limited to some value, and you lose that knowledge as soon as you convert to the parent type.One question that you keep not answering (Paul and I both asked it) is how you'd implement the range primitive popFront.But instead, we gets some bastardised monster from the compiler, that's not quit a subtype, but that's not quite something else that really make sens either. As expected, this nonsense ends up spilling into user code, and then the standard lib, based on user constraints, and everybody is left choosing between bad tradeof down the road because the whole house of cards is built on shaky foundations. The bad news is, there is already a language like this. It's called C++, and it's actually quite successful. With all due respect to you and Walter, you guys are legends, but I think there is also a bit of learned helplessness coming from both of you due to a lifetime of exposure to the soul corroding effects of C++. This attitudes pervades everything, and most language constructs suffer of some form of it in one way or another, causing a cascade of bad side effects, starting with this whole thread. A few examples en vrac for instance: DIP1000, delegate context qualifiers, functions vs first class functions, etc...I very much agree Walter and I have brought C++ bias into D, sometimes in a detrimental way.Back to the case of enum, it is obviously and trivially a subtype.No it isn't. How many times do I need to explain that?In fact, even the syntax is the same: enum Foo: string { ... }It doesn't matter. It's not a subtype.Handling enum strings should never have been a special that was added to phobos, because it should never have been a special to begin with, in phobos or elsewhere.Clearly enums have their own oddities, most inherited from C++. Perhaps we should do what C++ did, add a new "enum class" construct that fixes its issues. But I don't know of a perfect design, and I very much would love to see one.
May 11 2021
On Tuesday, 11 May 2021 at 13:50:46 UTC, Andrei Alexandrescu wrote:No it isn't. EnumString and String are reference types. A reference to an enum value does not convert to a reference to its representation. Very very very VERY different.Here we hit at the core of the problem. A reference to a type B that is a subtype of A is not a subtype of ref A. Or, in simlpler terms, B is a subtype of A doesn't imply that ref B is a subtype of ref A. This means that you can pass a B where an A is expected, but not a ref B where a ref A is expected. You'll note that the example I provided with classes for understanding will also demonstrate the same behavior: class A { ... } class B : A { ... } void foo(ref A a) { ... } B b = ...; foo(b); // This must be an error because, while B is a subtype of A, ref B is not a subtype of ref A. That means that you shouldn't be able to pass SomeEnumString to any function, in phobos or elsewhere, that will mutate it, such as popFront. But you should be able to do so, transparently, so any function that won't. That includes all compile time parameters.
May 11 2021
On 5/10/21 8:19 AM, deadalnix wrote:On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:There's no return. The range is being mutated.So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.If you have a range of T, then you got to return a T.I'm not sure what's the problem is here. Do you have a concrete example?Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
May 11 2021
On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu wrote:That must be a type error, this is a feature, not a bug. This is not expected to work.I'm not sure what's the problem is here. Do you have a concrete example?Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
May 11 2021
On 5/11/21 8:14 AM, deadalnix wrote:On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu wrote:Then enum strings are not ranges, correct?That must be a type error, this is a feature, not a bug. This is not expected to work.I'm not sure what's the problem is here. Do you have a concrete example?Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
May 11 2021
On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:Then enum strings are not ranges, correct?They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
May 11 2021
On 5/11/21 10:34 AM, deadalnix wrote:On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:`ref string` is not a type.Then enum strings are not ranges, correct?They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
May 11 2021
On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:On 5/11/21 10:34 AM, deadalnix wrote:This is just denial. There are many exemple of conversions that differs with string and ref strings which do not involve enums. For instance, immutable(string) -> string is a valid conversion, but immutable(string) -> ref string isn't. Call it something else than a type if you want, nevertheless, conversions rules are simply different, even if you abstract the notion of rvalue/lvalue from the whole thing, so it is clearly more than just a regular storage class. When you say ref, you say "I do not want a subtype". Saying B isn't a subtype of A because I can't pass a B to what expects a ref A is just fallacious.On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:`ref string` is not a type.Then enum strings are not ranges, correct?They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.
May 11 2021
On 5/11/21 12:13 PM, deadalnix wrote:On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:It's simple fact.On 5/11/21 10:34 AM, deadalnix wrote:This is just denial.On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:`ref string` is not a type.Then enum strings are not ranges, correct?They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.There are many exemple of conversions that differs with string and ref strings which do not involve enums. For instance, immutable(string) -> string is a valid conversion, but immutable(string) -> ref string isn't. Call it something else than a type if you want, nevertheless, conversions rules are simply different, even if you abstract the notion of rvalue/lvalue from the whole thing, so it is clearly more than just a regular storage class. When you say ref, you say "I do not want a subtype". Saying B isn't a subtype of A because I can't pass a B to what expects a ref A is just fallacious.Again with moving the goalposts.
May 11 2021
On 5/11/21 12:39 PM, Andrei Alexandrescu wrote:On 5/11/21 12:13 PM, deadalnix wrote:To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.On Tuesday, 11 May 2021 at 15:19:05 UTC, Andrei Alexandrescu wrote:It's simple fact.On 5/11/21 10:34 AM, deadalnix wrote:This is just denial.On Tuesday, 11 May 2021 at 13:56:50 UTC, Andrei Alexandrescu wrote:`ref string` is not a type.Then enum strings are not ranges, correct?They are not. But they are strings. Which imply that string aren't ranges, which is right, `ref strings` are ranges, not strings.There are many exemple of conversions that differs with string and ref strings which do not involve enums. For instance, immutable(string) -> string is a valid conversion, but immutable(string) -> ref string isn't. Call it something else than a type if you want, nevertheless, conversions rules are simply different, even if you abstract the notion of rvalue/lvalue from the whole thing, so it is clearly more than just a regular storage class. When you say ref, you say "I do not want a subtype". Saying B isn't a subtype of A because I can't pass a B to what expects a ref A is just fallacious.Again with moving the goalposts.
May 11 2021
On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.Again with moving the goalposts.To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
May 11 2021
On 5/11/21 2:37 PM, Meta wrote:On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:Being blunt is totally cool, but that doesn't make you right. There's no true subtyping or polymorphism with value semantics. This has been common knowledge in C++ - inheriting a value type is an antipattern for many reasons, and conversion operators are to be used carefully (and not as a substitute to subtyping) for many other reasons. With value types, it's all static typing, no polymorphism, no LSP beyond what's called ad-hoc polymorphism in the classic Caderlli et al paper (http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf). What can be aimed for with values is called "parametric polymorphism" (which is NOT subtyping) by the same paper: "Parametric polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure." That works if and only if you can reasonably supplant the same primitives across said range of types. With enums that's onerous; as soon as you "derive" an enum from int you figure that ++x can't reasonably be implemented. Same goes for enum strings - you can't implement the expected string primitives so substitutability is out the window. Values are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again. The problem was the code was doing exactly what it shouldn't have, yet the assert was puzzlingly passing. Since everyone here is great at teaching basic type theory, it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); Then the assertion started failing as expected. Following that, I've used that example for years in teaching and to invariably there are eyes going wide when they hear that C++ pointers are monomorphic, it's the pointed-to values that are polymorphic, and that's an essential distinction. (In D, just like in Java, classes take care of that indirection automatically, which can get some confused.)I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.Again with moving the goalposts.To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
May 11 2021
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:There's no true subtyping or polymorphism with value semantics.I think you guys need to agree on what you mean by "type" and "subtype". Mathematically a type would be a set of states and a set of operators that can take you between the states. A subtype is just a reduced set of states/operators where operators keep you within the set of states. In OO a type is an abstraction (reduced set) of the states that the entity you model in The Real World has. A subclass in OO is increasing the number of states/operators, but decreasing the number of Real World entities covered. So these twi notions of "subtype" are opposite.primitives across said range of types. With enums that's onerous; as soon as you "derive" an enum from int you figure that ++x can't reasonably be implemented. Same goes for enumIn C enums are subtypes of int. You reduce the number of states. C enums are not sound, because operators can take you out of the allowed set of states in a heartbeat. Anyway, I've given up following this discussion. Just define the desirable outcome (practical design) and forget about the theoretical aspects... then others might be able to understand where the viewpoints differ.
May 11 2021
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:Values are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again. The problem was the code was doing exactly what it shouldn't have, yet the assert was puzzlingly passing. Since everyone here is great at teaching basic type theory, it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); Then the assertion started failing as expected. Following that, I've used that example for years in teaching and to invariably there are eyes going wide when they hear that C++ pointers are monomorphic, it's the pointed-to values that are polymorphic, and that's an essential distinction. (In D, just like in Java, classes take care of that indirection automatically, which can get some confused.)While this is indeed very interesting, this is missing the larger point. This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid. There is no such thing as a monomorphic pointer to a polymorphic type in any sound type system. That cannot be made to work. It is pointer and the pointed data, as a package, being half a value, half a reference type in the process. This is unavoidable, you can't unbundle it or everything breaks down. So why is there an indirection in there? Simply because you cannot know the layout of the object at compile time when you are doing runtime polymorphism. But even then, you could decide to make it behave as a value type with eager deep copy or copy on write and that would work too, and it would still be polymorphic. But we get back to square one: this has nothing to do with the type, which hold a reference to a payload. And the whole typing and subtyping business happen on these value types.
May 11 2021
On Wednesday, 12 May 2021 at 01:46:25 UTC, deadalnix wrote:On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:-AlexValues are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again. The problem was the code was doing exactly what it shouldn't have, yet the assert was puzzlingly passing. Since everyone here is great at teaching basic type theory, it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); Then the assertion started failing as expected. Following that, I've used that example for years in teaching and to invariably there are eyes going wide when they hear that C++ pointers are monomorphic, it's the pointed-to values that are polymorphic, and that's an essential distinction. (In D, just like in Java, classes take care of that indirection automatically, which can get some confused.)While this is indeed very interesting, this is missing the larger point. This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid. There is no such thing as a monomorphic pointer to a polymorphic type in any sound type system. That cannot be made represent both the pointer and the pointed data, as a package, being half a value, half a reference type in the process. This is unavoidable, you can't unbundle it or everything breaks down. So why is there an indirection in there? Simply because you cannot know the layout of the object at compile time when you are doing runtime polymorphism. But even then, you could decide to make it behave as a value type with eager deep copy or copy on write and that would work too, and it would still be polymorphic. But we get back to square one: this has nothing to do with the value type, which hold a reference to a payload. And the whole typing and subtyping business happen on these value types.
May 11 2021
On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:-AlexNo, both are value type, but in the case of the class, the value contains a reference to the payload that you describe in the class's body. Consider: class A {} A a = new A(); void foo(A ainfoo) { ainfooo = new A(); } foo(a); Was "a" modified here? No it wasn't.
May 11 2021
On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-typesNo, classes are reference types, structs are values types in -AlexNo, both are value type,but in the case of the class, the value contains a reference to the payload that you describe in the class's body. Consider: class A {} A a = new A(); void foo(A ainfoo) { ainfooo = new A(); } foo(a); Was "a" modified here?Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
May 11 2021
On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:On Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:In layman terms, just because I can replace the item in the box with the exact same box, it does not mean the box hasn't been modified. - AlexOn Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-typesNo, classes are reference types, structs are values types in -AlexNo, both are value type,but in the case of the class, the value contains a reference to the payload that you describe in the class's body. Consider: class A {} A a = new A(); void foo(A ainfoo) { ainfooo = new A(); } foo(a); Was "a" modified here?Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
May 11 2021
On Wednesday, 12 May 2021 at 02:22:52 UTC, 12345swordy wrote:On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:Woops, meant to say "with the exact same item." -AlexOn Wednesday, 12 May 2021 at 02:07:14 UTC, deadalnix wrote:In layman terms, just because I can replace the item in the box with the exact same box, it does not mean the box hasn't been modified. - AlexOn Wednesday, 12 May 2021 at 01:58:39 UTC, 12345swordy wrote:Wrong. https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/reference-typesNo, classes are reference types, structs are values types in -AlexNo, both are value type,but in the case of the class, the value contains a reference to the payload that you describe in the class's body. Consider: class A {} A a = new A(); void foo(A ainfoo) { ainfooo = new A(); } foo(a); Was "a" modified here?Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - Alex
May 11 2021
On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:Yes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - AlexYou might want to reconsider how sure of yourself you are. For instance by opening https://replit.com/languages/csharp and running the following code in there: using System; class A { int i; public A(int i_) { i = i_; } public int getI() { return i; } } class Program { static void Main(string[] args) { A a = new A(15); Console.WriteLine(a.getI()); foo(a); Console.WriteLine(a.getI()); } static void foo(A ainfoo) { ainfoo = new A(23); Console.WriteLine(ainfoo.getI()); } }
May 11 2021
On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -AlexYes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - AlexYou might want to reconsider how sure of yourself you are.
May 11 2021
On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:Before posting that email was the best time to run the code, look at the output and deduce what it means. The second best time is now. In any case, I will disengage from that subthread with you, because it has reached its conclusion, and the point has been demonstrably made with actual code. Arguing about what the code does really is pointless when you can simply run it and look at the result.On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -AlexYes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - AlexYou might want to reconsider how sure of yourself you are.
May 12 2021
On Wednesday, 12 May 2021 at 11:45:52 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 02:41:31 UTC, 12345swordy wrote:Like I said before, it does not support your claims, whatsoever. not support your claims whatsoever. Classes are reference types not value types, end of discussion. the language if you don't believe me.On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:Before posting that email was the best time to run the code, look at the output and deduce what it means.On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:The code you posted, do not support your claim what so ever. When I am talk about address I am literally talking about virtual memory address here, such as 0x40000 or something similar to that. You do not know what the actual virtual memory address of variable of 'a' for class 'b', as the GC takes it care of it for you. So when A is being replace with the new instance of A that happens to have the same value that is being replace, the virtual memory that A holds from the function parameter currently holds will change. -AlexYes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - AlexYou might want to reconsider how sure of yourself you are.In any case, I will disengage from that subthread with you, because it has reached its conclusion, and the point has been demonstrably made with actual code.Replacing the item in the box with the different yet exact same item, doesn't mean that you didn't modify the box. Again, print the object memory address, and you will see what I am talking about. -Alex
May 12 2021
On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:Replacing the item in the box with the different yet exact same item, doesn't mean that you didn't modify the box. Again, print the object memory address, and you will see what I am talking about. -AlexI legitimately can't tell if you are an idiot or a troll.
May 12 2021
On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:What kind of idiot that ignores official documentation provided by Microsoft that clearly states that classes are reference types not value types!? Your coding examples does NOT DISPROVE THIS NOTATION WHATSOEVER!!!! -AlexReplacing the item in the box with the different yet exact same item, doesn't mean that you didn't modify the box. Again, print the object memory address, and you will see what I am talking about. -AlexI legitimately can't tell if you are an idiot or a troll.
May 12 2021
On Wednesday, 12 May 2021 at 15:41:31 UTC, 12345swordy wrote:On Wednesday, 12 May 2021 at 15:31:29 UTC, deadalnix wrote:I think, you both talking about same thing. I think what he meant about half value type, half reference type, is that the variables/function parameters, themselves are references to the data an object has, and that reference is basically a value type, while the actual object data is stored in memory on that address found in variable/parameter, and this half value/half reference semantics are packaged in a single type, which cannot be broken apart. I.e. you can't have a variable that just a simple pointer to some heap memory, and you can't also have a variable that actually contains the data the object has on stack, like in C++ for example. This is the same thing what you've meant by classes being reference types, he just went a level lower into the implementation of so called reference types. roots. Best regards, Alexandru.On Wednesday, 12 May 2021 at 14:35:27 UTC, 12345swordy wrote:What kind of idiot that ignores official documentation provided by Microsoft that clearly states that classes are reference types not value types!? Your coding examples does NOT DISPROVE THIS NOTATION WHATSOEVER!!!! -AlexReplacing the item in the box with the different yet exact same item, doesn't mean that you didn't modify the box. Again, print the object memory address, and you will see what I am talking about. -AlexI legitimately can't tell if you are an idiot or a troll.
May 12 2021
On Wednesday, 12 May 2021 at 02:30:50 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 02:21:06 UTC, 12345swordy wrote:You are conflicting passing an argument by value/reference with the concept of value/reference types. They are not the same thing. "Don't confuse the concept of passing by reference with the concept of reference types. The two concepts are not the same. A method parameter can be modified by ref regardless of whether it is a value type or a reference type. There is no boxing of a value type when it is passed by reference." https://docs.microsoft.com/en-us/dotnet/csharp/language-reference/keywords/ref -AlexYes. A is being replace with the new instance of A that happens to have the same value here. There is no guarantee that they will share the same address. - AlexYou might want to reconsider how sure of yourself you are. For instance by opening https://replit.com/languages/csharp and running the following code in there: using System; class A { int i; public A(int i_) { i = i_; } public int getI() { return i; } } class Program { static void Main(string[] args) { A a = new A(15); Console.WriteLine(a.getI()); foo(a); Console.WriteLine(a.getI()); } static void foo(A ainfoo) { ainfoo = new A(23); Console.WriteLine(ainfoo.getI()); } }
May 12 2021
On 5/11/21 9:46 PM, deadalnix wrote:This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid.All of this is bizarrely incorrect. Care to elaborate?
May 12 2021
On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu wrote:On 5/11/21 9:46 PM, deadalnix wrote:Consider the following: https://godbolt.org/z/8vzx9W56a This is a clear demonstration that C++'s type system is unsound here. It is unsound because it has the property that you mentioned in your post: the pointer is monomorphic and the value this pointers points to is polymorphic. This is simply unsound, you cannot separate the two (unless you make everything immutable). The pointer and the value must come together as a bundle, and that whole bundle (which is a value type containing a reference) is right.This whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid.All of this is bizarrely incorrect. Care to elaborate?
May 12 2021
On Wednesday, 12 May 2021 at 12:11:22 UTC, deadalnix wrote:This is a clear demonstration that C++'s type system is unsound here.In fairness all generic low level programming languages that are practical to use have somewhat unsound type systems. Only high level languages can be fully sound (detect invalid programs at runtime and abort). C++ was forced into this mold by C though... (You can have heavily constrained low level languages that are sound)It is unsound because it has the property that you mentioned in your post: the pointer is monomorphic and the value thisWhat does monomorphic mean in this context? Why would not this hold: *Singleton <: *Anything I find the discussion very confusing at this point.
May 12 2021
On Wednesday, 12 May 2021 at 13:35:11 UTC, Ola Fosheim Grøstad wrote:*Singleton <: *AnythingTypo :-D, I meant pointer-to-Singeltong is subtype of pointer-to-Anything: Singleton* <: Anything*
May 12 2021
On 5/12/21 8:11 AM, deadalnix wrote:On Wednesday, 12 May 2021 at 11:41:20 UTC, Andrei Alexandrescu wrote:Ah, now we're at slicing. Love these forum discussions!On 5/11/21 9:46 PM, deadalnix wrote:Consider the following: https://godbolt.org/z/8vzx9W56a This is a clear demonstration that C++'s type system is unsound here. It is unsound because it has the property that you mentioned in your post: the pointer is monomorphic and the value this pointers points to is polymorphic. This is simply unsound, you cannot separate the two (unless you make everything immutable). The pointer and the value must come together as a bundle, and that whole bundle (which is a value type containing a reference) is itself what isThis whole model in C++ is unsound. It's easy to show. In you above example, the this pointer, typed as Widget*, points to an instance of a subclass of Widget. If you were to assign a Widget to that pointer (which you can do, this is a pointer to a mutable widget), then any references to that widget using a subtype of Widget is now invalid.All of this is bizarrely incorrect. Care to elaborate?
May 12 2021
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:Of course, but I thought the conversation was about strings, not value types. Last I checked, strings are reference types, in the same way that Java objects are reference types.I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.Being blunt is totally cool, but that doesn't make you right. There's no true subtyping or polymorphism with value semantics. This has been common knowledge in C++ - inheriting a value type is an antipattern for many reasons, and conversion operators are to be used carefully (and not as a substitute to subtyping) for many other reasons. With value types, it's all static typing, no polymorphism, no LSP beyond what's called ad-hoc polymorphism in the classic Caderlli et al paper (http://poincare.matf.bg.ac.rs/~smalkov/files/old/fp.r344.2016/public/predavanja/FP.cas.2016.07%20-%20p471-cardelli.pdf).What can be aimed for with values is called "parametric polymorphism" (which is NOT subtyping) by the same paper:The nice thing about D's template constraints, though, is that it allows us to impose subtype polymorphism on a parametrically polymorphic function."Parametric polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure." That works if and only if you can reasonably supplant the same primitives across said range of types. With enums that's onerous; as soon as you "derive" an enum from int you figure that ++x can't reasonably be implemented. Same goes for enum strings - you can't implement the expected string primitives so substitutability is out the window.++x still fulfills the contract that the derived enum has inherited from `int`: `++: int -> int`. It easily passes the substitutability test. Likewise, enums with a base type of string fulfill all the same contracts that `string` does. Nowhere in the contract of the string type does it specify that `s[1..$]` returns a value of the same type as `s`, just of type `string`, which a string enum does.Values are monomorphic.Are you saying that all values are monomorphic, or that _value types_ are monomorphic?Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again. The problem was the code was doing exactly what it shouldn't have, yet the assert was puzzlingly passing. Since everyone here is great at teaching basic type theoryJust so we're clear, my previous post was not trying to insinuate that I am an expert in type theory and you are just too ignorant to understand the arguments presented. I don't claim to be anything close to an expert and only know the basics, and you're the one with the doctorate here.it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); Then the assertion started failing as expected. Following that, I've used that example for years in teaching and to invariably there are eyes going wide when they hear that C++ pointers are monomorphic, it's the pointed-to values that are polymorphic, and that's an essential distinction. (In D, just like in Java, classes take care of that indirection automatically, which can get some confused.)You just said a paragraph back that values are monomorphic. So are pointed-to values monomorphic or polymorphic? This isn't a gotcha; I'm just confused about which you meant. I think the point you are trying to make with this story is that an operation on an enum that returns the base type will lead to confusing/wrong behaviour and allowing it for template functions which are meant to take strings would be bad design, just like it was with Widget.clone(). Is that right?
May 11 2021
On 5/11/21 10:36 PM, Meta wrote:Last I checked, strings are reference types, in the same way that Java objects are reference types.Just by means of clarification, that's not true because the length is stored with the pointer. This occasionally trips folks starting with D.
May 12 2021
On Wednesday, May 12, 2021 5:52:28 AM MDT Andrei Alexandrescu via Digitalmars- d wrote:On 5/11/21 10:36 PM, Meta wrote:To be more precise, a dynamic array in D is essentially struct Array(T) { size_t length; T* ptr; } So, the length is stored directly in the struct, and the data is referenced via the pointer stored in the struct. As such, we often refer to a D dynamic array as a pseudo-reference type. Either way, while the way it's put together has some very useful properties (like making it so that multiple dynamic arrays can be slices of the same data), there's no question that it can be confusing at first. And of course, that extends to strings, since D strings are dynamic arrays. - Jonathan M DavisLast I checked, strings are reference types, in the same way that Java objects are reference types.Just by means of clarification, that's not true because the length is stored with the pointer. This occasionally trips folks starting with D.
May 12 2021
On 5/11/21 10:36 PM, Meta wrote:++x still fulfills the contract that the derived enum has inherited from `int`: `++: int -> int`.No, that would be ref int -> ref int, which has consequences.
May 12 2021
On 11.05.21 23:36, Andrei Alexandrescu wrote:On 5/11/21 2:37 PM, Meta wrote:Deadalnix is saying that there is a subtyping relationship for rvalues, while you are pointing out that there is no subtyping relationship for lvalues. I think those are both correct. (Type theory has no notion of lvalues or rvalues, so those would indeed have to be interpreted as different types.) I fail to see why the semantics of lvalues should have any bearing on format strings even though I understand why most of Phobos might want to assume isSomeString talks about lvalues of the type.On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:Being blunt is totally cool, but that doesn't make you right. ...I apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.Again with moving the goalposts.To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.There's no true subtyping or polymorphism with value semantics. ...There's certainly subtyping. The point about "polymorphism" (in type theory, polymorphism typically refers to parametric polymorphism, but I guess you mean existential types), is a bit more tricky. I guess the point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without any runtime semantics can't support data types whose values do not embed runtime type info. However, it can certainly support value types, even value types that are stored without indirections.the assert was puzzlingly passing. Since everyone here is great at teaching basic type theory, it's an obvious problem - the fix is: assert(typeid(*this) == typeid(Widget)); ...That's a C++ quirk. Not much to do with type theory. In fact, C++ may not be a great example for illustration, as its type system is unsound.
May 11 2021
On 5/11/21 10:39 PM, Timon Gehr wrote:Deadalnix is saying that there is a subtyping relationship for rvalues, while you are pointing out that there is no subtyping relationship for lvalues. I think those are both correct.Well put. Rvalues can afford the luxury to change representation (e.g. from byte to int of float to double) because they're only used once. So a passable polymorphism scheme can be implemented via coercion.(Type theory has no notion of lvalues or rvalues, so those would indeed have to be interpreted as different types.)Hmmm... haven't looked in a while, but don't some of Java formalizations account for int, double etc. being values and consequently rvalues when passed around? (Though they can't be passed by reference so a formalization could get away with assuming int is a reference, e.g. ++x means "rebind reference x to a new reference to the value x + 1").I fail to see why the semantics of lvalues should have any bearing on format strings even though I understand why most of Phobos might want to assume isSomeString talks about lvalues of the type.It doesn't, the format string is just a symptom. The problem is that we change (already did, and massively... >100 instances of StringTypeOf) the standard library to accommodate what I think is an unproductive form of genericity.One matter is to distinguish what can be done from what D has already done and cannot change. For example, I tried some code just now and was... surprised. Meta mentioned that increment works with enums, and lo and behold it does: void main() { import std; enum X : int { x = 10, y = 20 } X x; writeln(x); ++x; writeln(x); } That prints "x" and then "cast(X)11". Meaning you can easily write a program that takes you outside enumerated values without a cast, which somewhat dilutes the value of "final switch" and the general notion that enumerated types are a small closed set. Arguably ++ should not be allowed on enumerated values. Surprises go on: void main() { import std; enum X : string { x = "Hello, world!", y = "xyz" } X x; writeln(x); x = x[1 .. $]; writeln(x); } That prints: x cast(X)ello, world! which showcases, as a little distraction, a bug in the formatting of enums - the string should be quoted properly. But the larger point is that enum types derived from string actually allow, again, stepping outside their universe with ease. This cramps my style somewhat because during the whole discussion I assume that doesn't work, or at least shouldn't. I guess an argument could be built that its semantics is what it is. Anyway, the other side of the argument that got ignored is the alias this thing: void main() { import std; static struct X { string fun(); alias fun this; } X x; x = x[1 .. $]; } This doesn't compile; the slice does, but the assignment doesn't. Which means there are differences in what would be expected of a string (or, as it turns out, an enum string) and what would be expected of a type that converts to string by means of alias this.There's no true subtyping or polymorphism with value semantics. ...There's certainly subtyping. The point about "polymorphism" (in type theory, polymorphism typically refers to parametric polymorphism, but I guess you mean existential types), is a bit more tricky. I guess the point is that a language that wants `f(σ) ⊆ ∃τ. f(τ)` to hold without any runtime semantics can't support data types whose values do not embed runtime type info. However, it can certainly support value types, even value types that are stored without indirections.
May 12 2021
On Wednesday, 12 May 2021 at 13:39:50 UTC, Andrei Alexandrescu wrote:One matter is to distinguish what can be done from what D has already done and cannot change. For example, I tried some code just now and was... surprised. Meta mentioned that increment works with enums, and lo and behold it does: void main() { import std; enum X : int { x = 10, y = 20 } X x; writeln(x); ++x; writeln(x); } That prints "x" and then "cast(X)11". Meaning you can easily write a program that takes you outside enumerated values without a cast, which somewhat dilutes the value of "final switch" and the general notion that enumerated types are a small closed set. Arguably ++ should not be allowed on enumerated values. Surprises go on: void main() { import std; enum X : string { x = "Hello, world!", y = "xyz" } X x; writeln(x); x = x[1 .. $]; writeln(x); } That prints: x cast(X)ello, world! which showcases, as a little distraction, a bug in the formatting of enums - the string should be quoted properly. But the larger point is that enum types derived from string actually allow, again, stepping outside their universe with ease.I've raised these problem on a regular basis for years now. This is obviously another instance where things are unsound, and needs to be fixed. Last time I we had a discussion on the matter, it went in a loop that is best summarized as this: enum E : int { A, B, C } while (true) { Me: A | B ought to be an int, not an E. W&A: But you need it to be an enum, so that you can do things like combining flags and stay. As in: enum Mode { Read, Write } openFile(file, Mode.Read | Mode.Write); Me: Wl then, you can't have final switch, because you don't have the guarantee it rely on. W&A: final switch is very much needed, from X, Y Z reason. } This is extremely tiresome and kinda looks like the current discussion (or another one would be the in contract needing to be statically bound, where Timon and Myself had to fish for Bertrand Meyer because nothing short of an argument from authority could do the trick). So if we get nothing else out of that discussion, fixing enum so that they don't go out of the allowed set of value would be nice. It's just unfortunate that it takes literally 5 years+ to get to a point where this is even acknowledged as being an issue. I hope we can somehow shorten that process, because it's not workable as it is. You have people around like Timon and myself who have an eye for this. It's free brainpower you are leaving not leveraging.
May 12 2021
On 5/12/21 11:30 AM, deadalnix wrote:Last time I we had a discussion on the matter, it went in a loop that is best summarized as this: enum E : int { A, B, C } while (true) { Me: A | B ought to be an int, not an E. W&A: But you need it to be an enum, so that you can do things like combining flags and stay. As in: enum Mode { Read, Write } openFile(file, Mode.Read | Mode.Write); Me: Wl then, you can't have final switch, because you don't have the guarantee it rely on. W&A: final switch is very much needed, from X, Y Z reason. }I know this is Walter's take, but please don't ascribe it to me as well. I could at the very best give a nod to practicality, but I very much think that typing binary "or" on enums as the enum is a kludge.This is extremely tiresome and kinda looks like the current discussion (or another one would be the in contract needing to be statically bound, where Timon and Myself had to fish for Bertrand Meyer because nothing short of an argument from authority could do the trick). So if we get nothing else out of that discussion, fixing enum so that they don't go out of the allowed set of value would be nice. It's just unfortunate that it takes literally 5 years+ to get to a point where this is even acknowledged as being an issue.This reach for credit here does not seem very well deserved.I hope we can somehow shorten that process, because it's not workable as it is. You have people around like Timon and myself who have an eye for this. It's free brainpower you are leaving not leveraging.I will say what follows with the utmost respect. I think Timon is way better at these things (like in, incomparably better) than you and me combined. He most certainly is less skilled than you at other things, but as far as PL theory in this group goes, he and Paul are the only game in town.
May 12 2021
On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu wrote:I will say what follows with the utmost respect. I think Timon is way better at these things (like in, incomparably better) than you and me combined. He most certainly is less skilled than you at other things, but as far as PL theory in this group goes, he and Paul are the only game in town.You are so wonderful at being inclusive... :-P Never seen anyone in these forums that haven' said things about PL theory that is either wrong or lacks nuance. Applies to Andreis, Timons, Pauls alike... However, since most here does not have comp.sci. background it would be nice if we stop hiding behind terminology (which people will perceive differently even if they have comp.sci. background which is why papers use references). deadalnix is explaining how he uses the terms which makes the thread more inclusive for all. Your dismissal is not helpful.
May 12 2021
On Wednesday, 12 May 2021 at 18:35:49 UTC, Andrei Alexandrescu wrote:I will say what follows with the utmost respect. I think Timon is way better at these things (like in, incomparably better) than you and me combined. He most certainly is less skilled than you at other things, but as far as PL theory in this group goes, he and Paul are the only game in town.It's fine, then just listen to him and not to me. That already would be vast improvement over the current state of affairs.
May 12 2021
On Wednesday, 12 May 2021 at 02:39:23 UTC, Timon Gehr wrote:It isn't a quirk. To get dynamic lookup you need to add a virtual member. class A { public: virtual void nothing(){} void test(){ std::cout << typeid(*this).name() << std::endl; std::cout << typeid(A).name() << std::endl; } }; class B : public A { }; void test_typeinfo(){ B b{}; b.test(); }assert(typeid(*this) == typeid(Widget)); ...That's a C++ quirk. Not much to do with type theory. In fact, C++ may not be a great example for illustration, as its type system is unsound.
May 12 2021
On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:Values are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again.I don't understand what you mean by pointers being monomorphic. this will always have the type of the class it was defined in. So the assert will always hold. How is this surprising??? What is more dangerous is that if you forget to add a virtual member then *this will also always hold as being a Widget. That is the result of C++ being a low-level language. No sensible high level language would allow such semantics.
May 12 2021
On Wednesday, 12 May 2021 at 14:52:42 UTC, Ola Fosheim Grøstad wrote:On Tuesday, 11 May 2021 at 21:36:46 UTC, Andrei Alexandrescu wrote:Ok, consider the following. class A {}; class B: public A {}; A *a = new B(); tyepid(a) is A*. In C++, a is monomorphic. typeid(*a) is B. In C++, *a is polymorphic.Values are monomorphic. Years ago I found a bug in a large C++ system that went like this: class Widget : BaseWidget { ... Widget* clone() { assert(typeid(this) == typeid(Widget*)); return new Widget(*this); } }; The assert was a _monomorphism test_, i.e. it made sure that the current object is actually a Widget and not something derived from it, who forgot to override clone() once again.I don't understand what you mean by pointers being monomorphic.
May 12 2021
On Wednesday, 12 May 2021 at 15:35:26 UTC, deadalnix wrote:Ok, consider the following. class A {}; class B: public A {}; A *a = new B(); tyepid(a) is A*. In C++, a is monomorphic. typeid(*a) is B. In C++, *a is polymorphic.Sadly, IIRC typeid(*a) is A, because A does not contain a virtual member... typeid(a) is A*, because that is the type of the pointer. However, the relationship between B* and A* is polymorphic, because you can use B* in the context where you expect A*? E.g. you can call a function that expects paramater A* with a pointer B*. So that makes the relationship polymorphic? I have to admit I never use the terminology monomorphic and polymorphic, so my understanding could be wrong. If so, I am probably not alone in the thread, so for the sake of other readers, maybe someone can provide a definition for monomorphic?
May 12 2021
On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad wrote:typeid(a) is A*, because that is the type of the pointer. However, the relationship between B* and A* is polymorphic, because you can use B* in the context where you expect A*? E.g. you can call a function that expects paramater A* with a pointer B*. So that makes the relationship polymorphic?To be more precise. B* is a subtype of A* if you can use B* in contexts where A* is expected, which is polymorphic in nature. More interestingly, pure OO-languages like Beta provide type-variables. C++/D lack those. So in such languages you can bind new types to type-variables and therefore change the typing of elements of arrays and such in subclasses. (Which leads to other challenges, all languages seem to have some kind of challenge associated with them once they allow polymorphisms)
May 12 2021
On Wednesday, 12 May 2021 at 16:46:40 UTC, Ola Fosheim Grøstad wrote:On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad wrote:I would say it is a sybtype, yes, but polymorphism imply that there are several ways to see the same thing, which, as Andrei points out, imply that you go through a reference somewhere.typeid(a) is A*, because that is the type of the pointer. However, the relationship between B* and A* is polymorphic, because you can use B* in the context where you expect A*? E.g. you can call a function that expects paramater A* with a pointer B*. So that makes the relationship polymorphic?To be more precise. B* is a subtype of A* if you can use B* in contexts where A* is expected, which is polymorphic in nature.
May 12 2021
On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad wrote:I have to admit I never use the terminology monomorphic and polymorphic, so my understanding could be wrong. If so, I am probably not alone in the thread, so for the sake of other readers, maybe someone can provide a definition for monomorphic?It's quite simple. *a is polymorphic, because it it an object of type A as far as the user of *a is concerned, but it is actually an object of type B (or any other subtype of a). a itself isn't polymorphic, because it is a pointer to an A no matter what. It is not a pointer to a B that is observed as if it was a pointer to an A. There is nothing more in it to be discovered at run time, it's just a pointer. Even if you do B *b = ...; A *a = b; Then you have not an instance of polymorphism, simply that you had a pointer to a B, and now you also have a pointer to an A.
May 12 2021
On Wednesday, 12 May 2021 at 16:49:12 UTC, deadalnix wrote:On Wednesday, 12 May 2021 at 16:38:10 UTC, Ola Fosheim Grøstad wrote:I think I understand what you mean, but the terminology used is confusing me. A monomorphic function/operator works on only one type, but a polymorphic function/operators works on many types. Seems to me that A* can work on many types, but B* can only work on one type (if has no subclasses. So wouldn't that make A* be polymorphic in nature, but B* be monomorphic in nature? I've recently found it better (less baggage) to think in terms of protocols than classes. Then A* would be a pointer to something that provides the A-protocols. So when an A* pointer points to a B instance then we can think of it as if it points to the A-protocols that B provides. Maybe then you could claim that it is monomorphic as it only binds to A-protocols. But that is not actually the case, as you have the ability to cast A* to B*. So then it would be polymorphic...? I dunno. Seems it is a matter of perspective, if "monomorphic" means "of one form".I have to admit I never use the terminology monomorphic and polymorphic, so my understanding could be wrong. If so, I am probably not alone in the thread, so for the sake of other readers, maybe someone can provide a definition for monomorphic?It's quite simple. *a is polymorphic, because it it an object of type A as far as the user of *a is concerned, but it is actually an object of type B (or any other subtype of a). a itself isn't polymorphic, because it is a pointer to an A no matter what. It is not a pointer to a B that is observed as if it was a pointer to an A. There is nothing more in it to be discovered at run time, it's just a pointer.
May 12 2021
On Tuesday, May 11, 2021 12:37:20 PM MDT Meta via Digitalmars-d wrote:On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:Having isSomeString accept types that implicitly converted to string would be a disaster. Templates do not operate on implict conversions - or even on subtypes. They operate on the exact type they're given. You can, of course, write a template constraint which checks for implicit conversions, but you still don't get the implicit conversion when the template is instantiated. You get the original type. This has a number of implications, but in general, it leads to bugs if templates check for implicit conversions instead of exact types. In particular, any templated function which checks for an implicit conversion then needs to force the implicit conversion, or it will likely not work properly - be it because you get compilation errors, or because the original type compiles with the same code but does not behave the same way as the type from the implicit conversion which was not actually made. In fact, IIRC, at one point, isSomeString _did_ work with enums, and we fixed it so that it didn't, because it was causing problems. Also, IIRC, it was my fault that it was ever made to work with enums, and I very much regret that. In general, implicit conversions have no business in template constraints. Obviously, there are exceptions to that, but in general, there will be fewer bugs if the conversions are done explicitly by the code instantiating the template. The reason that it's done in Phobos as much as it is is primarily because of code that was originally not generic which was later templatized (often because it took string and was changed to work on multiple string types or to work on general ranges of characters). And in most cases where we've tried to templatize functions without breaking code, we've had problems because of the implicit conversions that worked before. std.traits.isConvertibleToString is one such abomination which came out of that (its use usually results in code that slices local variables and escapes them, which is really bad). IIRC, that was done by Walter, and if he's making mistakes like that with regards to implicit conversions and templated code, what do you think the average D programmer is doing? The main reason for bringing up popFront and enums is to show that that enums with a base type of string are not actually strings, and treating them as if they were causes serious problems. There are of course places where that sub-typing results in implicit conversions, but templates do not work that way, and trying to force it is very problematic. The proliferation of template constraint and static if complexity that Andrei is complaining about with regards to stuff like format is the result of that, and it's the kind of code that's very hard to get right. Simply not trying to support those implicit conversions with templated functions _significantly_ reduces the complexity of such code with the only cost being that the code instantiating the template will have to use cast(string) on the enum value. - Jonathan M DavisI apologize for injecting myself into this conversation, but with all due respect, what the hell are you talking about? Everything Deadalnix is saying makes perfect sense - it's basic type theory, and yet you're accusing him of moving goalposts and making up definitions, etc. The problem is that `isSomeString` doesn't respect the LSP and the template constraints on the relevant stdlib functions for enums are a hack to work around that. End of story. if `isSomeString` was defined sensibly, these template constraint hacks would not have to exist. All the bluster about `popFront` on enum strings, etc. is completely irrelevant, and is a red herring anyway (as was already explained). I'm sorry for being so blunt, but this conversation is painful to read.Again with moving the goalposts.To clarify: you can't make up your own definitions as you go so as to support the point you're making at the moment. You can't go "oh, call it something else than a type, my point stays". No. Your point doesn't stay. By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.
May 12 2021
On 5/12/21 6:36 AM, Jonathan M Davis wrote:Having isSomeString accept types that implicitly converted to string would be a disaster.Sadly that's exactly what StringTypeOf does: https://run.dlang.io/is/8xqPKr We should eliminate all uses of StringTypeOf from phobos.
May 12 2021
On Tuesday, 11 May 2021 at 16:44:03 UTC, Andrei Alexandrescu wrote:By the same token you can't make up your own definition of what subtyping is and isn't. Value types and reference types are well-trodden ground. You can't just claim new terminology and then prove your own point by using it.I simply removed an assumption that isn't relevant to the case I'm making, namely wether you consider ref string to be a type or not, because it doesn't affect the conclusion and therefore isn't a debate worth getting into. You made the point that SomeEnumString cannot be considered a subtype of string because things start breaking when it is passed by ref, and I retort that the exact same things break in the exact same way for subtypes, making your argument moot. You say "B is not a subtype of A because it exhibit behavior X when passed by ref" I say "D is a known subtype of C, and it also exibhit behavior X when passed by ref, therefore X cannot be used as a justification that B isn't a subtype of A" We can argue to no end about what is the right definition that should be used for X, but it really doesn't change the overall point that is being made.
May 11 2021
On Tuesday, 11 May 2021 at 12:14:42 UTC, deadalnix wrote:On Tuesday, 11 May 2021 at 12:05:18 UTC, Andrei Alexandrescu wrote:I realize that this require further explanations. The fact that B is a subtype of A doesn't imply that a type constructed from B is a subtype of that same construction using A. For instance, A function() would be a subtype of B function(), the relation reversed in that example. In your example, you are constructing a ref SomeEnumString and expecting it to be a subtype of string (or maybe ref string) but both are incorrect assumptions. This is because you can execute operation that require covariance as well as operation that require contravariance on a ref, therefore, it needs to be exactly the same type. This is hardly an exceptional situation, this also happens when taking an array, B being a subtype of A doesn't mean the B[] is a subtype of A[]. Interestingly, it is the case for const ref, or const arrays, which is where the push toward handling const ref differently comes from. In any case, it is not expect from format that it modify teh pattern it takes as an input. In fact, it is a god damn compile time parameter, it is not mutable to begin with. It is therefore expected that this works.That must be a type error, this is a feature, not a bug. This is not expected to work.I'm not sure what's the problem is here. Do you have a concrete example?Of course. A range must implement popFront with the signature: void popFront(ref SomeEnumString s) { ... please fill in the implementation ... }
May 11 2021
On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:This feels a bit like the real problem might be in the conflation of the container (the enum or the string) and the range? Cf. the way this is handled in Rust, where there is a clear distinction between a container, versus an iterator over that container: https://doc.rust-lang.org/rust-by-example/flow_control/for.html Note also the different ways that the iterator can be generated: either using a reference to the container itself, or by moving the container into the iterator so the container itself is consumed by the iteration.Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101.So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.
May 10 2021
On Monday, 10 May 2021 at 17:09:37 UTC, Joseph Rushton Wakeling wrote:Cf. the way this is handled in Rust, where there is a clear distinction between a container, versus an iterator over that container:That is true for C++ and Python as well. C++ has begin(object)/end(object) and Python has iter(object).
May 10 2021
On 5/10/21 1:09 PM, Joseph Rushton Wakeling wrote:On Monday, 10 May 2021 at 04:21:34 UTC, Andrei Alexandrescu wrote:True, D has only "orphan" ranges, no containers. std.container is not working out and with current D technology we can't define containers that work with safe/pure/nogc at the same time (two out of three we can). If you consider the enum string value a container and the string extracted from it a range of that container, I think that would be a valid way to look at the matter.This feels a bit like the real problem might be in the conflation of the container (the enum or the string) and the range? Cf. the way this is handled in Rust, where there is a clear distinction between a container, versus an iterator over that container: https://doc.rust-lang.org/rust-by-example/flow_control/for.html Note also the different ways that the iterator can be generated: either using a reference to the container itself, or by moving the container into the iterator so the container itself is consumed by the iteration.Popping the head out of an enum value ought to be a string, not that enum's value. I don't really see where the problem is here, this is subtyping 101.So you have a range r of type T. You call r.popFront(). Obvioulsly the type of r should stay the same because in D variables don't change type. So... what gives, young Padawan? No, this is not subtyping 101.
May 11 2021
On Tuesday, 11 May 2021 at 13:41:53 UTC, Andrei Alexandrescu wrote:True, D has only "orphan" ranges, no containers. std.container is not working out and with current D technology we can't define containers that work with safe/pure/nogc at the same time (two out of three we can).How much value does pure have here anyway? Typical container usage involves allocating from the global (!) heap, which arguably *should* be impure, hacks like `pureMalloc` notwithstanding.
May 11 2021
On 11.05.21 16:38, Paul Backus wrote:allocating from the global (!) heap, which arguably *should* be impureI think this is confusing different levels of abstraction. What should be impure is accessing memory addresses as integers.
May 11 2021
I think it makes possible sense to require either wrappers that clarify intent, or always treat enums the same way (as an enum). I think Phobos *mostly* does the latter. Erroring for ambiguity might be more disruptive than it's worth.
May 11 2021