digitalmars.D - Algorithms should be free from rich types
- =?UTF-8?Q?Ali_=c3=87ehreli?= (48/48) Jun 27 2023 My mind is not fully clear on this topic yet but some related things
- H. S. Teoh (84/129) Jun 27 2023 I can't resist me a Walter quote here:
- FeepingCreature (12/39) Jun 28 2023 I like this approach:
- =?UTF-8?Q?Ali_=c3=87ehreli?= (3/5) Jun 28 2023 I love it. And I actually tried but no, D does not have this yet. :D
- Richard (Rikki) Andrew Cattermole (32/32) Jun 28 2023 Oh how you dare me.
- Adam D Ruppe (5/6) Jun 28 2023 just do
- bachmeier (2/13) Jun 28 2023 This would be a good change to the language.
- Cecil Ward (2/17) Jun 28 2023 I’m not sure, but I’m thinking ‘yes’.
- Steven Schveighoffer (22/35) Jun 29 2023 ```d
- Steven Schveighoffer (5/26) Jun 29 2023 Oh wait, the `__traits(getMember)` trick doesn't work on member
- Max Samukha (4/16) Jun 28 2023 That's some poorly designed library (Phobos?). A decently
- =?UTF-8?Q?Ali_=c3=87ehreli?= (7/8) Jun 28 2023 Not in the D world at all.
- Hipreme (13/21) Jun 28 2023 I have had a rant with `private` since the time I used LibGDX
- bachmeier (3/24) Jun 28 2023 [Rich
- =?UTF-8?Q?Ali_=c3=87ehreli?= (29/31) Jun 28 2023 Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from...
- Atila Neves (8/11) Jun 29 2023 I have lost count of how many times my life has been made
- Jonathan M Davis (27/39) Jun 29 2023 Yeah. As with many things, I think that it primarily comes down to good ...
- Steven Schveighoffer (8/23) Jun 29 2023 private is good for the library writer.
- H. S. Teoh (26/43) Jun 29 2023 [...]
- Atila Neves (7/37) Jun 30 2023 API design is indeed hard. Which makes it all the more imperative
- bachmeier (10/17) Jun 30 2023 Private is more like locking everyone else's doors for their own
- monkyyy (3/8) Jun 30 2023 Why do people make arguments about data ownership at all?
- Timon Gehr (5/15) Jul 03 2023 That's why functions are not making the arguments. API design is a
- H. S. Teoh (67/84) Jun 30 2023 The thing is, both of the above are true.
- bachmeier (27/43) Jun 30 2023 That can never happen if you have to explicitly override
- Meta (3/12) Jun 30 2023 IMO private is extremely important for maintaining the internal
- Dom DiSc (12/14) Jul 01 2023 Yes. And this is pretty much the only reason to use private.
- Dukc (4/7) Jul 02 2023 You do have it. `__traits(getMember, /+...+/)` as others have
- Atila Neves (15/30) Jul 03 2023 I don't see how - it only applies to your own code, adding
- Steven Schveighoffer (19/31) Jul 03 2023 That is the issue. For instance, if you do:
- H. S. Teoh (33/43) Jul 03 2023 [...]
- Steven Schveighoffer (22/57) Jul 03 2023 But that's the thing, there are parts that *simply must be private*. No
- H. S. Teoh (40/64) Jul 03 2023 We're actually agreeing with each other, y'know. :-D
- claptrap (5/13) Jul 03 2023 the correct usage of "ideal" is..
- Steven Schveighoffer (30/86) Jul 03 2023 Yeah kind of. It's just that there are 2 types of privacy labeling,
- H. S. Teoh (73/135) Jul 05 2023 Depends. D is flexible enough that public fields can be replaced with
- Dukc (18/34) Jul 02 2023 The language-agnostic answer is to patch the library yourself to
My mind is not fully clear on this topic yet but some related things have been brewing in me for years. First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.) To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type... The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore. People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done. Ok, that rant is over. The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take. Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.) But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary... Of course I understand the benefits of all those types but the core algorithm should be as free as possible. So, this is simply wrong. I think us, software developers, have been on the wrong path. Our task should primarily be about getting things done first. I could work with those types if they had virtual interfaces. But no. They are un-subtypable C++ 'class'es. I think it could also work if the algorithm was templatized; but again, no... Hey! Thank you! I feel better already. :) Ali
Jun 27 2023
On Tue, Jun 27, 2023 at 02:53:59PM -0700, Ali Çehreli via Digitalmars-d wrote: [...]First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.)I can't resist me a Walter quote here: I've been around long enough to have seen an endless parade of magic new techniques du jour, most of which purport to remove the necessity of thought about your programming problem. In the end they wind up contributing one or two pieces to the collective wisdom, and fade away in the rearview mirror. -- Walter Bright When you start doing something with the code because that's what everybody else does, or because it's what everyone else says is "the Right Thing(tm)", then it's just cargo-culting, which inevitably leads to problems down the road.To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type...Thing is, things like these usually come from temporary hacks in the code that the original coder didn't want to set in stone, but that end up staying put because of inertia and becoming de facto set in stone.The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore. People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done.IOW, empower the user instead of straitjacketing them. My favorite programming modus operandi. Along the same lines as my philosophy of "everything should be a library, main() is just a convenient (thin) interface to access the library API". [...]The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take. Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.)That's a sign of poorly-factored code. The logically-separate parts of the code are not properly separated out, causing them to be dependent on each other where they technically should not be. Doing this right is actually a lot harder than it looks; it often requires significant amounts of refactoring after your initial implementation, because until you write the thing out in code, it isn't always clear which parts are actually dependent and which parts can be separated. Idioms like pipeline programming with ranges help to identify independent pieces of the logic, and abstractions like the range API help you actually separate out the pieces in a clean way. Without a unifying common API like ranges, it's pretty tough to write code in composable pieces that can be freely mixed-and-matched with each other. https://wiki.dlang.org/Component_programming_with_ranges Well, obviously you already know about this article, but one of my motivations for writing that article was precisely what you describe above.But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary... Of course I understand the benefits of all those types but the core algorithm should be as free as possible. So, this is simply wrong. I think us, software developers, have been on the wrong path. Our task should primarily be about getting things done first.Over the years, I've been dreaming about the ideal situation where there would be libraries of algorithms that are not tied to a specific implementation (i.e., bound to concrete types and parameter values), but are written in a form that encapsulates only its core logic. You'd then pull in the algorithm by specifying which concrete type(s) to bind its various parts to, and it'd Just Work(tm). That's the way things should have been from the beginning. But the situation today is far from that ideal: you have libraries that solve some particular programming problem X, but to use the library's solution you need to use also Y, Z, and W that the author of that library happened to choose. For instance, the FreeType library implements rasterization algorithms, but you can't access those algorithms directly. You have to use the library API, which abstracts away file handling, memory management, image type, etc.. In order to cater to different user needs, an entire complicated API is invented to allow the user to specify certain parameters the authors deem tweakable, while an elaborate scheme is designed to hide the rest of the information away. You can't effectively use the rasterization algorithm without also using all of these other peripheral types; and when you need to interface FreeType with another library that uses other, different concrete types, you end up having to write lots of shunt code whose sole purpose is to bridge between incompatible types that actually do equivalent things.I could work with those types if they had virtual interfaces. But no. They are un-subtypable C++ 'class'es. I think it could also work if the algorithm was templatized; but again, no...[...] In cases like this, I often get really tempted to copy-n-paste the code and templatize it myself. :-D Of course, in practice that's usually impractical, so the next best thing is to use D's compile-time introspection capabilities to autogenerate boilerplate shunt code to work around API infelicities in the target library, and export a nicer API on the D side. :-D Not always possible, of course, like in your case, where you'd have to either copy-n-paste code and do un- safe casts, or live with infelicities like writing stuff to a file and opening it via the official API. (I had to do something similar once in my day job, interfacing with a grossly over-engineered C++ framework that nobody fully understood nor wanted anything to do with if they could help it -- I ended up having to write a hack where a single function call involved 7 layers of abstraction, one of which involved writing a struct to a temporary file on one side of an RPC call and having the other side (a daemon process) read from the file and cast it back to the struct. The result was the stuff of nightmares that, to everyone's great relief, was phased out a couple of releases later. We relished every moment of typing `\rm -rf` on that entire old codebase after its replacement became fully functional.) T -- 2+2=4. 2*2=4. 2^2=4. Therefore, +, *, and ^ are the same operation.
Jun 27 2023
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:My mind is not fully clear on this topic yet but some related things have been brewing in me for years. First, an aside: You may remember my minor complaint about 'private' during a DConf presentation years ago. Today, I feel even stronger that disallowing access to parts of software "just because" of good design is a mistake. I've seen multiple examples of this in professional life where a developer uses 'private' only because it is "of course" better to do so. (The Turkish word "işgüzar" and the German word "verschlimmbessern" describe the situation pretty well for me but the English language lacks such a word.) To give an example from D's ecosystem, the D runtime's garbage collector statistics object used to be 'private'. (I think there is an interface for it now.) What an inconvenience it was to copy/paste that type's definition from the runtime to user code, get the compiled symbol of the object from the library, and pointer cast it to be able to access the members! A 'static assert' attempts to protect the project from changes to that type... The idea of 'private' should be to just give the developer freedom to change the implementation in the future. It should not impede use cases that people come up with. That can be achieved practically with an underscore: Make everything 'public' and name your implementation details with an underscore. People who need them will surely know they are implementation details that can change in the future but they will be happy: They will get things done.I like this approach: ``` class C { private int i; } ... void main() system { auto c = new C; c.private.i = 5; } ```
Jun 28 2023
On 6/28/23 01:00, FeepingCreature wrote:auto c = new C; c.private.i = 5;I love it. And I actually tried but no, D does not have this yet. :D Ali
Jun 28 2023
Oh how you dare me. --- app.d module app; import foo; void main() { Foo foo = new Foo; foo.privateGet!"i" = 2; foo.say(); } ref privateGet(string name, From)(ref From from) { static foreach(I; 0 .. from.tupleof.length) { { enum Name = __traits(identifier, from.tupleof[I]); static if (Name == name) { return from.tupleof[I]; } } } assert(0); } --- foo.d module foo; class Foo { void say() { import std.stdio; writeln(i); } private: int i; bool b; }
Jun 28 2023
On Wednesday, 28 June 2023 at 17:06:43 UTC, Richard (Rikki) Andrew Cattermole wrote:Oh how you dare me.just do __traits(getMember, foo, "i") = 2; reflection bypasses private
Jun 28 2023
On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature wrote:I like this approach: ``` class C { private int i; } ... void main() system { auto c = new C; c.private.i = 5; } ```This would be a good change to the language.
Jun 28 2023
On Wednesday, 28 June 2023 at 17:40:25 UTC, bachmeier wrote:On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature wrote:I’m not sure, but I’m thinking ‘yes’.I like this approach: ``` class C { private int i; } ... void main() system { auto c = new C; c.private.i = 5; } ```This would be a good change to the language.
Jun 28 2023
On 6/28/23 4:00 AM, FeepingCreature wrote:I like this approach: ``` class C { private int i; } ... void main() system { auto c = new C; c.private.i = 5; } ``````d auto usePrivate(T)(ref T thing) system { static struct GetMeThePrivateStuff { disable this(this); // shouldn't be copied about, meant to be a temporary access private T* _thing; // "private" lol auto ref opDispatch(string s, Args...)(Args args) { static if(Args.length == 0) return __traits(getMember, *_thing, s); else return __traits(getMember, *_thing, s)(args); } } return GetMeThePrivateStuff(&thing); } ``` Yeah, yeah, it needs work. But you get the idea. D is all-powerful. -Steve
Jun 29 2023
On 6/29/23 10:15 PM, Steven Schveighoffer wrote:```d auto usePrivate(T)(ref T thing) system { static struct GetMeThePrivateStuff { disable this(this); // shouldn't be copied about, meant to be a temporary access private T* _thing; // "private" lol auto ref opDispatch(string s, Args...)(Args args) { static if(Args.length == 0) return __traits(getMember, *_thing, s); else return __traits(getMember, *_thing, s)(args); } } return GetMeThePrivateStuff(&thing); } ```Oh wait, the `__traits(getMember)` trick doesn't work on member functions, interesting... So maybe half-powerful ;) -Steve
Jun 29 2023
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:My mind is not fully clear on this topic yet but some related things have been brewing in me for years. Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.) But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...That's some poorly designed library (Phobos?). A decently designed one would at least allow you to construct a MySlice instance from a (pointer, length) pair.
Jun 28 2023
On 6/28/23 02:25, Max Samukha wrote:That's some poorly designed library (Phobos?).Not in the D world at all. Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library. Ali
Jun 28 2023
On Wednesday, 28 June 2023 at 17:00:44 UTC, Ali Çehreli wrote:On 6/28/23 02:25, Max Samukha wrote:I have had a rant with `private` since the time I used LibGDX Particle System. I wasn't able to extend its particle system to add collision to it, why? Because the particles were `private`. Since that, I never used `private` anymore without a very very good reason to do so, the only place I use it right now is for intermediate processes on a full process. People in industry knows nothing on how to use `protected`. Protected IMO should be the industry standard. I have worked in a codebase which is being refactored for at least 3 years, there's so many changes on `private` not being used after some time. Why is that? Because programmers should not fear themselves most of the time.That's some poorly designed library (Phobos?).Not in the D world at all. Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library. Ali
Jun 28 2023
On Wednesday, 28 June 2023 at 17:12:17 UTC, Hipreme wrote:I have had a rant with `private` since the time I used LibGDX Particle System. I wasn't able to extend its particle system to add collision to it, why? Because the particles were `private`. Since that, I never used `private` anymore without a very very good reason to do so, the only place I use it right now is for intermediate processes on a full process. People in industry knows nothing on how to use `protected`. Protected IMO should be the industry standard. I have worked in a codebase which is being refactored for at least 3 years, there's so many changes on `private` not being used after some time. Why is that? Because programmers should not fear themselves most of the time.[Rich Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html):At some point though, someone is going to need to have access to the data. And if you have a notion of “private”, you need corresponding notions of privilege and trust. And that adds a whole ton of complexity and little value, creates rigidity in a system, and often forces things to live in places they shouldn’t.If people don’t have the sensibilities to desire to program to abstractions and to be wary of marrying implementation details, then they are never going to be good programmers.
Jun 28 2023
On 6/28/23 10:38, bachmeier wrote:[RichHickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html): Amen! I've just finished reading most of it (skipped some Clojure specific parts). The following part is worth quoting as well: "When we drop down to the algorithm level, I think OO can seriously thwart reuse. In particular, the use of objects to represent simple informational data is almost criminal in its generation of per-piece-of-information micro-languages, i.e. the class methods, versus far more powerful, declarative, and generic methods like relational algebra. Inventing a class with its own interface to hold a piece of information is like inventing a new language to write every short story. This is anti-reuse, and, I think, results in an explosion of code in typical OO applications." One more quote both to stay unkind to my ex-favorite language and to relate to our ever-present discussions on the GC's appropriateness in libraries: "The complexity [of C++] is stunning. It failed as the library language it purported to be, due to lack of GC, in my opinion, and static typing failed to keep large OO systems from becoming wretched balls of mud. Large mutable object graphs are the sore point, and const is inadequate to address it. Once C++’s performance advantage eroded or became less important, you had to wonder—why bother? I can’t imagine working in a language without GC today, except in very special circumstances." Ali
Jun 28 2023
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:My mind is not fully clear on this topic yet but some related things have been brewing in me for years. [...]I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
Jun 29 2023
On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:Yeah. As with many things, I think that it primarily comes down to good API design (which can be hard). private prevents implementation details from being mucked with, which can be a lifesaver when refactoring and can be a big help with testing and ensuring that things work as expected when other folks use your code. On the other hand, if you fail to make it so that the API provides what your users need, then it could easily be the case that some stuff that should have been available is locked behind private, making their lives harder (or even impossible, depending on what they're trying to do). Similarly, if you actually plan your API around generic types, then it's much easier for folks to make it work with their own types, but it's not always obvious when you should be doing that vs designing an API around more specific types - and it's often the case that code goes from using more specific types to being more flexible as it matures (though that's harder to do in cases where you can't reasonably make sure that all user code gets updated when you make changes, which can make fixing such issues in open source code harder than in company code). So, I'm very much in favor of private being the default, but programmers need to be aware of API issues that can come from being too specific with APIs and locking away stuff that users may actually need. Experience can help a lot with that, though it isn't always easy, and there are plenty of folks out there who just put something together that "works" and leave folks to deal with the mess when something better thought out would have been far more useful. Actively trying to come up with good APIs instead of something that just works can go a long way. - Jonathan M DavisMy mind is not fully clear on this topic yet but some related things have been brewing in me for years. [...]I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
Jun 29 2023
On 6/29/23 10:44 AM, Atila Neves wrote:On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:private is good for the library writer. arbitrary access to private is good for the user/hacker. Honestly though, since private data is accessible through an escape hatch hack (i.e. `__traits(getMember)`), and the library writer can just say "whatevs, you broke it, you bought it", I think we are in a reasonable space. -SteveMy mind is not fully clear on this topic yet but some related things have been brewing in me for years. [...]I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
Jun 29 2023
On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via Digitalmars-d wrote:On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:[...][...] True. It comes down to good API design. Which, as you say, is very hard, probably harder than most people realize. It's easy to slap an ad hoc API onto your library functions, but over time it will prove inadequate for user needs and they will feel frustrated over why certain things are locked behind private. IME, it takes several iterations of actually using a particular API before it becomes clear where the friction points are and what are possible alternative designs that may work better for user code. (And also, which parts of the API are perhaps needlessly complex and could probably be simplified.) The problem is that if you have actual users during this period of time, they will start writing code that depends on the current API, which obligates you to support an inferior API even after a better design emerges.I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.Yeah. As with many things, I think that it primarily comes down to good API design (which can be hard).Similarly, if you actually plan your API around generic types, then it's much easier for folks to make it work with their own types, but it's not always obvious when you should be doing that vs designing an API around more specific types[...] Yeah, there's definitely a danger of premature generalization. Before you have experience designing a certain library, it's hard to predict what's worth generalizing and what isn't. But it's hard to gain experience without people actually using your library, which then binds you to the non-optimal initial API. So it's a catch-22. API design is hard. T -- What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??
Jun 29 2023
On Friday, 30 June 2023 at 02:21:42 UTC, H. S. Teoh wrote:On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via Digitalmars-d wrote:API design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".[...][...][...][...] True. It comes down to good API design. Which, as you say, is very hard, probably harder than most people realize. It's easy to slap an ad hoc API onto your library functions, but over time it will prove inadequate for user needs and they will feel frustrated over why certain things are locked behind private. IME, it takes several iterations of actually using a particular API before it becomes clear where the friction points are and what are possible alternative designs that may work better for user code. (And also, which parts of the API are perhaps needlessly complex and could probably be simplified.) The problem is that if you have actual users during this period of time, they will start writing code that depends on the current API, which obligates you to support an inferior API even after a better design emerges.[...][...] Yeah, there's definitely a danger of premature generalization. Before you have experience designing a certain library, it's hard to predict what's worth generalizing and what isn't. But it's hard to gain experience without people actually using your library, which then binds you to the non-optimal initial API. So it's a catch-22. API design is hard. T
Jun 30 2023
On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:API design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.
Jun 30 2023
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:Why do people make arguments about data ownership at all? Functions airnt people.I didn't think about the consequences of leaving the door to my flat open all the timePrivate is more like locking everyone else's doors for their own safety.
Jun 30 2023
On 6/30/23 17:57, monkyyy wrote:On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:That's why functions are not making the arguments. API design is a social activity between programmers. Programmers are people. Simple. Anyway, it's not like private actually prevents you from deliberately accessing things, it just makes clear that that's outside the supported API.On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:Why do people make arguments about data ownership at all? Functions airnt people.I didn't think about the consequences of leaving the door to my flat open all the timePrivate is more like locking everyone else's doors for their own safety.
Jul 03 2023
On Fri, Jun 30, 2023 at 02:41:00PM +0000, bachmeier via Digitalmars-d wrote:On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:The thing is, both of the above are true. Private does have its uses: to hide implementation details from unrelated parts of the code so that, especially in a large project with many contributors, you don't end up with accidental dependencies between parts of the code that really shouldn't depend on each other. Hairball dependencies among unrelated modules is a major factor of unmaintainability in large projects, and preventing this goes a long way to reduce long-term maintenance costs. The other side to this, however, is that deciding what should be private and what shouldn't is a hard problem, and most people either can't figure it out, or can't be bothered to put in the effort to get it right, so they slap private on everything, making it hard to reuse their code outside of the narrow confines of how they initially envisioned it. So you end up with an API that covers the most common use cases but not others, which causes a lot of frustration when downstream code wants to do something but can't via the API, so they have to resort to copy-pasta or breaking private. (See: API design is hard.) Most people design APIs around how they envision the module would be (or ought to be) used, at a relatively high level of abstraction, without regard to the core algorithms that would be used to implement this. What we may call a "use-centric API". Contrary to popular belief, this is actually a mistake. It frequently leads to the situation where a useful algorithm that might benefit other parts of the code gets locked behind the private implementation of the module, because it doesn't directly map to the external API. This in turn promotes code duplication: if my module also needs some variant of the same algorithm, I have to copy-n-paste it or re-implement it from scratch in my own module -- usually also behind `private`, so the next person that comes along will need to do it again. It actually *reduces* code reuse. It also fosters the desire to break private: I realize that the algorithm is already implemented, so I wish I could break private in order to avoid rewriting it myself. A better approach is an algorithm-centric API design: in the course of implementing a module (or library), identify the core algorithms that solve the main problems that the module/library is trying to solve, and design the API around exposing this algorithm to user code. Then on top of that, add some syntactic sugar that maps this to the high-level usage of the algorithm (the use-centric API). There may still be private parts (internal details of the algorithms that the user really doesn't need to know), but these are confined to things that outside code truly doesn't need to know, not a blanket default that may unintentionally exclude certain unusual, but valid, use cases. There is an important philosophical difference between these two approaches. The first approach tends towards the philosophy of "you have problem X, no problem, hand it over to us (the library), we'll perform the magic to solve it, and we'll give you back the result Y". The method of solution is opaque and hidden from user code. IOW, the hood is welded shut; your only recourse in case of problems is to take it back to the dealer (the library author). The second approach has the philosophy "you have problem X, we (the library) will give you tools A, B, C, that you can use to solve problem X. In addition, we provide you special combo D (syntactic sugar functions) that will solve X the usual way without you having to figure out how to combine A, B, and C in the right way." The hood is open and you may fiddle with the things inside if you know what you're doing. But most of the time you won't need to -- the syntactic sugar functions handle the most common use cases for you. The first approach empowers the library writer, the second approach empowers the user. My argument is that the second approach is superior. No abstraction is perfect (otherwise it wouldn't be an abstraction!); there will always be cases where you need to go under the hood and do something the library author didn't envision initially. Give him the tools to do so without breaking encapsulation, instead of forcing him to come back to you for help. T -- Claiming that your operating system is the best in the world because more people use it is like saying McDonalds makes the best food in the world. -- Carl B. ConstantineAPI design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.
Jun 30 2023
On Friday, 30 June 2023 at 16:33:31 UTC, H. S. Teoh wrote:Private does have its uses: to hide implementation details from unrelated parts of the code so that, especially in a large project with many contributors, you don't end up with accidental dependencies between parts of the code that really shouldn't depend on each other.That can never happen if you have to explicitly override something that's been marked private - it's an intentional dependency.The other side to this, however, is that deciding what should be private and what shouldn't is a hard problem, and most people either can't figure it out, or can't be bothered to put in the effort to get it right, so they slap private on everything, making it hard to reuse their code outside of the narrow confines of how they initially envisioned it.It's worse than that. Saying something is private is used as a substitute for documenting or even commenting the code.So you end up with an API that covers the most common use cases but not others, which causes a lot of frustration when downstream code wants to do something but can't via the API, so they have to resort to copy-pasta or breaking private. (See: API design is hard.)It's hard not because you don't know what others need, but because you're marking stuff private and there's no way for anyone else to override that decision. One of the many examples related to the project I just released is the R shared library. The developers have not exported most of the functionality of the library. So when other developers created the Matrix package (now installed by default) for greatly expanded matrix types and operations, they had to resort to copying and pasting large amounts of C code for no obvious reason. Now there are two copies of all that code floating around, but they're probably out of sync. And as I noted above, private means the code is not documented or commented, so who knows if that hasn't resulted in bugs in hard-to-catch edge cases. I agree with the existence of private. In some cases, strictly enforcing privacy is a good thing (though you can't prevent copy and paste). It's difficult to justify the absence of a simple override mechanism. Where it gets really frustrating is when you've invested time getting to 95% of what you need. You're at the point where it almost works, but arbitrary decisions about private mean you'll never be able to achieve 100% of what you need.
Jun 30 2023
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.IMO private is extremely important for maintaining the internal invariants of a unit of encapsulation.
Jun 30 2023
On Friday, 30 June 2023 at 16:48:39 UTC, Meta wrote:IMO private is extremely important for maintaining the internal invariants of a unit of encapsulation.Yes. And this is pretty much the only reason to use private. You have functions that don't keep the invariants for performance reasons, so you create public functions that call them in the correct order and with the correct parameters to keep the invariants. So private is there, to hide unsafe interfaces, to prevent the user of a library to mess up things. If you want to be able to mess up things, any kind of API will never be good enough for you - you simply need the source code and modify it. And then private won't hinder you - simply remove it.
Jul 01 2023
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.You do have it. `__traits(getMember, /+...+/)` as others have mentioned, or some ugly casting trickery. Or just patching the library yourself to make the member you want public.
Jul 02 2023
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:I don't see how - it only applies to your own code, adding private doesn't make someone else's code no longer accessible.API design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".Private is more like locking everyone else's doors for their own safety.In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much.This is where the analogy breaks down. The whole point of private is to make a conscious choice over what is an implementation detail and what is part of the API. If it's the default, the programmer is nudged towards thinking of a good API instead of it being ad-hoc.I should have the option to override those decisions.As a library author, I don't think you should. It's on me to support usage of private functions that I'm nominally allowed to delete, but not really if someone is going to complain.If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.In theory, yes. In practice, yelling. We told people that `in` was in flux and because of that, to not use it. People (including me!) did it anyway. Some of them later complained when we decided what to do with it.
Jul 03 2023
On 7/3/23 3:57 AM, Atila Neves wrote:On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:That is the issue. For instance, if you do: ```d libFunction(cast(int *)0xdeadbeef); ``` And then complain that `libFunction`'s author didn't handle that case, you can rightfully be told to RTFM. Same thing with circumventing private. It should be *possible*, but absolutely unsupported.I should have the option to override those decisions.As a library author, I don't think you should. It's on me to support usage of private functions that I'm nominally allowed to delete, but not really if someone is going to complain.The definition of `private` shouldn't change at all. The ability to circumvent it still should remain for those wanting to muck with internal data, and I don't think there's any way to get around that (there's always reinterpret casting). The thing is, it's important to identify the *consequences* of changing private data -- it can *never* be within spec for a library to allow private data access. So one can muck around with private data, and pay the cost of zero support (and rightfully so). Or one can petition the library author to provide access to that private data. -SteveIf something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.In theory, yes. In practice, yelling. We told people that `in` was in flux and because of that, to not use it. People (including me!) did it anyway. Some of them later complained when we decided what to do with it.
Jul 03 2023
On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]The definition of `private` shouldn't change at all. The ability to circumvent it still should remain for those wanting to muck with internal data, and I don't think there's any way to get around that (there's always reinterpret casting). The thing is, it's important to identify the *consequences* of changing private data -- it can *never* be within spec for a library to allow private data access. So one can muck around with private data, and pay the cost of zero support (and rightfully so). Or one can petition the library author to provide access to that private data.[...] I think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour. A library with less private state, or just as much private state but with a sophisticated API can lets you tweak more things, would be less likely to leave the user out in the cold with unusual use cases. However, it does risk having too many knobs to turn, causing the API to be far more complex than it ought to be. Which in turn can lead to unnecessary complexity: the combinatorial explosion of configurations make it hard for the author to test every combination, so there may be lots of bugs hidden behind uncommon corner cases. The ideal library is one where there's almost no private state because there's no need for it: the code Just Works(tm) for any combination of values one may assign to the public state. The API is simple and concise, yet easily composible and naturally extends to all kinds of use cases, including unusual ones and ones the author himself never envisioned -- yet it all just works together naturally. This ideal may or may not be attainable, but the closer a library gets to this ideal, the better. T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Jul 03 2023
On 7/3/23 2:05 PM, H. S. Teoh wrote:On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via Digitalmars-d wrote: [...]But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.The definition of `private` shouldn't change at all. The ability to circumvent it still should remain for those wanting to muck with internal data, and I don't think there's any way to get around that (there's always reinterpret casting). The thing is, it's important to identify the *consequences* of changing private data -- it can *never* be within spec for a library to allow private data access. So one can muck around with private data, and pay the cost of zero support (and rightfully so). Or one can petition the library author to provide access to that private data.[...] I think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour.A library with less private state, or just as much private state but with a sophisticated API can lets you tweak more things, would be less likely to leave the user out in the cold with unusual use cases. However, it does risk having too many knobs to turn, causing the API to be far more complex than it ought to be. Which in turn can lead to unnecessary complexity: the combinatorial explosion of configurations make it hard for the author to test every combination, so there may be lots of bugs hidden behind uncommon corner cases.It's easy to talk about this in general terms, like "let you tweak more things", but when you start talking about non-abstract real cases, usually the reason for private data becomes obvious. The thing is, if it does make sense that something should just be public, making it public is easy, just make a PR to do it, and the benefits/drawbacks can be discussed, planned for, and agreed upon. Going the other way is much much worse. If you provide public access, it then becomes a supported API. I remember one case in the past, some type in phobos had undocumented members that were public due to laziness or carelessness. When the code had to change to a different implementation, we had to deprecate that access for years before actually changing. It was horrid. There is a real cost to careless publicity. -Steve
Jul 03 2023
On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 7/3/23 2:05 PM, H. S. Teoh wrote:[...]We're actually agreeing with each other, y'know. :-D As I said, the *ideal* is that you wouldn't have private state, or that the private state would be minimal. In practice, of course, certain things *should* be private, and that's not a problem. The problems the OP described arise when either private is used carelessly, causing things to be private that really need not be, or the API is poorly designed, so that parts of the library that ought to be reusable aren't just because of some arbitrary decision made by the author. I've never heard people complaining about how the array length data field is private, for example. That's because it being private does not hinder the user from doing whatever he wants to do with the array (short of breaking the implementation and doing something involving UB, of course). That's an example of proper usage of private. An example of where private hinders what a user might wish to do is an algorithm used internally by the library, that for whatever reason is private and unusable outside of the library code, even though the algorithm itself is general and can be applied outside of the scope of the library. Often in such cases there are immediate pragmatic reasons for it -- the implementation of the algorithm is bound to internal implementation details of other library code, for example. So you can't actually make it public without also making lots of things public that probably shouldn't be. But at a higher level, one asks the question, why is that algorithm implemented in that way in the first place? It could have been implemented generically, and the library could have used just a specialized instance of it to solve whatever it is it needs to solve, but the algorithm itself should be available for user code to use. *That's* the proper design. But alas, all too often this is not done, and you end up with 5 different implementations of the same algorithm, each with different quirks (and often, different subsets of bugs), and all of them are locked up behind `private`, or require some tangential private structure as argument that isn't constructible except via a long-winded circuitous route that probably doesn't do what the user actually wants it to do, even though the algorithm itself doesn't actually depend on this. Ultimately these details are just the incidental symptoms. The underlying root cause is a poor design that doesn't correctly decouple orthogonal functionality into reusable pieces. --TI think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour.But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.
Jul 03 2023
On Monday, 3 July 2023 at 19:27:45 UTC, H. S. Teoh wrote:On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via Digitalmars-d wrote:the correct usage of "ideal" is.. "Ideally we would do X but we don't because the world is full of idiots" ;)On 7/3/23 2:05 PM, H. S. Teoh wrote:[...]We're actually agreeing with each other, y'know. :-D As I said, the *ideal* is that you wouldn't have private state, or that the private state would be minimal.
Jul 03 2023
On 7/3/23 3:27 PM, H. S. Teoh wrote:On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via Digitalmars-d wrote:Yeah kind of. It's just that there are 2 types of privacy labeling, careless and designed.On 7/3/23 2:05 PM, H. S. Teoh wrote:[...]We're actually agreeing with each other, y'know. :-DI think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour.But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.As I said, the *ideal* is that you wouldn't have private state, or that the private state would be minimal. In practice, of course, certain things *should* be private, and that's not a problem. The problems the OP described arise when either private is used carelessly, causing things to be private that really need not be, or the API is poorly designed, so that parts of the library that ought to be reusable aren't just because of some arbitrary decision made by the author.If you carelessly label your fields as public, then realizing later they should have been private is costly, maybe impossible. If you carelessly label your fields as private, while it might upset some people, making them public later is easy. So if you are going to "not care" about public/private, technically the less risky choice is to make everything private, and worry about it later if it becomes an issue. So in that sense I disagree with the OP point. That being said, I've done a lot of libs where I just don't care and leave everything public. It's mostly because I don't expect widespread usage, and I also don't mind breaking peoples code (I don't think any of my projects that I started are past 1.0 yet). But something like Phobos shouldn't be so careless. We really should continue to make careless things private unless there is a good reason to make them public.I've never heard people complaining about how the array length data field is private, for example. That's because it being private does not hinder the user from doing whatever he wants to do with the array (short of breaking the implementation and doing something involving UB, of course). That's an example of proper usage of private.It's an obvious example that we all can agree on. If we agree there are clearly cases where private is important, than we start working our way back to where the line should be drawn.An example of where private hinders what a user might wish to do is an algorithm used internally by the library, that for whatever reason is private and unusable outside of the library code, even though the algorithm itself is general and can be applied outside of the scope of the library. Often in such cases there are immediate pragmatic reasons for it -- the implementation of the algorithm is bound to internal implementation details of other library code, for example. So you can't actually make it public without also making lots of things public that probably shouldn't be. But at a higher level, one asks the question, why is that algorithm implemented in that way in the first place? It could have been implemented generically, and the library could have used just a specialized instance of it to solve whatever it is it needs to solve, but the algorithm itself should be available for user code to use. *That's* the proper design.I agree that some things shouldn't be private. But what's the answer? When it should be public, just change it to public! An actual example of this in Phobos is the absence of a binary search algorithm. It's there, in SortedRange. But that implementation is private basically for no good reason (it can be trivially extracted into its own function). And SortedRange in itself is a schizophrenic meld of overbearing restrictions and puzzling allowances. The only reason I haven't made a PR for it is I just made a copy in my own code and have moved on. But it would probably be pretty trivial to expose. -Steve
Jul 03 2023
On Mon, Jul 03, 2023 at 10:14:38PM -0400, Steven Schveighoffer via Digitalmars-d wrote:On 7/3/23 3:27 PM, H. S. Teoh wrote:[...]Depends. D is flexible enough that public fields can be replaced with access functions, and almost all downstream code doesn't have to change to adapt to it. I've done it a lot in my own code, where some field, say mydata, was previously public but now needs to be private. No problem: just rename it to _mydata, and create access functions mydata() and mydata(typeof(_mydata)) to maintain compatibility with old code. Unless downstream code does something like take an address of the old field, this change will be transparent, a recompile will make it all work as before without requiring further changes.As I said, the *ideal* is that you wouldn't have private state, or that the private state would be minimal. In practice, of course, certain things *should* be private, and that's not a problem. The problems the OP described arise when either private is used carelessly, causing things to be private that really need not be, or the API is poorly designed, so that parts of the library that ought to be reusable aren't just because of some arbitrary decision made by the author.If you carelessly label your fields as public, then realizing later they should have been private is costly, maybe impossible.If you carelessly label your fields as private, while it might upset some people, making them public later is easy.The point is that it then bottlenecks on the author. If the author is not responsive for whatever reason (busy, abandoned the project, etc.) downstream users are stuck up the creek without a paddle.So if you are going to "not care" about public/private, technically the less risky choice is to make everything private, and worry about it later if it becomes an issue. So in that sense I disagree with the OP point.OK, I guess we differ on this point. Given the choice between having to wait for a potentially MIA author to fix an issue and having the ability to go under the hood to manually work around the issue, I choose the latter.That being said, I've done a lot of libs where I just don't care and leave everything public. It's mostly because I don't expect widespread usage, and I also don't mind breaking peoples code (I don't think any of my projects that I started are past 1.0 yet). But something like Phobos shouldn't be so careless. We really should continue to make careless things private unless there is a good reason to make them public.I guess this has to be judged on a case-by-case basis.My personal criteria is, if something can be designed without private (and without opening up holes that may allow user code to break stuff), prefer that design. Barring that, prefer the design that has the least amount of private possible for it to work without opening up loopholes for breakage. In general, I don't quite agree with e.g. Java's approach of making everything private by default and having only member functions mediate access to private state. My approach is to prefer POD types that hold public data that anybody can safely mutate, and public functions that operate on said POD types, rather than the closed-box approach advocated by OO. There's a time and place for the closed-box approach, of course. But in my book, that's the less preferred option that you'd fall back on only if you couldn't do it another way. And even when you can't avoid the closed-box approach, my preference is to minimize the degree of closedness as much as possible.I've never heard people complaining about how the array length data field is private, for example. That's because it being private does not hinder the user from doing whatever he wants to do with the array (short of breaking the implementation and doing something involving UB, of course). That's an example of proper usage of private.It's an obvious example that we all can agree on. If we agree there are clearly cases where private is important, than we start working our way back to where the line should be drawn.It's not always so simple, though. The algorithm might have been implemented in a way that depends on private types and internal assumptions that may break in unforeseen ways if you use it without realizing what the assumptions are. Forcibly changing it to public may require you to make other stuff public that shouldn't be. Or it may be written in a way that's tightly coupled to other internal library code, such that you can't call it separately. This gets particularly frustrating when the core of the algorithm itself does *not* depend on these things, but the upstream author wrote it that way because "it's private, so nobody cares if this code is dirty and badly designed". Being able to hide bad code behind private encourages this kind of one-off hacks that avoids having to think about proper code decomposition.An example of where private hinders what a user might wish to do is an algorithm used internally by the library, that for whatever reason is private and unusable outside of the library code, even though the algorithm itself is general and can be applied outside of the scope of the library. Often in such cases there are immediate pragmatic reasons for it -- the implementation of the algorithm is bound to internal implementation details of other library code, for example. So you can't actually make it public without also making lots of things public that probably shouldn't be. But at a higher level, one asks the question, why is that algorithm implemented in that way in the first place? It could have been implemented generically, and the library could have used just a specialized instance of it to solve whatever it is it needs to solve, but the algorithm itself should be available for user code to use. *That's* the proper design.I agree that some things shouldn't be private. But what's the answer? When it should be public, just change it to public!An actual example of this in Phobos is the absence of a binary search algorithm. It's there, in SortedRange. But that implementation is private basically for no good reason (it can be trivially extracted into its own function). And SortedRange in itself is a schizophrenic meld of overbearing restrictions and puzzling allowances.Yeah, that binary search function really ought to be public. I think by now, experience has more than proven that SortedRange was a mistake. It was an attempt to encode the sortedness of a range in the type system such that Phobos would be able to take advantage of this to provide performance improvements, but D's type system simply isn't powerful enough to express what's needed for this without unnecessary limitations and the weird quirks you see in the current implementation of SortedRange. It was an interesting and ambitious experiment, but I think it has run its course and the conclusion is that it doesn't work in the current language. Or at least isn't pulling its own weight given its current limitations. Perhaps it's time to send it to the scrap yard.The only reason I haven't made a PR for it is I just made a copy in my own code and have moved on. But it would probably be pretty trivial to expose.[...] IMO, we should just get rid of SortedRange and make the binary search algo a public function. Or even if we don't get rid of SortedRange (breakage of existing code and all that), I don't see why the binary search function shouldn't be publicly available. This is exactly the kind of abuse of `private` I was talking about: the function is clearly there and ready to use, but the author for various reasons decided that no, you're not allowed to just call the function, you have to jump through this here set of hoops to prove your worthiness first. T -- My father told me I wasn't at all afraid of hard work. I could lie down right next to it and go to sleep. -- Walter Bright
Jul 05 2023
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:The main topic here is about the harm caused by rich types surrounding algorithms. Let's say I am interested in using an open source algorithm that works with a memory area. (Not related to D.) We all know that a memory area can be described by a fat pointer like D's slices. So, that is what the algorithm should take. Unfortunately, the poor little algorithm is not free to be used: It is written to work with a custom type of that library; let's call it MySlice, which is produced by MyMemoryMappedFile, which is produced by MyFile, which is initialized only by types like MyFilePath. (I may have gotten the relationships wrong there.) But my data is already in a memory area that I own! How can I call that algorithm? Should I write it to a file first and then use those rich types to access the algorithm? That should not be necessary...The language-agnostic answer is to patch the library yourself to do what you want. Since D is a systems programming language, you also have another choice: bypass the type system, create `MySlice` by pointer casting it from the data representing a D slice. Now, neither of these solutions are exactly inviting. But they cannot be: to create `MySlice` in a way the library doesn't support, you have to know it's private implementation details. Even if the language didn't give the library author a way to protect those details, you'd be relying on undocumented version-specific details. Not having `private` would better in the sense you'd be more likely to get compiler errors instead of memory corruption if the private details change. Maybe `__traits(getMember, /+...+/)`, or declaring a private function as external `extern(C)` function, CTFE-mangling the D name, would be safer than the pointer cast I proposed.
Jul 02 2023