www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Algorithms should be free from rich types

reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
My mind is not fully clear on this topic yet but some related things 
have been brewing in me for years.

First, an aside: You may remember my minor complaint about 'private' 
during a DConf presentation years ago. Today, I feel even stronger that 
disallowing access to parts of software "just because" of good design is 
a mistake. I've seen multiple examples of this in professional life 
where a developer uses 'private' only because it is "of course" better 
to do so. (The Turkish word "işgüzar" and the German word 
"verschlimmbessern" describe the situation pretty well for me but the 
English language lacks such a word.)

To give an example from D's ecosystem, the D runtime's garbage collector 
statistics object used to be 'private'. (I think there is an interface 
for it now.) What an inconvenience it was to copy/paste that type's 
definition from the runtime to user code, get the compiled symbol of the 
object from the library, and pointer cast it to be able to access the 
members! A 'static assert' attempts to protect the project from changes 
to that type...

The idea of 'private' should be to just give the developer freedom to 
change the implementation in the future. It should not impede use cases 
that people come up with. That can be achieved practically with an 
underscore: Make everything 'public' and name your implementation 
details with an underscore. People who need them will surely know they 
are implementation details that can change in the future but they will 
be happy: They will get things done.

Ok, that rant is over.

The main topic here is about the harm caused by rich types surrounding 
algorithms. Let's say I am interested in using an open source algorithm 
that works with a memory area. (Not related to D.) We all know that a 
memory area can be described by a fat pointer like D's slices. So, that 
is what the algorithm should take.

Unfortunately, the poor little algorithm is not free to be used: It is 
written to work with a custom type of that library; let's call it 
MySlice, which is produced by MyMemoryMappedFile, which is produced by 
MyFile, which is initialized only by types like MyFilePath. (I may have 
gotten the relationships wrong there.)

But my data is already in a memory area that I own! How can I call that 
algorithm? Should I write it to a file first and then use those rich 
types to access the algorithm? That should not be necessary...

Of course I understand the benefits of all those types but the core 
algorithm should be as free as possible. So, this is simply wrong. I 
think us, software developers, have been on the wrong path. Our task 
should primarily be about getting things done first.

I could work with those types if they had virtual interfaces. But no. 
They are un-subtypable C++ 'class'es.

I think it could also work if the algorithm was templatized; but again, 
no...

Hey! Thank you! I feel better already. :)

Ali
Jun 27 2023
next sibling parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Tue, Jun 27, 2023 at 02:53:59PM -0700, Ali Çehreli via Digitalmars-d wrote:
[...]
 First, an aside: You may remember my minor complaint about 'private'
 during a DConf presentation years ago. Today, I feel even stronger
 that disallowing access to parts of software "just because" of good
 design is a mistake. I've seen multiple examples of this in
 professional life where a developer uses 'private' only because it is
 "of course" better to do so. (The Turkish word "işgüzar" and the
 German word "verschlimmbessern" describe the situation pretty well for
 me but the English language lacks such a word.)
I can't resist me a Walter quote here: I've been around long enough to have seen an endless parade of magic new techniques du jour, most of which purport to remove the necessity of thought about your programming problem. In the end they wind up contributing one or two pieces to the collective wisdom, and fade away in the rearview mirror. -- Walter Bright When you start doing something with the code because that's what everybody else does, or because it's what everyone else says is "the Right Thing(tm)", then it's just cargo-culting, which inevitably leads to problems down the road.
 To give an example from D's ecosystem, the D runtime's garbage
 collector statistics object used to be 'private'. (I think there is an
 interface for it now.) What an inconvenience it was to copy/paste that
 type's definition from the runtime to user code, get the compiled
 symbol of the object from the library, and pointer cast it to be able
 to access the members! A 'static assert' attempts to protect the
 project from changes to that type...
Thing is, things like these usually come from temporary hacks in the code that the original coder didn't want to set in stone, but that end up staying put because of inertia and becoming de facto set in stone.
 The idea of 'private' should be to just give the developer freedom to
 change the implementation in the future. It should not impede use
 cases that people come up with. That can be achieved practically with
 an underscore: Make everything 'public' and name your implementation
 details with an underscore.  People who need them will surely know
 they are implementation details that can change in the future but they
 will be happy: They will get things done.
IOW, empower the user instead of straitjacketing them. My favorite programming modus operandi. Along the same lines as my philosophy of "everything should be a library, main() is just a convenient (thin) interface to access the library API". [...]
 The main topic here is about the harm caused by rich types surrounding
 algorithms. Let's say I am interested in using an open source
 algorithm that works with a memory area. (Not related to D.) We all
 know that a memory area can be described by a fat pointer like D's
 slices. So, that is what the algorithm should take.

 Unfortunately, the poor little algorithm is not free to be used: It is
 written to work with a custom type of that library; let's call it
 MySlice, which is produced by MyMemoryMappedFile, which is produced by
 MyFile, which is initialized only by types like MyFilePath. (I may
 have gotten the relationships wrong there.)
That's a sign of poorly-factored code. The logically-separate parts of the code are not properly separated out, causing them to be dependent on each other where they technically should not be. Doing this right is actually a lot harder than it looks; it often requires significant amounts of refactoring after your initial implementation, because until you write the thing out in code, it isn't always clear which parts are actually dependent and which parts can be separated. Idioms like pipeline programming with ranges help to identify independent pieces of the logic, and abstractions like the range API help you actually separate out the pieces in a clean way. Without a unifying common API like ranges, it's pretty tough to write code in composable pieces that can be freely mixed-and-matched with each other. https://wiki.dlang.org/Component_programming_with_ranges Well, obviously you already know about this article, but one of my motivations for writing that article was precisely what you describe above.
 But my data is already in a memory area that I own! How can I call
 that algorithm? Should I write it to a file first and then use those
 rich types to access the algorithm? That should not be necessary...
 
 Of course I understand the benefits of all those types but the core
 algorithm should be as free as possible. So, this is simply wrong. I
 think us, software developers, have been on the wrong path. Our task
 should primarily be about getting things done first.
Over the years, I've been dreaming about the ideal situation where there would be libraries of algorithms that are not tied to a specific implementation (i.e., bound to concrete types and parameter values), but are written in a form that encapsulates only its core logic. You'd then pull in the algorithm by specifying which concrete type(s) to bind its various parts to, and it'd Just Work(tm). That's the way things should have been from the beginning. But the situation today is far from that ideal: you have libraries that solve some particular programming problem X, but to use the library's solution you need to use also Y, Z, and W that the author of that library happened to choose. For instance, the FreeType library implements rasterization algorithms, but you can't access those algorithms directly. You have to use the library API, which abstracts away file handling, memory management, image type, etc.. In order to cater to different user needs, an entire complicated API is invented to allow the user to specify certain parameters the authors deem tweakable, while an elaborate scheme is designed to hide the rest of the information away. You can't effectively use the rasterization algorithm without also using all of these other peripheral types; and when you need to interface FreeType with another library that uses other, different concrete types, you end up having to write lots of shunt code whose sole purpose is to bridge between incompatible types that actually do equivalent things.
 I could work with those types if they had virtual interfaces. But no.
 They are un-subtypable C++ 'class'es.
 
 I think it could also work if the algorithm was templatized; but
 again, no...
[...] In cases like this, I often get really tempted to copy-n-paste the code and templatize it myself. :-D Of course, in practice that's usually impractical, so the next best thing is to use D's compile-time introspection capabilities to autogenerate boilerplate shunt code to work around API infelicities in the target library, and export a nicer API on the D side. :-D Not always possible, of course, like in your case, where you'd have to either copy-n-paste code and do un- safe casts, or live with infelicities like writing stuff to a file and opening it via the official API. (I had to do something similar once in my day job, interfacing with a grossly over-engineered C++ framework that nobody fully understood nor wanted anything to do with if they could help it -- I ended up having to write a hack where a single function call involved 7 layers of abstraction, one of which involved writing a struct to a temporary file on one side of an RPC call and having the other side (a daemon process) read from the file and cast it back to the struct. The result was the stuff of nightmares that, to everyone's great relief, was phased out a couple of releases later. We relished every moment of typing `\rm -rf` on that entire old codebase after its replacement became fully functional.) T -- 2+2=4. 2*2=4. 2^2=4. Therefore, +, *, and ^ are the same operation.
Jun 27 2023
prev sibling next sibling parent reply FeepingCreature <feepingcreature gmail.com> writes:
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 First, an aside: You may remember my minor complaint about 
 'private' during a DConf presentation years ago. Today, I feel 
 even stronger that disallowing access to parts of software 
 "just because" of good design is a mistake. I've seen multiple 
 examples of this in professional life where a developer uses 
 'private' only because it is "of course" better to do so. (The 
 Turkish word "işgüzar" and the German word "verschlimmbessern" 
 describe the situation pretty well for me but the English 
 language lacks such a word.)

 To give an example from D's ecosystem, the D runtime's garbage 
 collector statistics object used to be 'private'. (I think 
 there is an interface for it now.) What an inconvenience it was 
 to copy/paste that type's definition from the runtime to user 
 code, get the compiled symbol of the object from the library, 
 and pointer cast it to be able to access the members! A 'static 
 assert' attempts to protect the project from changes to that 
 type...

 The idea of 'private' should be to just give the developer 
 freedom to change the implementation in the future. It should 
 not impede use cases that people come up with. That can be 
 achieved practically with an underscore: Make everything 
 'public' and name your implementation details with an 
 underscore. People who need them will surely know they are 
 implementation details that can change in the future but they 
 will be happy: They will get things done.
I like this approach: ``` class C { private int i; } ... void main() system { auto c = new C; c.private.i = 5; } ```
Jun 28 2023
next sibling parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/28/23 01:00, FeepingCreature wrote:

      auto c = new C;
      c.private.i = 5;
I love it. And I actually tried but no, D does not have this yet. :D Ali
Jun 28 2023
parent reply "Richard (Rikki) Andrew Cattermole" <richard cattermole.co.nz> writes:
Oh how you dare me.

--- app.d
module app;
import foo;
void main()
{
     Foo foo = new Foo;
     foo.privateGet!"i" = 2;
     foo.say();
}

ref privateGet(string name, From)(ref From from) {
     static foreach(I; 0 .. from.tupleof.length) {
         {
             enum Name = __traits(identifier, from.tupleof[I]);

             static if (Name == name) {
                 return from.tupleof[I];
             }
         }
     }

     assert(0);
}

--- foo.d
module foo;
class Foo {
     void say() {
         import std.stdio;
      	writeln(i);
     }

private:
     int i;
     bool b;
}
Jun 28 2023
parent Adam D Ruppe <destructionator gmail.com> writes:
On Wednesday, 28 June 2023 at 17:06:43 UTC, Richard (Rikki) 
Andrew Cattermole wrote:
 Oh how you dare me.
just do __traits(getMember, foo, "i") = 2; reflection bypasses private
Jun 28 2023
prev sibling next sibling parent reply bachmeier <no spam.net> writes:
On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature wrote:

 I like this approach:

 ```
 class C {
     private int i;
 }
 ...
 void main()  system {
     auto c = new C;
     c.private.i = 5;
 }
 ```
This would be a good change to the language.
Jun 28 2023
parent Cecil Ward <cecil cecilward.com> writes:
On Wednesday, 28 June 2023 at 17:40:25 UTC, bachmeier wrote:
 On Wednesday, 28 June 2023 at 08:00:23 UTC, FeepingCreature 
 wrote:

 I like this approach:

 ```
 class C {
     private int i;
 }
 ...
 void main()  system {
     auto c = new C;
     c.private.i = 5;
 }
 ```
This would be a good change to the language.
I’m not sure, but I’m thinking ‘yes’.
Jun 28 2023
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/28/23 4:00 AM, FeepingCreature wrote:

 I like this approach:
 
 ```
 class C {
      private int i;
 }
 ...
 void main()  system {
      auto c = new C;
      c.private.i = 5;
 }
 ```
 
```d auto usePrivate(T)(ref T thing) system { static struct GetMeThePrivateStuff { disable this(this); // shouldn't be copied about, meant to be a temporary access private T* _thing; // "private" lol auto ref opDispatch(string s, Args...)(Args args) { static if(Args.length == 0) return __traits(getMember, *_thing, s); else return __traits(getMember, *_thing, s)(args); } } return GetMeThePrivateStuff(&thing); } ``` Yeah, yeah, it needs work. But you get the idea. D is all-powerful. -Steve
Jun 29 2023
parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/29/23 10:15 PM, Steven Schveighoffer wrote:

 
 ```d
 auto usePrivate(T)(ref T thing)  system
 {
     static struct GetMeThePrivateStuff
     {
        disable this(this); // shouldn't be copied about, meant to be a 
 temporary access
       private T* _thing; // "private" lol
       auto ref opDispatch(string s, Args...)(Args args)
       {
          static if(Args.length == 0)
             return __traits(getMember, *_thing, s);
          else
             return __traits(getMember, *_thing, s)(args);
       }
     }
 
     return GetMeThePrivateStuff(&thing);
 }
 ```
Oh wait, the `__traits(getMember)` trick doesn't work on member functions, interesting... So maybe half-powerful ;) -Steve
Jun 29 2023
prev sibling next sibling parent reply Max Samukha <maxsamukha gmail.com> writes:
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 Unfortunately, the poor little algorithm is not free to be 
 used: It is written to work with a custom type of that library; 
 let's call it MySlice, which is produced by MyMemoryMappedFile, 
 which is produced by MyFile, which is initialized only by types 
 like MyFilePath. (I may have gotten the relationships wrong 
 there.)

 But my data is already in a memory area that I own! How can I 
 call that algorithm? Should I write it to a file first and then 
 use those rich types to access the algorithm? That should not 
 be necessary...
That's some poorly designed library (Phobos?). A decently designed one would at least allow you to construct a MySlice instance from a (pointer, length) pair.
Jun 28 2023
parent reply =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/28/23 02:25, Max Samukha wrote:

 That's some poorly designed library (Phobos?).
Not in the D world at all. Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library. Ali
Jun 28 2023
parent reply Hipreme <msnmancini hotmail.com> writes:
On Wednesday, 28 June 2023 at 17:00:44 UTC, Ali Çehreli wrote:
 On 6/28/23 02:25, Max Samukha wrote:

 That's some poorly designed library (Phobos?).
Not in the D world at all. Ironically, I think the library's design is actually pretty good. And that's why I was motivated to write in the first place: Everything was done according to industry best practices but in the end all of that reduces the usability of the library. Ali
I have had a rant with `private` since the time I used LibGDX Particle System. I wasn't able to extend its particle system to add collision to it, why? Because the particles were `private`. Since that, I never used `private` anymore without a very very good reason to do so, the only place I use it right now is for intermediate processes on a full process. People in industry knows nothing on how to use `protected`. Protected IMO should be the industry standard. I have worked in a codebase which is being refactored for at least 3 years, there's so many changes on `private` not being used after some time. Why is that? Because programmers should not fear themselves most of the time.
Jun 28 2023
parent reply bachmeier <no spam.net> writes:
On Wednesday, 28 June 2023 at 17:12:17 UTC, Hipreme wrote:

 I have had a rant with `private` since the time I used LibGDX 
 Particle System. I wasn't able to extend its particle system to 
 add collision to it, why? Because the particles were `private`. 
 Since that, I never used `private` anymore without a very very 
 good reason to do so, the only place I use it right now is for 
 intermediate processes on a full process. People in industry 
 knows nothing on how to use `protected`. Protected IMO should 
 be the industry standard.

 I have worked in a codebase which is being refactored for at 
 least 3 years, there's so many changes on `private` not being 
 used after some time. Why is that? Because programmers should 
 not fear themselves most of the time.
[Rich Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html):
 At some point though, someone is going to need to have access 
 to the data. And if you have a notion of “private”, you need 
 corresponding notions of privilege and trust. And that adds a 
 whole ton of complexity and little value, creates rigidity in a 
 system, and often forces things to live in places they 
 shouldn’t.
 If people don’t have the sensibilities to desire to program to 
 abstractions and to be wary of marrying implementation details, 
 then they are never going to be good programmers.
Jun 28 2023
parent =?UTF-8?Q?Ali_=c3=87ehreli?= <acehreli yahoo.com> writes:
On 6/28/23 10:38, bachmeier wrote:

 [Rich
 
Hickey](https://harfangk.github.io/2017/12/08/rich-hickey-interview-from-codequarterly.html): Amen! I've just finished reading most of it (skipped some Clojure specific parts). The following part is worth quoting as well: "When we drop down to the algorithm level, I think OO can seriously thwart reuse. In particular, the use of objects to represent simple informational data is almost criminal in its generation of per-piece-of-information micro-languages, i.e. the class methods, versus far more powerful, declarative, and generic methods like relational algebra. Inventing a class with its own interface to hold a piece of information is like inventing a new language to write every short story. This is anti-reuse, and, I think, results in an explosion of code in typical OO applications." One more quote both to stay unkind to my ex-favorite language and to relate to our ever-present discussions on the GC's appropriateness in libraries: "The complexity [of C++] is stunning. It failed as the library language it purported to be, due to lack of GC, in my opinion, and static typing failed to keep large OO systems from becoming wretched balls of mud. Large mutable object graphs are the sore point, and const is inadequate to address it. Once C++’s performance advantage eroded or became less important, you had to wonder—why bother? I can’t imagine working in a language without GC today, except in very special circumstances." Ali
Jun 28 2023
prev sibling next sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related 
 things have been brewing in me for years.

 [...]
I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
Jun 29 2023
next sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:
 On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related
 things have been brewing in me for years.

 [...]
I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
Yeah. As with many things, I think that it primarily comes down to good API design (which can be hard). private prevents implementation details from being mucked with, which can be a lifesaver when refactoring and can be a big help with testing and ensuring that things work as expected when other folks use your code. On the other hand, if you fail to make it so that the API provides what your users need, then it could easily be the case that some stuff that should have been available is locked behind private, making their lives harder (or even impossible, depending on what they're trying to do). Similarly, if you actually plan your API around generic types, then it's much easier for folks to make it work with their own types, but it's not always obvious when you should be doing that vs designing an API around more specific types - and it's often the case that code goes from using more specific types to being more flexible as it matures (though that's harder to do in cases where you can't reasonably make sure that all user code gets updated when you make changes, which can make fixing such issues in open source code harder than in company code). So, I'm very much in favor of private being the default, but programmers need to be aware of API issues that can come from being too specific with APIs and locking away stuff that users may actually need. Experience can help a lot with that, though it isn't always easy, and there are plenty of folks out there who just put something together that "works" and leave folks to deal with the mess when something better thought out would have been far more useful. Actively trying to come up with good APIs instead of something that just works can go a long way. - Jonathan M Davis
Jun 29 2023
prev sibling next sibling parent Steven Schveighoffer <schveiguy gmail.com> writes:
On 6/29/23 10:44 AM, Atila Neves wrote:
 On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 My mind is not fully clear on this topic yet but some related things 
 have been brewing in me for years.

 [...]
I have lost count of how many times my life has been made difficult by the lack of `private`. I have also lost count of how many times my life has been made easier by the fact that I ruthlessly declare everything `private` unless it has good reason not to be. Ease of refactoring = good, ergo `private` = good and should be the default.
private is good for the library writer. arbitrary access to private is good for the user/hacker. Honestly though, since private data is accessible through an escape hatch hack (i.e. `__traits(getMember)`), and the library writer can just say "whatevs, you broke it, you bought it", I think we are in a reasonable space. -Steve
Jun 29 2023
prev sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via Digitalmars-d
wrote:
 On Thursday, June 29, 2023 8:44:05 AM MDT Atila Neves via Digitalmars-d wrote:
[...]
 I have lost count of how many times my life has been made
 difficult by the lack of `private`.

 I have also lost count of how many times my life has been made
 easier by the fact that I ruthlessly declare everything `private`
 unless it has good reason not to be.

 Ease of refactoring = good, ergo `private` = good and should be
 the default.
Yeah. As with many things, I think that it primarily comes down to good API design (which can be hard).
[...] True. It comes down to good API design. Which, as you say, is very hard, probably harder than most people realize. It's easy to slap an ad hoc API onto your library functions, but over time it will prove inadequate for user needs and they will feel frustrated over why certain things are locked behind private. IME, it takes several iterations of actually using a particular API before it becomes clear where the friction points are and what are possible alternative designs that may work better for user code. (And also, which parts of the API are perhaps needlessly complex and could probably be simplified.) The problem is that if you have actual users during this period of time, they will start writing code that depends on the current API, which obligates you to support an inferior API even after a better design emerges.
 Similarly, if you actually plan your API around generic types, then
 it's much easier for folks to make it work with their own types, but
 it's not always obvious when you should be doing that vs designing an
 API around more specific types
[...] Yeah, there's definitely a danger of premature generalization. Before you have experience designing a certain library, it's hard to predict what's worth generalizing and what isn't. But it's hard to gain experience without people actually using your library, which then binds you to the non-optimal initial API. So it's a catch-22. API design is hard. T -- What do you mean the Internet isn't filled with subliminal messages? What about all those buttons marked "submit"??
Jun 29 2023
parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 30 June 2023 at 02:21:42 UTC, H. S. Teoh wrote:
 On Thu, Jun 29, 2023 at 05:54:28PM -0600, Jonathan M Davis via 
 Digitalmars-d wrote:
 [...]
[...]
 [...]
[...] True. It comes down to good API design. Which, as you say, is very hard, probably harder than most people realize. It's easy to slap an ad hoc API onto your library functions, but over time it will prove inadequate for user needs and they will feel frustrated over why certain things are locked behind private. IME, it takes several iterations of actually using a particular API before it becomes clear where the friction points are and what are possible alternative designs that may work better for user code. (And also, which parts of the API are perhaps needlessly complex and could probably be simplified.) The problem is that if you have actual users during this period of time, they will start writing code that depends on the current API, which obligates you to support an inferior API even after a better design emerges.
 [...]
[...] Yeah, there's definitely a danger of premature generalization. Before you have experience designing a certain library, it's hard to predict what's worth generalizing and what isn't. But it's hard to gain experience without people actually using your library, which then binds you to the non-optimal initial API. So it's a catch-22. API design is hard. T
API design is indeed hard. Which makes it all the more imperative to not accidentally design one with implementation details that users downstream start depending on. That is: API design needs to be a conscious opt-in decision and not "I guess I didn't think about the consequences of leaving the door to my flat open all the time and now there are people camping in my living room".
Jun 30 2023
parent reply bachmeier <no spam.net> writes:
On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:

 API design is indeed hard. Which makes it all the more 
 imperative to not accidentally design one with implementation 
 details that users downstream start depending on. That is: API 
 design needs to be a conscious opt-in decision and not "I guess 
 I didn't think about the consequences of leaving the door to my 
 flat open all the time and now there are people camping in my 
 living room".
Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.
Jun 30 2023
next sibling parent reply monkyyy <crazymonkyyy gmail.com> writes:
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 I didn't think about the consequences of leaving the door to 
 my flat open all the time
Private is more like locking everyone else's doors for their own safety.
Why do people make arguments about data ownership at all? Functions airnt people.
Jun 30 2023
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 6/30/23 17:57, monkyyy wrote:
 On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 I didn't think about the consequences of leaving the door to my flat 
 open all the time
Private is more like locking everyone else's doors for their own safety.
Why do people make arguments about data ownership at all? Functions airnt people.
That's why functions are not making the arguments. API design is a social activity between programmers. Programmers are people. Simple. Anyway, it's not like private actually prevents you from deliberately accessing things, it just makes clear that that's outside the supported API.
Jul 03 2023
prev sibling next sibling parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Fri, Jun 30, 2023 at 02:41:00PM +0000, bachmeier via Digitalmars-d wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:
 
 API design is indeed hard. Which makes it all the more imperative to
 not accidentally design one with implementation details that users
 downstream start depending on. That is: API design needs to be a
 conscious opt-in decision and not "I guess I didn't think about the
 consequences of leaving the door to my flat open all the time and
 now there are people camping in my living room".
Private is more like locking everyone else's doors for their own safety. In the cases that it keeps an intruder out, it was helpful to them. When grandma had to sleep on the sidewalk, not so much. Many times library authors have prevented me from doing my work because of arbitrarily preventing access to implementation details. I should have the option to override those decisions. If something blows up, or if my code gets broken in the future, it's my fault, because I was the one that made that decision.
The thing is, both of the above are true. Private does have its uses: to hide implementation details from unrelated parts of the code so that, especially in a large project with many contributors, you don't end up with accidental dependencies between parts of the code that really shouldn't depend on each other. Hairball dependencies among unrelated modules is a major factor of unmaintainability in large projects, and preventing this goes a long way to reduce long-term maintenance costs. The other side to this, however, is that deciding what should be private and what shouldn't is a hard problem, and most people either can't figure it out, or can't be bothered to put in the effort to get it right, so they slap private on everything, making it hard to reuse their code outside of the narrow confines of how they initially envisioned it. So you end up with an API that covers the most common use cases but not others, which causes a lot of frustration when downstream code wants to do something but can't via the API, so they have to resort to copy-pasta or breaking private. (See: API design is hard.) Most people design APIs around how they envision the module would be (or ought to be) used, at a relatively high level of abstraction, without regard to the core algorithms that would be used to implement this. What we may call a "use-centric API". Contrary to popular belief, this is actually a mistake. It frequently leads to the situation where a useful algorithm that might benefit other parts of the code gets locked behind the private implementation of the module, because it doesn't directly map to the external API. This in turn promotes code duplication: if my module also needs some variant of the same algorithm, I have to copy-n-paste it or re-implement it from scratch in my own module -- usually also behind `private`, so the next person that comes along will need to do it again. It actually *reduces* code reuse. It also fosters the desire to break private: I realize that the algorithm is already implemented, so I wish I could break private in order to avoid rewriting it myself. A better approach is an algorithm-centric API design: in the course of implementing a module (or library), identify the core algorithms that solve the main problems that the module/library is trying to solve, and design the API around exposing this algorithm to user code. Then on top of that, add some syntactic sugar that maps this to the high-level usage of the algorithm (the use-centric API). There may still be private parts (internal details of the algorithms that the user really doesn't need to know), but these are confined to things that outside code truly doesn't need to know, not a blanket default that may unintentionally exclude certain unusual, but valid, use cases. There is an important philosophical difference between these two approaches. The first approach tends towards the philosophy of "you have problem X, no problem, hand it over to us (the library), we'll perform the magic to solve it, and we'll give you back the result Y". The method of solution is opaque and hidden from user code. IOW, the hood is welded shut; your only recourse in case of problems is to take it back to the dealer (the library author). The second approach has the philosophy "you have problem X, we (the library) will give you tools A, B, C, that you can use to solve problem X. In addition, we provide you special combo D (syntactic sugar functions) that will solve X the usual way without you having to figure out how to combine A, B, and C in the right way." The hood is open and you may fiddle with the things inside if you know what you're doing. But most of the time you won't need to -- the syntactic sugar functions handle the most common use cases for you. The first approach empowers the library writer, the second approach empowers the user. My argument is that the second approach is superior. No abstraction is perfect (otherwise it wouldn't be an abstraction!); there will always be cases where you need to go under the hood and do something the library author didn't envision initially. Give him the tools to do so without breaking encapsulation, instead of forcing him to come back to you for help. T -- Claiming that your operating system is the best in the world because more people use it is like saying McDonalds makes the best food in the world. -- Carl B. Constantine
Jun 30 2023
parent bachmeier <no spam.net> writes:
On Friday, 30 June 2023 at 16:33:31 UTC, H. S. Teoh wrote:

 Private does have its uses: to hide implementation details from 
 unrelated parts of the code so that, especially in a large 
 project with many contributors, you don't end up with 
 accidental dependencies between parts of the code that really 
 shouldn't depend on each other.
That can never happen if you have to explicitly override something that's been marked private - it's an intentional dependency.
 The other side to this, however, is that deciding what should 
 be private and what shouldn't is a hard problem, and most 
 people either can't figure it out, or can't be bothered to put 
 in the effort to get it right, so they slap private on 
 everything, making it hard to reuse their code outside of the 
 narrow confines of how they initially envisioned it.
It's worse than that. Saying something is private is used as a substitute for documenting or even commenting the code.
 So you end up with an API that covers the most common use cases 
 but not others, which causes a lot of frustration when 
 downstream code wants to do something but can't via the API, so 
 they have to resort to copy-pasta or breaking private. (See: 
 API design is hard.)
It's hard not because you don't know what others need, but because you're marking stuff private and there's no way for anyone else to override that decision. One of the many examples related to the project I just released is the R shared library. The developers have not exported most of the functionality of the library. So when other developers created the Matrix package (now installed by default) for greatly expanded matrix types and operations, they had to resort to copying and pasting large amounts of C code for no obvious reason. Now there are two copies of all that code floating around, but they're probably out of sync. And as I noted above, private means the code is not documented or commented, so who knows if that hasn't resulted in bugs in hard-to-catch edge cases. I agree with the existence of private. In some cases, strictly enforcing privacy is a good thing (though you can't prevent copy and paste). It's difficult to justify the absence of a simple override mechanism. Where it gets really frustrating is when you've invested time getting to 95% of what you need. You're at the point where it almost works, but arbitrary decisions about private mean you'll never be able to achieve 100% of what you need.
Jun 30 2023
prev sibling next sibling parent reply Meta <jared771 gmail.com> writes:
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 Private is more like locking everyone else's doors for their 
 own safety. In the cases that it keeps an intruder out, it was 
 helpful to them. When grandma had to sleep on the sidewalk, not 
 so much. Many times library authors have prevented me from 
 doing my work because of arbitrarily preventing access to 
 implementation details. I should have the option to override 
 those decisions. If something blows up, or if my code gets 
 broken in the future, it's my fault, because I was the one that 
 made that decision.
IMO private is extremely important for maintaining the internal invariants of a unit of encapsulation.
Jun 30 2023
parent Dom DiSc <dominikus scherkl.de> writes:
On Friday, 30 June 2023 at 16:48:39 UTC, Meta wrote:
 IMO private is extremely important for maintaining the internal 
 invariants of a unit of encapsulation.
Yes. And this is pretty much the only reason to use private. You have functions that don't keep the invariants for performance reasons, so you create public functions that call them in the correct order and with the correct parameters to keep the invariants. So private is there, to hide unsafe interfaces, to prevent the user of a library to mess up things. If you want to be able to mess up things, any kind of API will never be good enough for you - you simply need the source code and modify it. And then private won't hinder you - simply remove it.
Jul 01 2023
prev sibling next sibling parent Dukc <ajieskola gmail.com> writes:
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 I should have the option to override those decisions. If 
 something blows up, or if my code gets broken in the future, 
 it's my fault, because I was the one that made that decision.
You do have it. `__traits(getMember, /+...+/)` as others have mentioned, or some ugly casting trickery. Or just patching the library yourself to make the member you want public.
Jul 02 2023
prev sibling parent reply Atila Neves <atila.neves gmail.com> writes:
On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 On Friday, 30 June 2023 at 11:07:33 UTC, Atila Neves wrote:

 API design is indeed hard. Which makes it all the more 
 imperative to not accidentally design one with implementation 
 details that users downstream start depending on. That is: API 
 design needs to be a conscious opt-in decision and not "I 
 guess I didn't think about the consequences of leaving the 
 door to my flat open all the time and now there are people 
 camping in my living room".
Private is more like locking everyone else's doors for their own safety.
I don't see how - it only applies to your own code, adding private doesn't make someone else's code no longer accessible.
 In the cases that it keeps an intruder out, it was helpful to 
 them. When grandma had to sleep on the sidewalk, not so much.
This is where the analogy breaks down. The whole point of private is to make a conscious choice over what is an implementation detail and what is part of the API. If it's the default, the programmer is nudged towards thinking of a good API instead of it being ad-hoc.
 I should have the option to override those decisions.
As a library author, I don't think you should. It's on me to support usage of private functions that I'm nominally allowed to delete, but not really if someone is going to complain.
 If something blows up, or if my code gets broken in the future, 
 it's my fault, because I was the one that made that decision.
In theory, yes. In practice, yelling. We told people that `in` was in flux and because of that, to not use it. People (including me!) did it anyway. Some of them later complained when we decided what to do with it.
Jul 03 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/3/23 3:57 AM, Atila Neves wrote:
 On Friday, 30 June 2023 at 14:41:00 UTC, bachmeier wrote:
 I should have the option to override those decisions.
As a library author, I don't think you should. It's on me to support usage of private functions that I'm nominally allowed to delete, but not really if someone is going to complain.
That is the issue. For instance, if you do: ```d libFunction(cast(int *)0xdeadbeef); ``` And then complain that `libFunction`'s author didn't handle that case, you can rightfully be told to RTFM. Same thing with circumventing private. It should be *possible*, but absolutely unsupported.
 If something blows up, or if my code gets broken in the future, it's 
 my fault, because I was the one that made that decision.
In theory, yes. In practice, yelling. We told people that `in` was in flux and because of that, to not use it. People (including me!) did it anyway. Some of them later complained when we decided what to do with it.
The definition of `private` shouldn't change at all. The ability to circumvent it still should remain for those wanting to muck with internal data, and I don't think there's any way to get around that (there's always reinterpret casting). The thing is, it's important to identify the *consequences* of changing private data -- it can *never* be within spec for a library to allow private data access. So one can muck around with private data, and pay the cost of zero support (and rightfully so). Or one can petition the library author to provide access to that private data. -Steve
Jul 03 2023
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
[...]
 The definition of `private` shouldn't change at all. The ability to
 circumvent it still should remain for those wanting to muck with
 internal data, and I don't think there's any way to get around that
 (there's always reinterpret casting). The thing is, it's important to
 identify the *consequences* of changing private data -- it can *never*
 be within spec for a library to allow private data access.
 
 So one can muck around with private data, and pay the cost of zero
 support (and rightfully so). Or one can petition the library author to
 provide access to that private data.
[...] I think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour. A library with less private state, or just as much private state but with a sophisticated API can lets you tweak more things, would be less likely to leave the user out in the cold with unusual use cases. However, it does risk having too many knobs to turn, causing the API to be far more complex than it ought to be. Which in turn can lead to unnecessary complexity: the combinatorial explosion of configurations make it hard for the author to test every combination, so there may be lots of bugs hidden behind uncommon corner cases. The ideal library is one where there's almost no private state because there's no need for it: the code Just Works(tm) for any combination of values one may assign to the public state. The API is simple and concise, yet easily composible and naturally extends to all kinds of use cases, including unusual ones and ones the author himself never envisioned -- yet it all just works together naturally. This ideal may or may not be attainable, but the closer a library gets to this ideal, the better. T -- It always amuses me that Windows has a Safe Mode during bootup. Does that mean that Windows is normally unsafe?
Jul 03 2023
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/3/23 2:05 PM, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 12:32:43PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 [...]
 The definition of `private` shouldn't change at all. The ability to
 circumvent it still should remain for those wanting to muck with
 internal data, and I don't think there's any way to get around that
 (there's always reinterpret casting). The thing is, it's important to
 identify the *consequences* of changing private data -- it can *never*
 be within spec for a library to allow private data access.

 So one can muck around with private data, and pay the cost of zero
 support (and rightfully so). Or one can petition the library author to
 provide access to that private data.
[...] I think we all agree that the mechanics of this won't (and shouldn't) change. But I think the OP was arguing at a higher level of abstraction. It isn't so much about whether private should be overridable or not, or even whether some piece of data in an object should be private or not; the question IMO is whether the library could have been designed in such a way that there's no *need* for private data in the first place. Or at least, the need for such is minimized. A library with tons of private state and only a rudimentary public API is generally more likely to have situations where the user will be left wishing that there were a couple more knobs to turn that can be used to customize the library's behaviour.
But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.
 
 A library with less private state, or just as much private state but
 with a sophisticated API can lets you tweak more things, would be less
 likely to leave the user out in the cold with unusual use cases.
 However, it does risk having too many knobs to turn, causing the API to
 be far more complex than it ought to be. Which in turn can lead to
 unnecessary complexity: the combinatorial explosion of configurations
 make it hard for the author to test every combination, so there may be
 lots of bugs hidden behind uncommon corner cases.
It's easy to talk about this in general terms, like "let you tweak more things", but when you start talking about non-abstract real cases, usually the reason for private data becomes obvious. The thing is, if it does make sense that something should just be public, making it public is easy, just make a PR to do it, and the benefits/drawbacks can be discussed, planned for, and agreed upon. Going the other way is much much worse. If you provide public access, it then becomes a supported API. I remember one case in the past, some type in phobos had undocumented members that were public due to laziness or carelessness. When the code had to change to a different implementation, we had to deprecate that access for years before actually changing. It was horrid. There is a real cost to careless publicity. -Steve
Jul 03 2023
parent reply "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:
[...]
 I think we all agree that the mechanics of this won't (and
 shouldn't) change. But I think the OP was arguing at a higher level
 of abstraction.  It isn't so much about whether private should be
 overridable or not, or even whether some piece of data in an object
 should be private or not; the question IMO is whether the library
 could have been designed in such a way that there's no *need* for
 private data in the first place. Or at least, the need for such is
 minimized.
 
 A library with tons of private state and only a rudimentary public
 API is generally more likely to have situations where the user will
 be left wishing that there were a couple more knobs to turn that can
 be used to customize the library's behaviour.
But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.
We're actually agreeing with each other, y'know. :-D As I said, the *ideal* is that you wouldn't have private state, or that the private state would be minimal. In practice, of course, certain things *should* be private, and that's not a problem. The problems the OP described arise when either private is used carelessly, causing things to be private that really need not be, or the API is poorly designed, so that parts of the library that ought to be reusable aren't just because of some arbitrary decision made by the author. I've never heard people complaining about how the array length data field is private, for example. That's because it being private does not hinder the user from doing whatever he wants to do with the array (short of breaking the implementation and doing something involving UB, of course). That's an example of proper usage of private. An example of where private hinders what a user might wish to do is an algorithm used internally by the library, that for whatever reason is private and unusable outside of the library code, even though the algorithm itself is general and can be applied outside of the scope of the library. Often in such cases there are immediate pragmatic reasons for it -- the implementation of the algorithm is bound to internal implementation details of other library code, for example. So you can't actually make it public without also making lots of things public that probably shouldn't be. But at a higher level, one asks the question, why is that algorithm implemented in that way in the first place? It could have been implemented generically, and the library could have used just a specialized instance of it to solve whatever it is it needs to solve, but the algorithm itself should be available for user code to use. *That's* the proper design. But alas, all too often this is not done, and you end up with 5 different implementations of the same algorithm, each with different quirks (and often, different subsets of bugs), and all of them are locked up behind `private`, or require some tangential private structure as argument that isn't constructible except via a long-winded circuitous route that probably doesn't do what the user actually wants it to do, even though the algorithm itself doesn't actually depend on this. Ultimately these details are just the incidental symptoms. The underlying root cause is a poor design that doesn't correctly decouple orthogonal functionality into reusable pieces. --T
Jul 03 2023
next sibling parent claptrap <clap trap.com> writes:
On Monday, 3 July 2023 at 19:27:45 UTC, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer 
 via Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:
[...]

 We're actually agreeing with each other, y'know. :-D

 As I said, the *ideal* is that you wouldn't have private state, 
 or that the private state would be minimal.
the correct usage of "ideal" is.. "Ideally we would do X but we don't because the world is full of idiots" ;)
Jul 03 2023
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 7/3/23 3:27 PM, H. S. Teoh wrote:
 On Mon, Jul 03, 2023 at 02:30:14PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 2:05 PM, H. S. Teoh wrote:
[...]
 I think we all agree that the mechanics of this won't (and
 shouldn't) change. But I think the OP was arguing at a higher level
 of abstraction.  It isn't so much about whether private should be
 overridable or not, or even whether some piece of data in an object
 should be private or not; the question IMO is whether the library
 could have been designed in such a way that there's no *need* for
 private data in the first place. Or at least, the need for such is
 minimized.

 A library with tons of private state and only a rudimentary public
 API is generally more likely to have situations where the user will
 be left wishing that there were a couple more knobs to turn that can
 be used to customize the library's behaviour.
But that's the thing, there are parts that *simply must be private*. No matter how you cut it, it has to have some level of privacy, because otherwise, you can't enforce semantic invariants with the type. Should array length (not the property, but the actual data field) be public? What about the pointer? Of course not. Yet, you still might want to access those things for some reason. That doesn't mean it's worth a change to public just for that one reason.
We're actually agreeing with each other, y'know. :-D
Yeah kind of. It's just that there are 2 types of privacy labeling, careless and designed.
 As I said, the *ideal* is that you wouldn't have private state, or that
 the private state would be minimal.  In practice, of course, certain
 things *should* be private, and that's not a problem. The problems the
 OP described arise when either private is used carelessly, causing
 things to be private that really need not be, or the API is poorly
 designed, so that parts of the library that ought to be reusable aren't
 just because of some arbitrary decision made by the author.
If you carelessly label your fields as public, then realizing later they should have been private is costly, maybe impossible. If you carelessly label your fields as private, while it might upset some people, making them public later is easy. So if you are going to "not care" about public/private, technically the less risky choice is to make everything private, and worry about it later if it becomes an issue. So in that sense I disagree with the OP point. That being said, I've done a lot of libs where I just don't care and leave everything public. It's mostly because I don't expect widespread usage, and I also don't mind breaking peoples code (I don't think any of my projects that I started are past 1.0 yet). But something like Phobos shouldn't be so careless. We really should continue to make careless things private unless there is a good reason to make them public.
 
 I've never heard people complaining about how the array length data
 field is private, for example.  That's because it being private does not
 hinder the user from doing whatever he wants to do with the array (short
 of breaking the implementation and doing something involving UB, of
 course).  That's an example of proper usage of private.
It's an obvious example that we all can agree on. If we agree there are clearly cases where private is important, than we start working our way back to where the line should be drawn.
 An example of where private hinders what a user might wish to do is an
 algorithm used internally by the library, that for whatever reason is
 private and unusable outside of the library code, even though the
 algorithm itself is general and can be applied outside of the scope of
 the library.  Often in such cases there are immediate pragmatic reasons
 for it -- the implementation of the algorithm is bound to internal
 implementation details of other library code, for example. So you can't
 actually make it public without also making lots of things public that
 probably shouldn't be.  But at a higher level, one asks the question,
 why is that algorithm implemented in that way in the first place?  It
 could have been implemented generically, and the library could have used
 just a specialized instance of it to solve whatever it is it needs to
 solve, but the algorithm itself should be available for user code to
 use.  *That's* the proper design.
I agree that some things shouldn't be private. But what's the answer? When it should be public, just change it to public! An actual example of this in Phobos is the absence of a binary search algorithm. It's there, in SortedRange. But that implementation is private basically for no good reason (it can be trivially extracted into its own function). And SortedRange in itself is a schizophrenic meld of overbearing restrictions and puzzling allowances. The only reason I haven't made a PR for it is I just made a copy in my own code and have moved on. But it would probably be pretty trivial to expose. -Steve
Jul 03 2023
parent "H. S. Teoh" <hsteoh qfbox.info> writes:
On Mon, Jul 03, 2023 at 10:14:38PM -0400, Steven Schveighoffer via
Digitalmars-d wrote:
 On 7/3/23 3:27 PM, H. S. Teoh wrote:
[...]
 As I said, the *ideal* is that you wouldn't have private state, or
 that the private state would be minimal.  In practice, of course,
 certain things *should* be private, and that's not a problem. The
 problems the OP described arise when either private is used
 carelessly, causing things to be private that really need not be, or
 the API is poorly designed, so that parts of the library that ought
 to be reusable aren't just because of some arbitrary decision made
 by the author.
If you carelessly label your fields as public, then realizing later they should have been private is costly, maybe impossible.
Depends. D is flexible enough that public fields can be replaced with access functions, and almost all downstream code doesn't have to change to adapt to it. I've done it a lot in my own code, where some field, say mydata, was previously public but now needs to be private. No problem: just rename it to _mydata, and create access functions mydata() and mydata(typeof(_mydata)) to maintain compatibility with old code. Unless downstream code does something like take an address of the old field, this change will be transparent, a recompile will make it all work as before without requiring further changes.
 If you carelessly label your fields as private, while it might upset
 some people, making them public later is easy.
The point is that it then bottlenecks on the author. If the author is not responsive for whatever reason (busy, abandoned the project, etc.) downstream users are stuck up the creek without a paddle.
 So if you are going to "not care" about public/private, technically
 the less risky choice is to make everything private, and worry about
 it later if it becomes an issue. So in that sense I disagree with the
 OP point.
OK, I guess we differ on this point. Given the choice between having to wait for a potentially MIA author to fix an issue and having the ability to go under the hood to manually work around the issue, I choose the latter.
 That being said, I've done a lot of libs where I just don't care and
 leave everything public. It's mostly because I don't expect widespread
 usage, and I also don't mind breaking peoples code (I don't think any
 of my projects that I started are past 1.0 yet). But something like
 Phobos shouldn't be so careless. We really should continue to make
 careless things private unless there is a good reason to make them
 public.
I guess this has to be judged on a case-by-case basis.
 I've never heard people complaining about how the array length data
 field is private, for example.  That's because it being private does
 not hinder the user from doing whatever he wants to do with the
 array (short of breaking the implementation and doing something
 involving UB, of course).  That's an example of proper usage of
 private.
It's an obvious example that we all can agree on. If we agree there are clearly cases where private is important, than we start working our way back to where the line should be drawn.
My personal criteria is, if something can be designed without private (and without opening up holes that may allow user code to break stuff), prefer that design. Barring that, prefer the design that has the least amount of private possible for it to work without opening up loopholes for breakage. In general, I don't quite agree with e.g. Java's approach of making everything private by default and having only member functions mediate access to private state. My approach is to prefer POD types that hold public data that anybody can safely mutate, and public functions that operate on said POD types, rather than the closed-box approach advocated by OO. There's a time and place for the closed-box approach, of course. But in my book, that's the less preferred option that you'd fall back on only if you couldn't do it another way. And even when you can't avoid the closed-box approach, my preference is to minimize the degree of closedness as much as possible.
 An example of where private hinders what a user might wish to do is
 an algorithm used internally by the library, that for whatever
 reason is private and unusable outside of the library code, even
 though the algorithm itself is general and can be applied outside of
 the scope of the library.  Often in such cases there are immediate
 pragmatic reasons for it -- the implementation of the algorithm is
 bound to internal implementation details of other library code, for
 example. So you can't actually make it public without also making
 lots of things public that probably shouldn't be.  But at a higher
 level, one asks the question, why is that algorithm implemented in
 that way in the first place?  It could have been implemented
 generically, and the library could have used just a specialized
 instance of it to solve whatever it is it needs to solve, but the
 algorithm itself should be available for user code to use.  *That's*
 the proper design.
I agree that some things shouldn't be private. But what's the answer? When it should be public, just change it to public!
It's not always so simple, though. The algorithm might have been implemented in a way that depends on private types and internal assumptions that may break in unforeseen ways if you use it without realizing what the assumptions are. Forcibly changing it to public may require you to make other stuff public that shouldn't be. Or it may be written in a way that's tightly coupled to other internal library code, such that you can't call it separately. This gets particularly frustrating when the core of the algorithm itself does *not* depend on these things, but the upstream author wrote it that way because "it's private, so nobody cares if this code is dirty and badly designed". Being able to hide bad code behind private encourages this kind of one-off hacks that avoids having to think about proper code decomposition.
 An actual example of this in Phobos is the absence of a binary search
 algorithm. It's there, in SortedRange. But that implementation is
 private basically for no good reason (it can be trivially extracted
 into its own function). And SortedRange in itself is a schizophrenic
 meld of overbearing restrictions and puzzling allowances.
Yeah, that binary search function really ought to be public. I think by now, experience has more than proven that SortedRange was a mistake. It was an attempt to encode the sortedness of a range in the type system such that Phobos would be able to take advantage of this to provide performance improvements, but D's type system simply isn't powerful enough to express what's needed for this without unnecessary limitations and the weird quirks you see in the current implementation of SortedRange. It was an interesting and ambitious experiment, but I think it has run its course and the conclusion is that it doesn't work in the current language. Or at least isn't pulling its own weight given its current limitations. Perhaps it's time to send it to the scrap yard.
 The only reason I haven't made a PR for it is I just made a copy in my
 own code and have moved on. But it would probably be pretty trivial to
 expose.
[...] IMO, we should just get rid of SortedRange and make the binary search algo a public function. Or even if we don't get rid of SortedRange (breakage of existing code and all that), I don't see why the binary search function shouldn't be publicly available. This is exactly the kind of abuse of `private` I was talking about: the function is clearly there and ready to use, but the author for various reasons decided that no, you're not allowed to just call the function, you have to jump through this here set of hoops to prove your worthiness first. T -- My father told me I wasn't at all afraid of hard work. I could lie down right next to it and go to sleep. -- Walter Bright
Jul 05 2023
prev sibling parent Dukc <ajieskola gmail.com> writes:
On Tuesday, 27 June 2023 at 21:53:59 UTC, Ali Çehreli wrote:
 The main topic here is about the harm caused by rich types 
 surrounding algorithms. Let's say I am interested in using an 
 open source algorithm that works with a memory area. (Not 
 related to D.) We all know that a memory area can be described 
 by a fat pointer like D's slices. So, that is what the 
 algorithm should take.

 Unfortunately, the poor little algorithm is not free to be 
 used: It is written to work with a custom type of that library; 
 let's call it MySlice, which is produced by MyMemoryMappedFile, 
 which is produced by MyFile, which is initialized only by types 
 like MyFilePath. (I may have gotten the relationships wrong 
 there.)

 But my data is already in a memory area that I own! How can I 
 call that algorithm? Should I write it to a file first and then 
 use those rich types to access the algorithm? That should not 
 be necessary...
The language-agnostic answer is to patch the library yourself to do what you want. Since D is a systems programming language, you also have another choice: bypass the type system, create `MySlice` by pointer casting it from the data representing a D slice. Now, neither of these solutions are exactly inviting. But they cannot be: to create `MySlice` in a way the library doesn't support, you have to know it's private implementation details. Even if the language didn't give the library author a way to protect those details, you'd be relying on undocumented version-specific details. Not having `private` would better in the sense you'd be more likely to get compiler errors instead of memory corruption if the private details change. Maybe `__traits(getMember, /+...+/)`, or declaring a private function as external `extern(C)` function, CTFE-mangling the D name, would be safer than the pointer cast I proposed.
Jul 02 2023