www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.fileformats?

reply berni44 <dlang d-ecke.de> writes:
In Phobos there are several toplevel modules which each are about 
one special file format. Wouldn't it be better to put them all 
into std.fileformats (or some similar place)?

std.base64
std.csv
std.json
std.xml
std.zip

If yes, there are some more questions poping up:

a) Should we use the change to replace some of the modules? At 
least for std.xml I found std-experimental-xml [1], which seems 
to be thought to replace xml eventually. And there are lot's of 
packages which address json. I don't know if there is one, which 
is *the* candidate for replacement.

b) Should we instead remove some of these? Probably std.zip is 
here the first candidate. (I put some work in it in the last few 
weeks, but it would be fine for me throwing this away.)

It's on a gut level - just wondering, what you think about this.

[1] https://code.dlang.org/packages/std-experimental-xml
Jan 06 2020
next sibling parent reply JN <666total wp.pl> writes:
On Monday, 6 January 2020 at 19:40:10 UTC, berni44 wrote:
 b) Should we instead remove some of these? Probably std.zip is 
 here the first candidate. (I put some work in it in the last 
 few weeks, but it would be fine for me throwing this away.)

 It's on a gut level - just wondering, what you think about this.

 [1] https://code.dlang.org/packages/std-experimental-xml
IMO all except base64 could be removed. Putting everything into the standard library made a lot of sense in the times before we got a package manager. Nowadays it might be better to simplify the standard library and just have XML, JSON, ZIP, CSV as "blessed" packages.
Jan 06 2020
parent reply Laeeth Isharc <laeeth kaleidic.io> writes:
On Monday, 6 January 2020 at 21:05:37 UTC, JN wrote:
 On Monday, 6 January 2020 at 19:40:10 UTC, berni44 wrote:
 b) Should we instead remove some of these? Probably std.zip is 
 here the first candidate. (I put some work in it in the last 
 few weeks, but it would be fine for me throwing this away.)

 It's on a gut level - just wondering, what you think about 
 this.

 [1] https://code.dlang.org/packages/std-experimental-xml
IMO all except base64 could be removed. Putting everything into the standard library made a lot of sense in the times before we got a package manager. Nowadays it might be better to simplify the standard library and just have XML, JSON, ZIP, CSV as "blessed" packages.
Dependencies will be our doom. Whereas if you use something in Phobos then you have some confidence it will still build across platforms in a couple of years. It also seems to make sense to work on making code.dlang.org ultra-reliable before making it necessary to use libraries before doing even basic things.
Jan 06 2020
next sibling parent reply "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Mon, Jan 06, 2020 at 11:38:25PM +0000, Laeeth Isharc via Digitalmars-d wrote:
 On Monday, 6 January 2020 at 21:05:37 UTC, JN wrote:
 On Monday, 6 January 2020 at 19:40:10 UTC, berni44 wrote:
 b) Should we instead remove some of these? Probably std.zip is
 here the first candidate. (I put some work in it in the last few
 weeks, but it would be fine for me throwing this away.)
 
 It's on a gut level - just wondering, what you think about this.
 
 [1] https://code.dlang.org/packages/std-experimental-xml
IMO all except base64 could be removed. Putting everything into the standard library made a lot of sense in the times before we got a package manager. Nowadays it might be better to simplify the standard library and just have XML, JSON, ZIP, CSV as "blessed" packages.
Dependencies will be our doom. Whereas if you use something in Phobos then you have some confidence it will still build across platforms in a couple of years.
[...] I agree, recent experiences have led me to the conclusion that dependencies are a liability rather than a benefit. The only exception is if you copy the code into your source tree, and periodically (manually) update it. Dependency resolution is NP complete (gives a whole new meaning to "dependency hell"); that should be a big red flag that dependencies are something to be avoided where possible, or at least treated with care, not relished. Even std.zip has been the source of trouble in the past: users would install dmd, code away happily, then get slammed with linker errors they don't know how to resolve. Eventually, the solution is to bundle zlib with the dmd distribution packages. Again, you see, dependencies are a liability, and the solution is to bundle the dependency with your main package so that the user never has to do any dependency resolution. This is why I've found that Adam's arsd libraries have been the best out of the D libraries out there: his philosophy of minimal (preferably no) dependencies, and everything bundled in a single source file, has been a boon. You just copy the source file into the right place in your source tree, check it in as part of your code repo, and never have to worry about sudden breakage beyond your control. Every now and then, just to stay up to date, git pull Adam's repo and copy the new file(s) over. Doing this manually may seem tedious in this day and age of instant gratification, but it's actually a benefit: (1) Since you manually copy the file(s) over, you're aware of what dependencies exactly you're pulling in, and (hopefully) are taking measures to prevent pulling in unnecessary extra cruft; (2) You will likely have enough sense to make sure your code compiles with the new version of the file before committing to the repo, thus avoiding the all-too-common problem of code breakage due to incompatibility with newer versions of the dependency: if it's checked in, it compiles, 100% guaranteed. Your buildability does not depend on the volatile state of some random server somewhere out there on the Internet. (3) Your collaborators will never be compiling using different versions of your dependencies and getting different results, which cause a lot of headaches trying to track down problems (everyone's build behaves slightly differently). (4) Should the upstream authors of the dependency abandon their project, vanish into the ether, or the dependency otherwise becomes unavailable, your code will still compile and still work. Again, you remove the possibility of random breakage caused by random internet outages. Future readers of your code will appreciate that *all* the code is there in the repo, without big missing chunks from dependencies that may no longer exist 10 or 20 years down the road. Even if the code will no longer compile by then, they can at least still see how it works. (And if you take this to the logical conclusion, bundling the exact version of the compiler you used to build the executables seems a logical possibility that will ensure compilability far into the future, even after your dependencies' maintainers have long abandoned it.) After so many decades of academia and industry alike trumpeting code reuse, I'm starting to become skeptical that perhaps King Code Reuse has invisible clothes. Dependency hell is a smell suggesting that something is fundamentally wrong with the concept. T -- That's not a bug; that's a feature!
Jan 06 2020
parent reply berni44 <dlang d-ecke.de> writes:
On Tuesday, 7 January 2020 at 00:10:08 UTC, H. S. Teoh wrote:
 After so many decades of academia and industry alike trumpeting 
 code reuse, I'm starting to become skeptical that perhaps King 
 Code Reuse has invisible clothes. Dependency hell is a smell 
 suggesting that something is fundamentally wrong with the 
 concept.
Years ago I came to the same conclusion but in a completely different area: I was programming generators for sudoku puzzles and it's variants. This looks like being ideal for OOP, because all those variations are quit similar, but it proved not to be. The puzzles, the program produced, where not really good. And the reason was, that every variation needs it's own optimization for good results and that was almost impossible with all those code sharing. Eventually I rewrote everything with a single program for every variation. That worked much better, although it was code repetition over and over again.
Jan 07 2020
parent "H. S. Teoh" <hsteoh quickfur.ath.cx> writes:
On Tue, Jan 07, 2020 at 12:17:57PM +0000, berni44 via Digitalmars-d wrote:
 On Tuesday, 7 January 2020 at 00:10:08 UTC, H. S. Teoh wrote:
 After so many decades of academia and industry alike trumpeting code
 reuse, I'm starting to become skeptical that perhaps King Code Reuse
 has invisible clothes.
[...]
 Years ago I came to the same conclusion but in a completely different
 area: I was programming generators for sudoku puzzles and it's
 variants. This looks like being ideal for OOP, because all those
 variations are quit similar, but it proved not to be.
Yeah, OOP is another of those things that IMNSHO has been overhyped far beyond its actual scope of usefulness. It *does* have its uses, that's undeniable, but only to certain classes (har har) of programming problems. In writing GUI components, for example, it seems to work really well. Trying to shoehorn *every* programming problem into an OO model, however, to me is a code smell. (Ironically, modern OO design advice is to avoid inheritance and embrace composition, which to me is a sign that polymorphism via inheritance, one of the foundational features of OOP, isn't as widely applicable as it's hyped to be.)
 The puzzles, the program produced, where not really good. And the
 reason was, that every variation needs it's own optimization for good
 results and that was almost impossible with all those code sharing.
 Eventually I rewrote everything with a single program for every
 variation. That worked much better, although it was code repetition
 over and over again.
Mind you, I'm not against code reuse via refactoring per se. Too much code repetition is also a code smell... but refactoring and generalizing to the point of obsession is a sign that it has gone off the deep end. Sometimes, it's just not worth the trouble, and the technical debt, to factor out the common code of two (or more) similar pieces of code. They may be similar but differ in complex ways that cannot be easily captured via a simple abstraction, and if you try to forcefully push the refactoring through anyway, you usually end up with a completely disproportionately more complex architecture that suffers from obvious symptoms of premature (or over-) generalization and over-engineering. The result usually is an order of magnitude more difficult to maintain, disproportionately harder to understand, and contains disproportionately more LOC. In such cases, the best solution may actually be copy-n-paste + modify. The code may well turn out to be simpler and more maintainable that way! T -- Why do conspiracy theories always come from the same people??
Jan 07 2020
prev sibling parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/6/20 6:38 PM, Laeeth Isharc wrote:
 On Monday, 6 January 2020 at 21:05:37 UTC, JN wrote:
 On Monday, 6 January 2020 at 19:40:10 UTC, berni44 wrote:
 b) Should we instead remove some of these? Probably std.zip is here 
 the first candidate. (I put some work in it in the last few weeks, 
 but it would be fine for me throwing this away.)

 It's on a gut level - just wondering, what you think about this.

 [1] https://code.dlang.org/packages/std-experimental-xml
IMO all except base64 could be removed. Putting everything into the standard library made a lot of sense in the times before we got a package manager. Nowadays it might be better to simplify the standard library and just have XML, JSON, ZIP, CSV as "blessed" packages.
Dependencies will be our doom.  Whereas if you use something in Phobos then you have some confidence it will still build across platforms in a couple of years. It also seems to make sense to work on making code.dlang.org ultra-reliable before making it necessary to use libraries before doing even basic things.
Except you do need to depend on it to do basic things. Have you tried using std.xml? Because something lives on code.dlang.org does not mean it is unreliable. The reliability depends on the developer, not the platform. I imagine vibe.d is going to be super-reliable for years, and it lives on code.dlang.org. H.S. Teoh also talks about arsd. This is also available on code.dlang.org (though not exactly built for it), but the mechanism of copying license compatible code into your project so you can maintain it doesn't depend on it being outside of code.dlang.org. You can do that with anything! D modules are pretty movable. One thing that is common between these two projects (and Phobos as well) -- They are both major projects with almost all their dependencies included in the project. That is, the project moves together, so you always have a cohesive "mini-standard" library. The dependency problem really shows up when your dependencies depend on dependencies that depend on dependencies that are all written and maintained by various people. A great example of how this can be a problem was the story of npm left-pad https://qz.com/646467/how-one-programmer-broke-the-internet-by-deleting-a-tiny-piece-of-code/ There's something to be said about having things outside the standard library not for the reasons of stability but for maneuverability. One can easily add/improve/change projects that are outside the standard library, but poor decisions in the standard library are sometimes stuck there (see std.xml). For file formats, or parsers, or really any kind of system that has innumerable ways of solving the problem, I really think the standard library is not where they should be. The benefits are cohesion with the library, and usage by the library. But there are many ways to skin the JSON cat. Especially when there's little reason for JSON to be included in Phobos other than a place for it to be. Oh, and I agree that reliability of code.dlang.org (or having dub more immune to the web site going down) should be a high priority. -Steve
Jan 06 2020
next sibling parent Adam D. Ruppe <destructionator gmail.com> writes:
On Tuesday, 7 January 2020 at 01:45:21 UTC, Steven Schveighoffer 
wrote:
 H.S. Teoh also talks about arsd. This is also available on 
 code.dlang.org (though not exactly built for it), but the 
 mechanism of copying license compatible code into your project 
 so you can maintain it doesn't depend on it being outside of 
 code.dlang.org. You can do that with anything! D modules are 
 pretty movable.
In theory yes, but in practice it can be very difficult because of the size of the dependency graph. You import foo which needs bar which needs baz which needs joe and sally which need fred, tom, dick, and harry... it is very, very easy to fall down that rabbit hole and package managers make it look enticing. Heck, even I am very, very tempted to introduce a few base modules to my libs, especially now that I use dmd -i more so it would be almost transparent. And I probably would have already if I didn't type out my modules each time for so long. So my policy isn't to package dependencies, it is to *eliminate* them. So the individual files mostly stand alone. You can import one without needing the others, not even from the same repo.
Jan 06 2020
prev sibling parent reply Laeeth Isharc <Laeeth laeeth.com> writes:
On Tuesday, 7 January 2020 at 01:45:21 UTC, Steven Schveighoffer 
wrote:
 On 1/6/20 6:38 PM, Laeeth Isharc wrote:
 On Monday, 6 January 2020 at 21:05:37 UTC, JN wrote:
 On Monday, 6 January 2020 at 19:40:10 UTC, berni44 wrote:
 b) Should we instead remove some of these? Probably std.zip 
 is here the first candidate. (I put some work in it in the 
 last few weeks, but it would be fine for me throwing this 
 away.)

 It's on a gut level - just wondering, what you think about 
 this.

 [1] https://code.dlang.org/packages/std-experimental-xml
IMO all except base64 could be removed. Putting everything into the standard library made a lot of sense in the times before we got a package manager. Nowadays it might be better to simplify the standard library and just have XML, JSON, ZIP, CSV as "blessed" packages.
Dependencies will be our doom.  Whereas if you use something in Phobos then you have some confidence it will still build across platforms in a couple of years. It also seems to make sense to work on making code.dlang.org ultra-reliable before making it necessary to use libraries before doing even basic things.
Except you do need to depend on it to do basic things. Have you tried using std.xml?
Yes, and I didn't persist for long. I wonder if sometimes there can be a tendency to let the best be the enemy of the considerably better. I think you made an argument for replacing std.xml rather than XML not being in the standard library. Though maybe that's a better argument than for getting rid of JSON and CSV.
 Because something lives on code.dlang.org does not mean it is 
 unreliable. The reliability depends on the developer, not the 
 platform.
Yes, I agree, but people are wired different ways and learnt to program in different eras and the cognitive cost of figuring out if a code.dlang.org library is any good is much greater for some people than others. You don't even know if the project will build with the current release of DMD. That mostly doesn't bother me personally, but it definitely does bother others. "The reliability depends on the developer, not the
 platform."
I don't agree with this. Phobos will probably work on BSD or smartos or whatever. And quite well-written libraries may require tweaking to build even on 32 bit Windows. And when you have a reasonable number of dependencies there is always something breaking with new releases so the cost may not be trivial for larger projects.
 I imagine vibe.d is going to be super-reliable for years, and 
 it lives on code.dlang.org.
Maybe. I am very grateful to Sonke for his contribution but vibe is one of the projects that breaks frequently and you can get into a mess where you need newer versions of the compiler for other reasons and vibe doesn't compile. Plus dub often can bring in dependencies spuriously, though that's a different problem.
 H.S. Teoh also talks about arsd. This is also available on 
 code.dlang.org (though not exactly built for it), but the 
 mechanism of copying license compatible code into your project 
 so you can maintain it doesn't depend on it being outside of 
 code.dlang.org. You can do that with anything! D modules are 
 pretty movable.
I like arsd and think we should use it more internally. It's pretty different though from having a module in Phobos, particularly if you want to get people who don't know D to use it, all the more so if they don't consider themselves programmers by trade. It can be too much to absorb in one go.
 There's something to be said about having things outside the 
 standard library not for the reasons of stability but for 
 maneuverability. One can easily add/improve/change projects 
 that are outside the standard library, but poor decisions in 
 the standard library are sometimes stuck there (see std.xml).
Yes I agree but isn't that an argument for adopting a better one as part of the standard library? Undead could also be more widely publicised and dub could be educated to ask if you want to fix old code by updating references.
 For file formats, or parsers, or really any kind of system that 
 has innumerable ways of solving the problem, I really think the 
 standard library is not where they should be. The benefits are 
 cohesion with the library, and usage by the library. But there 
 are many ways to skin the JSON cat. Especially when there's 
 little reason for JSON to be included in Phobos other than a 
 place for it to be.
A sensible default doesn't stop you from using something suited to your particular needs. It also saves time and cognitive effort to know what to expect when reading code.
 Oh, and I agree that reliability of code.dlang.org (or having 
 dub more immune to the web site going down) should be a high 
 priority.
I wonder what we could do about that. I could support if resources would help.
Jan 06 2020
parent reply Steven Schveighoffer <schveiguy gmail.com> writes:
On 1/6/20 10:48 PM, Laeeth Isharc wrote:
 On Tuesday, 7 January 2020 at 01:45:21 UTC, Steven Schveighoffer wrote:
 Except you do need to depend on it to do basic things. Have you tried 
 using std.xml?
Yes, and I didn't persist for long. I wonder if sometimes there can be a tendency to let the best be the enemy of the considerably better.  I think you made an argument for replacing std.xml rather than XML not being in the standard library. Though maybe that's a better argument than for getting rid of JSON and CSV.
Yes, for sure you can have a "basic" implementation for something, and then go elsewhere for more fancy implementations. Unit testing is a good example of this, the default works, and runs unit tests, but if you want fancy outputs, you go with one of the 3rd-party tools. The nice thing is that the runtime library allows you to swap out it's dull minimal implementation with your fancy implementation. xml and other file formats are not like that. First, there isn't a requirement in the library to have xml anywhere -- it's its own project, and simply lives in Phobos, nothing else in Phobos depends on it. Because it is of poor quality, we should replace or remove it. Of course, there are many ways to do xml, and it's hard to agree on the "right" way, which is why it sits there basically untouched for over a decade. It's also hard to make a similar mechanism like the previously mentioned unittests where you can "swap out" the implementation with what you really want. It fits better as an add-on library. It doesn't have to be part of Phobos to be maintained by a quality team. It's just kind of tacked on, an xml package, and we were doing what they did. Note that the bare minimum has been done to it, just to keep it building. I don't think that's desirable, and it's not a good look for the language. Not only that, but there is a cost to having a poor library being the "default" one. People do not go looking right away for something else, so they waste time on it, and finally go elsewhere anyway (perhaps away from D not even knowing there is something better for it). I'd say we are better off NOT having it in there.
 Because something lives on code.dlang.org does not mean it is 
 unreliable. The reliability depends on the developer, not the platform.
Yes, I agree, but people are wired different ways and learnt to program in different eras and the cognitive cost of figuring out if a code.dlang.org library is any good is much greater for some people than others.  You don't even know if the project will build with the current release of DMD.  That mostly doesn't bother me personally, but it definitely does bother others.
I definitely think we should have a repository system that separates "blessed" and maintained projects from all the ones that are added by random people. Maybe like a certified project list, and in order to be on it, the project has to be maintained by the core team (and tested along with the core CI), or there has to be a promise to fix issues within a certain time period or whatnot. Look at boost for example -- not part of the standard library, but might as well be. It lives in a space where innovation needs to move faster than standards committees. D isn't in the same boat exactly, but we have much less leeway to make significant changes to Phobos packages than we would to make changes on external ones. It's also difficult to say "this is the one way you should use xml", when nobody is passionate enough about it.
 
 "The reliability depends on the developer, not the
 platform."
I don't agree with this.  Phobos will probably work on BSD or smartos or whatever.  And quite well-written libraries may require tweaking to build even on 32 bit Windows.
This is due to the extensive CI system we have on there. This doesn't stop other projects from doing the same. What I meant was that code outside of Phobos can be just as reliable.
 And when you have a reasonable number of dependencies there is always 
 something breaking with new releases so the cost may not be trivial for 
 larger projects.
If we have the dependencies all covered, then there is not an issue. In other words, we could maintain a list of projects where all the dependencies are up to date.
 
 I imagine vibe.d is going to be super-reliable for years, and it lives 
 on code.dlang.org.
Maybe.   I am very grateful to Sonke for his contribution but vibe is one of the projects that breaks frequently and you can get into a mess where you need newer versions of the compiler for other reasons and vibe doesn't compile.  Plus dub often can bring in dependencies spuriously, though that's a different problem.
I haven't had this experience. When I've updated for my project, the vibe framework seems to be very good about warning me with deprecations rather than breaking.
 There's something to be said about having things outside the standard 
 library not for the reasons of stability but for maneuverability. One 
 can easily add/improve/change projects that are outside the standard 
 library, but poor decisions in the standard library are sometimes 
 stuck there (see std.xml).
Yes I agree but isn't that an argument for adopting a better one as part of the standard library?
But which one? Nobody can agree on what to put in there, so it stays. XML is not easy. JSON is easy (I wrote jsoniopipe in 1 week I think), and it is still in there despite other better systems existing.
 Undead could also be more widely publicised and dub could be educated to 
 ask if you want to fix old code by updating references.
You mean move it to undead? I'm fine with that, and I'm also fine with dub being proactive about projects that depend on it. But that doesn't mean we have to replace it.
 For file formats, or parsers, or really any kind of system that has 
 innumerable ways of solving the problem, I really think the standard 
 library is not where they should be. The benefits are cohesion with 
 the library, and usage by the library. But there are many ways to skin 
 the JSON cat. Especially when there's little reason for JSON to be 
 included in Phobos other than a place for it to be.
A sensible default doesn't stop you from using something suited to your particular needs.  It also saves time and cognitive effort to know what to expect when reading code.
The standard library should contain a minimal or best implementation of things that are core to the language. I don't think it needs a default implementation of everything. At one point we were looking at putting all sorts of things in there (std.database, std.io, std.kitchensink). I just think the advent of a build system that can pull in dependencies mitigates a lot of that need. It brings with it other problems, but if we can mow that grass down so you aren't wading through a forest, maybe it's not so bad. -Steve
Jan 06 2020
next sibling parent berni44 <dlang d-ecke.de> writes:
On Tuesday, 7 January 2020 at 05:55:24 UTC, Steven Schveighoffer 
wrote:
 I definitely think we should have a repository system that 
 separates "blessed" and maintained projects from all the ones 
 that are added by random people.
The whole thing feels a little bit like an onion: The core is the core language, than the runtime, then Phobos and then, there is a skin missing (the one you describe) and then code.dlang.org and maybe next to whole internet. The closer to the core, the more reliable, but less flexible. So in Phobos there should only be essential stuff, like "writeln".
Jan 07 2020
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
I'd be good with thinking about moving std.xml to dub. But not to unDead, as 
there is no replacement.
Jan 07 2020
parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 7, 2020 7:20:05 PM MST Walter Bright via Digitalmars-d 
wrote:
 I'd be good with thinking about moving std.xml to dub. But not to unDead,
 as there is no replacement.
Why? I thought that the whole point of undead was for stuff from Phobos that we wanted to get rid of but still wanted to be available for older code would go to undead. I don't see how whether there is a replacement in Phobos is relevant. e.g. std.stream and friends were put into undead, and we don't really have a replacement for those. You can solve the problems that those modules try to solve in a different way using what's in Phobos, but we don't really have a stream solution in Phobos right now. What would be better about putting std.xml in its own project instead of undead? And honestly, from what I've seen of std.xml, I think that anyone using it should just plain seriously reconsider. I don't know how buggy it actually is, but it clearly was doing some stuff wrong such that it wouldn't be able to properly handle at least some XML files (though at the moment, I don't recall exactly what I found in there). And we do have multiple xml solutions available on code.dlang.org now. So, while we may not have a "blessed," default solution for people to switch to, there are better options available than std.xml. Also, I think that putting std.xml in its own project makes it too likely that someone else would use it in a new project, whereas if it's in undead, it's unlikely to be used by new projects, just older projects that were already using it. We've considered std.xml to be a poor solution for years now (we've even had a warning in its documentation that it was going to go away because it isn't up to our current standards), and I don't think that we should be encouraging its use. Putting it in undead makes it available without presenting it as a solution that people should be using. It's also where we've already been putting all of the modules that we decided to get rid of from Phobos, and we haven't really used the criteria of needing a replacement previously as far as whether it goes in undead or elsewhere. That's just been part of the question of whether we would remove it, and in some cases, we have removed stuff without really having a replacement in Phobos (most notably std.stream and friends). - Jonathan M Davis
Jan 07 2020
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 1/7/2020 10:51 PM, Jonathan M Davis wrote:
 [...]
You raise some good points. I am not an XML user, so I don't have an informed opinion about this.
Jan 08 2020
parent reply berni44 <dlang d-ecke.de> writes:
On Wednesday, 8 January 2020 at 08:26:33 UTC, Walter Bright wrote:
 On 1/7/2020 10:51 PM, Jonathan M Davis wrote:
 [...]
You raise some good points. I am not an XML user, so I don't have an informed opinion about this.
Filed a PR to add std.xml to undeaD: https://github.com/dlang/undeaD/pull/37 OK?
Jan 10 2020
parent reply rikki cattermole <rikki cattermole.co.nz> writes:
On 11/01/2020 3:11 AM, berni44 wrote:
 On Wednesday, 8 January 2020 at 08:26:33 UTC, Walter Bright wrote:
 On 1/7/2020 10:51 PM, Jonathan M Davis wrote:
 [...]
You raise some good points. I am not an XML user, so I don't have an informed opinion about this.
Filed a PR to add std.xml to undeaD: https://github.com/dlang/undeaD/pull/37 OK?
This should have a vote and then yes we can deprecate it.
Jan 10 2020
parent reply berni44 <dlang d-ecke.de> writes:
On Friday, 10 January 2020 at 14:21:44 UTC, rikki cattermole 
wrote:
 This should have a vote and then yes we can deprecate it.
How to start a vote?
Jan 10 2020
parent rikki cattermole <rikki cattermole.co.nz> writes:
On 11/01/2020 3:43 AM, berni44 wrote:
 On Friday, 10 January 2020 at 14:21:44 UTC, rikki cattermole wrote:
 This should have a vote and then yes we can deprecate it.
How to start a vote?
Start a new thread, its not an official process. It gives a chance for people to give a compelling reason not to split it out and gets people on board.
Jan 10 2020
prev sibling parent reply Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Monday, January 6, 2020 12:40:10 PM MST berni44 via Digitalmars-d wrote:
 In Phobos there are several toplevel modules which each are about
 one special file format. Wouldn't it be better to put them all
 into std.fileformats (or some similar place)?

 std.base64
 std.csv
 std.json
 std.xml
 std.zip

 If yes, there are some more questions poping up:

 a) Should we use the change to replace some of the modules? At
 least for std.xml I found std-experimental-xml [1], which seems
 to be thought to replace xml eventually. And there are lot's of
 packages which address json. I don't know if there is one, which
 is *the* candidate for replacement.

 b) Should we instead remove some of these? Probably std.zip is
 here the first candidate. (I put some work in it in the last few
 weeks, but it would be fine for me throwing this away.)

 It's on a gut level - just wondering, what you think about this.

 [1] https://code.dlang.org/packages/std-experimental-xml
Are you talking about putting them all in a single module or all in a single package? Putting them in a single module would be the opposite of the direction that we've been going with Phobos over the past few years. We've generally been breaking up larger modules, not merging modules. So, if we do rearrange these modules, it would definitely not be by putting them all in one module. However, even if we moved these modules into a sub-package, moving them around at this point would arguably just be unnecessary churn. It would break existing code for minimal benefit. If we replace any of them with new implementations (e.g. there's been talk of replacing std.xml and std.json for years now), then maybe they should go in a deeper package hierarchy, but I really don't think that it makes much sense to simply rearrange modules. It's also the sort of thing that Walter tends to be against. Now, personally, I don't think that anything regarding file formats should have been in the standard library in the first place. IMHO, that's not the sort of thing that belongs in a standard library, and they really should have been on code.dlang.org. However, when they were originally written, we didn't have code.dlang.org, so they ended up in Phobos. Either way, based on previous discussions on adding stuff to Phobos, I think that it's pretty clear that any new, major additions would have to be on code.dlang.org and battle-tested there before being moved into Phobos, and I don't see any of these modules being replaced any time soon even if we want to replace them. std.experimental.xml was a GSoC project that was not completed and is basically dead. It seems like the original author got too busy with school and never got back to it. For it to get anywhere, someone else would have to finish it. It was started with the intention of replacing std.xml, but like any other major additions, it would still have to go through the Phobs review process and be voted in. http://code.dlang.org/packages/dxml might end up in Phobos at some point, but that's not a fight that I want to fight right now, and the only reason that I think that it would make sense to put any XML parser/writer in Phobos at this point is because we already have std.xml, and it really needs to either be replaced or removed. http://code.dlang.org/packages/std_data_json was a candidate for replacing std.json, but IIRC, there was basically too much arguing over what the new std.json should look like, so Sonke gave up on getting it into Phobos. No one has tried to get a new JSON solution through the Phobos review process since then, and I think that for the most part, people have been happy to just put their stuff on code.dlang.org rather than trying to push anything through the Phobos review process. For the most part, I don't see any point in removing any of these modules, since that would break existing code, and while I don't think that Phobos should have been implementing parsers for standard file formats, that doesn't necessarily mean that we should be breaking existing code to remove them. The primary exception would probably be std.xml, since it arguably does more harm than good, but there's never really been a consensus on ripping it out without a replacement being added to Phobos. BTW, base64 isn't really a file format. It's an encoding. So, even if we were going to move all of these into a sub-package, std.base64 wouldn't belong with them. - Jonathan M Davis
Jan 06 2020
parent reply berni44 <dlang d-ecke.de> writes:
On Tuesday, 7 January 2020 at 01:10:08 UTC, Jonathan M Davis 
wrote:
 Are you talking about putting them all in a single module or 
 all in a single package?
Sorry, for being unclear here. I thought of a package.
 Now, personally, I don't think that anything regarding file 
 formats should have been in the standard library in the first 
 place.
Thinking about this whole stuff, I noticed, that there are two different points of view, which should be separated: The idealist view and the pragmatic view. IMHO both are important. So when I got you right, from an idealists view, you'd say these file formats should be removed from phobos, but from a pragmatists view this looks much more difficult. I think, I share this point of view. But I'd like to get rid of them anyway.
 For the most part, I don't see any point in removing any of 
 these modules, since that would break existing code,
Well, every module, that is kept inside Phobos produces (lots of) maintainance work. From my perspective, we are missing resources here. So I prefere a controlled breaking of code (with deprecation and all) instead of having the code rosting and breaking uncontrolled sooner or later. I came up with this issue, when I looked at my own comment on issue 17709 [1]: I found the reason for this issue and I think I could fix that in a reasonable amout of time. But is it worth it doing so if that module might be removed or replaced in the near future? Wouldn't it be much better to use that time to fix a bug at a more important place? But on the other side: How does such a comment look like to someone how is using std.xml and found that issue, cause he stumbled over the same problem? Wouldn't it be better to remove std.xml completely in the first place? [1] https://issues.dlang.org/show_bug.cgi?id=17709 And than, there is something more to be thought of: The (public) perception of Phobos. If it contains parts, that are more or less broken, useless or outdated, this will be the parts where people will look at, when they judge. It won't help much to have other parts, that work well.
 BTW, base64 isn't really a file format. It's an encoding.
Really? So why isn't it in std.encoding? :-) I know, that base64 is somewhat different, maybe it's in the gray area... Or look at it the other way round: Isn't zipping also just encoding and unzipping decoding?
Jan 07 2020
next sibling parent rikki cattermole <rikki cattermole.co.nz> writes:
On 08/01/2020 1:44 AM, berni44 wrote:
 BTW, base64 isn't really a file format. It's an encoding.
Really? So why isn't it in std.encoding? :-) I know, that base64 is somewhat different, maybe it's in the gray area... Or look at it the other way round: Isn't zipping also just encoding and unzipping decoding?
Perhaps, just perhaps std.encoding should be a package too ;) A zip file is a container, an encoding is not a container, it is a representation (just with different assumptions i.e. the base).
Jan 07 2020
prev sibling parent Jonathan M Davis <newsgroup.d jmdavisprog.com> writes:
On Tuesday, January 7, 2020 5:44:40 AM MST berni44 via Digitalmars-d wrote:
 On Tuesday, 7 January 2020 at 01:10:08 UTC, Jonathan M Davis

 wrote:
 Now, personally, I don't think that anything regarding file
 formats should have been in the standard library in the first
 place.
Thinking about this whole stuff, I noticed, that there are two different points of view, which should be separated: The idealist view and the pragmatic view. IMHO both are important. So when I got you right, from an idealists view, you'd say these file formats should be removed from phobos, but from a pragmatists view this looks much more difficult. I think, I share this point of view. But I'd like to get rid of them anyway.
std.xml is probably the only one from your list that I'd argue should be seriously considered for being removed from Phobos and moved into undead sooner rather than later. I don't know quite what state std.json is in, so maybe it should have the same done to it, though it's had some work done on it recently to try to improve it. I see no reason to rip out stuff like std.csv or std.zip at this point though. They work and are useful. They also aren't fundamentally broken in the way that std.xml is AFAIK.
 For the most part, I don't see any point in removing any of
 these modules, since that would break existing code,
Well, every module, that is kept inside Phobos produces (lots of) maintainance work. From my perspective, we are missing resources here. So I prefere a controlled breaking of code (with deprecation and all) instead of having the code rosting and breaking uncontrolled sooner or later. I came up with this issue, when I looked at my own comment on issue 17709 [1]: I found the reason for this issue and I think I could fix that in a reasonable amout of time. But is it worth it doing so if that module might be removed or replaced in the near future? Wouldn't it be much better to use that time to fix a bug at a more important place? But on the other side: How does such a comment look like to someone how is using std.xml and found that issue, cause he stumbled over the same problem? Wouldn't it be better to remove std.xml completely in the first place? [1] https://issues.dlang.org/show_bug.cgi?id=17709
std.xml is broken. We've agreed for years now that it should go. It's just that there has been no agreement on removing it without having a replacement, which is why it's still there. I don't think that it's worth your time to work on it. However, out of the list of modules that you provided, std.json is the only other one where I recall any real discussion about replacing or removing. Certainly, spending time fixing a bug in something like std.base64 or std.zip is not a waste of time.
 BTW, base64 isn't really a file format. It's an encoding.
Really? So why isn't it in std.encoding? :-) I know, that base64 is somewhat different, maybe it's in the gray area... Or look at it the other way round: Isn't zipping also just encoding and unzipping decoding?
Encoding involves taking information and converting it into another format which contains the same information in a different manner. File formats may use encodings for some of the information that they contain, but an encoding has to do with how information is encoded. e.g. Unicode code points are encoded with UTF-8, UTF-16, or UTF-32. All three encodings contain exactly the same information, but the way that that information is encoded differs. And while UTF-8 may be used inside a file, it is in no way tied to files. It's just a way that Unicode character information is encoded. base64 is a binary encoding, whereas std.encoding deals with character encodings. zip is a file / container format. It uses different compression algorithms to encode binary information internally, but zip itself is a container format, not an encoding. It does far more than encode a string of data in a different way like an encoding does. std.encoding is also a bit of an oddball. It's an older module that probably needs to be revamped / redesigned. It has some level of support for various character encodings - including UTF-8, UTF-16, and UTF-32 - but we have std.utf for UTF handling, and std.utf is what gets used by Phobos for handling UTF encodings. std.encoding is class-based and has some range support, but it isn't really range-based aside from improvements that have been made to it over time. It does get some occasional tweaks, but largely, it's an older module with an older design that doesn't necessarily fit all that well into the rest of Phobos. I don't know what should be done with it though. Some of what it does is stuff that we really should have, but it probably needs to be redesigned. However, somebody would have to step up to do that. - Jonathan M Davis
Jan 07 2020