www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - std.stdio overhaul by Steve Schveighoffer

reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Hello,


There are a number of issues related to D's current handling of streams, 
including the existence of the imperfect etc.stream and the 
over-specialization of std.stdio.

Steve has worked on an extensive overhaul of std.stdio which would 
obviate the need for etc.stream and would improve both the generality 
and efficiency of std.stdio.

Please chime in with feedback; he's away from the Usenet but allowed me 
to post this on his behalf. I uploaded the docs to

http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


Thanks,

Andrei
Sep 03 2011
next sibling parent Jose Armando Garcia <jsancio gmail.com> writes:
On Sat, Sep 3, 2011 at 12:54 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would obviate
 the need for etc.stream and would improve both the generality and efficiency
 of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me to
 post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Interesting. How does this work with RAII? Where is the source code?
Sep 03 2011
prev sibling next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

After a quick look, I have two concerns: 1. File is a class, not a struct. This precludes using reference counting as the current std.stdio.File does, meaning you have to close all your Files manually. I loved the reference counting semantics, especially the last few releases since most of the relevant compiler bugs have been fixed. 2. File(someFileName, someMode) needs to work. Not supporting this method of instantiating a File object would break way too much code.
Sep 03 2011
next sibling parent reply David Nadlinger <see klickverbot.at> writes:
On 9/3/11 11:20 PM, dsimcha wrote:
 2.  File(someFileName, someMode) needs to work.  Not supporting this method of
 instantiating a File object would break way too much code.

This one could easily be solved by aliasing File.open to (static) opCall(). David
Sep 03 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from David Nadlinger (see klickverbot.at)'s article
 On 9/3/11 11:20 PM, dsimcha wrote:
 2.  File(someFileName, someMode) needs to work.  Not supporting this method of
 instantiating a File object would break way too much code.

David

Agreed, but in the big picture this overhaul still breaks way too much code without either a clear migration path or a clear argument about why such extensive breakage is necessary. The part about File(someFileName, someMode) is just the first thing I noticed.
Sep 03 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 3:53 PM, dsimcha wrote:
 Agreed, but in the big picture this overhaul still breaks way too much code
 without either a clear migration path or a clear argument about why such
extensive
 breakage is necessary.  The part about File(someFileName, someMode) is just the
 first thing I noticed.

[rant] I agree. I agree that std.stream should be replaced, but I have a lot of misgivings about replacing std.stdio. I do not want to rewrite every darn D program I've ever written. I think it is a bad idea to break everyone else's D program. Everything in dsource will break in non-trivial ways. I don't think we can afford this. I do not know of any successful system or language that breaks user code with such aplomb as D does. Not even C++ dares to break that Piece Of S*** that everyone knows iostreams is. I can compile and run unix C code from 30 years ago on Linux with no changes at all. Same with DOS code. There needs to be huge improvement to justify such breakage. [I also don't like it that all my code that uses std.path is now broken.] I would prefer to see all the energy that is going into refactoring existing, working modules go into designing new, not existing, modules that there's a crying need for. [/rant] Enough ranting for now, as for the proposed std.stdio, 1. It does look fairly straightforward, but: 2. There is only one example. Have any commonly done programming tasks been tried out with it to see how they work? 3. There is no indication of how it interacts with C stdio. A primary goal of std.stdio was interoperability with C stdio. 4. There are no benchmarks. The current std.stdio was designed/written in parallel with some benchmarks Andrei and others cooked up, as a primary goal was performance. 5. flushCheck - flushing should be done based on the file type. tty's should be \n flushed, files when the buffer is full. I question the performance of using a delegate to check for flushing. How often will it be called? 6. There is no provision for multithreaded writing, i.e. what happens when two threads write to stdout. Ideally, there should be a way to 'lock' the stream to oneself, in order to appropriately interleave the output. 7. I see nothing for 'raw' character by character input. 8. I see nothing for determining if a char is available on the input. How would one implement "press any key to continue"?
Sep 03 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 7:33 PM, Steven Schveighoffer wrote:
 Please, leave all pitchforks and torches at rest for the moment :)

I know what I wrote was a bit brutal, but this needs to be settled before we've gone so far down that path that turning away then would be horribly unfair to you. I think what you need is a marketing spiel to sell the concept of what you're trying to do. It should include: 1. The benefits over the current std.stdio 2. Why the new API is needed to achieve those benefits 3. A migration plan for existing std.stdio code Just being more flexible isn't enough, it has to be more flexible in a way that matters, i.e. a real example showing how kickass it is compared to the current way.
 2. the performance. It's much better than current stdio. Aren't people
 continuously complaining at how slow i/o is in Phobos compared to other
libraries?

Why is it faster? I.e. is a wholly new interface required to make it faster, or does it just need to be better under the hood?
 3. There is no indication of how it interacts with C stdio. A primary goal of
 std.stdio was interoperability with C stdio.

useCStdio();

For some reason that just seems like a giant wart with a hair sticking out of it. Why not just use the C stdio buffers?
 5. flushCheck - flushing should be done based on the file type. tty's should
 be \n flushed, files when the buffer is full. I question the performance of
 using a delegate to check for flushing. How often will it be called?

Once per write to the buffer. Data is only checked once (the delegate is never given the same data to check again). If you want, I can look at adding a means to avoid using a delegate when the trigger is a single character. And TextInput/TextOutput auto detect whether a device is a tty, and install the right flushcheck function if necessary.

Flushing once per write is wrong - consider the user who does a zillion putc's. I don't see a purpose to anything beyond the C stdio ones - per character, per \n, and per buffer.
 7. I see nothing for 'raw' character by character input.

The interface is geared to read by processing the buffer, not one character at a time. Given access to the buffer, you can process one character at a time if you want. See InputRange in TextInput to see how raw character-by-character input can be done.

Raw mode is more than that - you have to set the OS to raw mode, otherwise it won't give you any characters until a \n is typed.
 8. I see nothing for determining if a char is available on the input. How
 would one implement "press any key to continue"?

I need more information. I would probably implement this as a read(ubyte[1]), so I don't see why it can't be that way.

There's more to it than that. Try writing it in C and you'll see what I mean. (You have to set the io to "raw" mode, turn "echo" off, etc.)
Sep 03 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 10:09 PM, Steven Schveighoffer wrote:
 I appreciate feedback, but I think there was a misunderstanding of what this
 "review" was for. I think people thought I was proposing this as a
ready-to-pull
 replacement for std.stdio. That is not the case. It's very much up in the air
 and under development. I just wanted to show people some progress and get
 feedback (which I've gotten a lot of!)

I'm glad it's early in the process.
 For some reason that just seems like a giant wart with a hair sticking out of
 it. Why not just use the C stdio buffers?

1. Because most people don't care. I never ever use printf, except when I was testing my new stdio stuff, and I needed something that worked :) My opinion, if you are using this line, you are doing something weird, legacy related, or you are debugging something.

I still use printf a lot. One reason is because it is lightweight - using writeln blows up the size of your .obj file, making it hard to track down a back end bug. This is a long standing gripe I have with writeln. D is supposed to work well with existing C code. To me, that includes working smoothly with C stdio.
 If you read my response to the first post in this thread, you can see my
rationale.

I understand the desire to do away with C stdio compatibility, but it needs to deliver a *lot* to justify that. I also don't mind if std.stdio needs to peek under the hood of C stdio to get there - yes, it'll be custom for each C library, but the user won't see that.
 a *check* to see if it should be flushed is done once per write. Not a flush. A
 flush is only done if the check says to (or the buffer is full). I think C's
 FILE * checks once per write as well, no?

No. It checks once per char for \n, and once per buffer overflow otherwise.
 That is not an OS issue, that is a terminal issue.

It's an OS issue. The OS does the line buffering.
 Note that the current std.stdio does not provide this functionality. The only
 raw functions are rawRead and rawWrite, which set binary mode. All binary mode
 does is on windows enable or disable translation of \r\n to \n. They will not
do
 what you are asking.

You're right, you have to dip under the hood to the OS protocol to do it.
Sep 03 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-04 06:53:27 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight - using
 writeln blows up the size of your .obj file, making it hard to track down a
 back end bug. This is a long standing gripe I have with writeln.

Well, while that may be a good reason to use printf, it really doesn't apply to very many D programmers. Your average D programmer really has no need to use printf.

That may be true, but the average D programmer will also, directly or indirectly, call C APIs which may use printf to write things to the console. I'm not sure it's much of a problem though. For one thing, C APIs generally don't print things on their own. And also, I doubt using D IO by default will break printf that much: I mean if C IO is used to print lines, those lines will be flushed as they're emitted, with no possible weird interleaving unless the line is really too long. And if you use both D and C IO together, likely you're just logging things to the console line by line and not outputting things in a specific format where weird interleaving could cause major breakage. I'm making some assumptions here, so maybe I'm wrong, but I can't really see a use case where both IO system would be used and where the fidelity of the output is that important… please correct me if I'm wrong. So in my opinion the default should be to use D streams, as I don't expect the drawbacks to be a major inconvenience, and the performance gain of being able to access the buffer directly would certainly be welcome. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 7:10 AM, Michel Fortin wrote:
 On 2011-09-04 06:53:27 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight -
 using
 writeln blows up the size of your .obj file, making it hard to track
 down a
 back end bug. This is a long standing gripe I have with writeln.

Well, while that may be a good reason to use printf, it really doesn't apply to very many D programmers. Your average D programmer really has no need to use printf.

That may be true, but the average D programmer will also, directly or indirectly, call C APIs which may use printf to write things to the console. I'm not sure it's much of a problem though. For one thing, C APIs generally don't print things on their own. And also, I doubt using D IO by default will break printf that much: I mean if C IO is used to print lines, those lines will be flushed as they're emitted, with no possible weird interleaving unless the line is really too long.

No, things are more complex; the interference will be major unless explicitly addressed. Andrei
Sep 04 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

No, things are more complex; the interference will be major unless explicitly addressed.

That doesn't really help understand the issue, you're just making it more obscure. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 04 2011
parent reply Steven Schveighoffer <schveiguy yahoo.com> writes:
Michel Fortin Wrote:

 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

No, things are more complex; the interference will be major unless explicitly addressed.

That doesn't really help understand the issue, you're just making it more obscure. -- Michel Fortin michel.fortin michelf.com http://michelf.com/

-Steve
Sep 04 2011
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-04 16:08:47 +0000, Steven Schveighoffer <schveiguy yahoo.com> said:

 Michel Fortin Wrote:
 
 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

No, things are more complex; the interference will be major unless explicitly addressed.

That doesn't really help understand the issue, you're just making it more obscure.

You are assuming each write flushes the buffer. That's not always the case.

Not exactly. I am assuming each write flushes the buffer __up to the last newline__, and that most writes ends with \n in a use case where you'd be intermixing the IO systems. That's what I read somewhere else in this discussion, but maybe I read it wrong. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 3:23 PM, Michel Fortin wrote:
 On 2011-09-04 16:08:47 +0000, Steven Schveighoffer <schveiguy yahoo.com>
 said:

 Michel Fortin Wrote:

 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the
 line is
 really too long.

No, things are more complex; the interference will be major unless explicitly addressed.

That doesn't really help understand the issue, you're just making it more obscure.

You are assuming each write flushes the buffer. That's not always the case.

Not exactly. I am assuming each write flushes the buffer __up to the last newline__, and that most writes ends with \n in a use case where you'd be intermixing the IO systems. That's what I read somewhere else in this discussion, but maybe I read it wrong.

It depends on the buffering mode of the stream, and also of the buffering mode of whatever alternative abstraction is being used. Sorry for being curt - I trusted Walter's earlier explanation would suffice. Andrei
Sep 04 2011
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-04 19:36:23 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 It depends on the buffering mode of the stream, and also of the 
 buffering mode of whatever alternative abstraction is being used.
 
 Sorry for being curt - I trusted Walter's earlier explanation would suffice.

Actually my assumption wasn't too bad within its own boundaries. I was only thinking about stdout and it being line-buffered by default. Looking into it a little more, I'm not sure what would happen to stdin and stdout isn't always line-buffered by default anyway (if the output isn't a terminal for instance). So I have to agree with you that mixing the two won't work well in most cases. Sorry for the distraction. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 04 2011
prev sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/4/2011 2:17 AM, Lars T. Kyllingstad wrote:
 On Sat, 03 Sep 2011 18:23:26 -0700, Walter Bright wrote:

 [I also don't like it that all my code that uses std.path is now
 broken.]

What do you mean by "broken"? That it does not compile or work as expected, or that it spits out a bunch of annoying deprecation messages? If it is any of the former, that was not supposed to happen. The new std.path still contains all the functions of the old std.path and should therefore be backwards compatible. If the new std.path breaks existing code, I need to fix it before it is released. Please let me know what problems you are experiencing.

It prints out all the deprecation message. It means I'll have to go edit existing, working code to change the names. I know that the majority wants the name changes. I know the deprecation system gives people plenty of time to edit their code. But I think the cost of breaking existing code is much higher than many realize, and a lot of that cost will be hidden. It'll come in the form of people deciding not to use D because it is "not stable". It'll come in the form of invalidating existing libraries and modules unless someone is regularly maintaining them. It'll come in the form of invalidating the mass of books, articles, blog postings, and presentations about D, and those will never get updated. People will type in the code examples, they will fail to compile, and they'll get turned off about D. I'll again note that I know of know successful operating system or programming language that goes around breaking existing code unless it is really, really urgent. Camel-casing a name doesn't meet that standard. So, yes, I don't like it.
Sep 05 2011
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Walter Bright (newshound2 digitalmars.com)'s article
 On 9/4/2011 2:17 AM, Lars T. Kyllingstad wrote:
 I'll again note that I know of know successful operating system or programming
 language that goes around breaking existing code unless it is really, really
urgent.
 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

I agree that we've been overzealous lately in breaking code to fix small inconsistencies in style, etc. I think in a lot of cases the answer is permanent (or very long term, i.e. several years) soft deprecation, plus a real soft-deprecated language feature. This will lead to cruft accumulation but in some cases this cruft is less bad than the cruft caused by inconsistent naming conventions/style, etc. To make the docs seem less crufty to people browsing, we could even eventually remove the soft-deprecated functionality from the DDoc documentation so that people reading it can't even see the cruft, and move the code to the bottom of the source files so that people don't see it unless they go looking for it. We could also adopt a policy of zero maintenance for features that have been soft-deprecated for long periods of time, i.e. not even if they produce egregiously wrong results, security holes, etc.
Sep 05 2011
prev sibling next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Count me as another who is sick and tired of the gratuitous breaking
changes every damned month.

The worst part is there's still some new stuff I actually want each
month, so I'm not doing my usual strategy of never, ever, ever updating
software.

It's just pain. Trivial changes are easy enough to fix, but are a
pain. More complex changes cost me time and money. (I'm still angry
about the removal of std.date. But soft deprecation is even worse -
I hate that so much the first thing I do when updating my dmd is to
edit the source to get that useless annoying shit out of there)
Sep 05 2011
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Adam Ruppe" <destructionator gmail.com> wrote in message 
news:j43nl0$2f85$1 digitalmars.com...
 Count me as another who is sick and tired of the gratuitous breaking
 changes every damned month.

I understand this, and it's a pain to have to change code every release, but I don't think phobos is _anywhere near_ ready to stop breaking. The good news is that the pace of releases has slowed down so it's only every couple of months.
 The worst part is there's still some new stuff I actually want each
 month, so I'm not doing my usual strategy of never, ever, ever updating
 software.

 It's just pain. Trivial changes are easy enough to fix, but are a
 pain. More complex changes cost me time and money. (I'm still angry
 about the removal of std.date. But soft deprecation is even worse -
 I hate that so much the first thing I do when updating my dmd is to
 edit the source to get that useless annoying shit out of there)

How difficult is the process of moving std.date to your own code? (or any other phobos module) How could this be made easier? I don't think the answer is keeping these (broken) modules in phobos.
Sep 06 2011
parent Adam Ruppe <destructionator gmail.com> writes:
Daniel Murphy wrote:
 How could [moving a module to your own code] be made easier?

Actually, ironically enough, removing it from Phobos would make it easier, since they the file can simply be copied into my own tree without needing to rename it to avoid conflicts. This wouldn't apply to a hypothetical std.xml2 though, if it was still called std.xml. Then the old code would still need to find all the imports and rename it. (Renaming modules will probably get more annoying as we go forward, since function local imports might encourage more repetition of the module name.)
Sep 06 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 09/05/2011 04:51 PM, Walter Bright wrote:
 If the new std.path breaks existing code, I need to fix it before it is
 released. Please let me know what problems you are experiencing.

It prints out all the deprecation message. It means I'll have to go edit existing, working code to change the names.

I think it means it gives you time, on your own schedule with generous deadlines, to make the changes to your code.
 I know that the majority wants the name changes. I know the deprecation
 system gives people plenty of time to edit their code.

 But I think the cost of breaking existing code is much higher than many
 realize, and a lot of that cost will be hidden. It'll come in the form
 of people deciding not to use D because it is "not stable". It'll come
 in the form of invalidating existing libraries and modules unless
 someone is regularly maintaining them. It'll come in the form of
 invalidating the mass of books, articles, blog postings, and
 presentations about D, and those will never get updated. People will
 type in the code examples, they will fail to compile, and they'll get
 turned off about D.

 I'll again note that I know of know successful operating system or
 programming language that goes around breaking existing code unless it
 is really, really urgent.

 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

I agree with all of the above. However, as is often the case, there's more than one side to the story. Bad APIs have their costs too. We can't afford to have an XML library that offers few and badly packaged features and comes at the tail of all benchmarks. We also can't afford a JSON library that is poorly designed and badly written. Ironically, the costs mostly manifest the same way: people will decide not to use D because it "lacks good libraries" and "is quirky to use". In many ways a language's standard library is a showcase of the language, and to a newcomer an inconsistent and awkward standard library affects the perception of the language's quality. Stressing that breaking code has a cost and implying that keeping it with flaws has no cost is as mistaken as worrying in chess about the flank at the expense of the center. The reality we need to face is, we are experiencing growth pains. What we must do is NOT lament about breaking this or keeping that. We must: a) devise good language features to cope with deprecation, of which deprecation with message is one that I think we need to embrace and extend (I have a few ideas I'll discuss separately); b) supplement that with a good policy for deprecating APIs and introducing new ones - in particular decide where to draw the line when introducing a breaking change; c) possibly create programs a la gofix that help migration. Andrei
Sep 05 2011
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 In many ways a language's standard library is a
 showcase of the language,

YES!!! I'm glad someone besides me finally realizes this. For example, whenever someone asks me about why D metaprogramming is so great, I just point them to a few std lib modules that showcase this, e.g.: http://stackoverflow.com/questions/7300298/metaprogramming-in-c-and-in-d/7300611#7300611
Sep 05 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library that
 offers few and badly packaged features and comes at the tail of all benchmarks.
 We also can't afford a JSON library that is poorly designed and badly written.
 Ironically, the costs mostly manifest the same way: people will decide not to
 use D because it "lacks good libraries" and "is quirky to use". In many ways a
 language's standard library is a showcase of the language, and to a newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But simply changing the names of otherwise successful APIs is not worth while.
 c) possibly create programs a la gofix that help migration.

gofix cannot fix books, articles, blogs, and presentations. Furthermore, in order to work successfully, gofix needs to be a complete D front end, capable of handling both the old and the new stuff. Doing a perl script would be a disaster. It's a substantial project, has a high risk of inadequacy, and I suspect our resources are better spent elsewhere. Considering also the problems people have running dmd and getting it to find their imports and libraries, add in having to run 'gofix' over their source code first, then patch up what gofix goofed up, seems a stretch.
Sep 05 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But simply changing the names of otherwise successful APIs is not worth while.

So we have to live with these naming conventions from C forever? -- /Jacob Carlborg
Sep 05 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 08:56, Jonathan M Davis wrote:
 On Tuesday, September 06, 2011 08:42:14 Jacob Carlborg wrote:
 On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of
 the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But simply changing the names of otherwise successful APIs is not worth while.

So we have to live with these naming conventions from C forever?

My take on it is that we need to figure out which pieces of Phobos need to be reworked or renamed and get it done as soon as possible. That way, everything follows the proper naming conventions (thus avoiding a mess like PHP) and is of an appropriately high level of quality. Then we can have an appropriately stable API which doesn't have to change often - if at all. I think that the current problem with Phobos is primarily a combination of three things: 1. Older APIs which aren't in line with how D2 and Phobos have evolved (e.g. they don't use ranges when they should). 2. Some older stuff didn't get a thorough enough peer review before making it into Phobos and is not at a high enough level of quality, so it needs to be revised or replaced. 3. Too much of what has been done in the past has been a hodgepodge of naming conventions, making it very inconsistent in some places. Once those have been sorted out (some of which can be done without breaking any existing code and some of which requires breaking changes), then we can have a stable API for Phobos which doesn't change much except where we're adding new functionality which doesn't break existing code. So ultimately, we _will_ have a stable API, but some breaking changes are required in the short term to resolve issues with Phobos which would cause problems in the long run. - Jonathan M Davis

Yes, thank you, I agree. -- /Jacob Carlborg
Sep 06 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 08:42:14 Jacob Carlborg wrote:
 On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.
 
 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of
 the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But simply changing the names of otherwise successful APIs is not worth while.

So we have to live with these naming conventions from C forever?

My take on it is that we need to figure out which pieces of Phobos need to be reworked or renamed and get it done as soon as possible. That way, everything follows the proper naming conventions (thus avoiding a mess like PHP) and is of an appropriately high level of quality. Then we can have an appropriately stable API which doesn't have to change often - if at all. I think that the current problem with Phobos is primarily a combination of three things: 1. Older APIs which aren't in line with how D2 and Phobos have evolved (e.g. they don't use ranges when they should). 2. Some older stuff didn't get a thorough enough peer review before making it into Phobos and is not at a high enough level of quality, so it needs to be revised or replaced. 3. Too much of what has been done in the past has been a hodgepodge of naming conventions, making it very inconsistent in some places. Once those have been sorted out (some of which can be done without breaking any existing code and some of which requires breaking changes), then we can have a stable API for Phobos which doesn't change much except where we're adding new functionality which doesn't break existing code. So ultimately, we _will_ have a stable API, but some breaking changes are required in the short term to resolve issues with Phobos which would cause problems in the long run. - Jonathan M Davis
Sep 05 2011
prev sibling next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Walter Bright wrote:
 I agree that the XML and JSON libraries need to be scrapped and rewritten.

Ugh, I actually use the std.json.
 Furthermore, in order to work successfully, gofix [...]

The easiest way to do that is run the compiler. If an error occurs, go to the given line of the problem and maybe automatically replace with the spell checker's suggestion. Then in case that's wrong, ask the user to confirm it. ... which is pretty much what my editor (vim) already does! Changing names is annoying, but it's not a difficult task, with what we have now. The compiler already does 90% of the work, and even fairly simple editors will bring it to about 98%. I'll bitch about it, but it isn't a big enough deal to bother with a gofix. Trivial fixes are already trivial fixes. I'd prefer to avoid them, but let's not forget that the compiler already does most the work. The more annoying changes are where stuff changes wholesale, so the code needs to be rethought, data needs to be changed, and so on. These are just huge sinks of pain. And, no, a long deprecation time doesn't change anything. Whether I spend thousands of dollars today changing it or thousands of dollars in six months changing it, the fact is I'm still out thousands of dollars.
Sep 06 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 9:17 AM, Adam Ruppe wrote:
 And, no, a long deprecation time doesn't change anything.
 Whether I spend thousands of dollars today changing it or thousands
 of dollars in six months changing it, the fact is I'm still out
 thousands of dollars.

Basic economics indicate that the difference is large. Andrei
Sep 06 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 16:17, Adam Ruppe wrote:
 Walter Bright wrote:
 I agree that the XML and JSON libraries need to be scrapped and rewritten.

Ugh, I actually use the std.json.
 Furthermore, in order to work successfully, gofix [...]

The easiest way to do that is run the compiler. If an error occurs, go to the given line of the problem and maybe automatically replace with the spell checker's suggestion. Then in case that's wrong, ask the user to confirm it. ... which is pretty much what my editor (vim) already does! Changing names is annoying, but it's not a difficult task, with what we have now. The compiler already does 90% of the work, and even fairly simple editors will bring it to about 98%. I'll bitch about it, but it isn't a big enough deal to bother with a gofix. Trivial fixes are already trivial fixes. I'd prefer to avoid them, but let's not forget that the compiler already does most the work. The more annoying changes are where stuff changes wholesale, so the code needs to be rethought, data needs to be changed, and so on. These are just huge sinks of pain. And, no, a long deprecation time doesn't change anything. Whether I spend thousands of dollars today changing it or thousands of dollars in six months changing it, the fact is I'm still out thousands of dollars.

You can always keep your own local copy of a module. -- /Jacob Carlborg
Sep 06 2011
parent Adam Ruppe <destructionator gmail.com> writes:
Jacob Carlborg wrote:
 You can always keep your own local copy of a module.

Yeah, though that comes with it's own set of pains. But, let me ask you this. Which is better? 1) Ask an unknown number of people to change their code to keep up with your changes and/or distribute the old module or 2) Ask just one person - who is making changes anyway - to make one more small change and distribute the old and new modules. One counterpoint to this would be "why make people download old modules they don't need in the zip?" There's two answers to that too: a) That's a trivial cost. The average Phobos module is about 3 or 4 kilobytes once zipped up. When you're grabbing a > 10 MB zip, three kilobytes is nothing to worry about. b) If it is a big deal, changing to a download on demand system (like the DIP) can avoid this... especially if the old version is still around and easily accessible by name.
Sep 06 2011
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 1:00 AM, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But simply changing the names of otherwise successful APIs is not worth while.

I agree we should be increasingly hawkish about such changes.
 c) possibly create programs a la gofix that help migration.

gofix cannot fix books, articles, blogs, and presentations. Furthermore, in order to work successfully, gofix needs to be a complete D front end, capable of handling both the old and the new stuff. Doing a perl script would be a disaster. It's a substantial project, has a high risk of inadequacy, and I suspect our resources are better spent elsewhere.

I'm not so sure. We're experiencing an unprecedented surge in participation to all aspects of the D programming language, and I believe you and I should start thinking differently. A certain change of phase happened to me a couple of months ago, when I commented about some fix that removed an undue limitation: "sounds good, but Walter has many other things on his plate more important than this". Within hours, the fix was available as a pull request. In this case, if one or more persons is/are determined enough to create dfix, it will happen regardless of whether you or I believe it's the optimal resource allocation.
 Considering also the problems people have running dmd and getting it to
 find their imports and libraries, add in having to run 'gofix' over
 their source code first, then patch up what gofix goofed up, seems a
 stretch.

I do agree dfix must work at least as well as Apple hardware :o). Andrei
Sep 06 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 04:48, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that offers few and badly packaged features and comes at the tail of all
 benchmarks. We also can't afford a JSON library that is poorly designed
 and badly written. Ironically, the costs mostly manifest the same way:
 people will decide not to use D because it "lacks good libraries" and
 "is quirky to use". In many ways a language's standard library is a
 showcase of the language, and to a newcomer an inconsistent and awkward
 standard library affects the perception of the language's quality.

 Stressing that breaking code has a cost and implying that keeping it
 with flaws has no cost is as mistaken as worrying in chess about the
 flank at the expense of the center.

 The reality we need to face is, we are experiencing growth pains. What
 we must do is NOT lament about breaking this or keeping that. We must:

 a) devise good language features to cope with deprecation, of which
 deprecation with message is one that I think we need to embrace and
 extend (I have a few ideas I'll discuss separately);

 b) supplement that with a good policy for deprecating APIs and
 introducing new ones - in particular decide where to draw the line when
 introducing a breaking change;

 c) possibly create programs a la gofix that help migration.


 Andrei

We don't want to have a standard library like the one in PHP where there seems to be no naming conventions at all. -- /Jacob Carlborg
Sep 05 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.
Sep 06 2011
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 09:35, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.

I think that that is one reason, not the only one, not the biggest one, but one reason. -- /Jacob Carlborg
Sep 06 2011
prev sibling next sibling parent Adam Ruppe <destructionator gmail.com> writes:
Walter Bright wrote:
 I don't think that is the reason PHP is such a bear to work with.

It is one of the problems with PHP, but I'm not sure it applies to D the same way. Almost *every time* I write PHP, I either mess up a name or the argument order. (Sometimes, PHP functions go src, dest, and sometimes it's dest, src. Ugh! The worst part is it doesn't even catch this. A name mismatch throws an error when it's run. Reversed arguments just silently do the wrong thing.) The random argument order is a huge huge pain. Contrast with D, where I've almost never forgotten a name. Granted, it might be due to my auto-complete function so I type once and only once, but it just hasn't been a big deal. Thanks to the UFCS with arrays, the argument order is almost always the same to work with that, so the much bigger problem never occurs.
Sep 06 2011
prev sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 2:35 AM, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.

Probably. At any rate, what I now think as a promising path is with new module names. Let's leave the likes of std.xml and std.json in peace, then pick a naming convention for the new ones and create whole new modules replacing them. Then people who are ready for the migration change import std.xml; with import std.some_naming_convention_involving_xml; and fix whatever code breakages that entails. If they're pleased with std.xml, nobody's holding a gun to their head. Months and years go by, and nobody uses std.xml because the new module and the migration path are copiously advertised in the documentation. At that point we can discuss excising std.xml altogether and replacing it with the new one. And so the new becomes old, just like in dialectics. There's a successful precedent in C++ - stringstream vs. strstream. The only missing thing is that C++ did not choose a naming convention because they limited themselves to only one header. So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at Facebook to good effect. Or should we, au contraire, use "old_" for the old module and advise people who want to stick with the old modules to change their imports? Andrei
Sep 06 2011
next sibling parent Adam Ruppe <destructionator gmail.com> writes:
Andrei Alexandrescu wrote:
 What should we use? xml2?

That might be a good idea. If D modules were to get in the habit of writing their major version numbers as part of the name, it'd solve this as well the dget automatic library downloading thingy in one go. Going with new and old won't work if there should ever be a version 3 written.
Sep 06 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 16:51, Andrei Alexandrescu wrote:
 On 9/6/11 2:35 AM, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.

Probably. At any rate, what I now think as a promising path is with new module names. Let's leave the likes of std.xml and std.json in peace, then pick a naming convention for the new ones and create whole new modules replacing them. Then people who are ready for the migration change import std.xml; with import std.some_naming_convention_involving_xml; and fix whatever code breakages that entails. If they're pleased with std.xml, nobody's holding a gun to their head. Months and years go by, and nobody uses std.xml because the new module and the migration path are copiously advertised in the documentation. At that point we can discuss excising std.xml altogether and replacing it with the new one. And so the new becomes old, just like in dialectics. There's a successful precedent in C++ - stringstream vs. strstream. The only missing thing is that C++ did not choose a naming convention because they limited themselves to only one header. So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at Facebook to good effect. Or should we, au contraire, use "old_" for the old module and advise people who want to stick with the old modules to change their imports? Andrei

I prefer to use "old_". Depending on what XML functionality we want we maybe want to have an xml package. -- /Jacob Carlborg
Sep 06 2011
parent reply Adam Ruppe <destructionator gmail.com> writes:
Jacob Carlborg wrote:
 I prefer to use "old_".

There's two big problems with that though: 1) It still breaks the old code. It's an even easier fix, so this isn't too bad, but it is still broken. 2) What if a third version of a module comes along?
Sep 06 2011
next sibling parent Adam Ruppe <destructionator gmail.com> writes:
Andrej Mitrovic wrote:
 select deprecated functionality

The problem I have is old code isn't going to change itself to select old functions. New code, on the other hand, can decide to use new functions since someone is actively writing it. Therefore, it's less painful to opt in to using new code than to select to use old code.
 Naming modules "xml.old1" [...]

I'd do "xml" and "xml2" rather than "xml.old" since the name is already "xml", and in my view, that's immutable now. This might be a poor man's version control system, but is that bad? It's not uncommon for software (or books or movies...) to have major versions (sequel numbers) in the name. Thanks to D's module namespacing, that name is the only place it'd be too; the code that uses it still looks natural. It's not like they'd have to write parseXML2() all over the place - like is somewhat common in C. Would people find it weird that versions are in the name? Maybe, but again, that's common in a lot of places. Just make sure the Phobos docs point to the newest version by default in the left nav panel so people don't have to hunt for what's newest. Would this naming scheme be a hassle in the source control? I don't think so. 1) If it's a rewrite, it's a different file anyway, even if you gave it the same name; it's not like a patch could apply to both versions. 2) If it's a minor fork, you could surely just apply patches to two branches in git, right? (I don't really know how it works, but I can't imagine it'd be harder than any other branch which I hear git makes easy.) 3) If it's backward compatible, no need to change the number. My only regret is we didn't have the foresight to call it "std.xml1" in the first place.
Sep 06 2011
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 17:37, Adam Ruppe wrote:
 Jacob Carlborg wrote:
 I prefer to use "old_".

There's two big problems with that though: 1) It still breaks the old code. It's an even easier fix, so this isn't too bad, but it is still broken. 2) What if a third version of a module comes along?

What I don't like is that if there's a function/class/module that should be deprecated but have a good proper name we can't use that name for a new implementation. Another problem I see is that DMD, D, druntime and Phobos are released in one piece. You should be able to take an arbitrary version of Phobos and use it with the compiler and the language, just as you can with other libraries. Then there could be a better version scheme and you can stay on a given version if you really have to. 1.0.0 Major.minor.build Increment major when introducing API breaking changes, i.e. removing a method. Increment minor when introducing non-breaking API changes, i.e. adding a new method. Increment build when changing implementation details, i.e. changing an internal data structure from an array to a linked list. With this version scheme you would know that as long as you stay on 1.x.y your code won't break. -- /Jacob Carlborg
Sep 07 2011
prev sibling next sibling parent reply Adam Ruppe <destructionator gmail.com> writes:
Andrej Mitrovic:
 I assume people will just pick the first thing they see

That's why the links on the left should always point to the newest version, and there might be notes in the docs pointing people to newer and older versions.
 "std.xml" looks standard so they would pick that over "std.xml2"

Maybe, but if were just consistent with a scheme, I think it'd be easy enough to learn. It's not uncommon to see sequels named "name 2".
Sep 06 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 11:27 AM, Adam Ruppe wrote:
 Andrej Mitrovic:
 I assume people will just pick the first thing they see

That's why the links on the left should always point to the newest version, and there might be notes in the docs pointing people to newer and older versions.
 "std.xml" looks standard so they would pick that over "std.xml2"

Maybe, but if were just consistent with a scheme, I think it'd be easy enough to learn. It's not uncommon to see sequels named "name 2".

Yah, I also think the documentation makes it easy to clarify which module is the preferred one. I think there's a lot of merit to simply appending a '2' to the module name. There only place where the '2' occurs is in the name of the module, and there aren't many modules we need to replace like that. Andrei
Sep 06 2011
parent reply "Daniel Murphy" <yebblies nospamgmail.com> writes:
"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module 
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module 
 name. There only place where the '2' occurs is in the name of the module, 
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp. When the new one is finished are we going to have 3? It's definately benificial to avoid breaking code, but I really disagree that phobos has reached that point yet. The breaking changes need to stop, but stopping prematurely will leave phobos permanently disfigured.
Sep 06 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 12:05 PM, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp.

Yet another argument :o). I also don't quite remember right now whether strstream or stringstream is the new one (I think the latter).
 When the new one is finished are we going to have 3?

The new one will stay regex.
 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet.  The breaking changes need to stop,
 but stopping prematurely will leave phobos permanently disfigured.

Agreed. Andrei
Sep 06 2011
prev sibling next sibling parent Dmitry Olshansky <dmitry.olsh gmail.com> writes:
On 06.09.2011 21:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp.

Looking at the docs: std.regexp is scheduled for deprecation (in August ? hm... that was a bit harsh).
 When the new one is finished are we going to have 3?

To the best of my knowledge new one is supposed to be std.regex, and since API is essentially the same, chances are most users won't notice the change :) Speaking of the whole idea, I like '2' appended, it's clear that it's a new and better version, and it keeps the old code from unnecessary strain.
 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet.  The breaking changes need to stop,
 but stopping prematurely will leave phobos permanently disfigured.

-- Dmitry Olshansky
Sep 06 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 21:42:09 Dmitry Olshansky wrote:
 On 06.09.2011 21:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 
 Yah, I also think the documentation makes it easy to clarify which
 module is the preferred one.
 
 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the
 module, and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp.

Looking at the docs: std.regexp is scheduled for deprecation (in August ? hm... that was a bit harsh).

std.regexp has been scheduled for deprecation for ages. It just hasn't had a date attached to it. It'll be deprecated in 2.055. - Jonathan M Davis
Sep 06 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 19:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp. When the new one is finished are we going to have 3? It's definately benificial to avoid breaking code, but I really disagree that phobos has reached that point yet. The breaking changes need to stop, but stopping prematurely will leave phobos permanently disfigured.

I agree. -- /Jacob Carlborg
Sep 07 2011
parent reply Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:
On 9/7/2011 2:19 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org> wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which
 module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the
 module,
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp. When the new one is finished are we going to have 3? It's definately benificial to avoid breaking code, but I really disagree that phobos has reached that point yet. The breaking changes need to stop, but stopping prematurely will leave phobos permanently disfigured.

I agree.

In the COM based land for D3D, there is just a number tacked onto the class name. We are up to version 11 (e.x. ID3D11Device). It works well and is definitely nicer once you are used to it, than calling everything New or FunctionEx, and left wondering what to do when you rev the interface again. Once you solve making 3 versions of an interface work cleanly, nice it should be a good system. Making all the modules versioned in some way would probably be ideal. The way linux shared libraries are linked could be used as a model, just make the 'friendly unversioned' module name an alias of some sort to the latest version of the library. Any code needing the older version can specify it explicitly. An approach like this would need to be done within D, as symbol links are a problem for some platforms (though at least its possible on windows these days).
Sep 08 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/8/11 1:28 PM, Marco Leise wrote:
 Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the
 class name. We are up to version 11 (e.x. ID3D11Device). It works
 well and is definitely nicer once you are used to it, than calling
 everything New or FunctionEx, and left wondering what to do when you
 rev the interface again

In the case of D3D though, D3D itself has a version number. The next version of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec shows up, what do we do about std.xml2, which parses version 1.1? And what do we call the new one? Should std.xml3 parse XMLv2.0?

That is late in the discussion, but a valid point.

Waiting for a suggestion from the XML experts. Andrei
Sep 08 2011
parent Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:
On 08/09/2011 21:02, Andrei Alexandrescu wrote:
 On 9/8/11 1:28 PM, Marco Leise wrote:
 Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas
 <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the
 class name. We are up to version 11 (e.x. ID3D11Device). It works
 well and is definitely nicer once you are used to it, than calling
 everything New or FunctionEx, and left wondering what to do when you
 rev the interface again

In the case of D3D though, D3D itself has a version number. The next version of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec shows up, what do we do about std.xml2, which parses version 1.1? And what do we call the new one? Should std.xml3 parse XMLv2.0?

That is late in the discussion, but a valid point.

Waiting for a suggestion from the XML experts. Andrei

I'm not really an XML expert, but I do recall that the XML Core Working Group shelved there plans to develop "XML2.0". All enhancements that are in the pipeline are separate projects with their own acronyms. IMHO, even if there were an XML2.0 spec, I don't think it would effect the naming of the module in Phobos, because I doubt very much it would introduce anything that would require a complete rewrite. std.xml2 could just be extended to support the new features of the spec in the context of its existing architecture. But it is probably a moot point. A...
Sep 08 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules replacing them.

std.xml2 will do fine.
Sep 06 2011
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 12:54 PM, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

std.xml2 will do fine.

Since the BDFL and the majority of his constituents are in favor of this, it looks like the winner. Andrei
Sep 06 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/6/2011 12:03 PM, Andrei Alexandrescu wrote:
 Since the BDFL

Brain-Damaged Feckless Leader?
Sep 06 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/06/2011 09:09 PM, Walter Bright wrote:
 On 9/6/2011 12:03 PM, Andrei Alexandrescu wrote:
 Since the BDFL

Brain-Damaged Feckless Leader?

Benevolent Dictator For Life ;)
Sep 06 2011
prev sibling parent reply notna <notna.remove.this ist-einmalig.de> writes:
Sorry upfront, I didn't read this hole thread, so maybe I'm missing or 
mixing something...

How about a D binding for http://www.xmlsoft.org/ ?

In other words, taking the "curl or sqlite3 path", something like 
/etc/c/xml2

On 06.09.2011 19:54, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

std.xml2 will do fine.

Sep 06 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/
 On 06.09.2011 19:54, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

std.xml2 will do fine.


Sep 06 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 23:51:48 Marco Leise wrote:
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...
 
 How about a D binding for http://www.xmlsoft.org/ ?
 
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs -with-rapidxml/

devs? Tango's XML parser should really make it into Phobos.

A new std.xml is already in the works. It'll be range-based, unlike the Tango parser. But there's no reason why Phobos shouldn't be able to have a similarly-fast XML parser. As I understand it, the primary reason that the current std.xml is slow is because it uses delegates quite a bit, but I haven't used it myself, so I don't know all of the details. - Jonathan M Davis
Sep 06 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-08 13:25, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis
 A new std.xml is already in the works. It'll be range-based, unlike
 the Tango
 parser. But there's no reason why Phobos shouldn't be able to have a
 similarly-fast XML parser. As I understand it, the primary reason that
 the
 current std.xml is slow is because it uses delegates quite a bit, but I
 haven't used it myself, so I don't know all of the details.

No, the issue is, and always will be, buffer access. C's FILE * just doesn't provide anything decent. It's the primary motivation for wanting to revamp it. With slicing and copy avoidance (i.e. only read into a buffer, never copy out), we can achieve the same with Phobos, but I think we have to replace C's buffering system (at least for this usage). Tango's I/O libraries use delegates and virtual functions galore. I think too big a stigma is attached to those. The difference between calling a virtual function/delegate and calling a normal function is very insignificant, the real savings for not using virtual functions is to allow inlining. However, in this case, I/O is so diverse that you *need* polymorphism. -Steve

The Tango XML parser doesn't read from a file, it takes the input as a string. The parser isn't affected by I/O at all. -- /Jacob Carlborg
Sep 08 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser? Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first? -Steve

I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory? I think it would be nice to be able to do both. If you read the whole file before sending it to the parser you would know it doesn't perform any I/O operations. -- /Jacob Carlborg
Sep 08 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-08 21:54, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 15:38:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser? Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first? -Steve

I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory?

Um... yeah, it is :) I have 1 GB of memory, my system starts thrashing with an app that consumes 750MB. So that's like 13 xml files read? Especially if I want to use DOM, I have to keep them around...

50MB is far from 750MB :), but I see your point.
 Not to mention that the GC has to allocate a contiguous space for it. So
 even if I have 100MB of garbage space, maybe none of it is usable, I
 still have to allocate a new block. I'm just surprised there isn't at
 least an option for a stream-based xml parser in Tango.

 One thing this does though, I always assumed it was Tango's I/O that
 accounts for its xml superiority. I wonder, does anyone count reading
 the file in any of the benchmarks?

As far as I know it's because of two reasons: it doesn't allocate any memory (uses slices) and all methods are final. I have no idea about the benchmarks.
 I still think we can come close without having to pre-read an entire file.

I hope so as well.
 I think it would be nice to be able to do both. If you read the whole
 file before sending it to the parser you would know it doesn't perform
 any I/O operations.

I totally agree. I think there's ways to abstract the functionality for both memory-based and device-based i/o into one interface (part of the reason for the revamp). -Steve

A ranged based API as Jonathan and others have said. -- /Jacob Carlborg
Sep 09 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday, September 06, 2011 23:51:48 Marco Leise wrote:
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing  


 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser:

 -with-rapidxml/

devs? Tango's XML parser should really make it into Phobos.

A new std.xml is already in the works. It'll be range-based, unlike the Tango parser. But there's no reason why Phobos shouldn't be able to have a similarly-fast XML parser. As I understand it, the primary reason that the current std.xml is slow is because it uses delegates quite a bit, but I haven't used it myself, so I don't know all of the details.

No, the issue is, and always will be, buffer access. C's FILE * just doesn't provide anything decent. It's the primary motivation for wanting to revamp it. With slicing and copy avoidance (i.e. only read into a buffer, never copy out), we can achieve the same with Phobos, but I think we have to replace C's buffering system (at least for this usage). Tango's I/O libraries use delegates and virtual functions galore. I think too big a stigma is attached to those. The difference between calling a virtual function/delegate and calling a normal function is very insignificant, the real savings for not using virtual functions is to allow inlining. However, in this case, I/O is so diverse that you *need* polymorphism. -Steve
Sep 08 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 13:25, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis
 A new std.xml is already in the works. It'll be range-based, unlike
 the Tango
 parser. But there's no reason why Phobos shouldn't be able to have a
 similarly-fast XML parser. As I understand it, the primary reason that
 the
 current std.xml is slow is because it uses delegates quite a bit, but I
 haven't used it myself, so I don't know all of the details.

No, the issue is, and always will be, buffer access. C's FILE * just doesn't provide anything decent. It's the primary motivation for wanting to revamp it. With slicing and copy avoidance (i.e. only read into a buffer, never copy out), we can achieve the same with Phobos, but I think we have to replace C's buffering system (at least for this usage). Tango's I/O libraries use delegates and virtual functions galore. I think too big a stigma is attached to those. The difference between calling a virtual function/delegate and calling a normal function is very insignificant, the real savings for not using virtual functions is to allow inlining. However, in this case, I/O is so diverse that you *need* polymorphism. -Steve

The Tango XML parser doesn't read from a file, it takes the input as a string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser? Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first? -Steve
Sep 08 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 15:38:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser? Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first? -Steve

I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory?

Um... yeah, it is :) I have 1 GB of memory, my system starts thrashing with an app that consumes 750MB. So that's like 13 xml files read? Especially if I want to use DOM, I have to keep them around... Not to mention that the GC has to allocate a contiguous space for it. So even if I have 100MB of garbage space, maybe none of it is usable, I still have to allocate a new block. I'm just surprised there isn't at least an option for a stream-based xml parser in Tango. One thing this does though, I always assumed it was Tango's I/O that accounts for its xml superiority. I wonder, does anyone count reading the file in any of the benchmarks? I still think we can come close without having to pre-read an entire file.
 I think it would be nice to be able to do both. If you read the whole  
 file before sending it to the parser you would know it doesn't perform  
 any I/O operations.

I totally agree. I think there's ways to abstract the functionality for both memory-based and device-based i/o into one interface (part of the reason for the revamp). -Steve
Sep 08 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 22:28:05 Timon Gehr wrote:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...
 
 How about a D binding for http://www.xmlsoft.org/ ?
 
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser:

Yeah. Thanks to array slicing, parsing is actually one of the areas that D libraries should be able to generally beat C/C++ libraries in terms of speed. That being said, creating bindings and wrappers for existing libraries is a great way to increase Phobos' functionality without reiventing the wheel in many cases. But there are definitely cases, where redoing something in D would actually be much better. It all depends on what you're trying to do and what libraries already exist in C or C++. - Jonathan M Davis
Sep 06 2011
prev sibling next sibling parent reply Mafi <mafi example.org> writes:
 Along these same lines I'm wondering why not simply call this new module
 std.io <http://std.io> rather than use the existing name std.stdio?
   It'd avoid the code breaking issue and help reflect that this new
 module isn't based around C's stdio FILE (at least that's what I
 gather).  Also, the code is written from scratch so that's another
 reason for why I don't think it should have the same name.  The only
 reason I can think of is if it provided significant improvements over
 the existing std.stdio without causing massive breakage.

 Regards,
 Brad Anderson

I think this is a good idea. I think std.io sounds and feels much better. Mafi
Sep 06 2011
parent reply Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:
Mafi Wrote:

 Along these same lines I'm wondering why not simply call this new module
 std.io <http://std.io> rather than use the existing name std.stdio?
   It'd avoid the code breaking issue and help reflect that this new
 module isn't based around C's stdio FILE (at least that's what I
 gather).  Also, the code is written from scratch so that's another
 reason for why I don't think it should have the same name.  The only
 reason I can think of is if it provided significant improvements over
 the existing std.stdio without causing massive breakage.

 Regards,
 Brad Anderson

I think this is a good idea. I think std.io sounds and feels much better. Mafi

I think this is a terrific suggestion. Paul
Sep 06 2011
next sibling parent bearophile <bearophileHUGS lycos.com> writes:
Paul D. Anderson:

 I think this is a terrific suggestion.

I have suggested std.io time ago, but someone doesn't like it: http://d.puremagic.com/issues/show_bug.cgi?id=4718 Bye, bearophile
Sep 06 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 18:48:24 bearophile wrote:
 Paul D. Anderson:
 I think this is a terrific suggestion.

I have suggested std.io time ago, but someone doesn't like it: http://d.puremagic.com/issues/show_bug.cgi?id=4718

It's not enough of an improvement to rename std.stdio to std.io just to rename it. However, if Steven's ultimate changes are different enough that a separate module is needed for a clean migration path, and those changes do get accepted into Phobos, then naming the new module std.io makes good sense. - Jonathan M Davis
Sep 06 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 18:57:00 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday, September 06, 2011 18:48:24 bearophile wrote:
 Paul D. Anderson:
 I think this is a terrific suggestion.

I have suggested std.io time ago, but someone doesn't like it: http://d.puremagic.com/issues/show_bug.cgi?id=4718

It's not enough of an improvement to rename std.stdio to std.io just to rename it. However, if Steven's ultimate changes are different enough that a separate module is needed for a clean migration path, and those changes do get accepted into Phobos, then naming the new module std.io makes good sense.

When I get my re-revamped stdio working, it will likely involve std.io (which I independently decided to use). It will not be a replacement for stdio, but will be used by stdio. So I'm glad others think this is a good idea. -Steve
Sep 08 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 17:53, Andrej Mitrovic wrote:
 On 9/6/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

I would say that's the right way to go. It's much easier to change an import than change code. Perhaps another alternative is to use version statements. DFL uses it for deprecated features that are still in the codebase and usable. We don't want to punish people for using newer modules, we should encourage it. If they're forced to import "std.xml_new", they'll eventually have to change those imports to "std.xml" down the road when the older std.xml gets replaced by the new one. I assume people will just pick the first thing that they see, "std.xml" looks standard so they would pick that over "std.xml2".

Yeah, I hate that with Java interfaces, appending a number. Just because the good proper name is already taken and they can't break existing code. -- /Jacob Carlborg
Sep 07 2011
parent reply David Gileadi <gileadis NSPMgmail.com> writes:
On 9/7/11 12:09 AM, Jacob Carlborg wrote:
 Yeah, I hate that with Java interfaces, appending a number. Just because
 the good proper name is already taken and they can't break existing code.

I've been happy to see less of this recently. Maybe it's just my imagination, but with build systems that have package management like Maven and Gradle becoming more widely adopted developers seem to be more content to let the package manager handle the versioning. I hope this becomes the case with D too.
Sep 07 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-07 16:33, David Gileadi wrote:
 On 9/7/11 12:09 AM, Jacob Carlborg wrote:
 Yeah, I hate that with Java interfaces, appending a number. Just because
 the good proper name is already taken and they can't break existing code.

I've been happy to see less of this recently. Maybe it's just my imagination, but with build systems that have package management like Maven and Gradle becoming more widely adopted developers seem to be more content to let the package manager handle the versioning. I hope this becomes the case with D too.

Me too, that's why I'm working on a package manager. -- /Jacob Carlborg
Sep 07 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-05 21:51:16 +0000, Walter Bright <newshound2 digitalmars.com> said:

 I'll again note that I know of know successful operating system or 
 programming language that goes around breaking existing code unless it 
 is really, really urgent.

Apple has been deprecating things a lot in Mac OS X. Deprecated APIs generally continue to work fine for a long time and only trigger warnings when you compile something that uses them, effectively making them inconvenient. Some deprecation messages that can't be compilation warnings are logged to the console when used instead (deprecated flags for instance), only once per process though. Sometime APIs are truly disabled, but they are not removed. For instance, the old API for accessing the screen's pixels has become non-functional in Mac OS X 10.7 Lion. Only the new API introduced 10.6 works now, the old one was still there but you just get a null pointer. Sometime APIs disappear when passing to a new architecture. For instance, Mac OS X still supports the old Carbon APIs, but only in 32-bit mode, those were never made available to 64-bit applications. But what works well for an operating system is not necessarily what works well for a runtime and a standard library. What Apple does is meant to keep binary compatibility. Users are not expected to have the source code of their application at hand, nor the expertise to fix them. They deprecate things so the OS can move forward and introduce new features, and using deprecated APIs generally mean that your app will have trouble using new features or move to new architectures in the future. The situation for the D standard library is a little different. If you compile D code, you do have the source code at hand. My take is that we should not remove deprecated APIs and thus break old programs unless keeping those APIs really cost too much or impede future improvements. Showing a deprecation message and marking them as deprecated in the documentation is important to incite people to use the non-deprecated APIs, but for simple things like name changes perhaps the deprecation message during compilation could be left out, as the improvement to annoyance ratio would be quite low. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 06 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.
Sep 06 2011
next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of putting __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and __minfo_end. -- /Jacob Carlborg
Sep 07 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of putting __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and __minfo_end.

Actually, I did follow documented behavior of ld. Unfortunately, ld does not follow the documented behavior.
Sep 07 2011
next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-07 11:35, Walter Bright wrote:
 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of putting __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and __minfo_end.

Actually, I did follow documented behavior of ld. Unfortunately, ld does not follow the documented behavior.

I don't know exactly what documentation you've read but this is what I've found: http://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html http://developer.apple.com/library/mac/#documentation/DeveloperTools/Reference/MachOReference/Reference/reference.html The second link contains documentation for "getsectbyname" and similar functions for getting information and data from sections and segments. By using these functions __minfo_beg __minfo_end become unnecessary. I have a fork of druntime that uses these functions. But at the same time I'm trying to make it work with dynamic libraries and I can't get TLS to work with dynamic libraries. -- /Jacob Carlborg
Sep 07 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-07 09:35:26 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of putting __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and __minfo_end.

Actually, I did follow documented behavior of ld. Unfortunately, ld does not follow the documented behavior.

Indeed. Although nowhere in the documentation does it says what the linker does with empty sections, it is reasonable to assume they'd be treated like other sections (kept in the right order). But this has been proven unreliable and it turns out there are proper APIs to do what this hack was meant to do, so we should use them instead. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 07 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-07 14:23, Michel Fortin wrote:
 On 2011-09-07 09:35:26 +0000, Walter Bright <newshound2 digitalmars.com>
 said:

 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update. The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of putting __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and __minfo_end.

Actually, I did follow documented behavior of ld. Unfortunately, ld does not follow the documented behavior.

Indeed. Although nowhere in the documentation does it says what the linker does with empty sections, it is reasonable to assume they'd be treated like other sections (kept in the right order). But this has been proven unreliable and it turns out there are proper APIs to do what this hack was meant to do, so we should use them instead.

From the ld man page, section "Layout": "All zero fill sections will appear after all non-zero fill sections in their segments." -- /Jacob Carlborg
Sep 07 2011
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you have to keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend on Microsoft's C runtime. I bet you'd see more breakages otherwise. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 07 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you have to keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend on Microsoft's C runtime. I bet you'd see more breakages otherwise.

I used to know people who worked in Microsoft's "app compat" department. The lengths they would go to to maintain support for older apps was amazing. It wasn't about just supporting documented behavior, it was supporting undocumented behavior and gross misuse of the APIs.
Sep 07 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
On 9/7/2011 6:22 PM, Walter Bright wrote:
 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright
 <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you have to keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend on Microsoft's C runtime. I bet you'd see more breakages otherwise.

I used to know people who worked in Microsoft's "app compat" department. The lengths they would go to to maintain support for older apps was amazing. It wasn't about just supporting documented behavior, it was supporting undocumented behavior and gross misuse of the APIs.

Yeh, the story of Raymond Chen working on a team that disassembled SimCity and inserted extra code to make it work even though it used previously freed memory comes to mind.
Sep 07 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/7/2011 4:06 PM, dsimcha wrote:
 On 9/7/2011 6:22 PM, Walter Bright wrote:
 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright
 <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you have to keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend on Microsoft's C runtime. I bet you'd see more breakages otherwise.

I used to know people who worked in Microsoft's "app compat" department. The lengths they would go to to maintain support for older apps was amazing. It wasn't about just supporting documented behavior, it was supporting undocumented behavior and gross misuse of the APIs.

Yeh, the story of Raymond Chen working on a team that disassembled SimCity and inserted extra code to make it work even though it used previously freed memory comes to mind.

I believe this was a large factor in the success of Microsoft Windows.
Sep 07 2011
prev sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-07 22:22:25 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:
 
 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you have to keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend on Microsoft's C runtime. I bet you'd see more breakages otherwise.

I used to know people who worked in Microsoft's "app compat" department. The lengths they would go to to maintain support for older apps was amazing. It wasn't about just supporting documented behavior, it was supporting undocumented behavior and gross misuse of the APIs.

Well, sometime Apple does support undocumented behaviour of previous version of their OS too. Take this prototype from time.h for instance: clock_t clock(void) __DARWIN_ALIAS(clock); What this __DARWIN_ALIAS macro does is it forces the code to use "_clock$UNIX2003" as the symbol name for the clock() function instead of the standard "_clock" symbol name. That's because the older version of the function had some bug in it (it was not conformant to some UNIX standard) but they still wanted old binaries to continue using the old version (so they don't break). Code compiled with the newer header will link with the fixed "_clock$UNIX2003" function instead of the old buggy one. But more generally, there's sometime a long term cost in supporting undocumented behaviour. If you let developers use undocumented things without consequence, you send the message that they can depend on them and they'll just depend on them more, and the more software that depends on undocumented behaviours the harder it becomes to tweak the API without breaking everything. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 07 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-04 04:35, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:55:08 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 I dislike naming things with a leading "D" like "DInput". Shouldn't we
 keep code that relies on C to be put in etc.c or somewhere?

I think the names are not great. The names are somewhat based on the metamorphosis of the entire interface structure. What about BufferedInput and BufferedOutput? Michel Fortin suggested those. -Steve

These names are a lot better. -- /Jacob Carlborg
Sep 04 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-04 04:48, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Because it breaks runtime swapping of I/O. For example, if you wanted to change stdin to a network socket, it's simple, just assign another InputStream. However, if stdin is a templated struct, you cannot do this at runtime, you have to decide at compile time what your stdin is. Believe it or not, this is not dissimilar to FILE *, except we have more flexibility. But I realize the implications now. I think I have to revisit this decision. We definitely need classes at the lower level, but I think we can wrap them with structs that are commonly used for RAII and for not breaking existing code. -Steve

Tango has added a new method to Object, "dispose". The method is called by the runtime when a scoped class exits a scope: void foo () { scope f = new File; } When "foo" exits File.dispose will be called and it can close any file handles. I think it's quite clever. -- /Jacob Carlborg
Sep 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 On 2011-09-04 04:48, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Because it breaks runtime swapping of I/O. For example, if you wanted to change stdin to a network socket, it's simple, just assign another InputStream. However, if stdin is a templated struct, you cannot do this at runtime, you have to decide at compile time what your stdin is. Believe it or not, this is not dissimilar to FILE *, except we have more flexibility. But I realize the implications now. I think I have to revisit this decision. We definitely need classes at the lower level, but I think we can wrap them with structs that are commonly used for RAII and for not breaking existing code. -Steve

Tango has added a new method to Object, "dispose". The method is called by the runtime when a scoped class exits a scope: void foo () { scope f = new File; } When "foo" exits File.dispose will be called and it can close any file handles. I think it's quite clever.

What happens if f is aliased beyond the existence of foo()? Andrei
Sep 04 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-04 14:59, Andrei Alexandrescu wrote:
 On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 Tango has added a new method to Object, "dispose". The method is called
 by the runtime when a scoped class exits a scope:

 void foo ()
 {
 scope f = new File;
 }

 When "foo" exits File.dispose will be called and it can close any file
 handles. I think it's quite clever.

What happens if f is aliased beyond the existence of foo()? Andrei

I'm not sure if this is what you mean but: File file; void foo () { scope f = new File; file = f; } void main () { foo; // file is disposed here } In the above example "dispose" will be called when "foo" exits. After the call to "foo" in the main function "file" will refer to an object that is disposed, i.e. an object where the "dispose" method has been called. I don't know how bad this is or if it is bad at all. I would be the same as the following code: File file; void foo () { auto f = new File; f.close; file = f; } void main () { foo; } -- /Jacob Carlborg
Sep 04 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 2:58 PM, Jacob Carlborg wrote:
 On 2011-09-04 14:59, Andrei Alexandrescu wrote:
 On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 Tango has added a new method to Object, "dispose". The method is called
 by the runtime when a scoped class exits a scope:

 void foo ()
 {
 scope f = new File;
 }

 When "foo" exits File.dispose will be called and it can close any file
 handles. I think it's quite clever.

What happens if f is aliased beyond the existence of foo()? Andrei

I'm not sure if this is what you mean but: File file; void foo () { scope f = new File; file = f; } void main () { foo; // file is disposed here } In the above example "dispose" will be called when "foo" exits. After the call to "foo" in the main function "file" will refer to an object that is disposed, i.e. an object where the "dispose" method has been called. I don't know how bad this is or if it is bad at all.

Well it's not bad but a bit underwhelming. Clearly it's better than the unsafe behavior of scope, but it's nothing to write home about. The grand save it makes is replacing "scope(exit) f.dispose();" with "scope" in front of the declaration. That does systematically save some typing, but it's a feature with only local, non-modular effect, and limited abstraction power. Andrei
Sep 04 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-04 21:34, Andrei Alexandrescu wrote:
 On 9/4/11 2:58 PM, Jacob Carlborg wrote:
 I'm not sure if this is what you mean but:

 File file;

 void foo ()
 {
 scope f = new File;
 file = f;
 }

 void main ()
 {
 foo;
 // file is disposed here
 }

 In the above example "dispose" will be called when "foo" exits. After
 the call to "foo" in the main function "file" will refer to an object
 that is disposed, i.e. an object where the "dispose" method has been
 called.

 I don't know how bad this is or if it is bad at all.

Well it's not bad but a bit underwhelming. Clearly it's better than the unsafe behavior of scope, but it's nothing to write home about. The grand save it makes is replacing "scope(exit) f.dispose();" with "scope" in front of the declaration. That does systematically save some typing, but it's a feature with only local, non-modular effect, and limited abstraction power. Andrei

Yeah, a variable declared as "scope" shouldn't, preferably, exit it's scope. The compiler will at least complain if you try to return a scoped variable. -- /Jacob Carlborg
Sep 04 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
I dislike naming things with a leading "D" like "DInput". Shouldn't we
keep code that relies on C to be put in etc.c or somewhere?
Sep 03 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Also, changing structs to classes is gonna *massively* break code
everywhere. Why inheritance instead of a predicate like isInputStream
= is(typeof(T t; t.put; t.close)), you know the drill..
Sep 03 2011
next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So, complaints
 about code breakage are valid, but they aren't deal breaking.

The larger the amount of code that is broken, the more gain there must be to justify it. Breaking std.stdio, which is used everywhere, this thoroughly needs a very high bar of justification.
Sep 03 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

It's really amazing how much cruft 2-3 year old D code tends to have: Workarounds for compiler bugs, workarounds for previously missing features, a generally lower standard for quality before we implemented a proper review process, etc. Heck, I've got a pull request in Github that rewrites a substantial portion of std.parallelism to take advantage of better implementations I've found for parallel foreach and amap, fix a couple bugs and get rid of tons of cruft, and this module's only been in Phobos a few months. These changes are purely under the hood, though, and there should be zero code breakage.
Sep 03 2011
parent Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 7:27 PM, dsimcha wrote:
 These changes are purely under
 the hood, though, and there should be zero code breakage.

Those are the great kind of changes, and it's also nice in that it means the API was done reasonably right.
Sep 03 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 9:53 PM, Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints
 about code breakage are valid, but they aren't deal breaking.

The larger the amount of code that is broken, the more gain there must be to justify it. Breaking std.stdio, which is used everywhere, this thoroughly needs a very high bar of justification.

I agree. I'm hoping the new stuff could build on top of std.stdio. Andrei
Sep 03 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 11:33 PM, Steven Schveighoffer wrote:
 We have to break something in std.stdio, because it's fixated on FILE *.
 We need something that allows FILE * to play the game, but is focused on
 a D-based solution. Otherwise, we have no room for improvement.

I'm not 100% convinced of that. We can achieve a good deal of improvement by resorting to platform-specific code. Clearly that's not the best way to go but it's not difficult and it does have its merit. Overall I think the design of std.stdio should be followed: 1. User opens a File (or whatever), which is a struct. The struct uses RAII. 2. Using the struct you can directly call primitives to read and write stuff. 3. You can also decide you want a polymorphic stream out of it, and you get to decide the parameters of the stream (buffering, chunking, synchronicity and whatnot). byChunk and byLine are good examples, although they aren't polymorphic. Once you have such a stream you're in polyland so you get to use all of its goodies (look ma no templates etc). 4. Once all copies of the struct is destroyed, all streams derived from it are automatically closed and will issue errors when used. That's pretty much it! It's a simple design that does all we need. Andrei
Sep 03 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 10:02 PM, Andrej Mitrovic wrote:
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

It's not that bad. First, it's understandable that now there are considerably more contributors and it's a bit easier tinkering with existing stuff than coming up with all new stuff. Second, historically we're at an all-time high of talent involved in D. I'm sure it will go up much more, but previously we've had a more accepting attitude to new functionality at the cost of scrutiny (e.g. std.xml and std.json, both written by episodic contributors). (I really regret having had that attitude, it hurt us.) So now that there are so many eyeballs focused on the code, and not just any eyeballs but eyeballs connected to good brains, there is pressure building up. There are quite a few pieces in Phobos that are withstanding scrutiny quite well: getopt, algorithm, variant (which can be, I think, safely extended to new great functionality), range, conv, random, and more. There are, unfortunately, others that didn't start off the right foot and right now are somewhat of an eyesore. I trust we will figure what to do about each on a by-case basis, though I agree with Walter that we should balance the breakage cost with correspondingly high rewards in terms of functionality improvements. Andrei
Sep 03 2011
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I'm sure it will go up much more, but previously we've had a more
 accepting attitude to new functionality at the cost of scrutiny (e.g.
 std.xml and std.json, both written by episodic contributors). (I really
 regret having had that attitude, it hurt us.) So now that there are so
 many eyeballs focused on the code, and not just any eyeballs but
 eyeballs connected to good brains, there is pressure building up.
 There are quite a few pieces in Phobos that are withstanding scrutiny
 quite well: getopt, algorithm, variant (which can be, I think, safely
 extended to new great functionality), range, conv, random, and more.
 There are, unfortunately, others that didn't start off the right foot
 and right now are somewhat of an eyesore. I trust we will figure what to
 do about each on a by-case basis, though I agree with Walter that we
 should balance the breakage cost with correspondingly high rewards in
 terms of functionality improvements.
 Andrei

Yes, the quality standard has gone up massively. When I was prepping std.parallelism for review a few months ago, I generally used the existing Phobos documentation as a guideline for what std.parallelism's docs should resemble. Andrei, of course, ripped the documentation apart. In hindsight it led to massive improvements and was for the better. It certainly set the tone for clear, precise documentation in the future and the same high standards were applied to std.path and the std.curl. However, at the time I actually thought he just hated std.parallelism at a gut level and was looking for any excuse to keep it out of Phobos. (I apologize for having thought this and therefore taken a much more adversarial view of the review process than I should have.)
Sep 03 2011
parent reply Walter Bright <newshound2 digitalmars.com> writes:
On 9/3/2011 8:22 PM, dsimcha wrote:
 However, at the time I actually thought he just hated
 std.parallelism at a gut level and was looking for any excuse to keep it out of
 Phobos.  (I apologize for having thought this and therefore taken a much more
 adversarial view of the review process than I should have.)

I can vouch for Andrei's reviews appearing to be personal, but they are not. He's mercilessly ripped up some of my stuff, but I had to agree he was right and the resulting improvement was well worth it. I don't much care for blowing sunshine, flattery and false praise. Andrei sets a high bar, I'm glad he does, and we'll all be better off for it.
Sep 03 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/4/11 12:11 AM, Walter Bright wrote:
 On 9/3/2011 8:22 PM, dsimcha wrote:
 However, at the time I actually thought he just hated
 std.parallelism at a gut level and was looking for any excuse to keep
 it out of
 Phobos. (I apologize for having thought this and therefore taken a
 much more
 adversarial view of the review process than I should have.)

I can vouch for Andrei's reviews appearing to be personal, but they are not. He's mercilessly ripped up some of my stuff, but I had to agree he was right and the resulting improvement was well worth it. I don't much care for blowing sunshine, flattery and false praise. Andrei sets a high bar, I'm glad he does, and we'll all be better off for it.

This is a bit of a surprise for me because I fancy (fancied...) to see myself as this emotionless, rational reviewer. Thank you all for putting up with me. Andrei
Sep 04 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 04, 2011 03:22:21 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 
 I'm sure it will go up much more, but previously we've had a more
 accepting attitude to new functionality at the cost of scrutiny (e.g.
 std.xml and std.json, both written by episodic contributors). (I really
 regret having had that attitude, it hurt us.) So now that there are so
 many eyeballs focused on the code, and not just any eyeballs but
 eyeballs connected to good brains, there is pressure building up.
 There are quite a few pieces in Phobos that are withstanding scrutiny
 quite well: getopt, algorithm, variant (which can be, I think, safely
 extended to new great functionality), range, conv, random, and more.
 There are, unfortunately, others that didn't start off the right foot
 and right now are somewhat of an eyesore. I trust we will figure what to
 do about each on a by-case basis, though I agree with Walter that we
 should balance the breakage cost with correspondingly high rewards in
 terms of functionality improvements.
 Andrei

Yes, the quality standard has gone up massively. When I was prepping std.parallelism for review a few months ago, I generally used the existing Phobos documentation as a guideline for what std.parallelism's docs should resemble. Andrei, of course, ripped the documentation apart. In hindsight it led to massive improvements and was for the better. It certainly set the tone for clear, precise documentation in the future and the same high standards were applied to std.path and the std.curl. However, at the time I actually thought he just hated std.parallelism at a gut level and was looking for any excuse to keep it out of Phobos. (I apologize for having thought this and therefore taken a much more adversarial view of the review process than I should have.)

std.datetime is far better for having gone through multiple reviews as well. The resulting code isn't perfect, and reviews don't always catch everything, but thorough reviews really help improve the quality of code. Even just having other contributors look over pull requests tends to find stuff that can and should be improved. So, while there will likely always be some issues with code that make it into Phobos, the overall code quality is definitely improving. - Jonathan M Davis
Sep 03 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 04, 2011 04:02:17 Andrej Mitrovic wrote:
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

Most of it's older stuff which has been around since D1, I believe - either that or it came fairly early in D2. - Jonathan M Davis
Sep 03 2011
prev sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 Any overhaul of existing functionality needs to improve on existing
 functionality. Changes just to change aren't valuable. So, changes should
 generally avoiding breaking backwards compatibility unless we gain something
 from it. So, as long as these changes are an overall improvement, then we'll
 just have to deal with the code breakage. However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So, complaints
 about code breakage are valid, but they aren't deal breaking.
 - Jonathan M Davis

I mostly agree with what you said, except that this proposal breaks a frequently used standard library module severely and without a clear gradual migration path.
Sep 03 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 04.09.2011, 00:57 Uhr, schrieb Andrej Mitrovic  
<andrej.mitrovich gmail.com>:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Wasn't this overhaul _meant_ to break existing code by offering a new API? Still that's a serious issue of course, but not too surprising. I'm ambivalent on the inheritance vs predicate debate. Interfaces are the way it is meant to be done and actually ensure correct types. Predicates work with structs as well. I don't know if this would be important.
Sep 03 2011
prev sibling next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 17:20:53 -0400, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s  
 article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

After a quick look, I have two concerns: 1. File is a class, not a struct. This precludes using reference counting as the current std.stdio.File does, meaning you have to close all your Files manually. I loved the reference counting semantics, especially the last few releases since most of the relevant compiler bugs have been fixed.

As long as a class can contain a File as a member, this argument makes no sense to me. In other words, it's impossible to remove the GC from the File destructor/refcounting system. I think what may end up happening, in terms of File being a scoped entity is: File becomes a struct. File's sole member is a class that implements InputStream, OutputStream, and ref counting. This would be roughly equivalent to today's File. Except it's not buffered. I think the names need work, and you are very right to point out that we should make existing code work as much as possible.
 2.  File(someFileName, someMode) needs to work.  Not supporting this  
 method of
 instantiating a File object would break way too much code.

I can change File.open to File.opCall, that will fix that. -Steve
Sep 03 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 10:11 PM, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 17:20:53 -0400, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

After a quick look, I have two concerns: 1. File is a class, not a struct. This precludes using reference counting as the current std.stdio.File does, meaning you have to close all your Files manually. I loved the reference counting semantics, especially the last few releases since most of the relevant compiler bugs have been fixed.

As long as a class can contain a File as a member, this argument makes no sense to me. In other words, it's impossible to remove the GC from the File destructor/refcounting system.

The meaning of the argument is that just because there is the possibility of a File leaking, we shouldn't increase the likelihood of such a leak. Andrei
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 21:23:26 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/3/2011 3:53 PM, dsimcha wrote:
 Agreed, but in the big picture this overhaul still breaks way too much  
 code
 without either a clear migration path or a clear argument about why  
 such extensive
 breakage is necessary.  The part about File(someFileName, someMode) is  
 just the
 first thing I noticed.

[rant] I agree. I agree that std.stream should be replaced, but I have a lot of misgivings about replacing std.stdio. I do not want to rewrite every darn D program I've ever written. I think it is a bad idea to break everyone else's D program. Everything in dsource will break in non-trivial ways. I don't think we can afford this. I do not know of any successful system or language that breaks user code with such aplomb as D does. Not even C++ dares to break that Piece Of S*** that everyone knows iostreams is. I can compile and run unix C code from 30 years ago on Linux with no changes at all. Same with DOS code. There needs to be huge improvement to justify such breakage. [I also don't like it that all my code that uses std.path is now broken.] I would prefer to see all the energy that is going into refactoring existing, working modules go into designing new, not existing, modules that there's a crying need for. [/rant]

Please, leave all pitchforks and torches at rest for the moment :) I want to stress, this is *NOT* a proposal for inclusion or generating a pull request tomorrow. It's a very very early version, almost a proof of concept, to show *why* we need to change things. Most of the library is up for debate. I agree it needs to be more compatible with current code. In hindsight, I probably should have said no when Andrei asked to post this on the NG, and did it myself when I could stress the state of it. The two most important things are: 1. the interface additions, in particular the readUntil portion (which I think provides a very powerful interface for parsing systems). 2. the performance. It's much better than current stdio. Aren't people continuously complaining at how slow i/o is in Phobos compared to other libraries?
 Enough ranting for now, as for the proposed std.stdio,

 1. It does look fairly straightforward, but:

 2. There is only one example. Have any commonly done programming tasks  
 been tried out with it to see how they work?

My main testing has been for: 1. utf input/output correctness of all formats 2. implementing readf/writef 3. testing performance. I have not written any "real world" tests. Probably the most interesting tests I've written are reading a UTF-X file and writing the data to a UTF-Y file (where X and Y are one of UTF-8, UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE).
 3. There is no indication of how it interacts with C stdio. A primary  
 goal of std.stdio was interoperability with C stdio.

useCStdio();
 4. There are no benchmarks. The current std.stdio was designed/written  
 in parallel with some benchmarks Andrei and others cooked up, as a  
 primary goal was performance.

I can include these.
 5. flushCheck - flushing should be done based on the file type. tty's  
 should be \n flushed, files when the buffer is full. I question the  
 performance of using a delegate to check for flushing. How often will it  
 be called?

Once per write to the buffer. Data is only checked once (the delegate is never given the same data to check again). If you want, I can look at adding a means to avoid using a delegate when the trigger is a single character. And TextInput/TextOutput auto detect whether a device is a tty, and install the right flushcheck function if necessary.
 6. There is no provision for multithreaded writing, i.e. what happens  
 when two threads write to stdout. Ideally, there should be a way to  
 'lock' the stream to oneself, in order to appropriately interleave the  
 output.

Again, I wish I had not told Andrei to post :( Multithreaded is not supported, but will be. When that is ready, a locking mechanism (and hopefully an auto-unlock mechanism) will be provided.
 7. I see nothing for 'raw' character by character input.

The interface is geared to read by processing the buffer, not one character at a time. Given access to the buffer, you can process one character at a time if you want. See InputRange in TextInput to see how raw character-by-character input can be done. That being said, I think I need to add a peek function.
 8. I see nothing for determining if a char is available on the input.  
 How would one implement "press any key to continue"?

I need more information. I would probably implement this as a read(ubyte[1]), so I don't see why it can't be that way. -Steve
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 18:55:08 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 I dislike naming things with a leading "D" like "DInput". Shouldn't we
 keep code that relies on C to be put in etc.c or somewhere?

I think the names are not great. The names are somewhat based on the metamorphosis of the entire interface structure. What about BufferedInput and BufferedOutput? Michel Fortin suggested those. -Steve
Sep 03 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Ah, reading your post I see this is just a start of the overhaul. I
assumed this was already getting ready for a review. Names can be
fixed eventually. :)
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Because it breaks runtime swapping of I/O. For example, if you wanted to change stdin to a network socket, it's simple, just assign another InputStream. However, if stdin is a templated struct, you cannot do this at runtime, you have to decide at compile time what your stdin is. Believe it or not, this is not dissimilar to FILE *, except we have more flexibility. But I realize the implications now. I think I have to revisit this decision. We definitely need classes at the lower level, but I think we can wrap them with structs that are commonly used for RAII and for not breaking existing code. -Steve
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 04 Sep 2011 00:30:33 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/3/2011 7:33 PM, Steven Schveighoffer wrote:
 Please, leave all pitchforks and torches at rest for the moment :)

I know what I wrote was a bit brutal, but this needs to be settled before we've gone so far down that path that turning away then would be horribly unfair to you.

I appreciate feedback, but I think there was a misunderstanding of what this "review" was for. I think people thought I was proposing this as a ready-to-pull replacement for std.stdio. That is not the case. It's very much up in the air and under development. I just wanted to show people some progress and get feedback (which I've gotten a lot of!) The next version of it will look drastically different based on what's been said here. But it will still contain some of the basic designs. In essence, I am very *early* in the path, and I *want* people to turn me in the right direction before I go too far the other way. This is the first version that *actually works*, which is why I wanted to share it :) Not anyone has really commented on the new interfaces. It's my fault, for letting Andrei post the documentation as the main subject, and also not fully documenting the module. I have no excuses, so I'll just have to take this as a "ok, we'll try this again later". But I did get some very good information, and know I have a lot of work to do.
 I think what you need is a marketing spiel to sell the concept of what  
 you're trying to do. It should include:

 1. The benefits over the current std.stdio
 2. Why the new API is needed to achieve those benefits
 3. A migration plan for existing std.stdio code

OK
 Just being more flexible isn't enough, it has to be more flexible in a  
 way that matters, i.e. a real example showing how kickass it is compared  
 to the current way.

I'll post some numbers.
 2. the performance. It's much better than current stdio. Aren't people
 continuously complaining at how slow i/o is in Phobos compared to other  
 libraries?

Why is it faster? I.e. is a wholly new interface required to make it faster, or does it just need to be better under the hood?

Yes, a new interface is required to make it faster. You need direct buffer access, and the current stdio does not provide that. That being said, I think this proposal goes nowhere unless it's a mostly drop-in replacement to the existing std.stdio. So I have to find a way to make it fit.
 3. There is no indication of how it interacts with C stdio. A primary  
 goal of
 std.stdio was interoperability with C stdio.

useCStdio();

For some reason that just seems like a giant wart with a hair sticking out of it. Why not just use the C stdio buffers?

1. Because most people don't care. I never ever use printf, except when I was testing my new stdio stuff, and I needed something that worked :) My opinion, if you are using this line, you are doing something weird, legacy related, or you are debugging something. 2. Because C does not provide enough access to the buffers. With my library, you can read an entire xml file, for instance, and never copy any data out of the buffer. C never gives direct access to the buffers, and while we can hack our way into it, its interface is still kludgy. If I wanted to implement, for example, readUntil using C buffers, I'd have to reimplement almost all of FILE *'s functions so I could do it properly. And even then, I'd still have to sacrifice some things -- C is still going to want to use its way of doing things, and I'd have to respect that. If you read my response to the first post in this thread, you can see my rationale.
 5. flushCheck - flushing should be done based on the file type. tty's  
 should
 be \n flushed, files when the buffer is full. I question the  
 performance of
 using a delegate to check for flushing. How often will it be called?

Once per write to the buffer. Data is only checked once (the delegate is never given the same data to check again). If you want, I can look at adding a means to avoid using a delegate when the trigger is a single character. And TextInput/TextOutput auto detect whether a device is a tty, and install the right flushcheck function if necessary.

Flushing once per write is wrong - consider the user who does a zillion putc's. I don't see a purpose to anything beyond the C stdio ones - per character, per \n, and per buffer.

a *check* to see if it should be flushed is done once per write. Not a flush. A flush is only done if the check says to (or the buffer is full). I think C's FILE * checks once per write as well, no? I also have thought of ways to optimize this so it's, say, once per call to writef.
 7. I see nothing for 'raw' character by character input.

The interface is geared to read by processing the buffer, not one character at a time. Given access to the buffer, you can process one character at a time if you want. See InputRange in TextInput to see how raw character-by-character input can be done.

Raw mode is more than that - you have to set the OS to raw mode, otherwise it won't give you any characters until a \n is typed.

That is not an OS issue, that is a terminal issue. Note that the current std.stdio does not provide this functionality. The only raw functions are rawRead and rawWrite, which set binary mode. All binary mode does is on windows enable or disable translation of \r\n to \n. They will not do what you are asking.
 8. I see nothing for determining if a char is available on the input.  
 How
 would one implement "press any key to continue"?

I need more information. I would probably implement this as a read(ubyte[1]), so I don't see why it can't be that way.

There's more to it than that. Try writing it in C and you'll see what I mean. (You have to set the io to "raw" mode, turn "echo" off, etc.)

File provides access to the OS handle, which can be used to set terminal settings. It might be good to add these settings as member functions of File. -Steve
Sep 03 2011
prev sibling next sibling parent "Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:
On Sat, 03 Sep 2011 18:23:26 -0700, Walter Bright wrote:

 [I also don't like it that all my code that uses std.path is now
 broken.]

What do you mean by "broken"? That it does not compile or work as expected, or that it spits out a bunch of annoying deprecation messages? If it is any of the former, that was not supposed to happen. The new std.path still contains all the functions of the old std.path and should therefore be backwards compatible. If the new std.path breaks existing code, I need to fix it before it is released. Please let me know what problems you are experiencing. -Lars
Sep 04 2011
prev sibling next sibling parent reply Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/5/11, Walter Bright <newshound2 digitalmars.com> wrote:
 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

It would really help out if we had some sort of semi-automated script that can do at least partial translation of code that uses old phobos functions to new ones. Maybe this wouldn't work 100% but at least it would help out. I'm thinking of something similar to what Python 2to3 does. I know for sure I could use this, so far I've had to fix the DWinSamples for every DMD/Phobos release.
Sep 05 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Andrej Mitrovic:

 It would really help out if we had some sort of semi-automated script
 that can do at least partial translation of code that uses old phobos
 functions to new ones. Maybe this wouldn't work 100% but at least it
 would help out. I'm thinking of something similar to what Python 2to3
 does.

You mean like the standard tool gofix: http://blog.golang.org/2011/04/introducing-gofix.html Bye, bearophile
Sep 05 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 00:05 Uhr, schrieb Andrej Mitrovic  
<andrej.mitrovich gmail.com>:

 On 9/5/11, Walter Bright <newshound2 digitalmars.com> wrote:
 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

It would really help out if we had some sort of semi-automated script that can do at least partial translation of code that uses old phobos functions to new ones. Maybe this wouldn't work 100% but at least it would help out. I'm thinking of something similar to what Python 2to3 does. I know for sure I could use this, so far I've had to fix the DWinSamples for every DMD/Phobos release.

It would help to have a lexical analyzer of the kind that allows for the refactorings in for example Eclipse for Java. Without clear identification of symbols it is impossible to write such a script for every new D release. And there were other changes in the past where this would have been handy although I have not writing any D code lately to feel the breakage. I think of globals (__gshared). I'm on the extreme with my urge to rewrite things if they give me the slightest feeling that they could be more elegant or effective and I've thought of such a script as well that could be distributed with every new D version while there are still breaking changes to the language. Well, you cannot write a script without a solid foundation that can reliably identify and refactor symbols. But this doesn't work well for code you copy from blogs. You would have to know what D version it was written with and run the matching chain of conversion scripts. Anyway this feels like some crazy idea that can't make it into existence. Still I have that picture of downloading a new D release and running the obligatory dmdup script to replace deprecated functionality or names with the new versions. Sure at some point 'this is it' and features of D and Phobos become set in stone. "Hello world!" console output is one of those examples that many will try first. I would understand if it breaks between major versions of a language, but not from one revision to the next. YMMV :)
Sep 05 2011
prev sibling next sibling parent reply Josh Simmons <simmons.44 gmail.com> writes:
On Tue, Sep 6, 2011 at 12:48 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 On 09/05/2011 04:51 PM, Walter Bright wrote:
 If the new std.path breaks existing code, I need to fix it before it is
 released. Please let me know what problems you are experiencing.

It prints out all the deprecation message. It means I'll have to go edit existing, working code to change the names.

I think it means it gives you time, on your own schedule with generous deadlines, to make the changes to your code.
 I know that the majority wants the name changes. I know the deprecation
 system gives people plenty of time to edit their code.

 But I think the cost of breaking existing code is much higher than many
 realize, and a lot of that cost will be hidden. It'll come in the form
 of people deciding not to use D because it is "not stable". It'll come
 in the form of invalidating existing libraries and modules unless
 someone is regularly maintaining them. It'll come in the form of
 invalidating the mass of books, articles, blog postings, and
 presentations about D, and those will never get updated. People will
 type in the code examples, they will fail to compile, and they'll get
 turned off about D.

 I'll again note that I know of know successful operating system or
 programming language that goes around breaking existing code unless it
 is really, really urgent.

 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

I agree with all of the above. However, as is often the case, there's more than one side to the story. Bad APIs have their costs too. We can't afford to have an XML library that offers few and badly packaged features and comes at the tail of all benchmarks. We also can't afford a JSON library that is poorly designed and badly written. Ironically, the costs mostly manifest the same way: people will decide not to use D because it "lacks good libraries" and "is quirky to use". In many ways a language's standard library is a showcase of the language, and to a newcomer an inconsistent and awkward standard library affects the perception of the language's quality. Stressing that breaking code has a cost and implying that keeping it with flaws has no cost is as mistaken as worrying in chess about the flank at the expense of the center. The reality we need to face is, we are experiencing growth pains. What we must do is NOT lament about breaking this or keeping that. We must: a) devise good language features to cope with deprecation, of which deprecation with message is one that I think we need to embrace and extend (I have a few ideas I'll discuss separately); b) supplement that with a good policy for deprecating APIs and introducing new ones - in particular decide where to draw the line when introducing a breaking change; c) possibly create programs a la gofix that help migration. Andrei

My question is why do you even need a standard API for XML and JSON. Trying to support everything out of the box to a high degree of quality and provide enough generality that it's useful for everybody is just too much work and all you achieve is to discourage alternative implementations better suited to specific needs.
Sep 05 2011
parent reply bearophile <bearophileHUGS lycos.com> writes:
Josh Simmons:

 My question is why do you even need a standard API for XML and JSON.

It helps port your user code to other libs that use the same standard API. This is very useful. In D I'd like a basic standard API even for simple 2D graphics. Bye, bearophile
Sep 06 2011
next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 18:54:45 Josh Simmons wrote:
 On Tue, Sep 6, 2011 at 6:37 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Josh Simmons:
 My question is why do you even need a standard API for XML and JSON.

It helps port your user code to other libs that use the same standard API. This is very useful. In D I'd like a basic standard API even for simple 2D graphics. Bye, bearophile

This would be true if there were only implementation differences between libraries doing roughly the same thing (in which case you'd not need a new library anyway). Unfortunately this is not how things work. So simple 2d graphics ey? vector or raster based? immediate rendering or scene graph representation? animation? fonts? textures? XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT? The problem with php isn't just it's awesome naming, it's the fact that anything that seemed like something somebody might use was added as opposed to limiting itself to the must-haves.

Other major languages (such as Java and C#) have large standard libraries and have done quite well with them. In fact, I believe that the large size of their standard libraries is generally seen as major advantage of those languages. No, we can't have everything in the standard library. No, an XML parser in the standard library likely won't meet everyone's needs. However, having a large standard library can be of great benefit to the users of the language even if it doesn't solve every problem that they could possibly have. The question isn't really whether we should add stuff like XML parsing to Phobos. The question is what is the best general implementation for a such a module and whether we can get an implementation of high enough quality to be able to go in the standard library. It's a question of time, man power, and quality. Obviously, Phobos is not going to explode in size overnight, but it _is_ going to grow in size, and eventually it should be fairly large. We already have several useful additions in the review queue which will likely make it into Phobos in one form or another over the next few months. - Jonathan M Davis
Sep 06 2011
parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 11:09, Jonathan M Davis wrote:
 On Tuesday, September 06, 2011 18:54:45 Josh Simmons wrote:
 This would be true if there were only implementation differences
 between libraries doing roughly the same thing (in which case you'd
 not need a new library anyway). Unfortunately this is not how things
 work.

 So simple 2d graphics ey? vector or raster based? immediate rendering
 or scene graph representation? animation? fonts? textures?

 XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?

 The problem with php isn't just it's awesome naming, it's the fact
 that anything that seemed like something somebody might use was added
 as opposed to limiting itself to the must-haves.

Other major languages (such as Java and C#) have large standard libraries and have done quite well with them. In fact, I believe that the large size of their standard libraries is generally seen as major advantage of those languages. No, we can't have everything in the standard library. No, an XML parser in the standard library likely won't meet everyone's needs. However, having a large standard library can be of great benefit to the users of the language even if it doesn't solve every problem that they could possibly have. The question isn't really whether we should add stuff like XML parsing to Phobos. The question is what is the best general implementation for a such a module and whether we can get an implementation of high enough quality to be able to go in the standard library. It's a question of time, man power, and quality.

Phobos could have a low level XML parsing module and on top of that other XML APIs can be built, like SAX, DOM and so on. This is how the XML modules in Tango are built. Tango has a low level XML pull parse. Built on top of that are a SAX API and a DOM document.
 Obviously, Phobos is not going to explode in size overnight, but it _is_ going
 to grow in size, and eventually it should be fairly large. We already have
 several useful additions in the review queue which will likely make it into
 Phobos in one form or another over the next few months.

 - Jonathan M Davis

-- /Jacob Carlborg
Sep 06 2011
prev sibling next sibling parent reply Josh Simmons <simmons.44 gmail.com> writes:
On Tue, Sep 6, 2011 at 7:09 PM, Jonathan M Davis <jmdavisProg gmx.com> wrote:
 Other major languages (such as Java and C#) have large standard libraries and
 have done quite well with them. In fact, I believe that the large size of
 their standard libraries is generally seen as major advantage of those
 languages.

 No, we can't have everything in the standard library. No, an XML parser in the
 standard library likely won't meet everyone's needs. However, having a large
 standard library can be of great benefit to the users of the language even if
 it doesn't solve every problem that they could possibly have. The question
 isn't really whether we should add stuff like XML parsing to Phobos. The
 question is what is the best general implementation for a such a module and
 whether we can get an implementation of high enough quality to be able to go
 in the standard library. It's a question of time, man power, and quality.

 Obviously, Phobos is not going to explode in size overnight, but it _is_ going
 to grow in size, and eventually it should be fairly large. We already have
 several useful additions in the review queue which will likely make it into
 Phobos in one form or another over the next few months.

 - Jonathan M Davis

Other languages like C# and Java have large enterprise outfits backing their massive standard libraries too. I just think the effort is better spent creating a solid language and encouraging third party libraries through better tools.
Sep 06 2011
parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 4:29 AM, Josh Simmons wrote:
 Other languages like C# and Java have large enterprise outfits backing
 their massive standard libraries too.

 I just think the effort is better spent creating a solid language and
 encouraging third party libraries through better tools.

As always finding the right balance is key. Community-grown languages such as PHP, Python, or Ruby also enjoy large libraries, so I don't think corporate support is a prerequisite. It would probably be a mistake to stop work on Phobos now. For all I can tell I'm more annoyed by the energy spent on what I'd call "isometric churn" like changing names (even those used internally, sigh) and changing comments from /** */ to /++ +/. All that energy could go into adding value. Andrei
Sep 06 2011
prev sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 11:09 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 Other major languages (such as Java and C#) have large standard  
 libraries and
 have done quite well with them. In fact, I believe that the large size of
 their standard libraries is generally seen as major advantage of those
 languages.

These languages are platforms with a complete abstraction from the underlying OS and libraries. My JDK installation is 170 MB in size. I would prefer thin wrappers over good existing libraries like curl. Possibly even msxml4/libxml2. Their API differs in some points, but they both offer XPath and other goodies, maybe they can be made compatible with a wrapper.
Sep 06 2011
prev sibling next sibling parent reply Josh Simmons <simmons.44 gmail.com> writes:
On Tue, Sep 6, 2011 at 6:37 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Josh Simmons:

 My question is why do you even need a standard API for XML and JSON.

It helps port your user code to other libs that use the same standard API. This is very useful. In D I'd like a basic standard API even for simple 2D graphics. Bye, bearophile

This would be true if there were only implementation differences between libraries doing roughly the same thing (in which case you'd not need a new library anyway). Unfortunately this is not how things work. So simple 2d graphics ey? vector or raster based? immediate rendering or scene graph representation? animation? fonts? textures? XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT? The problem with php isn't just it's awesome naming, it's the fact that anything that seemed like something somebody might use was added as opposed to limiting itself to the must-haves.
Sep 06 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Josh Simmons:

 So simple 2d graphics ey? vector or raster based? immediate rendering
 or scene graph representation? animation? fonts? textures?

Raster, immediate rendering, no need to specify animations, basic support for fonts and textures. Leaving most things out is not a problem here. Time ago someone has shown here a nice D module that works on Windows, Linux, that's is short. This module doesn't replace other graphics libs or GUI modules, it's something simple and small for small purposes.
 XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?
 
 The problem with php isn't just it's awesome naming, it's the fact
 that anything that seemed like something somebody might use was added
 as opposed to limiting itself to the must-haves.

Some people like wide libraries like Python (batteries included), others don't like that. Both choices have their serious advantages and serious disadvantages. There are not general rules to solve this problem, each case needs to be discussed. I think a good JSON module needs to be in Phobos, while for XML maybe it just needs a standard D API (this also comes from practical size considerations: a JSON module is probably small. A good XML module will probably become large). Bye, bearophile
Sep 06 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/6/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

I would say that's the right way to go. It's much easier to change an import than change code. Perhaps another alternative is to use version statements. DFL uses it for deprecated features that are still in the codebase and usable. We don't want to punish people for using newer modules, we should encourage it. If they're forced to import "std.xml_new", they'll eventually have to change those imports to "std.xml" down the road when the older std.xml gets replaced by the new one. I assume people will just pick the first thing that they see, "std.xml" looks standard so they would pick that over "std.xml2".
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 11:53:09 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 On 9/6/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

I would say that's the right way to go. It's much easier to change an import than change code. Perhaps another alternative is to use version statements. DFL uses it for deprecated features that are still in the codebase and usable.

I agree. I'd hate for the current std.xml to squat that name forever... In a related story, why did digitalmars.com change D to mean D2 instead of just using new_D? :) Also new coke. -Steve
Sep 06 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/6/11, Adam Ruppe <destructionator gmail.com> wrote:
 Jacob Carlborg wrote:
 I prefer to use "old_".

There's two big problems with that though: 1) It still breaks the old code. It's an even easier fix, so this isn't too bad, but it is still broken. 2) What if a third version of a module comes along?

Isn't it time we start eating our own dogfood and introduce version statements into Phobos to select deprecated functionality? Naming modules "xml.old1", "xml.old2" reminds me of using zip files as a poor man's version control system.
Sep 06 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/6/11, Adam Ruppe <destructionator gmail.com> wrote:
 Andrej Mitrovic wrote:
 select deprecated functionality

The problem I have is old code isn't going to change itself to select old functions.

I mean hypothetically with a new version this could be a compile switch: $ dmd main.d xml.d # use new version $ dmd main.d xml.d -version=2.053 # use old version I'm not exactly sure if it could work like that.. but anyway if it's just xml then we don't have to introduce a whole new way of using deprecated code I guess.
Sep 06 2011
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Sep 6, 2011, at 10:16 AM, Andrei Alexandrescu wrote:

 On 9/6/11 12:05 PM, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in =


 news:j45isu$2t3h$1 digitalmars.com...
=20
 Yah, I also think the documentation makes it easy to clarify which =



 is the preferred one.
=20
 I think there's a lot of merit to simply appending a '2' to the =



 name. There only place where the '2' occurs is in the name of the =



 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or std.regexp.

Yet another argument :o). I also don't quite remember right now =

The latter. I never forget this one because the name we're supposed to = use is annoyingly long :-)
Sep 06 2011
prev sibling next sibling parent "Martin Nowak" <dawg dawgfoto.de> writes:
On Tue, 06 Sep 2011 19:54:28 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules  
 replacing them.

std.xml2 will do fine.

Speaking of xml2 I clearly like to see an attempt of buffered lookahead reading for a stream/stdio overhaul. Writing range adapters with even only fixed lookahead on top of the current stream API is painful. martin
Sep 06 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 16:51 Uhr, schrieb Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org>:

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at  
 Facebook to good effect. Or should we, au contraire, use "old_" for the  
 old module and advise people who want to stick with the old modules to  
 change their imports?


 Andrei

What about: std.xml1 std.xml -> std.xml2 So std.xml is a symbolic link to std.xml2 in the next release or std.xml2 public imports std.xml ? This is what /bin/python is on my computer.
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 14:07:01 -0400, Martin Nowak <dawg dawgfoto.de> wrote:

 On Tue, 06 Sep 2011 19:54:28 +0200, Walter Bright  
 <newshound2 digitalmars.com> wrote:

 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules  
 replacing them.

std.xml2 will do fine.

Speaking of xml2 I clearly like to see an attempt of buffered lookahead reading for a stream/stdio overhaul. Writing range adapters with even only fixed lookahead on top of the current stream API is painful.

This is exactly the reason for the overhaul. I'm working on it, and I think my next version will be much more backwards compatible. See in the proposed documentation readUntil, and look at the byChunk implementation to see how it's used. The intention was to use the buffer as an expandable scratch space to do things like parsing xml files without copying. -Steve
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 14:13:45 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am 06.09.2011, 16:51 Uhr, schrieb Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org>:

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at  
 Facebook to good effect. Or should we, au contraire, use "old_" for the  
 old module and advise people who want to stick with the old modules to  
 change their imports?


 Andrei

What about: std.xml1 std.xml -> std.xml2 So std.xml is a symbolic link to std.xml2 in the next release or std.xml2 public imports std.xml ? This is what /bin/python is on my computer.

That only works/is worth it if std.xml2 is backwards compatible with std.xml1. -Steve
Sep 06 2011
prev sibling next sibling parent Brad Anderson <eco gnuk.net> writes:
--00151747b418dc2d7704ac49e69f
Content-Type: text/plain; charset=ISO-8859-1

On Tue, Sep 6, 2011 at 8:51 AM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/6/11 2:35 AM, Walter Bright wrote:

 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:

 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.

Probably. At any rate, what I now think as a promising path is with new module names. Let's leave the likes of std.xml and std.json in peace, then pick a naming convention for the new ones and create whole new modules replacing them. Then people who are ready for the migration change import std.xml; with import std.some_naming_convention_**involving_xml; and fix whatever code breakages that entails. If they're pleased with std.xml, nobody's holding a gun to their head. Months and years go by, and nobody uses std.xml because the new module and the migration path are copiously advertised in the documentation. At that point we can discuss excising std.xml altogether and replacing it with the new one. And so the new becomes old, just like in dialectics. There's a successful precedent in C++ - stringstream vs. strstream. The only missing thing is that C++ did not choose a naming convention because they limited themselves to only one header. So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at Facebook to good effect. Or should we, au contraire, use "old_" for the old module and advise people who want to stick with the old modules to change their imports? Andrei

Along these same lines I'm wondering why not simply call this new module std.io rather than use the existing name std.stdio? It'd avoid the code breaking issue and help reflect that this new module isn't based around C's stdio FILE (at least that's what I gather). Also, the code is written from scratch so that's another reason for why I don't think it should have the same name. The only reason I can think of is if it provided significant improvements over the existing std.stdio without causing massive breakage. Regards, Brad Anderson --00151747b418dc2d7704ac49e69f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable On Tue, Sep 6, 2011 at 8:51 AM, Andrei Alexandrescu <span dir=3D"ltr">&lt;<= a href=3D"mailto:SeeWebsiteForEmail erdani.org">SeeWebsiteForEmail erdani.o= rg</a>&gt;</span> wrote:<br><div class=3D"gmail_quote"><blockquote class=3D= "gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding= -left:1ex;"> <div><div></div><div class=3D"h5">On 9/6/11 2:35 AM, Walter Bright wrote:<b= r> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> On 9/5/2011 11:39 PM, Jacob Carlborg wrote:<br> <blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p= x #ccc solid;padding-left:1ex"> We don&#39;t want to have a standard library like the one in PHP where<br> there seems<br> to be no naming conventions at all.<br> </blockquote> <br> I don&#39;t think that is the reason PHP is such a bear to work with.<br> </blockquote> <br></div></div> Probably. At any rate, what I now think as a promising path is with new mod= ule names. Let&#39;s leave the likes of std.xml and std.json in peace, then= pick a naming convention for the new ones and create whole new modules rep= lacing them. Then people who are ready for the migration change<br> <br> import std.xml;<br> <br> with<br> <br> import std.some_naming_convention_<u></u>involving_xml;<br> <br> and fix whatever code breakages that entails. If they&#39;re pleased with s= td.xml, nobody&#39;s holding a gun to their head.<br> <br> Months and years go by, and nobody uses std.xml because the new module and = the migration path are copiously advertised in the documentation. At that p= oint we can discuss excising std.xml altogether and replacing it with the n= ew one. And so the new becomes old, just like in dialectics.<br> <br> There&#39;s a successful precedent in C++ - stringstream vs. strstream. The= only missing thing is that C++ did not choose a naming convention because = they limited themselves to only one header.<br> <br> So what should we use? xml2? new_xml? FWIW we use the prefix &quot;new_&quo= t; at Facebook to good effect. Or should we, au contraire, use &quot;old_&q= uot; for the old module and advise people who want to stick with the old mo= dules to change their imports?<br> <font color=3D"#888888"> <br> <br> Andrei<br> </font></blockquote></div><br><div>Along these same lines I&#39;m wondering= why not simply call this new module <a href=3D"http://std.io">std.io</a> r= ather than use the existing name std.stdio? =A0It&#39;d avoid the code brea= king issue and help reflect that this new module isn&#39;t based around C&#= 39;s stdio FILE (at least that&#39;s what I gather). =A0Also, the code is w= ritten from scratch so that&#39;s another reason for why I don&#39;t think = it should have the same name. =A0The only reason I can think of is if it pr= ovided significant improvements over the existing std.stdio without causing= massive breakage.</div> <div><br></div><div>Regards,</div><div>Brad Anderson</div> --00151747b418dc2d7704ac49e69f--
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 14:21:52 -0400, Brad Anderson <eco gnuk.net> wrote:

 On Tue, Sep 6, 2011 at 8:51 AM, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

 Probably. At any rate, what I now think as a promising path is with new
 module names. Let's leave the likes of std.xml and std.json in peace,  
 then
 pick a naming convention for the new ones and create whole new modules
 replacing them. Then people who are ready for the migration change

 import std.xml;

 with

 import std.some_naming_convention_**involving_xml;

Along these same lines I'm wondering why not simply call this new module std.io rather than use the existing name std.stdio? It'd avoid the code breaking issue and help reflect that this new module isn't based around C's stdio FILE (at least that's what I gather). Also, the code is written from scratch so that's another reason for why I don't think it should have the same name. The only reason I can think of is if it provided significant improvements over the existing std.stdio without causing massive breakage.

I think for something like std.xml which is somewhat of a standalone module, this is fine. However, i/o is used *everywhere*. It's the same situation with std.datetime. We can't duplicate all functions which deal with i/o in order to cater to both the stdio and the std.io folks, I think it's a waste of time, and it also looks bad. But I think I have come up with a plan (obviously not the one posted here) which keeps stdio's API compatible, yet can use the new stuff I've written if desired. i.e. provides improvements over the current std.stdio without causing massive breakage. Coincidentally, std.io is the name I chose for the new module ;) I'll post again when it's something that can be shared. I want to get all my ducks in a row first (obviously more than I did this time...) -Steve
Sep 06 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/

You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.
Sep 06 2011
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:

 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
=20
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing =



 mixing something...
=20
 How about a D binding for http://www.xmlsoft.org/ ?
=20
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: =20 =


ith-rapidxml/
=20
 You are so right, Timon. How deep is the trench between Phobos and =

That will never happen. Though on a positive note, a major reason the = Tango parser is so fast because there's no copying or translation of the = underlying data. Attributes are passed to the user as-is via a slice of = the input range. Most parsers in other languages simply don't work this = way.=
Sep 06 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:

 On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:

 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/

You are so right, Timon. How deep is the trench between Phobos and Tango devs? Tango's XML parser should really make it into Phobos.

That will never happen. Though on a positive note, a major reason the Tango parser is so fast because there's no copying or translation of the underlying data. Attributes are passed to the user as-is via a slice of the input range. Most parsers in other languages simply don't work this way.

So in the benchmark neither white-space is collapsed, nor are entities like &amp; converted?
Sep 06 2011
prev sibling next sibling parent Sean Kelly <sean invisibleduck.org> writes:
On Sep 6, 2011, at 6:49 PM, Marco Leise wrote:

 Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
=20
 On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:
=20
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
=20
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm =





 mixing something...
=20
 How about a D binding for http://www.xmlsoft.org/ ?
=20
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser: =20 =




ith-rapidxml/
=20
 You are so right, Timon. How deep is the trench between Phobos and =



=20
 That will never happen.  Though on a positive note, a major reason =


the underlying data. Attributes are passed to the user as-is via a = slice of the input range. Most parsers in other languages simply don't = work this way.
=20
 So in the benchmark neither white-space is collapsed, nor are entities =

I don't believe so. That's expected to be done by the user if he cares = about decoding the field. Compare this to the Xerces (Apache) XML = parser that passes in all attributes as wide chars regardless of the = input format and you can see why parsing XML in D can be so fast: = passing values via array slicing and having Unicode as the native = character format. If the input text is UTF-8 you use XmlParser!char, if = it's UTF-16 you use XmlParser!wchar, etc. I'm actually surprised that = more C/C++ parsers don't work this way.=
Sep 06 2011
prev sibling next sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh  
<WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the  
 class name.  We are up to version 11 (e.x. ID3D11Device).  It works well  
 and is definitely nicer once you are used to it, than calling everything  
 New or FunctionEx, and left wondering what to do when you rev the  
 interface again

In the case of D3D though, D3D itself has a version number. The next version of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec shows up, what do we do about std.xml2, which parses version 1.1? And what do we call the new one? Should std.xml3 parse XMLv2.0? -- Simen
Sep 08 2011
prev sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh  
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the  
 class name.  We are up to version 11 (e.x. ID3D11Device).  It works  
 well and is definitely nicer once you are used to it, than calling  
 everything New or FunctionEx, and left wondering what to do when you  
 rev the interface again

In the case of D3D though, D3D itself has a version number. The next version of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec shows up, what do we do about std.xml2, which parses version 1.1? And what do we call the new one? Should std.xml3 parse XMLv2.0?

That is late in the discussion, but a valid point.
Sep 08 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/03/2011 09:54 PM, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

File is now a class. This will break a lot of code. What is happening to the refcounted File feature? It seems that the new way of file handling, using a file class, is more error prone than the old way? But it is really great to hear that the efficiency problems of std.stdio are being sorted out!
Sep 03 2011
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2011-09-03 19:54:05 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Hello,
 
 
 There are a number of issues related to D's current handling of 
 streams, including the existence of the imperfect etc.stream and the 
 over-specialization of std.stdio.
 
 Steve has worked on an extensive overhaul of std.stdio which would 
 obviate the need for etc.stream and would improve both the generality 
 and efficiency of std.stdio.
 
 Please chime in with feedback; he's away from the Usenet but allowed me 
 to post this on his behalf. I uploaded the docs to
 
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Looks good… Hum, inconsistent casing of enum members. And shouldn't there be a way to do non-blocking IO? ;-) I like that File is now a class because it's cleaner that way, but non-deterministic destruction is going to be a problem. That said, it was already a problem anyway if you stored a File struct in a class, so maybe we need a more general solution for reference-counted classes. Class names DInput and DOutput sounds silly. If all classes implemented purely in D had a D prefix, it'd get redundant pretty fast (like KDE apps beginning in K). I'd suggest BufferedInput and BufferedOutput, or something else that actually describes what the class does, instead of DInput and DOutput. And I'd make them final, that way there won't be any virtual call overhead until the buffer needs to be replenished or flushed from the wrapped input or output stream. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
Sep 03 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 17:55:12 -0400, Michel Fortin  =

<michel.fortin michelf.com> wrote:

 On 2011-09-03 19:54:05 +0000, Andrei Alexandrescu  =

 <SeeWebsiteForEmail erdani.org> said:

 Hello,
   There are a number of issues related to D's current handling of  =


 streams, including the existence of the imperfect etc.stream and the =


 over-specialization of std.stdio.
  Steve has worked on an extensive overhaul of std.stdio which would  =


 obviate the need for etc.stream and would improve both the generality=


 and efficiency of std.stdio.
  Please chime in with feedback; he's away from the Usenet but allowed=


 me to post this on his behalf. I uploaded the docs to
  http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Looks good=E2=80=A6

Well, at least someone thinks so ;)
 Hum, inconsistent casing of enum members.

Can be fixed.
 And shouldn't there be a way to do non-blocking IO? ;-)

Yes. I haven't gotten to that yet. This is a very early version, not = ready for inclusion. It's mostly a proof-of-concept.
 I like that File is now a class because it's cleaner that way, but  =

 non-deterministic destruction is going to be a problem. That said, it =

 was already a problem anyway if you stored a File struct in a class, s=

 maybe we need a more general solution for reference-counted classes.

I agree, but I think I need to revisit that aspect. As broken as the = reference counting mechanism is, much code is based on it, so we can't s= ay = you have to revisit all source code in order to be compatible. And as Andrei points out, it works in cases where you *don't* store the = = struct on the heap, why should that be disabled?
 Class names DInput and DOutput sounds silly. If all classes implemente=

 purely in D had a D prefix, it'd get redundant pretty fast (like KDE  =

 apps beginning in K).

Yes, it made sense when I was going through the different iterations of = my = interface ideas. But you are right. BTW, these started out as = DBufferedInput and DBufferedOutput, and CStream was CBufferedStream.
 I'd suggest BufferedInput and BufferedOutput, or something else that  =

 actually describes what the class does, instead of DInput and DOutput.=

 And I'd make them final, that way there won't be any virtual call  =

 overhead until the buffer needs to be replenished or flushed from the =

 wrapped input or output stream.

They are final, ddoc just doesn't expose that... See my later post to the source. Things might be clearer. -Steve
Sep 03 2011
prev sibling next sibling parent dsimcha <dsimcha yahoo.com> writes:
Actually I'll generalize the comment I made before:  As much as I like more
efficiency, I despise the massive overhaul and code breakage and the complexity
of
having a zillion tiny objects to do everything that File used to do.  I would
like
to see the native I/O under the hood plus something more like the current API
for
basic file I/O.  I'd vote against the current design just because of the massive
code breakage it would cause with no migration plan.
Sep 03 2011
prev sibling next sibling parent reply Walter Bright <newshound2 digitalmars.com> writes:
What happens if I write:

    printf("hello ");
    writeln("world");

?
Sep 03 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 20:47:05 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 What happens if I write:

     printf("hello ");
     writeln("world");

useCStdio(); This makes all the standard handles C-based. And crap, I see I did not document it.... grr.... See here: https://github.com/schveiguy/phobos/blob/new-io/std/stdio.d#L3332 Sorry.... -Steve
Sep 03 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Sunday, September 04, 2011 02:49:40 Marco Leise wrote:
 Am 04.09.2011, 00:57 Uhr, schrieb Andrej Mitrovic
 
 <andrej.mitrovich gmail.com>:
 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Wasn't this overhaul _meant_ to break existing code by offering a new API? Still that's a serious issue of course, but not too surprising. I'm ambivalent on the inheritance vs predicate debate. Interfaces are the way it is meant to be done and actually ensure correct types. Predicates work with structs as well. I don't know if this would be important.

Any overhaul of existing functionality needs to improve on existing functionality. Changes just to change aren't valuable. So, changes should generally avoiding breaking backwards compatibility unless we gain something from it. So, as long as these changes are an overall improvement, then we'll just have to deal with the code breakage. However, if the code breakage doesn't actually gain us anything, then we should avoid it. So, complaints about code breakage are valid, but they aren't deal breaking. - Jonathan M Davis
Sep 03 2011
prev sibling next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/3/11 3:54 PM, Andrei Alexandrescu wrote:
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Here are a few points following a pass through the dox: * After thinking some more about it, I find the approach seek() plus enumerated Anchor undesirable. It's a bad case of logical coupling as one never calls seek() passing an anchor as a variable. It's really three functions - seekForward, seekBackward, and seekAbsolute. Heck, knowing what seek does, it should be just seekAbsolute. But then there are several possible designs; a logically coupled seek() is not a good turn in any case. * Seekable should document that tell() is O(1) and seek() can be considered O(1) but with a large constant factor. * Why is close() not part of Seekable, since Seekable seems to be the base of all streams? * Class File is IMHO not going to cut the mustard. It needs to be a struct with a destructor. One should be able to _get_ an InputStream or an OutputStream interface out of a File object (i.e. a file is a factory of such interfaces), but the File itself must be a struct. * I don't understand the difference between read() and readComplete(). * readUntil is a bit tenuous. I was hoping for a simpler interface to buffered streams, e.g. just expose the buffer as a ubyte[]. * readUntil(const(ubyte)[]) does not give a cheap means to figure whether the read ended because file ended or the terminator was met. * There's several readUntil but only one appendUntil. Why? * Document the difference between skip and seek. Also, skip should take a ulong. * I see encoder and decoder() in DInput, should both be decoder? * StreamWidth, TextXXX and friends are a bit sudden because they introduce a higher-level abstraction in a module so far only preoccupied to transferring bytes. I was thinking that kind of stuff would belong to a formatter/serializer module. Overall, there are interesting elements in this proposal but I don't quite feel it hit the proverbial nail on the head. Andrei
Sep 03 2011
parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 21:47:53 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 3:54 PM, Andrei Alexandrescu wrote:
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Here are a few points following a pass through the dox: * After thinking some more about it, I find the approach seek() plus enumerated Anchor undesirable. It's a bad case of logical coupling as one never calls seek() passing an anchor as a variable. It's really three functions - seekForward, seekBackward, and seekAbsolute. Heck, knowing what seek does, it should be just seekAbsolute. But then there are several possible designs; a logically coupled seek() is not a good turn in any case.

I think you need to support all three, but they could be individual functions. It just is easy to provide the same interface the OS handle provides. Let's entertain changing to three separate functions. But I think we need to support seek from front, seek from end, and seek from current. I don't know about the three you mentioned. How would you seek to the end if you didn't have seekEnd? And seeking forward or backward I think is captured much better via a positive or negative integer. I can imagine having to write code like this: if(pos < cur) seekBackward(cur - pos); else seekForward(pos - cur);
 * Seekable should document that tell() is O(1) and seek() can be  
 considered O(1) but with a large constant factor.

OK, docs need lots of TLC for sure.
 * Why is close() not part of Seekable, since Seekable seems to be the  
 base of all streams?

Hm... not really sure. I suppose it could be! But then, should the interface be called Seekable? What about just Stream?
 * Class File is IMHO not going to cut the mustard. It needs to be a  
 struct with a destructor. One should be able to _get_ an InputStream or  
 an OutputStream interface out of a File object (i.e. a file is a factory  
 of such interfaces), but the File itself must be a struct.

I'm seeing a large backlash on this decision. I'm going to revisit it. Note, however, that it was a poor choice of name for File on my part. File is *not* equivalent to the current stdio.File, in that it's not buffered, and is not text-based.
 * I don't understand the difference between read() and readComplete().

read() gets as much data as it can from the buffer and from the stream using at most one low-level read. readComplete() will continually read until either EOF is encountered, or the requested data is read. I started making read() do what readComplete does, but it surprisingly is a very difficult low-level thing to write. However, readComplete() is trivial to implement on top of read(), which is why I split the two functions. Please, come up with a better name, I hate readComplete :)
 * readUntil is a bit tenuous. I was hoping for a simpler interface to  
 buffered streams, e.g. just expose the buffer as a ubyte[].

I think we need a const(ubyte)[] peek(size_t nbytes). Would this suffice?
 * readUntil(const(ubyte)[]) does not give a cheap means to figure  
 whether the read ended because file ended or the terminator was met.

You are right. I'll think about this.
 * There's several readUntil but only one appendUntil. Why?

Didn't get around to it yet. The overloads for readUntil are trivial, so can be copied easily enough to appendUntil.
 * Document the difference between skip and seek. Also, skip should take  
 a ulong.

skip is buffer-only. It will never trigger a low-level call. I will fill the docs more completely. Given this, I think size_t is the right type, as a buffer cannot be more than size_t bytes in length.
 * I see encoder and decoder() in DInput, should both be decoder?

Yes. encoder is for DOutput, copy-paste error.
 * StreamWidth, TextXXX and friends are a bit sudden because they  
 introduce a higher-level abstraction in a module so far only preoccupied  
 to transferring bytes. I was thinking that kind of stuff would belong to  
 a formatter/serializer module.

Could be moved. However, stdin stderr and stdout are traditionally text-based, and stdio contains them. I wanted to split out text-handling from the basic buffered stream, since it's very specific. For example, having to deal with an object that supports formatted text i/o for a network socket seems uncommon. I'm open to suggestions. Note, I must have had a brain-malfunction when I gave what I thought was a fairly completely-documented module. I missed some very important declarations and functions. I'll work on fixing the docs and giving you a new copy. Thanks again for hosting it. -Steve
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 15:54:05 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Hello,


 There are a number of issues related to D's current handling of streams,  
 including the existence of the imperfect etc.stream and the  
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would  
 obviate the need for etc.stream and would improve both the generality  
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me  
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Thank you Andrei for posting this. Before I add some more details, let me first say, this is a very early version, but it does work (and spanks the pants off of the current stdio in the tests I've run). I'll add several very important things: 1. At the moment, this is written for Linux *ONLY*. I have very good experience with Windows i/o, and I am 100% certain I can implement this library for it. However, it's not my main OS, so I wanted to first get something working with my main working environment. 2. This is *not* currently multithread aware. But it will be. However, I think one important aspect to consider is to make a *thread-local* aware i/o library to avoid unnecessary locking when an i/o connection is only used in one thread. But please leave that part alone for now, I'm working on how to make the code reusable as shared types. Actually, if anyone has good ideas on that, please share! 3. Although I am dead-set on getting *something* into Phobos, I am not attached at all to the symbol names, or even some major design choices. I have seen so far it's one of the major concerns, and I think we can find good names. The names I came up with are not exactly arbitrary, but they are somewhat based on earlier designs that I have since abandoned, so renaming is definitely in order. 4. You can get the full source here: https://github.com/schveiguy/phobos/tree/new-io I used the 2.054 stock compiler, and a version of druntime that includes Lars' new std-process changes, also on my github account: https://github.com/schveiguy/druntime/tree/new-std-process Please use those when trying out the code. -------------------------- So let me tell you about the library design and why I did it the way I did it. Then, I'll respond to individual concerns already posted. The major problem I think the current std.stdio has is, it's buffered solution is based on C's FILE * implementation. Specifically, we have very little control and access to the buffer implementation. I think the key (or at least one of the keys) to uber-fast I/O is trying to copy as little as possible *needlessly*. Seamless and safe buffer access I think is the key to this. In addition to that, C's FILE * has several limitations: 1. On Windows, it's based on DMC's runtime, which limits 60 simultaneous open files (Windows OS limit is 10,000 I think) 2. 64-bit support is not standard in all C implementations (namely Windows) 3. All FILE * objects are inherently shared, meaning lock-free I/O is very cumbersome, especially considering we have D's shared/unshared system. 4. C supports UTF-8, and it's supposed to support UTF-16 (but I can't get UTF-16 to work). I think D ought to support all forms of UTF, since UTF is an integral part of the language. In addition to this, we have numerous D tools at our disposal -- delegates, closures, ranges, etc. In other words, limiting us to C's interfaces means either duct-taping on those features, or abandoning them. While a noble effort, and probably the best we could get, a prime example is the LockingFileReader range in std.stdio. Just reading it made me cringe. Have a look: https://github.com/D-Programming-Language/phobos/blob/master/std/stdio.d#L1282 I felt, we must be able to do something better. So I started creating what I thought would be a good i/o library. I did not start from the existing code, but just rewrote everything. The basic concept is, we implement buffering once, and implement low-level devices that can be wrapped by the buffering implementation. Almost everything that would use I/O wants to use a buffered version of it, so make the low-level aggregate minimal, and put all the useful functionality into the buffer. I also wanted to make sure it is very easy to implement *efficient* ranges. One design decision early on is that the device-level should be a class. There are a few good reasons for this: 1. an I/O device is a reference-type. Copying it does not open another handle. So even if we *wanted* structs, they would be pImpl structs. 2. One simple idea that works very well at the OS level is the file descriptor concept. The file descriptor provides an *interface* to user code for operating on a stream. And they are easily inter-changeable. This means a fd could be a network socket, a file, a pipe, a COM port, and the basic interface never changes. So we should use that same concept -- define a simple interface for a low-level device, and then you can implement the buffer around that interface. Since classes are the only types which support interfaces, I chose them. Yes, I know classes suffer from the dreaded "I don't know when the GC is going to get around to closing this file" problem. I think though, we have ways to mediate that (I'll post some responses to points about that elsewhere in the thread). One other important design decision I made was that the standard handles *must* be changable at runtime to C-based i/o. This was mainly to appease Walter, as he insists on having compatible I/O with C functions (such as printf). I think he has a good point, but I think limiting this to basically the standard handles is the right level of compatibility. After going through many iterations (you can look at the github history if you are interested), I settled on this basic tree. Note that I'm very open to changing any parts of this, as long as the basic concept of a common buffer type surrounding a low-level device type is kept intact. interface Seekable => an interface defining seek functions for a device. interface InputStream : Seekable => an interface defining functions that can be called on an input device. This is non-buffered. interface OutputStream : Seekable => an interface defining functions that can be called on an output device. Also non-buffered. class File : InputStream, OutputStream => The implementation for the OS handle-based input output stream. This is akin to a file descriptor. (Note, I realize this is a poor name choice for this, it should probably be changed). final class DInput => The buffered input stream. This implements the buffer which surrounds an InputStream. final class DOutput => The buffered output stream. This implements the buffer which surrounds an OutputStream. final class CStream => A Buffered Input and output stream based on C's FILE *. This is used if you want to be compatible with C input or output, and is used in TextInput and TextOutput when using the C standard handles. struct TextInput => A text-based input stream. This implements UTF translation of all forms and handles formatted input. Main member function is readf. struct TextOutput => A text-based output stream. This implements UTF translation of all forms and handles formatted output. Main member functions are the write* family. It seems like a lot. But keep in mind that almost everyone will only ever used DInput, DOutput, TextInput and TextOutput. These replace the current std.stdio.File. The low level devices are for implementing low-level devices. They are not really for being used, except to wrap in a buffer. I expect that convenience functions will exist to create the correct buffered stream when given the right parameters. The most obvious example is the function openFile (which is included). The nice thing is, due to the auto return feature and templates, this takes care of some of the mess of having 4 main types to deal with. I want to reiterate, I have created something that works, not something that is perfect. I want everyone's input on how it should be changed -- including major design decisions. I'm open to changing just about everything. The *only* major concept I want to keep is the buffering surrounding a low-level device. Thanks for taking the time to look at this. I hope it will become good enough to be included in Phobos. I plan to do everything I can to make it happen. -Steve
Sep 03 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
Seems to me like virtually every module in Phobos gets a complete
rewrite sooner or later. Yikes! Afaik the upcoming ones are also
std.xml, std.variant, maybe std.json too? (can't recall). Was there
really so much bad code written in Phobos all along that they all
require a rewrite?
Sep 03 2011
prev sibling next sibling parent David Nadlinger <see klickverbot.at> writes:
I will come back with some more detailed feedback later on, but a few 
nits that caught my eye:

  - I don't think changing file from being a struct to a class is a good 
idea. First, it breaks an awful lot of D/Phobos programs already out 
there, both because of the struct->class change and because of the other 
API changes. Second, I feel we should really try to make use of RAII for 
things like file handles – I know we have »scope (exit) file.close()«, 
but forcing the user to remember to always type that needs a very good 
reason, imho. Couldn't File rather have some factory methods returning 
stream interface implementations?

  - CStream and DInput/Output? I don't care how it is implemented under 
the hood, give me something that works! ;) In this case, I guess CStream 
is somewhat appropriate, as C (FILE*) streams are widely known, but 
still I'm not too fond of the names.

  - bufsize -> bufSize?

  - Why on earth does DDoc render the enum default parameter as 
»(Anchor).Begin«? Is there a bug report for this?

  - I am sure there is a reason why the design uses decoder delegates, 
but without the source being available, I didn't find it immediately 
obvious where the advantages of using it over processing what is being 
read() from the stream are. Is this so data can be processed before 
going into the buffer? On a related note, what seems to be the decoder 
property getter is named »encoder()«.

David


On 9/3/11 9:54 PM, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 21:58:09 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

please read the previous comment, it includes a link to the source as well  
as further explanations...

Boy, I could have planned this better...

-Steve
Sep 03 2011
prev sibling next sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, September 03, 2011 18:53:00 Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints about code breakage are valid, but they aren't deal
 breaking.

The larger the amount of code that is broken, the more gain there must be to justify it. Breaking std.stdio, which is used everywhere, this thoroughly needs a very high bar of justification.

Agreed. - Jonathan M Davis
Sep 03 2011
parent bearophile <bearophileHUGS lycos.com> writes:
Jonathan M Davis:

 Breaking std.stdio, which is used everywhere, this thoroughly needs a very
 high bar of justification.

Agreed.

The purpose of the gofix tool in the Go language library is to lower this bar significantly :-) Bye, bearophile
Sep 05 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 22:27:49 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 9:53 PM, Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints
 about code breakage are valid, but they aren't deal breaking.

The larger the amount of code that is broken, the more gain there must be to justify it. Breaking std.stdio, which is used everywhere, this thoroughly needs a very high bar of justification.

I agree. I'm hoping the new stuff could build on top of std.stdio.

It is my plan for the eventual result to break either no code, or as little code as possible. The current library is mostly a proof-of-concept, to see what people think, and to show what might be possible. I think the interfaces in this library make for a much easier-to-write xml library for instance. It's by no means a proposal for immediate acceptance into Phobos, I'm sorry if it came across that way. We have to break something in std.stdio, because it's fixated on FILE *. We need something that allows FILE * to play the game, but is focused on a D-based solution. Otherwise, we have no room for improvement. that's what I'm striving for. And along the way, I'm trying to make it as efficient as possible. -Steve
Sep 03 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sat, 03 Sep 2011 23:45:17 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 11:33 PM, Steven Schveighoffer wrote:
 We have to break something in std.stdio, because it's fixated on FILE *.
 We need something that allows FILE * to play the game, but is focused on
 a D-based solution. Otherwise, we have no room for improvement.

I'm not 100% convinced of that. We can achieve a good deal of improvement by resorting to platform-specific code. Clearly that's not the best way to go but it's not difficult and it does have its merit. Overall I think the design of std.stdio should be followed: 1. User opens a File (or whatever), which is a struct. The struct uses RAII.

OK, I think that's the offer on the table I keep getting :) I'm definitely going to use this, and its name will be File. I think it has to be in order to be compatible with all current code.
 2. Using the struct you can directly call primitives to read and write  
 stuff.

Buffered reads and writes? If so, don't you need to decide the items in point 3 before read/write? If not buffered, then I think I can work with this.
 3. You can also decide you want a polymorphic stream out of it, and you  
 get to decide the parameters of the stream (buffering, chunking,  
 synchronicity and whatnot). byChunk and byLine are good examples,  
 although they aren't polymorphic. Once you have such a stream you're in  
 polyland so you get to use all of its goodies (look ma no templates etc).

 4. Once all copies of the struct is destroyed, all streams derived from  
 it are automatically closed and will issue errors when used.

OK, I think I know how to do this. I'm assuming if you want to use exclusively the poly versions, you can do that. I.e. you don't have to keep an RAII File struct around.
 That's pretty much it! It's a simple design that does all we need.

I'll work on that. How should text vs. non-text i/o work? C currently conflates them at the same level, but I think they are two separate layers. What do you think? -Steve
Sep 03 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight - using
 writeln blows up the size of your .obj file, making it hard to track down a
 back end bug. This is a long standing gripe I have with writeln.

Well, while that may be a good reason to use printf, it really doesn't apply to very many D programmers. Your average D programmer really has no need to use printf. - Jonathan M Davis
Sep 03 2011
prev sibling next sibling parent reply "Paulo Pinto" <pjmlp progtools.org> writes:
Hi,

what is an "abstract interface" ?

--
Paulo

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:j3u0l4$1atr$1 digitalmars.com...
 Hello,


 There are a number of issues related to D's current handling of streams, 
 including the existence of the imperfect etc.stream and the 
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would obviate 
 the need for etc.stream and would improve both the generality and 
 efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me to 
 post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei
 

Sep 04 2011
parent David Nadlinger <see klickverbot.at> writes:
On 9/4/11 4:30 PM, Andrej Mitrovic wrote:
 On 9/4/11, Paulo Pinto<pjmlp progtools.org>  wrote:
 Hi,

 what is an "abstract interface" ?

I'm wondering the same thing.

A bug in ddoc. ;) David
Sep 04 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

I think that openFile, File.open and CStream.open should shouldn't take a string as the mode, it should be an enum or similar. Andrei is making a big deal out of using enums instead of bools. A bool value can contain "true" or "false", a string can contain an infinite number of different values. -- /Jacob Carlborg
Sep 04 2011
next sibling parent reply "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

I think that openFile, File.open and CStream.open should shouldn't take a string as the mode, it should be an enum or similar. Andrei is making a big deal out of using enums instead of bools. A bool value can contain "true" or "false", a string can contain an infinite number of different values.

openFile takes it as a template argument, and it will fail at compile time if the parameter is not correct (if not now, it will when the library is ready for inclusion). I agree that enum is cleaner and easier to deal with from the library's point of view, but we have 2 things going for us by using strings: 1. The string formats are backwards compatible, and well defined. In fact, CStream.open just passes the mode string without modification to fopen. 2. The brevity of and ability to comprehend a string literal vs. multiple enums. You can think of it like printf (or writef). The format string has infinitely wrong possible format strings, which must be rejected at run time. But I'll take that any day over C++'s format modifiers which are type checked at compile-time. Remember, typically, string formats are most frequently literals, and easy to read/write. While there is great potential for invalid parameters, the reality is this rarely happens, and if it does, the errors are seen immediately. -Steve
Sep 06 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 12:50, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

I think that openFile, File.open and CStream.open should shouldn't take a string as the mode, it should be an enum or similar. Andrei is making a big deal out of using enums instead of bools. A bool value can contain "true" or "false", a string can contain an infinite number of different values.

openFile takes it as a template argument, and it will fail at compile time if the parameter is not correct (if not now, it will when the library is ready for inclusion).

If it validates the string at compile time than that's great.
 I agree that enum is cleaner and easier to deal with from the library's
 point of view, but we have 2 things going for us by using strings:

 1. The string formats are backwards compatible, and well defined. In
 fact, CStream.open just passes the mode string without modification to
 fopen.
 2. The brevity of and ability to comprehend a string literal vs.
 multiple enums.

 You can think of it like printf (or writef). The format string has
 infinitely wrong possible format strings, which must be rejected at run
 time. But I'll take that any day over C++'s format modifiers which are
 type checked at compile-time.

It's not very often I use the print format functions. Most of the time I use Tango and with Tango's format strings at least you don't have to specify the type.
 Remember, typically, string formats are most frequently literals, and
 easy to read/write. While there is great potential for invalid
 parameters, the reality is this rarely happens, and if it does, the
 errors are seen immediately.

 -Steve

I would not say that they are easy to read, or at least understand/remember what a given mode means. I always have to double check the documentation when using these kind of modes. I always have to check if a given mode creates a new file or not. -- /Jacob Carlborg
Sep 06 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Ok, I would prefer to use enums if they have sensible names. Something like this: File.open(Mode.read | Mode.write); // for both read and write -- /Jacob Carlborg
Sep 06 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Ok, I would prefer to use enums if they have sensible names. Something like this: File.open(Mode.read | Mode.write); // for both read and write

Honest, C's openmode strings have been around for so long, they hardly confuse anyone anymore. I'd rather use "rw" and call it a day. Andrei
Sep 06 2011
next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/6/11 12:24 PM, Marco Leise wrote:
 Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer
 <schveiguy yahoo.com>:

 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they
 hardly confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;)

Sorry, but I had to laugh. There could not have been a better counter example for using fopen strings. I can live with them, but it is one of the bad designs in C that could use an alternative.

Guess I'm destroyed. Andrei
Sep 06 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 17:11, Andrei Alexandrescu wrote:
 On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Ok, I would prefer to use enums if they have sensible names. Something like this: File.open(Mode.read | Mode.write); // for both read and write

Honest, C's openmode strings have been around for so long, they hardly confuse anyone anymore. I'd rather use "rw" and call it a day. Andrei

I disagree. -- /Jacob Carlborg
Sep 07 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-06 17:39, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they hardly
 confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;) The plus "+" is odd, especially with "a" meaning "append". And there's that really useless "b" :)

Exactly.
 But I think this does *not* invalidate the usage of strings to denote
 open mode, it just needs more design. The good thing about it is, we can
 augment the string flags and be binary and perfectly backwards compatible.

 -Steve

-- /Jacob Carlborg
Sep 07 2011
prev sibling next sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file opening function taking a string "mode" parameter. But if you say that "rw" can/will be used instead than that's better. -- /Jacob Carlborg
Sep 07 2011
parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file opening function taking a string "mode" parameter. But if you say that "rw" can/will be used instead than that's better.

Yes, I'll try to add "rw" and maybe some other letter combinations that make sense in my next version. But I think we still have to support "r+", even though it's esoteric, because too much existing code already does this, and to not support it would leave silently compiling bugs. -Steve

Didn't you just say that you would check the string at compile time? -- /Jacob Carlborg
Sep 08 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 03:05 PM, Jacob Carlborg wrote:
 On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file opening function taking a string "mode" parameter. But if you say that "rw" can/will be used instead than that's better.

Yes, I'll try to add "rw" and maybe some other letter combinations that make sense in my next version. But I think we still have to support "r+", even though it's esoteric, because too much existing code already does this, and to not support it would leave silently compiling bugs. -Steve

Didn't you just say that you would check the string at compile time?

That is not compatible with the auto f = File(name, mode); interface.
Sep 08 2011
prev sibling next sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/8/11 11:11 AM, Simen Kjaeraas wrote:
 On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 I wonder if there's a way to give the option of using a template
 parameter or using a positional parameter without having two different
 symbol names. hm...

 openFile!(string modedefault = "r")(string filename, string mode =
 modedefault) if (isValidOpenMode(modedefault))
 {
 if(!isValidOpenMode(mode))
 throw new Exception("invalid file open mode: " ~ mode);
 ...
 }

 Would that work?

Neat! And yes, it certainly does work. I'm still unsure when someone will actually need to specify that at runtime, but maybe for scripting languages?

My opinion: we're spending way too much energy on this. File I/O poses much more difficult problems than choosing representation of open flags. Andrei
Sep 08 2011
prev sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-08 15:17, Steven Schveighoffer wrote:
 You can if you make it a template parameter. For example, my openFile
 function that I wrote does this (in fact, I needed a template mode
 string because the return type depends on it). The downside is you
 cannot pass a runtime-generated string. I cannot actually think of any
 use cases for that however.

 In any case, the existing API does not use a template parameter, and we
 have to try and break as little code as possible.

 I wonder if there's a way to give the option of using a template
 parameter or using a positional parameter without having two different
 symbol names. hm...

 openFile!(string modedefault = "r")(string filename, string mode =
 modedefault) if (isValidOpenMode(modedefault))
 {
 if(!isValidOpenMode(mode))
 throw new Exception("invalid file open mode: " ~ mode);
 ...
 }

 Would that work?

 -Steve

That looks nice if it works. -- /Jacob Carlborg
Sep 08 2011
prev sibling parent reply Jacob Carlborg <doob me.com> writes:
On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b". -- /Jacob Carlborg
Sep 07 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

I disagree: "rw" is quite obvious because you have context. It is not Mode.read | Mode.write vs "rw" but File("filename.txt", Mode.read | Mode.write); vs File("filename.txt", "rw");
Sep 07 2011
next sibling parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/07/2011 01:27 PM, Timon Gehr wrote:
 On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

I disagree: "rw" is quite obvious because you have context. It is not Mode.read | Mode.write vs "rw" but File("filename.txt", Mode.read | Mode.write); vs File("filename.txt", "rw");

Oh, btw: final switch(Mode.read|Mode.write){ case Mode.read: writeln(1); break; case Mode.write: writeln(2); break; } => 2 hm...
Sep 07 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:

 final switch(Mode.read|Mode.write){
       case Mode.read: writeln(1); break;
       case Mode.write: writeln(2); break;
 }

 =>  2

 hm...


Actually, it will print nothing, not even an Assertion failure, my enum definition was wrong
 Personally, I don't think that&ing or |ing enums should result in an enum,
 and this case illustrates one reason why. But ultimately, the main issue IMHO
 is that&ing or |ring enums doesn't generally result in a valid enum value, so
 it just doesn't make sense.

Yes exactly. That is why I always use alias int MODE; enum:MODE{ MODEread=1, MODEwrite=2, }
Sep 07 2011
parent reply Timon Gehr <timon.gehr gmx.ch> writes:
On 09/07/2011 10:49 PM, Jonathan M Davis wrote:
 On Wednesday, September 07, 2011 14:16:55 Timon Gehr wrote:
 On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:

 final switch(Mode.read|Mode.write){

        case Mode.read: writeln(1); break;
        case Mode.write: writeln(2); break;

 }

 =>   2

 hm...


Actually, it will print nothing, not even an Assertion failure, my enum definition was wrong

Did you compile with -w? I don't remember if that affects final switch or not, but there's definitely a problem if you can get final switch to take a value that it doesn't handle without using a cast.

final switch works the same with or without warnings. Basically final switch is wrong in assuming that enumerations can only contain the declared values, because the bitwise operators work on enums.
 Personally, I don't think that&ing or |ing enums should result in an
 enum, and this case illustrates one reason why. But ultimately, the
 main issue IMHO is that&ing or |ring enums doesn't generally result in
 a valid enum value, so it just doesn't make sense.

Yes exactly. That is why I always use alias int MODE; enum:MODE{ MODEread=1, MODEwrite=2, }

And how is that any different from alias int MODE; enum MODEread = 1; enum MODEwrite = 2;

It is not. But there is currently no nice way to express a set of orthogonal flags. enumerations are mis-used for it sometimes, but as you said that does not make sense. I sometimes have small bugs because the alias is weakly typed though.
 They're manifest constants, not enum values. So, you're basically suggesting
 that flags be done with manifest constants as opposed to enums? That doesn't
 encapsulate as well IMHO, and I'd still object to a function having a MODE
 parameter, since that implies that a MODE is a single flag, whereas it's a
 group of flags -

I'd argue that (MODEread | MODEwrite) is a single mode resulting from the composition of the MODEread and MODEwrite modes.
 that and as far as Phobos goes, we don't generally use aliases
 like that (of course, we don't name types in all caps or start variable or
 enum value names with uppercase characaters either, so what Phobos does
 obviously isn't necssarily what you stick to).

That is why imho Phobos should not use enums for file modes. They are just not a good match, because the language is so confused about what is valid on enums and what is not.
Sep 07 2011
parent reply travert phare.normalesup.org (Christophe) writes:
 It is not. But there is currently no nice way to express a set of 
 orthogonal flags.

"r", "w", "rw" would be. Another option is to use the power of typesafe variadic functions: enum Mode :char { read, write } File fOpen(string filename, Mode[]...); auto file = fOpen("test.txt", Mode.read, Mode.write); Isn't it much clearer than using (Mode.read | Mode.write)? Even using explicitely [Mode.read, Mode.write] sounds safer anyway. It is uses more memory that using bits operators, but who cares about few bytes when opening a whole file ? -- Christophe
Sep 07 2011
next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/07/2011 11:59 PM, Christophe wrote:
 It is not. But there is currently no nice way to express a set of
 orthogonal flags.

"r", "w", "rw" would be.

At least it is short.
 Another option is to use the power of typesafe variadic functions:

 enum Mode :char { read, write }
 File fOpen(string filename, Mode[]...);

 auto file = fOpen("test.txt", Mode.read, Mode.write);

 Isn't it much clearer than using (Mode.read | Mode.write)? Even using
 explicitely [Mode.read, Mode.write] sounds safer anyway. It is uses more
 memory that using bits operators, but who cares about few bytes when
 opening a whole file ?

do you seriously prefer auto f=fOpen("bah.txt",[Mode.read, Mode.write]); over auto f=fOpen("bah.txt","rw"); if that is the case, you could do auto f=fOpen("bah.txt",encodeMode!([Mode.read, Mode.write])); that even saves the few bytes.
Sep 07 2011
prev sibling parent Tobias Pankrath <tobias pankrath.net> writes:
Christophe wrote:

 It is not. But there is currently no nice way to express a set of
 orthogonal flags.


 "r", "w", "rw" would be.
 Another option is to use the power of typesafe variadic functions:
 
 enum Mode :char { read, write }
 File fOpen(string filename, Mode[]...);
 
 auto file = fOpen("test.txt", Mode.read, Mode.write);

I like the variadic version most. Another alternative would be to use a extra struct for flags, that supports OR'ing them in a better way than plain enum do. Maybe we can get some inspiration from http://doc.qt.nokia.com/4.7/qflags.html I'd like to add, that if we once get a good IDE for D, it won't be able to show me possible values of the mode parameter, if its type is just string. And files may be not the only part of phobos that will need flags. In the end, there should be a solution that works even though the API and possible values are not known from C.
Sep 08 2011
prev sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 07, 2011 13:32:46 Timon Gehr wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:
 
 fstream ifs("filename.txt", ios_base::in | ios_base::out);
 
 vs.
 
 File("filename.txt", "r+"); // or "rw"
 
 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.
 
 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

I disagree: "rw" is quite obvious because you have context. It is not Mode.read | Mode.write vs "rw" but File("filename.txt", Mode.read | Mode.write); vs File("filename.txt", "rw");

Oh, btw: final switch(Mode.read|Mode.write){ case Mode.read: writeln(1); break; case Mode.write: writeln(2); break; } => 2 hm...

Personally, I don't think that &ing or |ing enums should result in an enum, and this case illustrates one reason why. But ultimately, the main issue IMHO is that &ing or |ring enums doesn't generally result in a valid enum value, so it just doesn't make sense. - Jonathan M Davis
Sep 07 2011
prev sibling next sibling parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 01:13 PM, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

It's not the same. "a" and "b" do not have any meaning, they are just variable names. "r" stands for read and "w" stands for write. It's pretty obvious that they do, especially in the context of opening a file. I'd equate it to using i, j, k for index variables -- they are not descriptive, but in context, everyone knows what they mean.

I totally agree.
 And in response to the discussion about enum flags not being & or |
 together, I emphatically think enums should be used for bitfields.
 Remember, enum is not just an enumeration, it's a manifest constant.

enum Enumeration{ field0, field1, } enum manifestConstant=0;
 I see no reason that we should not use the namespace-creation ability of
 enum to create such constants. I don't see the downside.

The downside is that eg. final switch incorrectly assumes that enum values are not composeable. It is imho a small inconsistency in the language's design.
Sep 08 2011
prev sibling next sibling parent Jacob Carlborg <doob me.com> writes:
On 2011-09-08 13:13, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

It's not the same. "a" and "b" do not have any meaning, they are just variable names. "r" stands for read and "w" stands for write. It's pretty obvious that they do, especially in the context of opening a file.

I guess it's a little clearer in the context of opening a file. "a" can be short for "apple" and "b" can be short for "beer". -- /Jacob Carlborg
Sep 08 2011
prev sibling parent reply Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 08, 2011 07:13:48 Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:
 
 fstream ifs("filename.txt", ios_base::in | ios_base::out);
 
 vs.
 
 File("filename.txt", "r+"); // or "rw"
 
 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.
 
 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

It's not the same. "a" and "b" do not have any meaning, they are just variable names. "r" stands for read and "w" stands for write. It's pretty obvious that they do, especially in the context of opening a file. I'd equate it to using i, j, k for index variables -- they are not descriptive, but in context, everyone knows what they mean. And in response to the discussion about enum flags not being & or | together, I emphatically think enums should be used for bitfields. Remember, enum is not just an enumeration, it's a manifest constant. I see no reason that we should not use the namespace-creation ability of enum to create such constants. I don't see the downside.

I think that it makes perfect sense to use enums for flags. What I don't think makes sense is making the type of the variable which holds the flags to be that enum type unless _every_ possible combination of flags has its own flag so that &ing or |ing enums always results in a valid enum. I have no gripe with using enums for flags. It's using an enum to hold a value which is not a valid value for that enum which is the problem IMHO. - Jonathan M Davis
Sep 08 2011
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't think
 makes sense is making the type of the variable which holds the flags to be that
 enum type unless _every_ possible combination of flags has its own flag so that
 &ing or |ing enums always results in a valid enum.

This ain't going to work because it would require the human user to write by hand a combinatorial number of symbols. A ligthweight fixed-sized set with named members is a worthy abstraction for the standard library. Andrei
Sep 08 2011
parent Timon Gehr <timon.gehr gmx.ch> writes:
On 09/08/2011 10:33 PM, Jonathan M Davis wrote:
 On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of flags
 has its own flag so that&ing or |ing enums always results in a valid
 enum.

This ain't going to work because it would require the human user to write by hand a combinatorial number of symbols. A ligthweight fixed-sized set with named members is a worthy abstraction for the standard library.

I agree. I'm not arguing that the user _should_ create such a combination of flags. That would be horrible. I'm just arguing that having a set of flags with enums, e.g. enum Flag { a = 1, b = 2, c = 4, d = 8 }; and then having Flag.a | Flag.b or Flag.a& Flag.b result in a value of type Flag is not a good idea, because the result isn't a valid Flag. It should result in whatever the base type is (int in this case), and functions which take such flags&ed or |ed should take them using the base type, not the enum type. - Jonathan M Davis

+1.
Sep 08 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 08:49:22 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 12:50, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of  
 streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed  
 me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

I think that openFile, File.open and CStream.open should shouldn't take a string as the mode, it should be an enum or similar. Andrei is making a big deal out of using enums instead of bools. A bool value can contain "true" or "false", a string can contain an infinite number of different values.

openFile takes it as a template argument, and it will fail at compile time if the parameter is not correct (if not now, it will when the library is ready for inclusion).

If it validates the string at compile time than that's great.
 I agree that enum is cleaner and easier to deal with from the library's
 point of view, but we have 2 things going for us by using strings:

 1. The string formats are backwards compatible, and well defined. In
 fact, CStream.open just passes the mode string without modification to
 fopen.
 2. The brevity of and ability to comprehend a string literal vs.
 multiple enums.

 You can think of it like printf (or writef). The format string has
 infinitely wrong possible format strings, which must be rejected at run
 time. But I'll take that any day over C++'s format modifiers which are
 type checked at compile-time.

It's not very often I use the print format functions. Most of the time I use Tango and with Tango's format strings at least you don't have to specify the type.

writef is the same, %s is equivalent to calling toString(). But the format specifiers for Tango are also strings, and not compile-time verified. My point was simply, using a string to indicate flags or formatting instructions is pretty efficient, easy to write, and easy to read.
 Remember, typically, string formats are most frequently literals, and
 easy to read/write. While there is great potential for invalid
 parameters, the reality is this rarely happens, and if it does, the
 errors are seen immediately.

 -Steve

I would not say that they are easy to read, or at least understand/remember what a given mode means. I always have to double check the documentation when using these kind of modes. I always have to check if a given mode creates a new file or not.

Yeah, creating a new file is implied by a combination of modes. The one that's confusing I think is that "a" is for append, but "+" kind of tacks on appending to any other mode. It's not the most well-designed spec for file opening. Add to that you have the "b" which is a noop on most OSes. There is the possibility that we could accept an alternative open mode string, which we could design better. But we have to keep fopen's spec, it's already used everywhere. -Steve
Sep 06 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/6/11, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 There is the possibility that we could accept an alternative open mode
 string, which we could design better.  But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Or an alternative enum instead of a string. I'm another one of those people who forgets what the various read/write modes are.
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+"  
 kind
 of tacks on appending to any other mode. It's not the most  
 well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Ok, I would prefer to use enums if they have sensible names. Something like this: File.open(Mode.read | Mode.write); // for both read and write

Honest, C's openmode strings have been around for so long, they hardly confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;) The plus "+" is odd, especially with "a" meaning "append". And there's that really useless "b" :) But I think this does *not* invalidate the usage of strings to denote open mode, it just needs more design. The good thing about it is, we can augment the string flags and be binary and perfectly backwards compatible. -Steve
Sep 06 2011
prev sibling next sibling parent "Marco Leise" <Marco.Leise gmx.de> writes:
Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer  
<schveiguy yahoo.com>:

 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they hardly  
 confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;)

Sorry, but I had to laugh. There could not have been a better counter example for using fopen strings. I can live with them, but it is one of the bad designs in C that could use an alternative. Enums are used in: Unix, Windows, Delphi, Haskell, Lisp, C++, C#, OCaml, Go fopen strings are used in: C, Ruby, PHP, Python, NodeJS Java has reinvented mode strings: http://download.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#RandomAcce sFile(java.io.File, java.lang.String) Other languages also distinguish only between a fixed set of cases, like read, write and append. I found Scala and Perl to do that. In the end a string just like an enum with the enum being statically checked and the string being shorter. Every character corresponds to an 'ored' enum value. They can both be extended with flags that work with Windows and Posix, like 'create only if non-existent' or hints that may work on one system only, like exclusive access.
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Tue, 06 Sep 2011 13:24:34 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer  
 <schveiguy yahoo.com>:

 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they hardly  
 confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;)

Sorry, but I had to laugh. There could not have been a better counter example for using fopen strings. I can live with them, but it is one of the bad designs in C that could use an alternative.

I agree, but: 1. strings are statically checkable in D (see my openFile for an example), and 2. just because the flags were poorly chosen in C doesn't mean we must adhere to that spec. In other words, "rw" is not a valid fopen mode string, but it *could* be a valid std.stdio mode string.
 Enums         are used in: Unix, Windows, Delphi, Haskell, Lisp, C++,  
 C#, OCaml, Go
 fopen strings are used in: C, Ruby, PHP, Python, NodeJS
 Java has reinvented mode strings:  
 http://download.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#RandomAcce
sFile(java.io.File,  
 java.lang.String)
 Other languages also distinguish only between a fixed set of cases, like  
 read, write and append. I found Scala and Perl to do that.

 In the end a string just like an enum with the enum being statically  
 checked and the string being shorter. Every character corresponds to an  
 'ored' enum value. They can both be extended with flags that work with  
 Windows and Posix, like 'create only if non-existent' or hints that may  
 work on one system only, like exclusive access.

I like enums in terms of writing code that processes them, but in terms of calling functions with them, I mean look at a sample fstream constructor in C++: fstream ifs("filename.txt", ios_base::in | ios_base::out); vs. File("filename.txt", "r+"); // or "rw" There's just no way you can think "rw" is less descriptive or understandable than ios_base::in | ios_base::out. -Steve
Sep 06 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file opening function taking a string "mode" parameter. But if you say that "rw" can/will be used instead than that's better.

Yes, I'll try to add "rw" and maybe some other letter combinations that make sense in my next version. But I think we still have to support "r+", even though it's esoteric, because too much existing code already does this, and to not support it would leave silently compiling bugs. -Steve
Sep 08 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using: Mode.read | Mode.write Instead of "rw" is the same thing as one should name variables with a proper descriptive names instead of just "a" or "b".

It's not the same. "a" and "b" do not have any meaning, they are just variable names. "r" stands for read and "w" stands for write. It's pretty obvious that they do, especially in the context of opening a file. I'd equate it to using i, j, k for index variables -- they are not descriptive, but in context, everyone knows what they mean. And in response to the discussion about enum flags not being & or | together, I emphatically think enums should be used for bitfields. Remember, enum is not just an enumeration, it's a manifest constant. I see no reason that we should not use the namespace-creation ability of enum to create such constants. I don't see the downside. -Steve
Sep 08 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 08:20:46 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/08/2011 01:13 PM, Steven Schveighoffer wrote:

 And in response to the discussion about enum flags not being & or |
 together, I emphatically think enums should be used for bitfields.
 Remember, enum is not just an enumeration, it's a manifest constant.

enum Enumeration{ field0, field1, } enum manifestConstant=0;

There are other forms too: enum MyConstants { const1 = 5; const2 = 42; } enum flags { flag1 = 0x01, flag2 = 0x02, flag3 = 0x04, } Those are clearly manifest constants with a namespace. The last one is a bitfield.
 I see no reason that we should not use the namespace-creation ability of
 enum to create such constants. I don't see the downside.

The downside is that eg. final switch incorrectly assumes that enum values are not composeable. It is imho a small inconsistency in the language's design.

So don't use final switch? Again, not all enums are enumerations, you have to judge whether final switch is applicable based on interpretation of that. -Steve
Sep 08 2011
prev sibling next sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 09:05:35 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in  
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file opening function taking a string "mode" parameter. But if you say that "rw" can/will be used instead than that's better.

Yes, I'll try to add "rw" and maybe some other letter combinations that make sense in my next version. But I think we still have to support "r+", even though it's esoteric, because too much existing code already does this, and to not support it would leave silently compiling bugs. -Steve

Didn't you just say that you would check the string at compile time?

You can if you make it a template parameter. For example, my openFile function that I wrote does this (in fact, I needed a template mode string because the return type depends on it). The downside is you cannot pass a runtime-generated string. I cannot actually think of any use cases for that however. In any case, the existing API does not use a template parameter, and we have to try and break as little code as possible. I wonder if there's a way to give the option of using a template parameter or using a positional parameter without having two different symbol names. hm... openFile!(string modedefault = "r")(string filename, string mode = modedefault) if (isValidOpenMode(modedefault)) { if(!isValidOpenMode(mode)) throw new Exception("invalid file open mode: " ~ mode); ... } Would that work? -Steve
Sep 08 2011
prev sibling parent "Simen Kjaeraas" <simen.kjaras gmail.com> writes:
On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I wonder if there's a way to give the option of using a template  
 parameter or using a positional parameter without having two different  
 symbol names.  hm...

 openFile!(string modedefault = "r")(string filename, string mode =  
 modedefault) if (isValidOpenMode(modedefault))
 {
     if(!isValidOpenMode(mode))
        throw new Exception("invalid file open mode: " ~ mode);
     ...
 }

 Would that work?

Neat! And yes, it certainly does work. I'm still unsure when someone will actually need to specify that at runtime, but maybe for scripting languages? -- Simen
Sep 08 2011
prev sibling next sibling parent Andrej Mitrovic <andrej.mitrovich gmail.com> writes:
On 9/4/11, Paulo Pinto <pjmlp progtools.org> wrote:
 Hi,

 what is an "abstract interface" ?

I'm wondering the same thing.
Sep 04 2011
prev sibling next sibling parent Kagamin <spam here.lot> writes:
Andrei Alexandrescu Wrote:

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Ddoc screwed the types, right?
Sep 05 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Tuesday, September 06, 2011 19:29:13 Josh Simmons wrote:
 On Tue, Sep 6, 2011 at 7:09 PM, Jonathan M Davis <jmdavisProg gmx.com> 

 Other major languages (such as Java and C#) have large standard
 libraries and have done quite well with them. In fact, I believe that
 the large size of their standard libraries is generally seen as major
 advantage of those languages.
 
 No, we can't have everything in the standard library. No, an XML parser
 in the standard library likely won't meet everyone's needs. However,
 having a large standard library can be of great benefit to the users of
 the language even if it doesn't solve every problem that they could
 possibly have. The question isn't really whether we should add stuff
 like XML parsing to Phobos. The question is what is the best general
 implementation for a such a module and whether we can get an
 implementation of high enough quality to be able to go in the standard
 library. It's a question of time, man power, and quality.
 
 Obviously, Phobos is not going to explode in size overnight, but it _is_
 going to grow in size, and eventually it should be fairly large. We
 already have several useful additions in the review queue which will
 likely make it into Phobos in one form or another over the next few
 months.
 
 - Jonathan M Davis

Other languages like C# and Java have large enterprise outfits backing their massive standard libraries too. I just think the effort is better spent creating a solid language and encouraging third party libraries through better tools.

For the most part, the folks working on Phobos are not the same folks who work on dmd. There's some overlap, but they're definitely not the same people. So, the fact that people are working on the standard library does _nothing_ to slow the language down. If anything, it helps, because it provides a standard code base which uses (and therefore tests) the various features of the language. Third party libraries are great, but I don't see why you would ever want to discourage development of a language's standard library in favor of third party libraries. In some cases, modules in the standard library have originated in third party libraries anyway. No, Phobos is not likely to ever rival C# or Java for volume of code. But that doesn't mean that Phobos shouldn't try and be as large is it can be while still maintaining high quality. - Jonathan M Davis
Sep 06 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Wednesday, September 07, 2011 14:16:55 Timon Gehr wrote:
 On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:
 
 final switch(Mode.read|Mode.write){
 
       case Mode.read: writeln(1); break;
       case Mode.write: writeln(2); break;
 
 }
 
 =>  2
 
 hm...


Actually, it will print nothing, not even an Assertion failure, my enum definition was wrong

Did you compile with -w? I don't remember if that affects final switch or not, but there's definitely a problem if you can get final switch to take a value that it doesn't handle without using a cast.
 Personally, I don't think that&ing or |ing enums should result in an
 enum, and this case illustrates one reason why. But ultimately, the
 main issue IMHO is that&ing or |ring enums doesn't generally result in
 a valid enum value, so it just doesn't make sense.

Yes exactly. That is why I always use alias int MODE; enum:MODE{ MODEread=1, MODEwrite=2, }

And how is that any different from alias int MODE; enum MODEread = 1; enum MODEwrite = 2; They're manifest constants, not enum values. So, you're basically suggesting that flags be done with manifest constants as opposed to enums? That doesn't encapsulate as well IMHO, and I'd still object to a function having a MODE parameter, since that implies that a MODE is a single flag, whereas it's a group of flags - that and as far as Phobos goes, we don't generally use aliases like that (of course, we don't name types in all caps or start variable or enum value names with uppercase characaters either, so what Phobos does obviously isn't necssarily what you stick to). - Jonathan M Davis
Sep 07 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 08, 2011 21:38:43 Jacob Carlborg wrote:
 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser? Isn't that a bit limited? What if I have a 50MB file, I have to read it into a continuous memory block first? -Steve

I'm just telling how Tango currently works, not how the XML module in Phobos should work. But I guess it might be somewhat limited. 50MB isn't that big to read into memory? I think it would be nice to be able to do both. If you read the whole file before sending it to the parser you would know it doesn't perform any I/O operations.

I expect that the the new std.xml will work on ranges of dchar (certainly, if it doesn't it should) such that it could be used on a string that's the entire file or on a stream over the file. If it's tied to reading in the whole file first, it's a design flaw. But I don't know what the current state of the new std.xml is. I don't think that I've seen Tomek around here recently. - Jonathan M Davis
Sep 08 2011
prev sibling next sibling parent Jonathan M Davis <jmdavisProg gmx.com> writes:
On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of flags
 has its own flag so that &ing or |ing enums always results in a valid
 enum.

This ain't going to work because it would require the human user to write by hand a combinatorial number of symbols. A ligthweight fixed-sized set with named members is a worthy abstraction for the standard library.

I agree. I'm not arguing that the user _should_ create such a combination of flags. That would be horrible. I'm just arguing that having a set of flags with enums, e.g. enum Flag { a = 1, b = 2, c = 4, d = 8 }; and then having Flag.a | Flag.b or Flag.a & Flag.b result in a value of type Flag is not a good idea, because the result isn't a valid Flag. It should result in whatever the base type is (int in this case), and functions which take such flags &ed or |ed should take them using the base type, not the enum type. - Jonathan M Davis
Sep 08 2011
prev sibling parent "Steven Schveighoffer" <schveiguy yahoo.com> writes:
On Thu, 08 Sep 2011 17:34:50 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/08/2011 10:33 PM, Jonathan M Davis wrote:
 On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I  
 don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of  
 flags
 has its own flag so that&ing or |ing enums always results in a valid
 enum.

This ain't going to work because it would require the human user to write by hand a combinatorial number of symbols. A ligthweight fixed-sized set with named members is a worthy abstraction for the standard library.

I agree. I'm not arguing that the user _should_ create such a combination of flags. That would be horrible. I'm just arguing that having a set of flags with enums, e.g. enum Flag { a = 1, b = 2, c = 4, d = 8 }; and then having Flag.a | Flag.b or Flag.a& Flag.b result in a value of type Flag is not a good idea, because the result isn't a valid Flag. It should result in whatever the base type is (int in this case), and functions which take such flags&ed or |ed should take them using the base type, not the enum type. - Jonathan M Davis

+1.

I could go either way on this. On one hand, it's nice to say "this is a bitfield, and the compiler will force you to use my enumeration constants to build it", and on the other hand, anyone who passes in integers (especially something non-hex or non-binary like 12345) is asking for code-review rejection ;) I did use an enumeration argument that included a single bit which could be or'd in the stdio overhaul. It was still verifying the enum was valid in the contract, so it just as easily could be uint (or maybe it was ubyte?). I don't suppose the type checking is all that critical. -Steve
Sep 08 2011