digitalmars.D - std.stdio overhaul by Steve Schveighoffer

Andrei Alexandrescu (12/12) Sep 03 2011 Hello,

Jose Armando Garcia (3/13) Sep 03 2011 Interesting. How does this work with RAII? Where is the source code?
dsimcha (8/20) Sep 03 2011 After a quick look, I have two concerns:

David Nadlinger (3/5) Sep 03 2011 This one could easily be solved by aliasing File.open to (static) opCall...

dsimcha (5/10) Sep 03 2011 Agreed, but in the big picture this overhaul still breaks way too much c...

Walter Bright (35/39) Sep 03 2011 [rant]

Steven Schveighoffer (43/88) Sep 03 2011 Please, leave all pitchforks and torches at rest for the moment :) I wa...

Walter Bright (21/45) Sep 03 2011 I know what I wrote was a bit brutal, but this needs to be settled befor...

Steven Schveighoffer (53/117) Sep 03 2011 I appreciate feedback, but I think there was a misunderstanding of what ...

Walter Bright (14/34) Sep 03 2011 I still use printf a lot. One reason is because it is lightweight - usin...

Jonathan M Davis (5/8) Sep 03 2011 Well, while that may be a good reason to use printf, it really doesn't a...

Michel Fortin (23/31) Sep 04 2011 That may be true, but the average D programmer will also, directly or

Andrei Alexandrescu (4/25) Sep 04 2011 No, things are more complex; the interference will be major unless

Michel Fortin (8/16) Sep 04 2011 That doesn't really help understand the issue, you're just making it

Steven Schveighoffer (3/23) Sep 04 2011 You are assuming each write flushes the buffer. That's not always the ca...

Michel Fortin (9/27) Sep 04 2011 Not exactly. I am assuming each write flushes the buffer __up to the

Andrei Alexandrescu (5/31) Sep 04 2011 It depends on the buffering mode of the stream, and also of the

Michel Fortin (13/17) Sep 04 2011 Actually my assumption wasn't too bad within its own boundaries. I was

Lars T. Kyllingstad (9/11) Sep 04 2011 What do you mean by "broken"? That it does not compile or work as

Walter Bright (16/26) Sep 05 2011 It prints out all the deprecation message. It means I'll have to go edit...

Andrej Mitrovic (8/10) Sep 05 2011 It would really help out if we had some sort of semi-automated script

Marco Leise (25/35) Sep 05 2011 It would help to have a lexical analyzer of the kind that allows for the...
bearophile (5/10) Sep 05 2011 You mean like the standard tool gofix:

dsimcha (13/17) Sep 05 2011 I agree that we've been overzealous lately in breaking code to fix small
Adam Ruppe (10/10) Sep 05 2011 Count me as another who is sick and tired of the gratuitous breaking

Daniel Murphy (9/19) Sep 06 2011 I understand this, and it's a pain to have to change code every release,...

Adam Ruppe (10/11) Sep 06 2011 Actually, ironically enough, removing it from Phobos would make

Andrei Alexandrescu (26/45) Sep 05 2011 I think it means it gives you time, on your own schedule with generous

Josh Simmons (7/57) Sep 05 2011 My question is why do you even need a standard API for XML and JSON.

bearophile (4/5) Sep 06 2011 It helps port your user code to other libs that use the same standard AP...

Josh Simmons (11/16) Sep 06 2011 This would be true if there were only implementation differences

bearophile (5/12) Sep 06 2011 Some people like wide libraries like Python (batteries included), others...

Jonathan M Davis (18/42) Sep 06 2011 Other major languages (such as Java and C#) have large standard librarie...

Jacob Carlborg (7/39) Sep 06 2011 Phobos could have a low level XML parsing module and on top of that
Marco Leise (7/12) Sep 06 2011 These languages are platforms with a complete abstraction from the

Josh Simmons (5/22) Sep 06 2011 Other languages like C# and Java have large enterprise outfits backing

Andrei Alexandrescu (10/14) Sep 06 2011 As always finding the right balance is key. Community-grown languages

dsimcha (5/7) Sep 05 2011 YES!!! I'm glad someone besides me finally realizes this. For example, ...
Walter Bright (11/22) Sep 05 2011 I agree that the XML and JSON libraries need to be scrapped and rewritte...

Jacob Carlborg (4/26) Sep 05 2011 So we have to live with these naming conventions from C forever?

Jonathan M Davis (21/48) Sep 05 2011 My take on it is that we need to figure out which pieces of Phobos need ...

Jacob Carlborg (4/52) Sep 06 2011 Yes, thank you, I agree.

Adam Ruppe (21/23) Sep 06 2011 The easiest way to do that is run the compiler. If an error occurs,

Andrei Alexandrescu (3/7) Sep 06 2011 Basic economics indicate that the difference is large.
Jacob Carlborg (4/27) Sep 06 2011 You can always keep your own local copy of a module.

Adam Ruppe (17/18) Sep 06 2011 Yeah, though that comes with it's own set of pains.

Andrei Alexandrescu (14/46) Sep 06 2011 I'm not so sure. We're experiencing an unprecedented surge in

Jacob Carlborg (5/28) Sep 05 2011 We don't want to have a standard library like the one in PHP where there...

Walter Bright (2/4) Sep 06 2011 I don't think that is the reason PHP is such a bear to work with.

Jacob Carlborg (5/10) Sep 06 2011 I think that that is one reason, not the only one, not the biggest one,
Adam Ruppe (14/15) Sep 06 2011 It is one of the problems with PHP, but I'm not sure it applies
Andrei Alexandrescu (22/27) Sep 06 2011 Probably. At any rate, what I now think as a promising path is with new

Adam Ruppe (6/7) Sep 06 2011 That might be a good idea. If D modules were to get in the habit
Jacob Carlborg (5/33) Sep 06 2011 I prefer to use "old_". Depending on what XML functionality we want we

Adam Ruppe (5/6) Sep 06 2011 There's two big problems with that though:

Andrej Mitrovic (5/11) Sep 06 2011 Isn't it time we start eating our own dogfood and introduce version

Adam Ruppe (31/33) Sep 06 2011 The problem I have is old code isn't going to change itself

Andrej Mitrovic (7/11) Sep 06 2011 I mean hypothetically with a new version this could be a compile switch:

Jacob Carlborg (20/26) Sep 07 2011 What I don't like is that if there's a function/class/module that should...

Andrej Mitrovic (11/14) Sep 06 2011 I would say that's the right way to go. It's much easier to change an

Steven Schveighoffer (7/15) Sep 06 2011 I agree. I'd hate for the current std.xml to squat that name forever...
Adam Ruppe (6/8) Sep 06 2011 That's why the links on the left should always point to the newest

Andrei Alexandrescu (7/15) Sep 06 2011 Yah, I also think the documentation makes it easy to clarify which

Daniel Murphy (8/13) Sep 06 2011 I still can never remember if I'm supposed to be using std.regex or

Andrei Alexandrescu (6/21) Sep 06 2011 Yet another argument :o). I also don't quite remember right now whether

Sean Kelly (8/23) Sep 06 2011 module

Dmitry Olshansky (10/25) Sep 06 2011 Looking at the docs: std.regexp is scheduled for deprecation (in August

Jonathan M Davis (4/20) Sep 06 2011 std.regexp has been scheduled for deprecation for ages. It just hasn't h...

Jacob Carlborg (4/19) Sep 07 2011 I agree.

Sean Cavanaugh (14/36) Sep 08 2011 In the COM based land for D3D, there is just a number tacked onto the

Simen Kjaeraas (9/14) Sep 08 2011 In the case of D3D though, D3D itself has a version number. The next

Marco Leise (2/16) Sep 08 2011 That is late in the discussion, but a valid point.

Andrei Alexandrescu (3/21) Sep 08 2011 Waiting for a suggestion from the XML experts.

Alix Pexton (10/34) Sep 08 2011 I'm not really an XML expert, but I do recall that the XML Core Working

Jacob Carlborg (5/19) Sep 07 2011 Yeah, I hate that with Java interfaces, appending a number. Just because...

David Gileadi (6/8) Sep 07 2011 I've been happy to see less of this recently. Maybe it's just my

Jacob Carlborg (4/12) Sep 07 2011 Me too, that's why I'm working on a package manager.

Walter Bright (3/5) Sep 06 2011 std.xml2

Martin Nowak (7/13) Sep 06 2011 Speaking of xml2 I clearly like to see an attempt of buffered lookahead ...

Steven Schveighoffer (8/22) Sep 06 2011 This is exactly the reason for the overhaul. I'm working on it, and I

Andrei Alexandrescu (4/10) Sep 06 2011 Since the BDFL and the majority of his constituents are in favor of

Walter Bright (2/3) Sep 06 2011 Brain-Damaged Feckless Leader?

Timon Gehr (2/5) Sep 06 2011 Benevolent Dictator For Life ;)

notna (6/12) Sep 06 2011 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or

Timon Gehr (3/17) Sep 06 2011 That is about 4 times slower than the Tango XML parser:

Marco Leise (3/13) Sep 06 2011 You are so right, Timon. How deep is the trench between Phobos and Tango...

Jonathan M Davis (7/23) Sep 06 2011 A new std.xml is already in the works. It'll be range-based, unlike the ...

Steven Schveighoffer (14/40) Sep 08 2011 No, the issue is, and always will be, buffer access. C's FILE * just

Jacob Carlborg (5/25) Sep 08 2011 The Tango XML parser doesn't read from a file, it takes the input as a

Steven Schveighoffer (5/32) Sep 08 2011 So you have to read the entire file before sending it to the parser?

Jacob Carlborg (9/16) Sep 08 2011 I'm just telling how Tango currently works, not how the XML module in

Jonathan M Davis (7/26) Sep 08 2011 I expect that the the new std.xml will work on ranges of dchar (certainl...
Steven Schveighoffer (16/33) Sep 08 2011 Um... yeah, it is :) I have 1 GB of memory, my system starts thrashing ...

Jacob Carlborg (9/43) Sep 09 2011 As far as I know it's because of two reasons: it doesn't allocate any

Sean Kelly (10/26) Sep 06 2011 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs...

Marco Leise (3/26) Sep 06 2011 So in the benchmark neither white-space is collapsed, nor are entities

Sean Kelly (18/42) Sep 06 2011 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs...

Jonathan M Davis (9/19) Sep 06 2011 Yeah. Thanks to array slicing, parsing is actually one of the areas that...

Marco Leise (8/13) Sep 06 2011 What about:

Steven Schveighoffer (4/19) Sep 06 2011 That only works/is worth it if std.xml2 is backwards compatible with

Brad Anderson (11/42) Sep 06 2011 Along these same lines I'm wondering why not simply call this new module

Steven Schveighoffer (15/39) Sep 06 2011 I think for something like std.xml which is somewhat of a standalone
Mafi (2/12) Sep 06 2011 I think this is a good idea. I think std.io sounds and feels much better...

Paul D. Anderson (3/18) Sep 06 2011 I think this is a terrific suggestion.

bearophile (5/6) Sep 06 2011 I have suggested std.io time ago, but someone doesn't like it:

Jonathan M Davis (6/11) Sep 06 2011 It's not enough of an improvement to rename std.stdio to std.io just to ...

Steven Schveighoffer (7/20) Sep 08 2011 When I get my re-revamped stdio working, it will likely involve std.io

Michel Fortin (35/38) Sep 06 2011 Apple has been deprecating things a lot in Mac OS X. Deprecated APIs

Walter Bright (3/5) Sep 06 2011 It doesn't work that well. dmd breaks with every new OS update.

Jacob Carlborg (6/11) Sep 07 2011 Maybe it would work better if you would use the proper API instead of

Walter Bright (3/14) Sep 07 2011 Actually, I did follow documented behavior of ld. Unfortunately, ld does...

Jacob Carlborg (13/30) Sep 07 2011 I don't know exactly what documentation you've read but this is what
Michel Fortin (10/26) Sep 07 2011 Indeed. Although nowhere in the documentation does it says what the

Jacob Carlborg (6/31) Sep 07 2011 From the ld man page, section "Layout":

Michel Fortin (9/10) Sep 07 2011 Indeed, I think you're right that they are better than Apple. But you

Walter Bright (5/10) Sep 07 2011 I used to know people who worked in Microsoft's "app compat" department....

dsimcha (4/19) Sep 07 2011 Yeh, the story of Raymond Chen working on a team that disassembled

Walter Bright (2/23) Sep 07 2011 I believe this was a large factor in the success of Microsoft Windows.

Michel Fortin (22/35) Sep 07 2011 Well, sometime Apple does support undocumented behaviour of previous

Andrej Mitrovic (2/2) Sep 03 2011 I dislike naming things with a leading "D" like "DInput". Shouldn't we

Steven Schveighoffer (7/9) Sep 03 2011 I think the names are not great. The names are somewhat based on the

Andrej Mitrovic (3/3) Sep 03 2011 Ah, reading your post I see this is just a start of the overhaul. I
Jacob Carlborg (4/12) Sep 04 2011 These names are a lot better.

Andrej Mitrovic (3/3) Sep 03 2011 Also, changing structs to classes is gonna *massively* break code

Marco Leise (7/10) Sep 03 2011 Wasn't this overhaul _meant_ to break existing code by offering a new AP...

Jonathan M Davis (9/21) Sep 03 2011 Any overhaul of existing functionality needs to improve on existing

Walter Bright (5/8) Sep 03 2011 The larger the amount of code that is broken, the more gain there must b...

Andrej Mitrovic (5/5) Sep 03 2011 Seems to me like virtually every module in Phobos gets a complete

dsimcha (9/14) Sep 03 2011 It's really amazing how much cruft 2-3 year old D code tends to have: W...

Walter Bright (3/5) Sep 03 2011 Those are the great kind of changes, and it's also nice in that it means...

Andrei Alexandrescu (20/25) Sep 03 2011 It's not that bad. First, it's understandable that now there are

dsimcha (11/26) Sep 03 2011 Yes, the quality standard has gone up massively. When I was prepping

Jonathan M Davis (9/38) Sep 03 2011 std.datetime is far better for having gone through multiple reviews as w...
Walter Bright (6/10) Sep 03 2011 I can vouch for Andrei's reviews appearing to be personal, but they are ...

Andrei Alexandrescu (5/18) Sep 04 2011 This is a bit of a surprise for me because I fancy (fancied...) to see

Andrei Alexandrescu (3/12) Sep 03 2011 I agree. I'm hoping the new stuff could build on top of std.stdio.

Steven Schveighoffer (15/28) Sep 03 2011 It is my plan for the eventual result to break either no code, or as

Andrei Alexandrescu (17/20) Sep 03 2011 I'm not 100% convinced of that. We can achieve a good deal of

Steven Schveighoffer (15/35) Sep 03 2011 OK, I think that's the offer on the table I keep getting :) I'm

Jonathan M Davis (3/14) Sep 03 2011 Agreed.

bearophile (4/8) Sep 05 2011 The purpose of the gofix tool in the Go language library is to lower thi...

Jonathan M Davis (4/9) Sep 03 2011 Most of it's older stuff which has been around since D1, I believe - eit...

dsimcha (3/11) Sep 03 2011 I mostly agree with what you said, except that this proposal breaks a fr...

Steven Schveighoffer (14/17) Sep 03 2011 Because it breaks runtime swapping of I/O.

Jacob Carlborg (11/28) Sep 04 2011 Tango has added a new method to Object, "dispose". The method is called

Andrei Alexandrescu (3/36) Sep 04 2011 What happens if f is aliased beyond the existence of foo()?

Jacob Carlborg (31/44) Sep 04 2011 I'm not sure if this is what you mean but:

Andrei Alexandrescu (8/41) Sep 04 2011 Well it's not bad but a bit underwhelming. Clearly it's better than the

Jacob Carlborg (6/36) Sep 04 2011 Yeah, a variable declared as "scope" shouldn't, preferably, exit it's

Steven Schveighoffer (14/39) Sep 03 2011 As long as a class can contain a File as a member, this argument makes n...

Andrei Alexandrescu (5/33) Sep 03 2011 The meaning of the argument is that just because there is the

Timon Gehr (7/19) Sep 03 2011 File is now a class. This will break a lot of code.
Michel Fortin (20/35) Sep 03 2011 Looks good…

Steven Schveighoffer (28/55) Sep 03 2011 =

dsimcha (6/6) Sep 03 2011 Actually I'll generalize the comment I made before: As much as I like m...
Walter Bright (4/4) Sep 03 2011 What happens if I write:

Steven Schveighoffer (9/12) Sep 03 2011 useCStdio();

Andrei Alexandrescu (33/34) Sep 03 2011 Here are a few points following a pass through the dox:

Steven Schveighoffer (48/79) Sep 03 2011 I think you need to support all three, but they could be individual

Steven Schveighoffer (127/137) Sep 03 2011 Thank you Andrei for posting this. Before I add some more details, let ...

Steven Schveighoffer (6/6) Sep 03 2011 On Sat, 03 Sep 2011 21:58:09 -0400, Steven Schveighoffer

David Nadlinger (25/37) Sep 03 2011 I will come back with some more detailed feedback later on, but a few
Paulo Pinto (6/19) Sep 04 2011 Hi,

Andrej Mitrovic (2/4) Sep 04 2011 I'm wondering the same thing.

David Nadlinger (3/8) Sep 04 2011 A bug in ddoc. ;)

Jacob Carlborg (8/20) Sep 04 2011 I think that openFile, File.open and CStream.open should shouldn't take

Steven Schveighoffer (20/47) Sep 06 2011 openFile takes it as a template argument, and it will fail at compile ti...

Jacob Carlborg (11/59) Sep 06 2011 It's not very often I use the print format functions. Most of the time I...

Steven Schveighoffer (15/79) Sep 06 2011 writef is the same, %s is equivalent to calling toString().

Andrej Mitrovic (3/7) Sep 06 2011 Or an alternative enum instead of a string. I'm another one of those
Jacob Carlborg (6/15) Sep 06 2011 Ok, I would prefer to use enums if they have sensible names. Something

Andrei Alexandrescu (4/20) Sep 06 2011 Honest, C's openmode strings have been around for so long, they hardly

Steven Schveighoffer (9/32) Sep 06 2011 That's not a valid fopen string ;)

Marco Leise (18/22) Sep 06 2011 Sorry, but I had to laugh. There could not have been a better counter

Andrei Alexandrescu (3/13) Sep 06 2011 Guess I'm destroyed.
Steven Schveighoffer (14/37) Sep 06 2011 I agree, but: 1. strings are statically checkable in D (see my openFile ...

Jacob Carlborg (6/15) Sep 07 2011 But "r+" is. And that's what I assume will be used when I see a file

Steven Schveighoffer (7/26) Sep 08 2011 Yes, I'll try to add "rw" and maybe some other letter combinations that ...

Jacob Carlborg (4/32) Sep 08 2011 Didn't you just say that you would check the string at compile time?

Steven Schveighoffer (20/55) Sep 08 2011 You can if you make it a template parameter. For example, my openFile

Simen Kjaeraas (7/18) Sep 08 2011 Neat! And yes, it certainly does work. I'm still unsure when someone

Andrei Alexandrescu (4/22) Sep 08 2011 My opinion: we're spending way too much energy on this. File I/O poses

Jacob Carlborg (4/23) Sep 08 2011 That looks nice if it works.

Timon Gehr (2/36) Sep 08 2011 That is not compatible with the auto f = File(name, mode); interface.

Jacob Carlborg (7/16) Sep 07 2011 BTW, I think that using:

Timon Gehr (7/26) Sep 07 2011 I disagree: "rw" is quite obvious because you have context. It is not

Timon Gehr (8/38) Sep 07 2011 Oh, btw:

Jonathan M Davis (6/54) Sep 07 2011 Personally, I don't think that &ing or |ing enums should result in an en...

Timon Gehr (9/25) Sep 07 2011 Actually, it will print nothing, not even an Assertion failure, my enum

Jonathan M Davis (17/46) Sep 07 2011 Did you compile with -w? I don't remember if that affects final switch o...

Timon Gehr (13/59) Sep 07 2011 final switch works the same with or without warnings. Basically final

travert phare.normalesup.org (Christophe) (12/14) Sep 07 2011 Well, you could use an array of flags ? Oh, wait, that is precisely what...

Timon Gehr (9/21) Sep 07 2011 do you seriously prefer
Tobias Pankrath (14/24) Sep 08 2011 I like the variadic version most. Another alternative

Steven Schveighoffer (12/31) Sep 08 2011 It's not the same. "a" and "b" do not have any meaning, they are just

Timon Gehr (10/43) Sep 08 2011 enum Enumeration{

Steven Schveighoffer (18/32) Sep 08 2011 There are other forms too:

Jacob Carlborg (5/31) Sep 08 2011 I guess it's a little clearer in the context of opening a file. "a" can
Jonathan M Davis (8/45) Sep 08 2011 I think that it makes perfect sense to use enums for flags. What I don't...

Andrei Alexandrescu (6/10) Sep 08 2011 This ain't going to work because it would require the human user to

Jonathan M Davis (11/23) Sep 08 2011 I agree. I'm not arguing that the user _should_ create such a combinatio...

Timon Gehr (2/25) Sep 08 2011 +1.

Steven Schveighoffer (11/49) Sep 08 2011 I could go either way on this. On one hand, it's nice to say "this is a...

Jacob Carlborg (4/14) Sep 07 2011 --

Jacob Carlborg (4/26) Sep 07 2011 I disagree.

Kagamin (2/3) Sep 05 2011 Ddoc screwed the types, right?
Jonathan M Davis (15/44) Sep 06 2011 For the most part, the folks working on Phobos are not the same folks wh...

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

Hello,


There are a number of issues related to D's current handling of streams, 
including the existence of the imperfect etc.stream and the 
over-specialization of std.stdio.

Steve has worked on an extensive overhaul of std.stdio which would 
obviate the need for etc.stream and would improve both the generality 
and efficiency of std.stdio.

Please chime in with feedback; he's away from the Usenet but allowed me 
to post this on his behalf. I uploaded the docs to

http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


Thanks,

Andrei

Sep 03 2011

Jose Armando Garcia <jsancio gmail.com> writes:

On Sat, Sep 3, 2011 at 12:54 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would obviate
 the need for etc.stream and would improve both the generality and efficiency
 of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me to
 post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Interesting. How does this work with RAII? Where is the source code?

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

After a quick look, I have two concerns:

1.  File is a class, not a struct.  This precludes using reference counting as
the
current std.stdio.File does, meaning you have to close all your Files manually.
 I
loved the reference counting semantics, especially the last few releases since
most of the relevant compiler bugs have been fixed.

2.  File(someFileName, someMode) needs to work.  Not supporting this method of
instantiating a File object would break way too much code.

Sep 03 2011

David Nadlinger <see klickverbot.at> writes:

On 9/3/11 11:20 PM, dsimcha wrote:
 2.  File(someFileName, someMode) needs to work.  Not supporting this method of
 instantiating a File object would break way too much code.

This one could easily be solved by aliasing File.open to (static) opCall().

David

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from David Nadlinger (see klickverbot.at)'s article
 On 9/3/11 11:20 PM, dsimcha wrote:
 2.  File(someFileName, someMode) needs to work.  Not supporting this method of
 instantiating a File object would break way too much code.

 This one could easily be solved by aliasing File.open to (static) opCall().
 David

Agreed, but in the big picture this overhaul still breaks way too much code
without either a clear migration path or a clear argument about why such
extensive
breakage is necessary.  The part about File(someFileName, someMode) is just the
first thing I noticed.

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 3:53 PM, dsimcha wrote:
 Agreed, but in the big picture this overhaul still breaks way too much code
 without either a clear migration path or a clear argument about why such
extensive
 breakage is necessary.  The part about File(someFileName, someMode) is just the
 first thing I noticed.

[rant]

I agree. I agree that std.stream should be replaced, but I have a lot of 
misgivings about replacing std.stdio. I do not want to rewrite every darn D 
program I've ever written. I think it is a bad idea to break everyone else's D 
program.

Everything in dsource will break in non-trivial ways. I don't think we can 
afford this. I do not know of any successful system or language that breaks
user 
code with such aplomb as D does. Not even C++ dares to break that Piece Of S*** 
that everyone knows iostreams is. I can compile and run unix C code from 30 
years ago on Linux with no changes at all. Same with DOS code.

There needs to be huge improvement to justify such breakage.

[I also don't like it that all my code that uses std.path is now broken.]

I would prefer to see all the energy that is going into refactoring existing, 
working modules go into designing new, not existing, modules that there's a 
crying need for.

[/rant]

Enough ranting for now, as for the proposed std.stdio,

1. It does look fairly straightforward, but:

2. There is only one example. Have any commonly done programming tasks been 
tried out with it to see how they work?

3. There is no indication of how it interacts with C stdio. A primary goal of 
std.stdio was interoperability with C stdio.

4. There are no benchmarks. The current std.stdio was designed/written in 
parallel with some benchmarks Andrei and others cooked up, as a primary goal
was 
performance.

5. flushCheck - flushing should be done based on the file type. tty's should be 
\n flushed, files when the buffer is full. I question the performance of using
a 
delegate to check for flushing. How often will it be called?

6. There is no provision for multithreaded writing, i.e. what happens when two 
threads write to stdout. Ideally, there should be a way to 'lock' the stream to 
oneself, in order to appropriately interleave the output.

7. I see nothing for 'raw' character by character input.

8. I see nothing for determining if a char is available on the input. How would 
one implement "press any key to continue"?

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 21:23:26 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/3/2011 3:53 PM, dsimcha wrote:
 Agreed, but in the big picture this overhaul still breaks way too much  
 code
 without either a clear migration path or a clear argument about why  
 such extensive
 breakage is necessary.  The part about File(someFileName, someMode) is  
 just the
 first thing I noticed.

 [rant]

 I agree. I agree that std.stream should be replaced, but I have a lot of  
 misgivings about replacing std.stdio. I do not want to rewrite every  
 darn D program I've ever written. I think it is a bad idea to break  
 everyone else's D program.

 Everything in dsource will break in non-trivial ways. I don't think we  
 can afford this. I do not know of any successful system or language that  
 breaks user code with such aplomb as D does. Not even C++ dares to break  
 that Piece Of S*** that everyone knows iostreams is. I can compile and  
 run unix C code from 30 years ago on Linux with no changes at all. Same  
 with DOS code.

 There needs to be huge improvement to justify such breakage.

 [I also don't like it that all my code that uses std.path is now broken.]

 I would prefer to see all the energy that is going into refactoring  
 existing, working modules go into designing new, not existing, modules  
 that there's a crying need for.

 [/rant]

Please, leave all pitchforks and torches at rest for the moment :)  I want  
to stress, this is *NOT* a proposal for inclusion or generating a pull  
request tomorrow.  It's a very very early version, almost a proof of  
concept, to show *why* we need to change things.  Most of the library is  
up for debate.  I agree it needs to be more compatible with current code.

In hindsight, I probably should have said no when Andrei asked to post  
this on the NG, and did it myself when I could stress the state of it.   
The two most important things are:

1. the interface additions, in particular the readUntil portion (which I  
think provides a very powerful interface for parsing systems).
2. the performance.  It's much better than current stdio.  Aren't people  
continuously complaining at how slow i/o is in Phobos compared to other  
libraries?

 Enough ranting for now, as for the proposed std.stdio,

 1. It does look fairly straightforward, but:

 2. There is only one example. Have any commonly done programming tasks  
 been tried out with it to see how they work?

My main testing has been for:

1. utf input/output correctness of all formats
2. implementing readf/writef
3. testing performance.

I have not written any "real world" tests.  Probably the most interesting  
tests I've written are reading a UTF-X file and writing the data to a  
UTF-Y file (where X and Y are one of UTF-8, UTF-16LE, UTF-16BE, UTF-32LE,  
UTF-32BE).
 3. There is no indication of how it interacts with C stdio. A primary  
 goal of std.stdio was interoperability with C stdio.

useCStdio();

 4. There are no benchmarks. The current std.stdio was designed/written  
 in parallel with some benchmarks Andrei and others cooked up, as a  
 primary goal was performance.

I can include these.

 5. flushCheck - flushing should be done based on the file type. tty's  
 should be \n flushed, files when the buffer is full. I question the  
 performance of using a delegate to check for flushing. How often will it  
 be called?

Once per write to the buffer.  Data is only checked once (the delegate is  
never given the same data to check again).  If you want, I can look at  
adding a means to avoid using a delegate when the trigger is a single  
character.
And TextInput/TextOutput auto detect whether a device is a tty, and  
install the right flushcheck function if necessary.

 6. There is no provision for multithreaded writing, i.e. what happens  
 when two threads write to stdout. Ideally, there should be a way to  
 'lock' the stream to oneself, in order to appropriately interleave the  
 output.

Again, I wish I had not told Andrei to post :(  Multithreaded is not  
supported, but will be.  When that is ready, a locking mechanism (and  
hopefully an auto-unlock mechanism) will be provided.

 7. I see nothing for 'raw' character by character input.

The interface is geared to read by processing the buffer, not one  
character at a time.  Given access to the buffer, you can process one  
character at a time if you want.

See InputRange in TextInput to see how raw character-by-character input  
can be done.

That being said, I think I need to add a peek function.

 8. I see nothing for determining if a char is available on the input.  
 How would one implement "press any key to continue"?

I need more information.  I would probably implement this as a  
read(ubyte[1]), so I don't see why it can't be that way.

-Steve

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 7:33 PM, Steven Schveighoffer wrote:
 Please, leave all pitchforks and torches at rest for the moment :)

I know what I wrote was a bit brutal, but this needs to be settled before we've 
gone so far down that path that turning away then would be horribly unfair to
you.

I think what you need is a marketing spiel to sell the concept of what you're 
trying to do. It should include:

1. The benefits over the current std.stdio
2. Why the new API is needed to achieve those benefits
3. A migration plan for existing std.stdio code

Just being more flexible isn't enough, it has to be more flexible in a way that 
matters, i.e. a real example showing how kickass it is compared to the current
way.


 2. the performance. It's much better than current stdio. Aren't people
 continuously complaining at how slow i/o is in Phobos compared to other
libraries?

Why is it faster? I.e. is a wholly new interface required to make it faster, or 
does it just need to be better under the hood?


 3. There is no indication of how it interacts with C stdio. A primary goal of
 std.stdio was interoperability with C stdio.

 useCStdio();

For some reason that just seems like a giant wart with a hair sticking out of 
it. Why not just use the C stdio buffers?


 5. flushCheck - flushing should be done based on the file type. tty's should
 be \n flushed, files when the buffer is full. I question the performance of
 using a delegate to check for flushing. How often will it be called?

 Once per write to the buffer. Data is only checked once (the delegate is never
 given the same data to check again). If you want, I can look at adding a means
 to avoid using a delegate when the trigger is a single character.
 And TextInput/TextOutput auto detect whether a device is a tty, and install the
 right flushcheck function if necessary.

Flushing once per write is wrong - consider the user who does a zillion putc's. 
I don't see a purpose to anything beyond the C stdio ones - per character, per 
\n, and per buffer.


 7. I see nothing for 'raw' character by character input.

 The interface is geared to read by processing the buffer, not one character at
a
 time. Given access to the buffer, you can process one character at a time if
you
 want.

 See InputRange in TextInput to see how raw character-by-character input can be
 done.

Raw mode is more than that - you have to set the OS to raw mode, otherwise it 
won't give you any characters until a \n is typed.


 8. I see nothing for determining if a char is available on the input. How
 would one implement "press any key to continue"?

 I need more information. I would probably implement this as a read(ubyte[1]),
so
 I don't see why it can't be that way.

There's more to it than that. Try writing it in C and you'll see what I mean. 
(You have to set the io to "raw" mode, turn "echo" off, etc.)

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 04 Sep 2011 00:30:33 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/3/2011 7:33 PM, Steven Schveighoffer wrote:
 Please, leave all pitchforks and torches at rest for the moment :)

 I know what I wrote was a bit brutal, but this needs to be settled  
 before we've gone so far down that path that turning away then would be  
 horribly unfair to you.

I appreciate feedback, but I think there was a misunderstanding of what  
this "review" was for.  I think people thought I was proposing this as a  
ready-to-pull replacement for std.stdio.  That is not the case.  It's very  
much up in the air and under development.  I just wanted to show people  
some progress and get feedback (which I've gotten a lot of!)

The next version of it will look drastically different based on what's  
been said here.  But it will still contain some of the basic designs.  In  
essence, I am very *early* in the path, and I *want* people to turn me in  
the right direction before I go too far the other way.  This is the first  
version that *actually works*, which is why I wanted to share it :)

Not anyone has really commented on the new interfaces.  It's my fault, for  
letting Andrei post the documentation as the main subject, and also not  
fully documenting the module.  I have no excuses, so I'll just have to  
take this as a "ok, we'll try this again later".

But I did get some very good information, and know I have a lot of work to  
do.

 I think what you need is a marketing spiel to sell the concept of what  
 you're trying to do. It should include:

 1. The benefits over the current std.stdio
 2. Why the new API is needed to achieve those benefits
 3. A migration plan for existing std.stdio code

OK

 Just being more flexible isn't enough, it has to be more flexible in a  
 way that matters, i.e. a real example showing how kickass it is compared  
 to the current way.

I'll post some numbers.

 2. the performance. It's much better than current stdio. Aren't people
 continuously complaining at how slow i/o is in Phobos compared to other  
 libraries?

 Why is it faster? I.e. is a wholly new interface required to make it  
 faster, or does it just need to be better under the hood?

Yes, a new interface is required to make it faster.  You need direct  
buffer access, and the current stdio does not provide that.

That being said, I think this proposal goes nowhere unless it's a mostly  
drop-in replacement to the existing std.stdio.  So I have to find a way to  
make it fit.

 3. There is no indication of how it interacts with C stdio. A primary  
 goal of
 std.stdio was interoperability with C stdio.

 useCStdio();

 For some reason that just seems like a giant wart with a hair sticking  
 out of it. Why not just use the C stdio buffers?

1. Because most people don't care.  I never ever use printf, except when I  
was testing my new stdio stuff, and I needed something that worked :)  My  
opinion, if you are using this line, you are doing something weird, legacy  
related, or you are debugging something.
2. Because C does not provide enough access to the buffers.  With my  
library, you can read an entire xml file, for instance, and never copy any  
data out of the buffer.  C never gives direct access to the buffers, and  
while we can hack our way into it, its interface is still kludgy.  If I  
wanted to implement, for example, readUntil using C buffers, I'd have to  
reimplement almost all of FILE *'s functions so I could do it properly.   
And even then, I'd still have to sacrifice some things -- C is still going  
to want to use its way of doing things, and I'd have to respect that.

If you read my response to the first post in this thread, you can see my  
rationale.

 5. flushCheck - flushing should be done based on the file type. tty's  
 should
 be \n flushed, files when the buffer is full. I question the  
 performance of
 using a delegate to check for flushing. How often will it be called?

 Once per write to the buffer. Data is only checked once (the delegate  
 is never
 given the same data to check again). If you want, I can look at adding  
 a means
 to avoid using a delegate when the trigger is a single character.
 And TextInput/TextOutput auto detect whether a device is a tty, and  
 install the
 right flushcheck function if necessary.

 Flushing once per write is wrong - consider the user who does a zillion  
 putc's. I don't see a purpose to anything beyond the C stdio ones - per  
 character, per \n, and per buffer.

a *check* to see if it should be flushed is done once per write.  Not a  
flush.  A flush is only done if the check says to (or the buffer is  
full).  I think C's FILE * checks once per write as well, no?

I also have thought of ways to optimize this so it's, say, once per call  
to writef.

 7. I see nothing for 'raw' character by character input.

 The interface is geared to read by processing the buffer, not one  
 character at a
 time. Given access to the buffer, you can process one character at a  
 time if you
 want.

 See InputRange in TextInput to see how raw character-by-character input  
 can be
 done.

 Raw mode is more than that - you have to set the OS to raw mode,  
 otherwise it won't give you any characters until a \n is typed.

That is not an OS issue, that is a terminal issue.

Note that the current std.stdio does not provide this functionality.  The  
only raw functions are rawRead and rawWrite, which set binary mode.  All  
binary mode does is on windows enable or disable translation of \r\n to  
\n.  They will not do what you are asking.

 8. I see nothing for determining if a char is available on the input.  
 How
 would one implement "press any key to continue"?

 I need more information. I would probably implement this as a  
 read(ubyte[1]), so
 I don't see why it can't be that way.

 There's more to it than that. Try writing it in C and you'll see what I  
 mean. (You have to set the io to "raw" mode, turn "echo" off, etc.)

File provides access to the OS handle, which can be used to set terminal  
settings.

It might be good to add these settings as member functions of File.

-Steve

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 10:09 PM, Steven Schveighoffer wrote:
 I appreciate feedback, but I think there was a misunderstanding of what this
 "review" was for. I think people thought I was proposing this as a
ready-to-pull
 replacement for std.stdio. That is not the case. It's very much up in the air
 and under development. I just wanted to show people some progress and get
 feedback (which I've gotten a lot of!)

I'm glad it's early in the process.


 For some reason that just seems like a giant wart with a hair sticking out of
 it. Why not just use the C stdio buffers?

 1. Because most people don't care. I never ever use printf, except when I was
 testing my new stdio stuff, and I needed something that worked :) My opinion,
if
 you are using this line, you are doing something weird, legacy related, or you
 are debugging something.

I still use printf a lot. One reason is because it is lightweight - using 
writeln blows up the size of your .obj file, making it hard to track down a
back 
end bug. This is a long standing gripe I have with writeln.

D is supposed to work well with existing C code. To me, that includes working 
smoothly with C stdio.


 If you read my response to the first post in this thread, you can see my
rationale.

I understand the desire to do away with C stdio compatibility, but it needs to 
deliver a *lot* to justify that. I also don't mind if std.stdio needs to peek 
under the hood of C stdio to get there - yes, it'll be custom for each C 
library, but the user won't see that.


 a *check* to see if it should be flushed is done once per write. Not a flush. A
 flush is only done if the check says to (or the buffer is full). I think C's
 FILE * checks once per write as well, no?

No. It checks once per char for \n, and once per buffer overflow otherwise.


 That is not an OS issue, that is a terminal issue.

It's an OS issue. The OS does the line buffering.

 Note that the current std.stdio does not provide this functionality. The only
 raw functions are rawRead and rawWrite, which set binary mode. All binary mode
 does is on windows enable or disable translation of \r\n to \n. They will not
do
 what you are asking.

You're right, you have to dip under the hood to the OS protocol to do it.

Sep 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight - using
 writeln blows up the size of your .obj file, making it hard to track down a
 back end bug. This is a long standing gripe I have with writeln.

Well, while that may be a good reason to use printf, it really doesn't apply 
to very many D programmers. Your average D programmer really has no need to 
use printf.

- Jonathan M Davis

Sep 03 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-04 06:53:27 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight - using
 writeln blows up the size of your .obj file, making it hard to track down a
 back end bug. This is a long standing gripe I have with writeln.

 
 Well, while that may be a good reason to use printf, it really doesn't apply
 to very many D programmers. Your average D programmer really has no need to
 use printf.

That may be true, but the average D programmer will also, directly or 
indirectly, call C APIs which may use printf to write things to the 
console. I'm not sure it's much of a problem though. For one thing, C 
APIs generally don't print things on their own.

And also, I doubt using D IO by default will break printf that much: I 
mean if C IO is used to print lines, those lines will be flushed as 
they're emitted, with no possible weird interleaving unless the line is 
really too long. And if you use both D and C IO together, likely you're 
just logging things to the console line by line and not outputting 
things in a specific format where weird interleaving could cause major 
breakage. I'm making some assumptions here, so maybe I'm wrong, but I 
can't really see a use case where both IO system would be used and 
where the fidelity of the output is that important… please correct me 
if I'm wrong.

So in my opinion the default should be to use D streams, as I don't 
expect the drawbacks to be a major inconvenience, and the performance 
gain of being able to access the buffer directly would certainly be 
welcome.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 7:10 AM, Michel Fortin wrote:
 On 2011-09-04 06:53:27 +0000, Jonathan M Davis <jmdavisProg gmx.com> said:

 On Saturday, September 03, 2011 23:49:52 Walter Bright wrote:
 I still use printf a lot. One reason is because it is lightweight -
 using
 writeln blows up the size of your .obj file, making it hard to track
 down a
 back end bug. This is a long standing gripe I have with writeln.

 Well, while that may be a good reason to use printf, it really doesn't
 apply
 to very many D programmers. Your average D programmer really has no
 need to
 use printf.

 That may be true, but the average D programmer will also, directly or
 indirectly, call C APIs which may use printf to write things to the
 console. I'm not sure it's much of a problem though. For one thing, C
 APIs generally don't print things on their own.

 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

No, things are more complex; the interference will be major unless 
explicitly addressed.

Andrei

Sep 04 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

 
 No, things are more complex; the interference will be major unless 
 explicitly addressed.

That doesn't really help understand the issue, you're just making it 
more obscure.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 04 2011

Steven Schveighoffer <schveiguy yahoo.com> writes:

Michel Fortin Wrote:

 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

 
 No, things are more complex; the interference will be major unless 
 explicitly addressed.

 
 That doesn't really help understand the issue, you're just making it 
 more obscure.
 
 -- 
 Michel Fortin
 michel.fortin michelf.com
 http://michelf.com/
 

You are assuming each write flushes the buffer. That's not always the case.

-Steve

Sep 04 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-04 16:08:47 +0000, Steven Schveighoffer <schveiguy yahoo.com> said:

 Michel Fortin Wrote:
 
 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the line is
 really too long.

 
 No, things are more complex; the interference will be major unless
 explicitly addressed.

 
 That doesn't really help understand the issue, you're just making it
 more obscure.

 
 You are assuming each write flushes the buffer. That's not always the case.

Not exactly. I am assuming each write flushes the buffer __up to the 
last newline__, and that most writes ends with \n in a use case where 
you'd be intermixing the IO systems. That's what I read somewhere else 
in this discussion, but maybe I read it wrong.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 3:23 PM, Michel Fortin wrote:
 On 2011-09-04 16:08:47 +0000, Steven Schveighoffer <schveiguy yahoo.com>
 said:

 Michel Fortin Wrote:

 On 2011-09-04 12:57:06 +0000, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:

 On 9/4/11 7:10 AM, Michel Fortin wrote:
 And also, I doubt using D IO by default will break printf that much: I
 mean if C IO is used to print lines, those lines will be flushed as
 they're emitted, with no possible weird interleaving unless the
 line is
 really too long.

 No, things are more complex; the interference will be major unless
 explicitly addressed.

 That doesn't really help understand the issue, you're just making it
 more obscure.

 You are assuming each write flushes the buffer. That's not always the
 case.

 Not exactly. I am assuming each write flushes the buffer __up to the
 last newline__, and that most writes ends with \n in a use case where
 you'd be intermixing the IO systems. That's what I read somewhere else
 in this discussion, but maybe I read it wrong.

It depends on the buffering mode of the stream, and also of the 
buffering mode of whatever alternative abstraction is being used.

Sorry for being curt - I trusted Walter's earlier explanation would suffice.


Andrei

Sep 04 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-04 19:36:23 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 It depends on the buffering mode of the stream, and also of the 
 buffering mode of whatever alternative abstraction is being used.
 
 Sorry for being curt - I trusted Walter's earlier explanation would suffice.

Actually my assumption wasn't too bad within its own boundaries. I was 
only thinking about stdout and it being line-buffered by default. 
Looking into it a little more, I'm not sure what would happen to stdin 
and stdout isn't always line-buffered by default anyway (if the output 
isn't a terminal for instance).

So I have to agree with you that mixing the two won't work well in most 
cases. Sorry for the distraction.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 04 2011

"Lars T. Kyllingstad" <public kyllingen.NOSPAMnet> writes:

On Sat, 03 Sep 2011 18:23:26 -0700, Walter Bright wrote:

 [I also don't like it that all my code that uses std.path is now
 broken.]

What do you mean by "broken"?  That it does not compile or work as 
expected, or that it spits out a bunch of annoying deprecation messages?

If it is any of the former, that was not supposed to happen.  The new 
std.path still contains all the functions of the old std.path and should 
therefore be backwards compatible.

If the new std.path breaks existing code, I need to fix it before it is 
released.  Please let me know what problems you are experiencing.

-Lars

Sep 04 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/4/2011 2:17 AM, Lars T. Kyllingstad wrote:
 On Sat, 03 Sep 2011 18:23:26 -0700, Walter Bright wrote:

 [I also don't like it that all my code that uses std.path is now
 broken.]

 What do you mean by "broken"?  That it does not compile or work as
 expected, or that it spits out a bunch of annoying deprecation messages?

 If it is any of the former, that was not supposed to happen.  The new
 std.path still contains all the functions of the old std.path and should
 therefore be backwards compatible.

 If the new std.path breaks existing code, I need to fix it before it is
 released.  Please let me know what problems you are experiencing.

It prints out all the deprecation message. It means I'll have to go edit 
existing, working code to change the names.

I know that the majority wants the name changes. I know the deprecation system 
gives people plenty of time to edit their code.

But I think the cost of breaking existing code is much higher than many
realize, 
and a lot of that cost will be hidden. It'll come in the form of people
deciding 
not to use D because it is "not stable". It'll come in the form of invalidating 
existing libraries and modules unless someone is regularly maintaining them. 
It'll come in the form of invalidating the mass of books, articles, blog 
postings, and presentations about D, and those will never get updated. People 
will type in the code examples, they will fail to compile, and they'll get 
turned off about D.

I'll again note that I know of know successful operating system or programming 
language that goes around breaking existing code unless it is really, really
urgent.

Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

Sep 05 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/5/11, Walter Bright <newshound2 digitalmars.com> wrote:
 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

It would really help out if we had some sort of semi-automated script
that can do at least partial translation of code that uses old phobos
functions to new ones. Maybe this wouldn't work 100% but at least it
would help out. I'm thinking of something similar to what Python 2to3
does.

I know for sure I could use this, so far I've had to fix the
DWinSamples for every DMD/Phobos release.

Sep 05 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 00:05 Uhr, schrieb Andrej Mitrovic  
<andrej.mitrovich gmail.com>:

 On 9/5/11, Walter Bright <newshound2 digitalmars.com> wrote:
 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

 It would really help out if we had some sort of semi-automated script
 that can do at least partial translation of code that uses old phobos
 functions to new ones. Maybe this wouldn't work 100% but at least it
 would help out. I'm thinking of something similar to what Python 2to3
 does.

 I know for sure I could use this, so far I've had to fix the
 DWinSamples for every DMD/Phobos release.

It would help to have a lexical analyzer of the kind that allows for the  
refactorings in for example Eclipse for Java. Without clear identification  
of symbols it is impossible to write such a script for every new D  
release. And there were other changes in the past where this would have  
been handy although I have not writing any D code lately to feel the  
breakage. I think of globals (__gshared).
I'm on the extreme with my urge to rewrite things if they give me the  
slightest feeling that they could be more elegant or effective and I've  
thought of such a script as well that could be distributed with every new  
D version while there are still breaking changes to the language. Well,  
you cannot write a script without a solid foundation that can reliably  
identify and refactor symbols.
But this doesn't work well for code you copy from blogs. You would have to  
know what D version it was written with and run the matching chain of  
conversion scripts. Anyway this feels like some crazy idea that can't make  
it into existence.
Still I have that picture of downloading a new D release and running the  
obligatory dmdup script to replace deprecated functionality or names with  
the new versions. Sure at some point 'this is it' and features of D and  
Phobos become set in stone. "Hello world!" console output is one of those  
examples that many will try first. I would understand if it breaks between  
major versions of a language, but not from one revision to the next. YMMV  
:)

Sep 05 2011

bearophile <bearophileHUGS lycos.com> writes:

Andrej Mitrovic:

 It would really help out if we had some sort of semi-automated script
 that can do at least partial translation of code that uses old phobos
 functions to new ones. Maybe this wouldn't work 100% but at least it
 would help out. I'm thinking of something similar to what Python 2to3
 does.

You mean like the standard tool gofix:
http://blog.golang.org/2011/04/introducing-gofix.html

Bye,
bearophile

Sep 05 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Walter Bright (newshound2 digitalmars.com)'s article
 On 9/4/2011 2:17 AM, Lars T. Kyllingstad wrote:
 I'll again note that I know of know successful operating system or programming
 language that goes around breaking existing code unless it is really, really
urgent.
 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

I agree that we've been overzealous lately in breaking code to fix small
inconsistencies in style, etc.  I think in a lot of cases the answer is
permanent
(or very long term, i.e. several years) soft deprecation, plus a real
soft-deprecated language feature.  This will lead to cruft accumulation but in
some cases this cruft is less bad than the cruft caused by inconsistent naming
conventions/style, etc.  To make the docs seem less crufty to people browsing,
we
could even eventually remove the soft-deprecated functionality from the DDoc
documentation so that people reading it can't even see the cruft, and move the
code to the bottom of the source files so that people don't see it unless they
go
looking for it.  We could also adopt a policy of zero maintenance for features
that have been soft-deprecated for long periods of time, i.e. not even if they
produce egregiously wrong results, security holes, etc.

Sep 05 2011

Adam Ruppe <destructionator gmail.com> writes:

Count me as another who is sick and tired of the gratuitous breaking
changes every damned month.

The worst part is there's still some new stuff I actually want each
month, so I'm not doing my usual strategy of never, ever, ever updating
software.

It's just pain. Trivial changes are easy enough to fix, but are a
pain. More complex changes cost me time and money. (I'm still angry
about the removal of std.date. But soft deprecation is even worse -
I hate that so much the first thing I do when updating my dmd is to
edit the source to get that useless annoying shit out of there)

Sep 05 2011

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Adam Ruppe" <destructionator gmail.com> wrote in message 
news:j43nl0$2f85$1 digitalmars.com...
 Count me as another who is sick and tired of the gratuitous breaking
 changes every damned month.

I understand this, and it's a pain to have to change code every release, but 
I don't think phobos is _anywhere near_ ready to stop breaking.  The good 
news is that the pace of releases has slowed down so it's only every couple 
of months.

 The worst part is there's still some new stuff I actually want each
 month, so I'm not doing my usual strategy of never, ever, ever updating
 software.

 It's just pain. Trivial changes are easy enough to fix, but are a
 pain. More complex changes cost me time and money. (I'm still angry
 about the removal of std.date. But soft deprecation is even worse -
 I hate that so much the first thing I do when updating my dmd is to
 edit the source to get that useless annoying shit out of there)

How difficult is the process of moving std.date to your own code? (or any 
other phobos module)   How could this be made easier?  I don't think the 
answer is keeping these (broken) modules in phobos.

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Daniel Murphy wrote:
 How could [moving a module to your own code] be made easier?

Actually, ironically enough, removing it from Phobos would make
it easier, since they the file can simply be copied into my own
tree without needing to rename it to avoid conflicts.

This wouldn't apply to a hypothetical std.xml2 though, if it was
still called std.xml. Then the old code would still need to find all
the imports and rename it.

(Renaming modules  will probably get more annoying as we go forward,
since function local imports might encourage more repetition of the
module name.)

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 09/05/2011 04:51 PM, Walter Bright wrote:
 If the new std.path breaks existing code, I need to fix it before it is
 released. Please let me know what problems you are experiencing.

 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

I think it means it gives you time, on your own schedule with generous 
deadlines, to make the changes to your code.

 I know that the majority wants the name changes. I know the deprecation
 system gives people plenty of time to edit their code.

 But I think the cost of breaking existing code is much higher than many
 realize, and a lot of that cost will be hidden. It'll come in the form
 of people deciding not to use D because it is "not stable". It'll come
 in the form of invalidating existing libraries and modules unless
 someone is regularly maintaining them. It'll come in the form of
 invalidating the mass of books, articles, blog postings, and
 presentations about D, and those will never get updated. People will
 type in the code examples, they will fail to compile, and they'll get
 turned off about D.

 I'll again note that I know of know successful operating system or
 programming language that goes around breaking existing code unless it
 is really, really urgent.

 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

I agree with all of the above. However, as is often the case, there's 
more than one side to the story.

Bad APIs have their costs too. We can't afford to have an XML library 
that offers few and badly packaged features and comes at the tail of all 
benchmarks. We also can't afford a JSON library that is poorly designed 
and badly written. Ironically, the costs mostly manifest the same way: 
people will decide not to use D because it "lacks good libraries" and 
"is quirky to use". In many ways a language's standard library is a 
showcase of the language, and to a newcomer an inconsistent and awkward 
standard library affects the perception of the language's quality.

Stressing that breaking code has a cost and implying that keeping it 
with flaws has no cost is as mistaken as worrying in chess about the 
flank at the expense of the center.

The reality we need to face is, we are experiencing growth pains. What 
we must do is NOT lament about breaking this or keeping that. We must:

a) devise good language features to cope with deprecation, of which 
deprecation with message is one that I think we need to embrace and 
extend (I have a few ideas I'll discuss separately);

b) supplement that with a good policy for deprecating APIs and 
introducing new ones - in particular decide where to draw the line when 
introducing a breaking change;

c) possibly create programs a la gofix that help migration.


Andrei

Sep 05 2011

Josh Simmons <simmons.44 gmail.com> writes:

On Tue, Sep 6, 2011 at 12:48 PM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 On 09/05/2011 04:51 PM, Walter Bright wrote:
 If the new std.path breaks existing code, I need to fix it before it is
 released. Please let me know what problems you are experiencing.

 It prints out all the deprecation message. It means I'll have to go edit
 existing, working code to change the names.

 I think it means it gives you time, on your own schedule with generous
 deadlines, to make the changes to your code.

 I know that the majority wants the name changes. I know the deprecation
 system gives people plenty of time to edit their code.

 But I think the cost of breaking existing code is much higher than many
 realize, and a lot of that cost will be hidden. It'll come in the form
 of people deciding not to use D because it is "not stable". It'll come
 in the form of invalidating existing libraries and modules unless
 someone is regularly maintaining them. It'll come in the form of
 invalidating the mass of books, articles, blog postings, and
 presentations about D, and those will never get updated. People will
 type in the code examples, they will fail to compile, and they'll get
 turned off about D.

 I'll again note that I know of know successful operating system or
 programming language that goes around breaking existing code unless it
 is really, really urgent.

 Camel-casing a name doesn't meet that standard. So, yes, I don't like it.

 I agree with all of the above. However, as is often the case, there's more
 than one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library that
 offers few and badly packaged features and comes at the tail of all
 benchmarks. We also can't afford a JSON library that is poorly designed and
 badly written. Ironically, the costs mostly manifest the same way: people
 will decide not to use D because it "lacks good libraries" and "is quirky to
 use". In many ways a language's standard library is a showcase of the
 language, and to a newcomer an inconsistent and awkward standard library
 affects the perception of the language's quality.

 Stressing that breaking code has a cost and implying that keeping it with
 flaws has no cost is as mistaken as worrying in chess about the flank at the
 expense of the center.

 The reality we need to face is, we are experiencing growth pains. What we
 must do is NOT lament about breaking this or keeping that. We must:

 a) devise good language features to cope with deprecation, of which
 deprecation with message is one that I think we need to embrace and extend
 (I have a few ideas I'll discuss separately);

 b) supplement that with a good policy for deprecating APIs and introducing
 new ones - in particular decide where to draw the line when introducing a
 breaking change;

 c) possibly create programs a la gofix that help migration.


 Andrei

My question is why do you even need a standard API for XML and JSON.

Trying to support everything out of the box to a high degree of
quality and provide enough generality that it's useful for everybody
is just too much work and all you achieve is to discourage alternative
implementations better suited to specific needs.

Sep 05 2011

bearophile <bearophileHUGS lycos.com> writes:

Josh Simmons:

 My question is why do you even need a standard API for XML and JSON.

It helps port your user code to other libs that use the same standard API. This
is very useful. In D I'd like a basic standard API even for simple 2D graphics.

Bye,
bearophile

Sep 06 2011

Josh Simmons <simmons.44 gmail.com> writes:

On Tue, Sep 6, 2011 at 6:37 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Josh Simmons:

 My question is why do you even need a standard API for XML and JSON.

 It helps port your user code to other libs that use the same standard API.
This is very useful. In D I'd like a basic standard API even for simple 2D
graphics.

 Bye,
 bearophile

This would be true if there were only implementation differences
between libraries doing roughly the same thing (in which case you'd
not need a new library anyway). Unfortunately this is not how things
work.

So simple 2d graphics ey? vector or raster based? immediate rendering
or scene graph representation? animation? fonts? textures?

XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?

The problem with php isn't just it's awesome naming, it's the fact
that anything that seemed like something somebody might use was added
as opposed to limiting itself to the must-haves.

Sep 06 2011

bearophile <bearophileHUGS lycos.com> writes:

Josh Simmons:

 So simple 2d graphics ey? vector or raster based? immediate rendering
 or scene graph representation? animation? fonts? textures?

Raster, immediate rendering, no need to specify animations, basic support for
fonts and textures. Leaving most things out is not a problem here. Time ago
someone has shown here a nice D module that works on Windows, Linux, that's is
short.  This module doesn't replace other graphics libs or GUI modules, it's
something simple and small for small purposes.


 XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?
 
 The problem with php isn't just it's awesome naming, it's the fact
 that anything that seemed like something somebody might use was added
 as opposed to limiting itself to the must-haves.

Some people like wide libraries like Python (batteries included), others don't
like that. Both choices have their serious advantages and serious
disadvantages. There are not general rules to solve this problem, each case
needs to be discussed. I think a good JSON module needs to be in Phobos, while
for XML maybe it just needs a standard D API (this also comes from practical
size considerations: a JSON module is probably small. A good XML module will
probably become large).

Bye,
bearophile

Sep 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 18:54:45 Josh Simmons wrote:
 On Tue, Sep 6, 2011 at 6:37 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Josh Simmons:
 My question is why do you even need a standard API for XML and JSON.

 
 It helps port your user code to other libs that use the same standard
 API. This is very useful. In D I'd like a basic standard API even for
 simple 2D graphics.
 
 Bye,
 bearophile

 
 This would be true if there were only implementation differences
 between libraries doing roughly the same thing (in which case you'd
 not need a new library anyway). Unfortunately this is not how things
 work.
 
 So simple 2d graphics ey? vector or raster based? immediate rendering
 or scene graph representation? animation? fonts? textures?
 
 XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?
 
 The problem with php isn't just it's awesome naming, it's the fact
 that anything that seemed like something somebody might use was added
 as opposed to limiting itself to the must-haves.


have done quite well with them. In fact, I believe that the large size of 
their standard libraries is generally seen as major advantage of those 
languages.

No, we can't have everything in the standard library. No, an XML parser in the 
standard library likely won't meet everyone's needs. However, having a large 
standard library can be of great benefit to the users of the language even if 
it doesn't solve every problem that they could possibly have. The question 
isn't really whether we should add stuff like XML parsing to Phobos. The 
question is what is the best general implementation for a such a module and 
whether we can get an implementation of high enough quality to be able to go 
in the standard library. It's a question of time, man power, and quality.

Obviously, Phobos is not going to explode in size overnight, but it _is_ going 
to grow in size, and eventually it should be fairly large. We already have 
several useful additions in the review queue which will likely make it into 
Phobos in one form or another over the next few months.

- Jonathan M Davis

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 11:09, Jonathan M Davis wrote:
 On Tuesday, September 06, 2011 18:54:45 Josh Simmons wrote:
 This would be true if there were only implementation differences
 between libraries doing roughly the same thing (in which case you'd
 not need a new library anyway). Unfortunately this is not how things
 work.

 So simple 2d graphics ey? vector or raster based? immediate rendering
 or scene graph representation? animation? fonts? textures?

 XML ey? SAX, DOM, Pull, Data Binding? XPath? XSLT?

 The problem with php isn't just it's awesome naming, it's the fact
 that anything that seemed like something somebody might use was added
 as opposed to limiting itself to the must-haves.


 have done quite well with them. In fact, I believe that the large size of
 their standard libraries is generally seen as major advantage of those
 languages.

 No, we can't have everything in the standard library. No, an XML parser in the
 standard library likely won't meet everyone's needs. However, having a large
 standard library can be of great benefit to the users of the language even if
 it doesn't solve every problem that they could possibly have. The question
 isn't really whether we should add stuff like XML parsing to Phobos. The
 question is what is the best general implementation for a such a module and
 whether we can get an implementation of high enough quality to be able to go
 in the standard library. It's a question of time, man power, and quality.

Phobos could have a low level XML parsing module and on top of that 
other XML APIs can be built, like SAX, DOM and so on. This is how the 
XML modules in Tango are built. Tango has a low level XML pull parse. 
Built on top of that are a SAX API and a DOM document.

 Obviously, Phobos is not going to explode in size overnight, but it _is_ going
 to grow in size, and eventually it should be fairly large. We already have
 several useful additions in the review queue which will likely make it into
 Phobos in one form or another over the next few months.

 - Jonathan M Davis


-- 
/Jacob Carlborg

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 11:09 Uhr, schrieb Jonathan M Davis <jmdavisProg gmx.com>:


 libraries and
 have done quite well with them. In fact, I believe that the large size of
 their standard libraries is generally seen as major advantage of those
 languages.

These languages are platforms with a complete abstraction from the  
underlying OS and libraries. My JDK installation is 170 MB in size. I  
would prefer thin wrappers over good existing libraries like curl.  
Possibly even msxml4/libxml2. Their API differs in some points, but they  
both offer XPath and other goodies, maybe they can be made compatible with  
a wrapper.

Sep 06 2011

Josh Simmons <simmons.44 gmail.com> writes:

On Tue, Sep 6, 2011 at 7:09 PM, Jonathan M Davis <jmdavisProg gmx.com> wrote:

 have done quite well with them. In fact, I believe that the large size of
 their standard libraries is generally seen as major advantage of those
 languages.

 No, we can't have everything in the standard library. No, an XML parser in the
 standard library likely won't meet everyone's needs. However, having a large
 standard library can be of great benefit to the users of the language even if
 it doesn't solve every problem that they could possibly have. The question
 isn't really whether we should add stuff like XML parsing to Phobos. The
 question is what is the best general implementation for a such a module and
 whether we can get an implementation of high enough quality to be able to go
 in the standard library. It's a question of time, man power, and quality.

 Obviously, Phobos is not going to explode in size overnight, but it _is_ going
 to grow in size, and eventually it should be fairly large. We already have
 several useful additions in the review queue which will likely make it into
 Phobos in one form or another over the next few months.

 - Jonathan M Davis


their massive standard libraries too.

I just think the effort is better spent creating a solid language and
encouraging third party libraries through better tools.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 4:29 AM, Josh Simmons wrote:

 their massive standard libraries too.

 I just think the effort is better spent creating a solid language and
 encouraging third party libraries through better tools.

As always finding the right balance is key. Community-grown languages 
such as PHP, Python, or Ruby also enjoy large libraries, so I don't 
think corporate support is a prerequisite. It would probably be a 
mistake to stop work on Phobos now.

For all I can tell I'm more annoyed by the energy spent on what I'd call 
"isometric churn" like changing names (even those used internally, sigh) 
and changing comments from /** */ to /++ +/. All that energy could go 
into adding value.


Andrei

Sep 06 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 In many ways a language's standard library is a
 showcase of the language,

YES!!! I'm glad someone besides me finally realizes this.  For example, whenever
someone asks me about why D metaprogramming is so great, I just point them to a
few std lib modules that showcase this, e.g.:

Sep 05 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library that
 offers few and badly packaged features and comes at the tail of all benchmarks.
 We also can't afford a JSON library that is poorly designed and badly written.
 Ironically, the costs mostly manifest the same way: people will decide not to
 use D because it "lacks good libraries" and "is quirky to use". In many ways a
 language's standard library is a showcase of the language, and to a newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

I agree that the XML and JSON libraries need to be scrapped and rewritten. But 
simply changing the names of otherwise successful APIs is not worth while.


 c) possibly create programs a la gofix that help migration.

gofix cannot fix books, articles, blogs, and presentations.

Furthermore, in order to work successfully, gofix needs to be a complete D
front 
end, capable of handling both the old and the new stuff. Doing a perl script 
would be a disaster. It's a substantial project, has a high risk of inadequacy, 
and I suspect our resources are better spent elsewhere.

Considering also the problems people have running dmd and getting it to find 
their imports and libraries, add in having to run 'gofix' over their source
code 
first, then patch up what gofix goofed up, seems a stretch.

Sep 05 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

 I agree that the XML and JSON libraries need to be scrapped and
 rewritten. But simply changing the names of otherwise successful APIs is
 not worth while.

So we have to live with these naming conventions from C forever?

-- 
/Jacob Carlborg

Sep 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 08:42:14 Jacob Carlborg wrote:
 On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.
 
 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of
 the
 language's quality.

 
 I agree that the XML and JSON libraries need to be scrapped and
 rewritten. But simply changing the names of otherwise successful APIs is
 not worth while.

 
 So we have to live with these naming conventions from C forever?

My take on it is that we need to figure out which pieces of Phobos need to be 
reworked or renamed and get it done as soon as possible. That way, everything 
follows the proper naming conventions (thus avoiding a mess like PHP) and is 
of an appropriately high level of quality. Then we can have an appropriately 
stable API which doesn't have to change often - if at all. I think that the 
current problem with Phobos is primarily a combination of three things:

1. Older APIs which aren't in line with how D2 and Phobos have evolved (e.g. 
they don't use ranges when they should).

2. Some older stuff didn't get a thorough enough peer review before making it 
into Phobos and is not at a high enough level of quality, so it needs to be 
revised or replaced.

3. Too much of what has been done in the past has been a hodgepodge of naming 
conventions, making it very inconsistent in some places.

Once those have been sorted out (some of which can be done without breaking 
any existing code and some of which requires breaking changes), then we can 
have a stable API for Phobos which doesn't change much except where we're 
adding new functionality which doesn't break existing code. So ultimately, we 
_will_ have a stable API, but some breaking changes are required in the short 
term to resolve issues with Phobos which would cause problems in the long run.

- Jonathan M Davis

Sep 05 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 08:56, Jonathan M Davis wrote:
 On Tuesday, September 06, 2011 08:42:14 Jacob Carlborg wrote:
 On 2011-09-06 08:00, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of
 the
 language's quality.

 I agree that the XML and JSON libraries need to be scrapped and
 rewritten. But simply changing the names of otherwise successful APIs is
 not worth while.

 So we have to live with these naming conventions from C forever?

 My take on it is that we need to figure out which pieces of Phobos need to be
 reworked or renamed and get it done as soon as possible. That way, everything
 follows the proper naming conventions (thus avoiding a mess like PHP) and is
 of an appropriately high level of quality. Then we can have an appropriately
 stable API which doesn't have to change often - if at all. I think that the
 current problem with Phobos is primarily a combination of three things:

 1. Older APIs which aren't in line with how D2 and Phobos have evolved (e.g.
 they don't use ranges when they should).

 2. Some older stuff didn't get a thorough enough peer review before making it
 into Phobos and is not at a high enough level of quality, so it needs to be
 revised or replaced.

 3. Too much of what has been done in the past has been a hodgepodge of naming
 conventions, making it very inconsistent in some places.

 Once those have been sorted out (some of which can be done without breaking
 any existing code and some of which requires breaking changes), then we can
 have a stable API for Phobos which doesn't change much except where we're
 adding new functionality which doesn't break existing code. So ultimately, we
 _will_ have a stable API, but some breaking changes are required in the short
 term to resolve issues with Phobos which would cause problems in the long run.

 - Jonathan M Davis

Yes, thank you, I agree.

-- 
/Jacob Carlborg

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Walter Bright wrote:
 I agree that the XML and JSON libraries need to be scrapped and rewritten.

Ugh, I actually use the std.json.

 Furthermore, in order to work successfully, gofix [...]

The easiest way to do that is run the compiler. If an error occurs,
go to the given line of the problem and maybe automatically replace
with the spell checker's suggestion. Then in case that's wrong, ask
the user to confirm it.

... which is pretty much what my editor (vim) already does!


Changing names is annoying, but it's not a difficult task, with what
we have now. The compiler already does 90% of the work, and even
fairly simple editors will bring it to about 98%.

I'll bitch about it, but it isn't a big enough deal to bother with
a gofix. Trivial fixes are already trivial fixes. I'd prefer to
avoid them, but let's not forget that the compiler already does most
the work.


The more annoying changes are where stuff changes wholesale, so
the code needs to be rethought, data needs to be changed, and
so on. These are just huge sinks of pain.

And, no, a long deprecation time doesn't change anything.
Whether I spend thousands of dollars today changing it or thousands
of dollars in six months changing it, the fact is I'm still out
thousands of dollars.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 9:17 AM, Adam Ruppe wrote:
 And, no, a long deprecation time doesn't change anything.
 Whether I spend thousands of dollars today changing it or thousands
 of dollars in six months changing it, the fact is I'm still out
 thousands of dollars.

Basic economics indicate that the difference is large.

Andrei

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 16:17, Adam Ruppe wrote:
 Walter Bright wrote:
 I agree that the XML and JSON libraries need to be scrapped and rewritten.

 Ugh, I actually use the std.json.

 Furthermore, in order to work successfully, gofix [...]

 The easiest way to do that is run the compiler. If an error occurs,
 go to the given line of the problem and maybe automatically replace
 with the spell checker's suggestion. Then in case that's wrong, ask
 the user to confirm it.

 ... which is pretty much what my editor (vim) already does!


 Changing names is annoying, but it's not a difficult task, with what
 we have now. The compiler already does 90% of the work, and even
 fairly simple editors will bring it to about 98%.

 I'll bitch about it, but it isn't a big enough deal to bother with
 a gofix. Trivial fixes are already trivial fixes. I'd prefer to
 avoid them, but let's not forget that the compiler already does most
 the work.


 The more annoying changes are where stuff changes wholesale, so
 the code needs to be rethought, data needs to be changed, and
 so on. These are just huge sinks of pain.

 And, no, a long deprecation time doesn't change anything.
 Whether I spend thousands of dollars today changing it or thousands
 of dollars in six months changing it, the fact is I'm still out
 thousands of dollars.

You can always keep your own local copy of a module.

-- 
/Jacob Carlborg

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Jacob Carlborg wrote:
 You can always keep your own local copy of a module.

Yeah, though that comes with it's own set of pains.

But, let me ask you this. Which is better?

1) Ask an unknown number of people to change their code to keep up
with your changes and/or distribute the old module

or

2) Ask just one person - who is making changes anyway - to make one
more small change and distribute the old and new modules.



One counterpoint to this would be "why make people download old
modules they don't need in the zip?" There's two answers to that
too:

a) That's a trivial cost. The average Phobos module is about 3 or 4
kilobytes once zipped up. When you're grabbing a > 10 MB zip, three
kilobytes is nothing to worry about.

b) If it is a big deal, changing to a download on demand system (like
the DIP) can avoid this... especially if the old version is still
around and easily accessible by name.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 1:00 AM, Walter Bright wrote:
 On 9/5/2011 7:48 PM, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than
 one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that
 offers few and badly packaged features and comes at the tail of all
 benchmarks.
 We also can't afford a JSON library that is poorly designed and badly
 written.
 Ironically, the costs mostly manifest the same way: people will decide
 not to
 use D because it "lacks good libraries" and "is quirky to use". In
 many ways a
 language's standard library is a showcase of the language, and to a
 newcomer an
 inconsistent and awkward standard library affects the perception of the
 language's quality.

 I agree that the XML and JSON libraries need to be scrapped and
 rewritten. But simply changing the names of otherwise successful APIs is
 not worth while.

I agree we should be increasingly hawkish about such changes.

 c) possibly create programs a la gofix that help migration.

 gofix cannot fix books, articles, blogs, and presentations.

 Furthermore, in order to work successfully, gofix needs to be a complete
 D front end, capable of handling both the old and the new stuff. Doing a
 perl script would be a disaster. It's a substantial project, has a high
 risk of inadequacy, and I suspect our resources are better spent elsewhere.

I'm not so sure. We're experiencing an unprecedented surge in 
participation to all aspects of the D programming language, and I 
believe you and I should start thinking differently. A certain change of 
phase happened to me a couple of months ago, when I commented about some 
fix that removed an undue limitation: "sounds good, but Walter has many 
other things on his plate more important than this". Within hours, the 
fix was available as a pull request.

In this case, if one or more persons is/are determined enough to create 
dfix, it will happen regardless of whether you or I believe it's the 
optimal resource allocation.

 Considering also the problems people have running dmd and getting it to
 find their imports and libraries, add in having to run 'gofix' over
 their source code first, then patch up what gofix goofed up, seems a
 stretch.

I do agree dfix must work at least as well as Apple hardware :o).


Andrei

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 04:48, Andrei Alexandrescu wrote:
 I agree with all of the above. However, as is often the case, there's
 more than one side to the story.

 Bad APIs have their costs too. We can't afford to have an XML library
 that offers few and badly packaged features and comes at the tail of all
 benchmarks. We also can't afford a JSON library that is poorly designed
 and badly written. Ironically, the costs mostly manifest the same way:
 people will decide not to use D because it "lacks good libraries" and
 "is quirky to use". In many ways a language's standard library is a
 showcase of the language, and to a newcomer an inconsistent and awkward
 standard library affects the perception of the language's quality.

 Stressing that breaking code has a cost and implying that keeping it
 with flaws has no cost is as mistaken as worrying in chess about the
 flank at the expense of the center.

 The reality we need to face is, we are experiencing growth pains. What
 we must do is NOT lament about breaking this or keeping that. We must:

 a) devise good language features to cope with deprecation, of which
 deprecation with message is one that I think we need to embrace and
 extend (I have a few ideas I'll discuss separately);

 b) supplement that with a good policy for deprecating APIs and
 introducing new ones - in particular decide where to draw the line when
 introducing a breaking change;

 c) possibly create programs a la gofix that help migration.


 Andrei

We don't want to have a standard library like the one in PHP where there 
seems to be no naming conventions at all.

-- 
/Jacob Carlborg

Sep 05 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where there seems
 to be no naming conventions at all.

I don't think that is the reason PHP is such a bear to work with.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 09:35, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

 I don't think that is the reason PHP is such a bear to work with.

I think that that is one reason, not the only one, not the biggest one, 
but one reason.

-- 
/Jacob Carlborg

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Walter Bright wrote:
 I don't think that is the reason PHP is such a bear to work with.

It is one of the problems with PHP, but I'm not sure it applies
to D the same way.

Almost *every time* I write PHP, I either mess up a name or the
argument order. (Sometimes, PHP functions go src, dest, and sometimes
it's dest, src. Ugh! The worst part is it doesn't even catch this.
A name mismatch throws an error when it's run. Reversed arguments just
silently do the wrong thing.)

The random argument order is a huge huge pain.


Contrast with D, where I've almost never forgotten a name. Granted,
it might be due to my auto-complete function so I type once and
only once, but it just hasn't been a big deal.

Thanks to the UFCS with arrays, the argument order is almost always
the same to work with that, so the much bigger problem never occurs.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 2:35 AM, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

 I don't think that is the reason PHP is such a bear to work with.

Probably. At any rate, what I now think as a promising path is with new 
module names. Let's leave the likes of std.xml and std.json in peace, 
then pick a naming convention for the new ones and create whole new 
modules replacing them. Then people who are ready for the migration change

import std.xml;

with

import std.some_naming_convention_involving_xml;

and fix whatever code breakages that entails. If they're pleased with 
std.xml, nobody's holding a gun to their head.

Months and years go by, and nobody uses std.xml because the new module 
and the migration path are copiously advertised in the documentation. At 
that point we can discuss excising std.xml altogether and replacing it 
with the new one. And so the new becomes old, just like in dialectics.

There's a successful precedent in C++ - stringstream vs. strstream. The 
only missing thing is that C++ did not choose a naming convention 
because they limited themselves to only one header.

So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at 
Facebook to good effect. Or should we, au contraire, use "old_" for the 
old module and advise people who want to stick with the old modules to 
change their imports?


Andrei

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Andrei Alexandrescu wrote:
 What should we use? xml2?

That might be a good idea. If D modules were to get in the habit
of writing their major version numbers as part of the name, it'd solve
this as well the dget automatic library downloading thingy in one go.

Going with new and old won't work if there should ever be a version 3
written.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 16:51, Andrei Alexandrescu wrote:
 On 9/6/11 2:35 AM, Walter Bright wrote:
 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:
 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

 I don't think that is the reason PHP is such a bear to work with.

 Probably. At any rate, what I now think as a promising path is with new
 module names. Let's leave the likes of std.xml and std.json in peace,
 then pick a naming convention for the new ones and create whole new
 modules replacing them. Then people who are ready for the migration change

 import std.xml;

 with

 import std.some_naming_convention_involving_xml;

 and fix whatever code breakages that entails. If they're pleased with
 std.xml, nobody's holding a gun to their head.

 Months and years go by, and nobody uses std.xml because the new module
 and the migration path are copiously advertised in the documentation. At
 that point we can discuss excising std.xml altogether and replacing it
 with the new one. And so the new becomes old, just like in dialectics.

 There's a successful precedent in C++ - stringstream vs. strstream. The
 only missing thing is that C++ did not choose a naming convention
 because they limited themselves to only one header.

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at
 Facebook to good effect. Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?


 Andrei

I prefer to use "old_". Depending on what XML functionality we want we 
maybe want to have an xml package.

-- 
/Jacob Carlborg

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Jacob Carlborg wrote:
 I prefer to use "old_".

There's two big problems with that though:

1) It still breaks the old code. It's an even easier fix, so this isn't
too bad, but it is still broken.

2) What if a third version of a module comes along?

Sep 06 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/6/11, Adam Ruppe <destructionator gmail.com> wrote:
 Jacob Carlborg wrote:
 I prefer to use "old_".

 There's two big problems with that though:

 1) It still breaks the old code. It's an even easier fix, so this isn't
 too bad, but it is still broken.

 2) What if a third version of a module comes along?

Isn't it time we start eating our own dogfood and introduce version
statements into Phobos to select deprecated functionality? Naming
modules "xml.old1", "xml.old2" reminds me of using zip files as a poor
man's version control system.

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Andrej Mitrovic wrote:
 select deprecated functionality

The problem I have is old code isn't going to change itself
to select old functions.

New code, on the other hand, can decide to use new functions
since someone is actively writing it.

Therefore, it's less painful to opt in to using new code than
to select to use old code.

 Naming modules "xml.old1" [...]

I'd do "xml" and "xml2" rather than "xml.old" since the name is
already "xml", and in my view, that's immutable now.

This might be a poor man's version control system, but is that
bad? It's not uncommon for software (or books or movies...)
to have major versions (sequel numbers) in the name.

Thanks to D's module namespacing, that name is the only place it'd
be too; the code that uses it still looks natural. It's not like
they'd have to write parseXML2() all over the place - like is
somewhat common in C.



Would people find it weird that versions are in the name? Maybe,
but again, that's common in a lot of places. Just make sure the
Phobos docs point to the newest version by default in the left
nav panel so people don't have to hunt for what's newest.


Would this naming scheme be a hassle in the source control? I don't
think so.

1) If it's a rewrite, it's a different file anyway, even if you gave
it the same name; it's not like a patch could apply to both versions.

2) If it's a minor fork, you could surely just apply patches to
two branches in git, right? (I don't really know how it works, but
I can't imagine it'd be harder than any other branch which I hear
git makes easy.)

3) If it's backward compatible, no need to change the number.


My only regret is we didn't have the foresight to call it "std.xml1"
in the first place.

Sep 06 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/6/11, Adam Ruppe <destructionator gmail.com> wrote:
 Andrej Mitrovic wrote:
 select deprecated functionality

 The problem I have is old code isn't going to change itself
 to select old functions.

I mean hypothetically with a new version this could be a compile switch:




I'm not exactly sure if it could work like that.. but anyway if it's
just xml then we don't have to introduce a whole new way of using
deprecated code I guess.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 17:37, Adam Ruppe wrote:
 Jacob Carlborg wrote:
 I prefer to use "old_".

 There's two big problems with that though:

 1) It still breaks the old code. It's an even easier fix, so this isn't
 too bad, but it is still broken.

 2) What if a third version of a module comes along?

What I don't like is that if there's a function/class/module that should 
be deprecated but have a good proper name we can't use that name for a 
new implementation.

Another problem I see is that DMD, D, druntime and Phobos are released 
in one piece. You should be able to take an arbitrary version of Phobos 
and use it with the compiler and the language, just as you can with 
other libraries. Then there could be a better version scheme and you can 
stay on a given version if you really have to.

1.0.0
Major.minor.build

Increment major when introducing API breaking changes, i.e. removing a 
method. Increment minor when introducing non-breaking API changes, i.e. 
adding a new method. Increment build when changing implementation 
details, i.e. changing an internal data structure from an array to a 
linked list.

With this version scheme you would know that as long as you stay on 
1.x.y your code won't break.

-- 
/Jacob Carlborg

Sep 07 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/6/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

I would say that's the right way to go. It's much easier to change an
import than change code. Perhaps another alternative is to use version
statements. DFL uses it for deprecated features that are still in the
codebase and usable.

We don't want to punish people for using newer modules, we should
encourage it. If they're forced to import "std.xml_new", they'll
eventually have to change those imports to "std.xml" down the road
when the older std.xml gets replaced by the new one. I assume people
will just pick the first thing that they see, "std.xml" looks standard
so they would pick that over "std.xml2".

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 11:53:09 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 On 9/6/11, Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

 I would say that's the right way to go. It's much easier to change an
 import than change code. Perhaps another alternative is to use version
 statements. DFL uses it for deprecated features that are still in the
 codebase and usable.

I agree.  I'd hate for the current std.xml to squat that name forever...

In a related story, why did digitalmars.com change D to mean D2 instead of  
just using new_D? :)

Also new coke.

-Steve

Sep 06 2011

Adam Ruppe <destructionator gmail.com> writes:

Andrej Mitrovic:
 I assume people will just pick the first thing they see

That's why the links on the left should always point to the newest
version, and there might be notes in the docs pointing people to
newer and older versions.

 "std.xml" looks standard so they would pick that over "std.xml2"

Maybe, but if were just consistent with a scheme, I think it'd be
easy enough to learn. It's not uncommon to see sequels named "name 2".

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 11:27 AM, Adam Ruppe wrote:
 Andrej Mitrovic:
 I assume people will just pick the first thing they see

 That's why the links on the left should always point to the newest
 version, and there might be notes in the docs pointing people to
 newer and older versions.

 "std.xml" looks standard so they would pick that over "std.xml2"

 Maybe, but if were just consistent with a scheme, I think it'd be
 easy enough to learn. It's not uncommon to see sequels named "name 2".

Yah, I also think the documentation makes it easy to clarify which 
module is the preferred one.

I think there's a lot of merit to simply appending a '2' to the module 
name. There only place where the '2' occurs is in the name of the 
module, and there aren't many modules we need to replace like that.


Andrei

Sep 06 2011

"Daniel Murphy" <yebblies nospamgmail.com> writes:

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module 
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module 
 name. There only place where the '2' occurs is in the name of the module, 
 and there aren't many modules we need to replace like that.

I still can never remember if I'm supposed to be using std.regex or 
std.regexp.
When the new one is finished are we going to have 3?

It's definately benificial to avoid breaking code, but I really disagree 
that phobos has reached that point yet.  The breaking changes need to stop, 
but stopping prematurely will leave phobos permanently disfigured.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 12:05 PM, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.

Yet another argument :o). I also don't quite remember right now whether 
strstream or stringstream is the new one (I think the latter).

 When the new one is finished are we going to have 3?

The new one will stay regex.

 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet.  The breaking changes need to stop,
 but stopping prematurely will leave phobos permanently disfigured.

Agreed.


Andrei

Sep 06 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 6, 2011, at 10:16 AM, Andrei Alexandrescu wrote:

 On 9/6/11 12:05 PM, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in =


message
 news:j45isu$2t3h$1 digitalmars.com...
=20
 Yah, I also think the documentation makes it easy to clarify which =



module
 is the preferred one.
=20
 I think there's a lot of merit to simply appending a '2' to the =



module
 name. There only place where the '2' occurs is in the name of the =



module,
 and there aren't many modules we need to replace like that.

=20
 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.

=20
 Yet another argument :o). I also don't quite remember right now =

whether strstream or stringstream is the new one (I think the latter).

The latter.  I never forget this one because the name we're supposed to =
use is annoyingly long :-)

Sep 06 2011

Dmitry Olshansky <dmitry.olsh gmail.com> writes:

On 06.09.2011 21:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.

Looking at the docs: std.regexp is scheduled for deprecation (in August 
? hm... that was a bit harsh).

 When the new one is finished are we going to have 3?

To the best of my knowledge new one is supposed to be std.regex, and 
since API is essentially the same, chances are most users won't notice 
the change :)

Speaking of the whole idea, I like '2' appended, it's clear that it's a 
new and better version, and it keeps the old code from unnecessary strain.

 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet.  The breaking changes need to stop,
 but stopping prematurely will leave phobos permanently disfigured.


-- 
Dmitry Olshansky

Sep 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 21:42:09 Dmitry Olshansky wrote:
 On 06.09.2011 21:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 
 Yah, I also think the documentation makes it easy to clarify which
 module is the preferred one.
 
 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the
 module, and there aren't many modules we need to replace like that.

 
 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.

 
 Looking at the docs: std.regexp is scheduled for deprecation (in August
 ? hm... that was a bit harsh).

std.regexp has been scheduled for deprecation for ages. It just hasn't had a 
date attached to it. It'll be deprecated in 2.055.

- Jonathan M Davis

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 19:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org>  wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the module,
 and there aren't many modules we need to replace like that.

 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.
 When the new one is finished are we going to have 3?

 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet.  The breaking changes need to stop,
 but stopping prematurely will leave phobos permanently disfigured.

I agree.

-- 
/Jacob Carlborg

Sep 07 2011

Sean Cavanaugh <WorksOnMyMachine gmail.com> writes:

On 9/7/2011 2:19 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:05, Daniel Murphy wrote:
 "Andrei Alexandrescu"<SeeWebsiteForEmail erdani.org> wrote in message
 news:j45isu$2t3h$1 digitalmars.com...
 Yah, I also think the documentation makes it easy to clarify which
 module
 is the preferred one.

 I think there's a lot of merit to simply appending a '2' to the module
 name. There only place where the '2' occurs is in the name of the
 module,
 and there aren't many modules we need to replace like that.

 I still can never remember if I'm supposed to be using std.regex or
 std.regexp.
 When the new one is finished are we going to have 3?

 It's definately benificial to avoid breaking code, but I really disagree
 that phobos has reached that point yet. The breaking changes need to
 stop,
 but stopping prematurely will leave phobos permanently disfigured.

 I agree.


In the COM based land for D3D, there is just a number tacked onto the 
class name.  We are up to version 11 (e.x. ID3D11Device).  It works well 
and is definitely nicer once you are used to it, than calling everything 
New or FunctionEx, and left wondering what to do when you rev the 
interface again.  Once you solve making 3 versions of an interface work 
cleanly, nice it should be a good system.

Making all the modules versioned in some way would probably be ideal. 
The way linux shared libraries are linked could be used as a model, just 
make the 'friendly unversioned' module name an alias of some sort to the 
latest version of the library.  Any code needing the older version can 
specify it explicitly.  An approach like this would need to be done 
within D, as symbol links are a problem for some platforms (though at 
least its possible on windows these days).

Sep 08 2011

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh  
<WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the  
 class name.  We are up to version 11 (e.x. ID3D11Device).  It works well  
 and is definitely nicer once you are used to it, than calling everything  
 New or FunctionEx, and left wondering what to do when you rev the  
 interface again

In the case of D3D though, D3D itself has a version number. The next  
version
of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML spec
shows up, what do we do about std.xml2, which parses version 1.1? And what
do we call the new one? Should std.xml3 parse XMLv2.0?


-- 
   Simen

Sep 08 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh  
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the  
 class name.  We are up to version 11 (e.x. ID3D11Device).  It works  
 well and is definitely nicer once you are used to it, than calling  
 everything New or FunctionEx, and left wondering what to do when you  
 rev the interface again

 In the case of D3D though, D3D itself has a version number. The next  
 version
 of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML  
 spec
 shows up, what do we do about std.xml2, which parses version 1.1? And  
 what
 do we call the new one? Should std.xml3 parse XMLv2.0?

That is late in the discussion, but a valid point.

Sep 08 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/8/11 1:28 PM, Marco Leise wrote:
 Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the
 class name. We are up to version 11 (e.x. ID3D11Device). It works
 well and is definitely nicer once you are used to it, than calling
 everything New or FunctionEx, and left wondering what to do when you
 rev the interface again

 In the case of D3D though, D3D itself has a version number. The next
 version
 of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML
 spec
 shows up, what do we do about std.xml2, which parses version 1.1? And
 what
 do we call the new one? Should std.xml3 parse XMLv2.0?

 That is late in the discussion, but a valid point.

Waiting for a suggestion from the XML experts.

Andrei

Sep 08 2011

Alix Pexton <alix.DOT.pexton gmail.DOT.com> writes:

On 08/09/2011 21:02, Andrei Alexandrescu wrote:
 On 9/8/11 1:28 PM, Marco Leise wrote:
 Am 08.09.2011, 18:52 Uhr, schrieb Simen Kjaeraas
 <simen.kjaras gmail.com>:

 On Thu, 08 Sep 2011 11:40:01 +0200, Sean Cavanaugh
 <WorksOnMyMachine gmail.com> wrote:

 In the COM based land for D3D, there is just a number tacked onto the
 class name. We are up to version 11 (e.x. ID3D11Device). It works
 well and is definitely nicer once you are used to it, than calling
 everything New or FunctionEx, and left wondering what to do when you
 rev the interface again

 In the case of D3D though, D3D itself has a version number. The next
 version
 of std.xml will not be parsing XMLv2.0. When a version 2.0 of the XML
 spec
 shows up, what do we do about std.xml2, which parses version 1.1? And
 what
 do we call the new one? Should std.xml3 parse XMLv2.0?

 That is late in the discussion, but a valid point.

 Waiting for a suggestion from the XML experts.

 Andrei

I'm not really an XML expert, but I do recall that the XML Core Working 
Group shelved there plans to develop "XML2.0". All enhancements that are 
in the pipeline are separate projects with their own acronyms.

IMHO, even if there were an XML2.0 spec, I don't think it would effect 
the naming of the module in Phobos, because I doubt very much it would 
introduce anything that would require a complete rewrite. std.xml2 could 
just be extended to support the new features of the spec in the context 
of its existing architecture. But it is probably a moot point.

A...

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 17:53, Andrej Mitrovic wrote:
 On 9/6/11, Andrei Alexandrescu<SeeWebsiteForEmail erdani.org>  wrote:
 Or should we, au contraire, use "old_" for the
 old module and advise people who want to stick with the old modules to
 change their imports?

 I would say that's the right way to go. It's much easier to change an
 import than change code. Perhaps another alternative is to use version
 statements. DFL uses it for deprecated features that are still in the
 codebase and usable.

 We don't want to punish people for using newer modules, we should
 encourage it. If they're forced to import "std.xml_new", they'll
 eventually have to change those imports to "std.xml" down the road
 when the older std.xml gets replaced by the new one. I assume people
 will just pick the first thing that they see, "std.xml" looks standard
 so they would pick that over "std.xml2".

Yeah, I hate that with Java interfaces, appending a number. Just because 
the good proper name is already taken and they can't break existing code.

-- 
/Jacob Carlborg

Sep 07 2011

David Gileadi <gileadis NSPMgmail.com> writes:

On 9/7/11 12:09 AM, Jacob Carlborg wrote:
 Yeah, I hate that with Java interfaces, appending a number. Just because
 the good proper name is already taken and they can't break existing code.

I've been happy to see less of this recently.  Maybe it's just my 
imagination, but with build systems that have package management like 
Maven and Gradle becoming more widely adopted developers seem to be more 
content to let the package manager handle the versioning.  I hope this 
becomes the case with D too.

Sep 07 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-07 16:33, David Gileadi wrote:
 On 9/7/11 12:09 AM, Jacob Carlborg wrote:
 Yeah, I hate that with Java interfaces, appending a number. Just because
 the good proper name is already taken and they can't break existing code.

 I've been happy to see less of this recently. Maybe it's just my
 imagination, but with build systems that have package management like
 Maven and Gradle becoming more widely adopted developers seem to be more
 content to let the package manager handle the versioning. I hope this
 becomes the case with D too.

Me too, that's why I'm working on a package manager.

-- 
/Jacob Carlborg

Sep 07 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules replacing them.

std.xml2

will do fine.

Sep 06 2011

"Martin Nowak" <dawg dawgfoto.de> writes:

On Tue, 06 Sep 2011 19:54:28 +0200, Walter Bright  
<newshound2 digitalmars.com> wrote:

 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules  
 replacing them.

 std.xml2

 will do fine.

Speaking of xml2 I clearly like to see an attempt of buffered lookahead  
reading for a stream/stdio overhaul.
Writing range adapters with even only fixed lookahead on top of the  
current stream API is painful.

martin

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 14:07:01 -0400, Martin Nowak <dawg dawgfoto.de> wrote:

 On Tue, 06 Sep 2011 19:54:28 +0200, Walter Bright  
 <newshound2 digitalmars.com> wrote:

 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules  
 replacing them.

 std.xml2

 will do fine.

 Speaking of xml2 I clearly like to see an attempt of buffered lookahead  
 reading for a stream/stdio overhaul.
 Writing range adapters with even only fixed lookahead on top of the  
 current stream API is painful.

This is exactly the reason for the overhaul.  I'm working on it, and I  
think my next version will be much more backwards compatible.

See in the proposed documentation readUntil, and look at the byChunk  
implementation to see how it's used.

The intention was to use the buffer as an expandable scratch space to do  
things like parsing xml files without copying.

-Steve

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 12:54 PM, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

 std.xml2

 will do fine.

Since the BDFL and the majority of his constituents are in favor of 
this, it looks like the winner.

Andrei

Sep 06 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/6/2011 12:03 PM, Andrei Alexandrescu wrote:
 Since the BDFL

Brain-Damaged Feckless Leader?

Sep 06 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/06/2011 09:09 PM, Walter Bright wrote:
 On 9/6/2011 12:03 PM, Andrei Alexandrescu wrote:
 Since the BDFL

 Brain-Damaged Feckless Leader?

Benevolent Dictator For Life ;)

Sep 06 2011

notna <notna.remove.this ist-einmalig.de> writes:

Sorry upfront, I didn't read this hole thread, so maybe I'm missing or 
mixing something...

How about a D binding for http://www.xmlsoft.org/ ?

In other words, taking the "curl or sqlite3 path", something like 
/etc/c/xml2

On 06.09.2011 19:54, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

 std.xml2

 will do fine.

Sep 06 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

That is about 4 times slower than the Tango XML parser:

http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/


 On 06.09.2011 19:54, Walter Bright wrote:
 On 9/6/2011 7:51 AM, Andrei Alexandrescu wrote:
 Let's leave the likes of std.xml and std.json in peace, then pick a
 naming convention for the new ones and create whole new modules
 replacing them.

 std.xml2

 will do fine.

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

 That is about 4 times slower than the Tango XML parser:

 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/

You are so right, Timon. How deep is the trench between Phobos and Tango  
devs? Tango's XML parser should really make it into Phobos.

Sep 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 23:51:48 Marco Leise wrote:
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...
 
 How about a D binding for http://www.xmlsoft.org/ ?
 
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

 
 That is about 4 times slower than the Tango XML parser:
 
 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs
 -with-rapidxml/

 You are so right, Timon. How deep is the trench between Phobos and Tango
 devs? Tango's XML parser should really make it into Phobos.

A new std.xml is already in the works. It'll be range-based, unlike the Tango 
parser. But there's no reason why Phobos shouldn't be able to have a 
similarly-fast XML parser. As I understand it, the primary reason that the 
current std.xml is slow is because it uses delegates quite a bit, but I 
haven't used it myself, so I don't know all of the details.

- Jonathan M Davis

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday, September 06, 2011 23:51:48 Marco Leise wrote:
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing  


 or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

 That is about 4 times slower than the Tango XML parser:

  

 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs
 -with-rapidxml/

 You are so right, Timon. How deep is the trench between Phobos and Tango
 devs? Tango's XML parser should really make it into Phobos.

 A new std.xml is already in the works. It'll be range-based, unlike the  
 Tango
 parser. But there's no reason why Phobos shouldn't be able to have a
 similarly-fast XML parser. As I understand it, the primary reason that  
 the
 current std.xml is slow is because it uses delegates quite a bit, but I
 haven't used it myself, so I don't know all of the details.

No, the issue is, and always will be, buffer access.  C's FILE * just  
doesn't provide anything decent.  It's the primary motivation for wanting  
to revamp it.  With slicing and copy avoidance (i.e. only read into a  
buffer, never copy out), we can achieve the same with Phobos, but I think  
we have to replace C's buffering system (at least for this usage).

Tango's I/O libraries use delegates and virtual functions galore.  I think  
too big a stigma is attached to those.  The difference between calling a  
virtual function/delegate and calling a normal function is very  
insignificant, the real savings for not using virtual functions is to  
allow inlining.

However, in this case, I/O is so diverse that you *need* polymorphism.

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 13:25, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis
 A new std.xml is already in the works. It'll be range-based, unlike
 the Tango
 parser. But there's no reason why Phobos shouldn't be able to have a
 similarly-fast XML parser. As I understand it, the primary reason that
 the
 current std.xml is slow is because it uses delegates quite a bit, but I
 haven't used it myself, so I don't know all of the details.

 No, the issue is, and always will be, buffer access. C's FILE * just
 doesn't provide anything decent. It's the primary motivation for wanting
 to revamp it. With slicing and copy avoidance (i.e. only read into a
 buffer, never copy out), we can achieve the same with Phobos, but I
 think we have to replace C's buffering system (at least for this usage).

 Tango's I/O libraries use delegates and virtual functions galore. I
 think too big a stigma is attached to those. The difference between
 calling a virtual function/delegate and calling a normal function is
 very insignificant, the real savings for not using virtual functions is
 to allow inlining.

 However, in this case, I/O is so diverse that you *need* polymorphism.

 -Steve

The Tango XML parser doesn't read from a file, it takes the input as a 
string. The parser isn't affected by I/O at all.

-- 
/Jacob Carlborg

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 13:25, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 17:59:44 -0400, Jonathan M Davis
 A new std.xml is already in the works. It'll be range-based, unlike
 the Tango
 parser. But there's no reason why Phobos shouldn't be able to have a
 similarly-fast XML parser. As I understand it, the primary reason that
 the
 current std.xml is slow is because it uses delegates quite a bit, but I
 haven't used it myself, so I don't know all of the details.

 No, the issue is, and always will be, buffer access. C's FILE * just
 doesn't provide anything decent. It's the primary motivation for wanting
 to revamp it. With slicing and copy avoidance (i.e. only read into a
 buffer, never copy out), we can achieve the same with Phobos, but I
 think we have to replace C's buffering system (at least for this usage).

 Tango's I/O libraries use delegates and virtual functions galore. I
 think too big a stigma is attached to those. The difference between
 calling a virtual function/delegate and calling a normal function is
 very insignificant, the real savings for not using virtual functions is
 to allow inlining.

 However, in this case, I/O is so diverse that you *need* polymorphism.

 -Steve

 The Tango XML parser doesn't read from a file, it takes the input as a  
 string. The parser isn't affected by I/O at all.

So you have to read the entire file before sending it to the parser?

Isn't that a bit limited?  What if I have a 50MB file, I have to read it  
into a continuous memory block first?

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

 So you have to read the entire file before sending it to the parser?

 Isn't that a bit limited? What if I have a 50MB file, I have to read it
 into a continuous memory block first?

 -Steve

I'm just telling how Tango currently works, not how the XML module in 
Phobos should work. But I guess it might be somewhat limited. 50MB isn't 
that big to read into memory?

I think it would be nice to be able to do both. If you read the whole 
file before sending it to the parser you would know it doesn't perform 
any I/O operations.

-- 
/Jacob Carlborg

Sep 08 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 08, 2011 21:38:43 Jacob Carlborg wrote:
 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

 
 So you have to read the entire file before sending it to the parser?
 
 Isn't that a bit limited? What if I have a 50MB file, I have to read it
 into a continuous memory block first?
 
 -Steve

 
 I'm just telling how Tango currently works, not how the XML module in
 Phobos should work. But I guess it might be somewhat limited. 50MB isn't
 that big to read into memory?
 
 I think it would be nice to be able to do both. If you read the whole
 file before sending it to the parser you would know it doesn't perform
 any I/O operations.

I expect that the the new std.xml will work on ranges of dchar (certainly, if 
it doesn't it should) such that it could be used on a string that's the entire 
file or on a stream over the file. If it's tied to reading in the whole file 
first, it's a design flaw. But I don't know what the current state of the new 
std.xml is. I don't think that I've seen Tomek around here recently.

- Jonathan M Davis

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 15:38:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

 So you have to read the entire file before sending it to the parser?

 Isn't that a bit limited? What if I have a 50MB file, I have to read it
 into a continuous memory block first?

 -Steve

 I'm just telling how Tango currently works, not how the XML module in  
 Phobos should work. But I guess it might be somewhat limited. 50MB isn't  
 that big to read into memory?

Um... yeah, it is :)  I have 1 GB of memory, my system starts thrashing  
with an app that consumes 750MB.  So that's like 13 xml files read?   
Especially if I want to use DOM, I have to keep them around...

Not to mention that the GC has to allocate a contiguous space for it.  So  
even if I have 100MB of garbage space, maybe none of it is usable, I still  
have to allocate a new block.  I'm just surprised there isn't at least an  
option for a stream-based xml parser in Tango.

One thing this does though, I always assumed it was Tango's I/O that  
accounts for its xml superiority.  I wonder, does anyone count reading the  
file in any of the benchmarks?

I still think we can come close without having to pre-read an entire file.

 I think it would be nice to be able to do both. If you read the whole  
 file before sending it to the parser you would know it doesn't perform  
 any I/O operations.

I totally agree.  I think there's ways to abstract the functionality for  
both memory-based and device-based i/o into one interface (part of the  
reason for the revamp).

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 21:54, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 15:38:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 15:22, Steven Schveighoffer wrote:
 On Thu, 08 Sep 2011 09:16:40 -0400, Jacob Carlborg <doob me.com> wrote:
 The Tango XML parser doesn't read from a file, it takes the input as a
 string. The parser isn't affected by I/O at all.

 So you have to read the entire file before sending it to the parser?

 Isn't that a bit limited? What if I have a 50MB file, I have to read it
 into a continuous memory block first?

 -Steve

 I'm just telling how Tango currently works, not how the XML module in
 Phobos should work. But I guess it might be somewhat limited. 50MB
 isn't that big to read into memory?

 Um... yeah, it is :) I have 1 GB of memory, my system starts thrashing
 with an app that consumes 750MB. So that's like 13 xml files read?
 Especially if I want to use DOM, I have to keep them around...

50MB is far from 750MB :), but I see your point.

 Not to mention that the GC has to allocate a contiguous space for it. So
 even if I have 100MB of garbage space, maybe none of it is usable, I
 still have to allocate a new block. I'm just surprised there isn't at
 least an option for a stream-based xml parser in Tango.

 One thing this does though, I always assumed it was Tango's I/O that
 accounts for its xml superiority. I wonder, does anyone count reading
 the file in any of the benchmarks?

As far as I know it's because of two reasons: it doesn't allocate any 
memory (uses slices) and all methods are final.

I have no idea about the benchmarks.

 I still think we can come close without having to pre-read an entire file.

I hope so as well.

 I think it would be nice to be able to do both. If you read the whole
 file before sending it to the parser you would know it doesn't perform
 any I/O operations.

 I totally agree. I think there's ways to abstract the functionality for
 both memory-based and device-based i/o into one interface (part of the
 reason for the revamp).

 -Steve

A ranged based API as Jonathan and others have said.

-- 
/Jacob Carlborg

Sep 09 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:

 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
=20
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing =



or
 mixing something...
=20
 How about a D binding for http://www.xmlsoft.org/ ?
=20
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

=20
 That is about 4 times slower than the Tango XML parser:
=20
 =


http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-w=
ith-rapidxml/
=20
 You are so right, Timon. How deep is the trench between Phobos and =

Tango devs? Tango's XML parser should really make it into Phobos.

That will never happen.  Though on a positive note, a major reason the =
Tango parser is so fast because there's no copying or translation of the =
underlying data.  Attributes are passed to the user as-is via a slice of =
the input range.  Most parsers in other languages simply don't work this =
way.=

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:

 On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:

 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:

 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...

 How about a D binding for http://www.xmlsoft.org/ ?

 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

 That is about 4 times slower than the Tango XML parser:

 http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-with-rapidxml/

 You are so right, Timon. How deep is the trench between Phobos and  
 Tango devs? Tango's XML parser should really make it into Phobos.

 That will never happen.  Though on a positive note, a major reason the  
 Tango parser is so fast because there's no copying or translation of the  
 underlying data.  Attributes are passed to the user as-is via a slice of  
 the input range.  Most parsers in other languages simply don't work this  
 way.

So in the benchmark neither white-space is collapsed, nor are entities  
like &amp; converted?

Sep 06 2011

Sean Kelly <sean invisibleduck.org> writes:

On Sep 6, 2011, at 6:49 PM, Marco Leise wrote:

 Am 07.09.2011, 00:23 Uhr, schrieb Sean Kelly <sean invisibleduck.org>:
=20
 On Sep 6, 2011, at 2:51 PM, Marco Leise wrote:
=20
 Am 06.09.2011, 22:28 Uhr, schrieb Timon Gehr <timon.gehr gmx.ch>:
=20
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm =





missing or
 mixing something...
=20
 How about a D binding for http://www.xmlsoft.org/ ?
=20
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

=20
 That is about 4 times slower than the Tango XML parser:
=20
 =




http://dotnot.org/blog/archives/2008/03/10/xml-benchmarks-updated-graphs-w=
ith-rapidxml/
=20
 You are so right, Timon. How deep is the trench between Phobos and =



Tango devs? Tango's XML parser should really make it into Phobos.
=20
 That will never happen.  Though on a positive note, a major reason =


the Tango parser is so fast because there's no copying or translation of =
the underlying data.  Attributes are passed to the user as-is via a =
slice of the input range.  Most parsers in other languages simply don't =
work this way.
=20
 So in the benchmark neither white-space is collapsed, nor are entities =

like &amp; converted?

I don't believe so.  That's expected to be done by the user if he cares =
about decoding the field.  Compare this to the Xerces (Apache) XML =
parser that passes in all attributes as wide chars regardless of the =
input format and you can see why parsing XML in D can be so fast: =
passing values via array slicing and having Unicode as the native =
character format.  If the input text is UTF-8 you use XmlParser!char, if =
it's UTF-16 you use XmlParser!wchar, etc.  I'm actually surprised that =
more C/C++ parsers don't work this way.=

Sep 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 22:28:05 Timon Gehr wrote:
 On 09/06/2011 09:36 PM, notna wrote:
 Sorry upfront, I didn't read this hole thread, so maybe I'm missing or
 mixing something...
 
 How about a D binding for http://www.xmlsoft.org/ ?
 
 In other words, taking the "curl or sqlite3 path", something like
 /etc/c/xml2

 
 That is about 4 times slower than the Tango XML parser:

Yeah. Thanks to array slicing, parsing is actually one of the areas that D 
libraries should be able to generally beat C/C++ libraries in terms of speed.

That being said, creating bindings and wrappers for existing libraries is a 
great way to increase Phobos' functionality without reiventing the wheel in 
many cases. But there are definitely cases, where redoing something in D would 
actually be much better. It all depends on what you're trying to do and what 
libraries already exist in C or C++.

- Jonathan M Davis

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 16:51 Uhr, schrieb Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org>:

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at  
 Facebook to good effect. Or should we, au contraire, use "old_" for the  
 old module and advise people who want to stick with the old modules to  
 change their imports?


 Andrei

What about:

            std.xml1
std.xml -> std.xml2

So std.xml is a symbolic link to std.xml2 in the next release or std.xml2  
public imports std.xml ?
This is what /bin/python is on my computer.

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 14:13:45 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

 Am 06.09.2011, 16:51 Uhr, schrieb Andrei Alexandrescu  
 <SeeWebsiteForEmail erdani.org>:

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at  
 Facebook to good effect. Or should we, au contraire, use "old_" for the  
 old module and advise people who want to stick with the old modules to  
 change their imports?


 Andrei

 What about:

             std.xml1
 std.xml -> std.xml2

 So std.xml is a symbolic link to std.xml2 in the next release or  
 std.xml2 public imports std.xml ?
 This is what /bin/python is on my computer.

That only works/is worth it if std.xml2 is backwards compatible with  
std.xml1.

-Steve

Sep 06 2011

Brad Anderson <eco gnuk.net> writes:

On Tue, Sep 6, 2011 at 8:51 AM, Andrei Alexandrescu <
SeeWebsiteForEmail erdani.org> wrote:

 On 9/6/11 2:35 AM, Walter Bright wrote:

 On 9/5/2011 11:39 PM, Jacob Carlborg wrote:

 We don't want to have a standard library like the one in PHP where
 there seems
 to be no naming conventions at all.

 I don't think that is the reason PHP is such a bear to work with.

 Probably. At any rate, what I now think as a promising path is with new
 module names. Let's leave the likes of std.xml and std.json in peace, then
 pick a naming convention for the new ones and create whole new modules
 replacing them. Then people who are ready for the migration change

 import std.xml;

 with

 import std.some_naming_convention_**involving_xml;

 and fix whatever code breakages that entails. If they're pleased with
 std.xml, nobody's holding a gun to their head.

 Months and years go by, and nobody uses std.xml because the new module and
 the migration path are copiously advertised in the documentation. At that
 point we can discuss excising std.xml altogether and replacing it with the
 new one. And so the new becomes old, just like in dialectics.

 There's a successful precedent in C++ - stringstream vs. strstream. The
 only missing thing is that C++ did not choose a naming convention because
 they limited themselves to only one header.

 So what should we use? xml2? new_xml? FWIW we use the prefix "new_" at
 Facebook to good effect. Or should we, au contraire, use "old_" for the old
 module and advise people who want to stick with the old modules to change
 their imports?


 Andrei

Along these same lines I'm wondering why not simply call this new module
std.io rather than use the existing name std.stdio?  It'd avoid the code
breaking issue and help reflect that this new module isn't based around C's
stdio FILE (at least that's what I gather).  Also, the code is written from
scratch so that's another reason for why I don't think it should have the
same name.  The only reason I can think of is if it provided significant
improvements over the existing std.stdio without causing massive breakage.

Regards,
Brad Anderson

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 14:21:52 -0400, Brad Anderson <eco gnuk.net> wrote:

 On Tue, Sep 6, 2011 at 8:51 AM, Andrei Alexandrescu <
 SeeWebsiteForEmail erdani.org> wrote:

 Probably. At any rate, what I now think as a promising path is with new
 module names. Let's leave the likes of std.xml and std.json in peace,  
 then
 pick a naming convention for the new ones and create whole new modules
 replacing them. Then people who are ready for the migration change

 import std.xml;

 with

 import std.some_naming_convention_**involving_xml;

 Along these same lines I'm wondering why not simply call this new module
 std.io rather than use the existing name std.stdio?  It'd avoid the code
 breaking issue and help reflect that this new module isn't based around  
 C's
 stdio FILE (at least that's what I gather).  Also, the code is written  
 from
 scratch so that's another reason for why I don't think it should have the
 same name.  The only reason I can think of is if it provided significant
 improvements over the existing std.stdio without causing massive  
 breakage.

I think for something like std.xml which is somewhat of a standalone  
module, this is fine.

However, i/o is used *everywhere*.  It's the same situation with  
std.datetime.  We can't duplicate all functions which deal with i/o in  
order to cater to both the stdio and the std.io folks, I think it's a  
waste of time, and it also looks bad.

But I think I have come up with a plan (obviously not the one posted here)  
which keeps stdio's API compatible, yet can use the new stuff I've written  
if desired. i.e. provides improvements over the current std.stdio without  
causing massive breakage.

Coincidentally, std.io is the name I chose for the new module ;)

I'll post again when it's something that can be shared.  I want to get all  
my ducks in a row first (obviously more than I did this time...)

-Steve

Sep 06 2011

Mafi <mafi example.org> writes:

 Along these same lines I'm wondering why not simply call this new module
 std.io <http://std.io> rather than use the existing name std.stdio?
   It'd avoid the code breaking issue and help reflect that this new
 module isn't based around C's stdio FILE (at least that's what I
 gather).  Also, the code is written from scratch so that's another
 reason for why I don't think it should have the same name.  The only
 reason I can think of is if it provided significant improvements over
 the existing std.stdio without causing massive breakage.

 Regards,
 Brad Anderson

I think this is a good idea. I think std.io sounds and feels much better.

Mafi

Sep 06 2011

Paul D. Anderson <paul.d.removethis.anderson comcast.andthis.net> writes:

Mafi Wrote:

 Along these same lines I'm wondering why not simply call this new module
 std.io <http://std.io> rather than use the existing name std.stdio?
   It'd avoid the code breaking issue and help reflect that this new
 module isn't based around C's stdio FILE (at least that's what I
 gather).  Also, the code is written from scratch so that's another
 reason for why I don't think it should have the same name.  The only
 reason I can think of is if it provided significant improvements over
 the existing std.stdio without causing massive breakage.

 Regards,
 Brad Anderson

 
 I think this is a good idea. I think std.io sounds and feels much better.
 
 Mafi

I think this is a terrific suggestion.

Paul

Sep 06 2011

bearophile <bearophileHUGS lycos.com> writes:

Paul D. Anderson:

 I think this is a terrific suggestion.

I have suggested std.io time ago, but someone doesn't like it:
http://d.puremagic.com/issues/show_bug.cgi?id=4718

Bye,
bearophile

Sep 06 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 18:48:24 bearophile wrote:
 Paul D. Anderson:
 I think this is a terrific suggestion.

 
 I have suggested std.io time ago, but someone doesn't like it:
 http://d.puremagic.com/issues/show_bug.cgi?id=4718

It's not enough of an improvement to rename std.stdio to std.io just to rename 
it. However, if Steven's ultimate changes are different enough that a separate 
module is needed for a clean migration path, and those changes do get accepted 
into Phobos, then naming the new module std.io makes good sense.

- Jonathan M Davis

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 18:57:00 -0400, Jonathan M Davis <jmdavisProg gmx.com>  
wrote:

 On Tuesday, September 06, 2011 18:48:24 bearophile wrote:
 Paul D. Anderson:
 I think this is a terrific suggestion.

 I have suggested std.io time ago, but someone doesn't like it:
 http://d.puremagic.com/issues/show_bug.cgi?id=4718

 It's not enough of an improvement to rename std.stdio to std.io just to  
 rename
 it. However, if Steven's ultimate changes are different enough that a  
 separate
 module is needed for a clean migration path, and those changes do get  
 accepted
 into Phobos, then naming the new module std.io makes good sense.

When I get my re-revamped stdio working, it will likely involve std.io  
(which I independently decided to use).  It will not be a replacement for  
stdio, but will be used by stdio.

So I'm glad others think this is a good idea.

-Steve

Sep 08 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-05 21:51:16 +0000, Walter Bright <newshound2 digitalmars.com> said:

 I'll again note that I know of know successful operating system or 
 programming language that goes around breaking existing code unless it 
 is really, really urgent.

Apple has been deprecating things a lot in Mac OS X. Deprecated APIs 
generally continue to work fine for a long time and only trigger 
warnings when you compile something that uses them, effectively making 
them inconvenient. Some deprecation messages that can't be compilation 
warnings are logged to the console when used instead (deprecated flags 
for instance), only once per process though.

Sometime APIs are truly disabled, but they are not removed. For 
instance, the old API for accessing the screen's pixels has become 
non-functional in Mac OS X 10.7 Lion. Only the new API introduced 10.6 
works now, the old one was still there but you just get a null pointer.

Sometime APIs disappear when passing to a new architecture. For 
instance, Mac OS X still supports the old Carbon APIs, but only in 
32-bit mode, those were never made available to 64-bit applications.

But what works well for an operating system is not necessarily what 
works well for a runtime and a standard library. What Apple does is 
meant to keep binary compatibility. Users are not expected to have the 
source code of their application at hand, nor the expertise to fix 
them. They deprecate things so the OS can move forward and introduce 
new features, and using deprecated APIs generally mean that your app 
will have trouble using new features or move to new architectures in 
the future.

The situation for the D standard library is a little different. If you 
compile D code, you do have the source code at hand. My take is that we 
should not remove deprecated APIs and thus break old programs unless 
keeping those APIs really cost too much or impede future improvements. 
Showing a deprecation message and marking them as deprecated in the 
documentation is important to incite people to use the non-deprecated 
APIs, but for simple things like name changes perhaps the deprecation 
message during compilation could be left out, as the improvement to 
annoyance ratio would be quite low.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 06 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

It doesn't work that well. dmd breaks with every new OS update.

The winner with binary compatibility is, far and away, Microsoft.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

 It doesn't work that well. dmd breaks with every new OS update.

 The winner with binary compatibility is, far and away, Microsoft.

Maybe it would work better if you would use the proper API instead of 
putting __name_beg and __name_end around sections in the binary, i.e. 
__minfo_beg and __minfo_end.

-- 
/Jacob Carlborg

Sep 07 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

 It doesn't work that well. dmd breaks with every new OS update.

 The winner with binary compatibility is, far and away, Microsoft.

 Maybe it would work better if you would use the proper API instead of putting
 __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and
 __minfo_end.

Actually, I did follow documented behavior of ld. Unfortunately, ld does not 
follow the documented behavior.

Sep 07 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-07 11:35, Walter Bright wrote:
 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

 It doesn't work that well. dmd breaks with every new OS update.

 The winner with binary compatibility is, far and away, Microsoft.

 Maybe it would work better if you would use the proper API instead of
 putting
 __name_beg and __name_end around sections in the binary, i.e.
 __minfo_beg and
 __minfo_end.

 Actually, I did follow documented behavior of ld. Unfortunately, ld does
 not follow the documented behavior.

I don't know exactly what documentation you've read but this is what 
I've found:

http://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/MachORuntime/Reference/reference.html

http://developer.apple.com/library/mac/#documentation/DeveloperTools/Reference/MachOReference/Reference/reference.html

The second link contains documentation for "getsectbyname" and similar 
functions for getting information and data from sections and segments. 
By using these functions __minfo_beg __minfo_end become unnecessary.

I have a fork of druntime that uses these functions. But at the same 
time I'm trying to make it work with dynamic libraries and I can't get 
TLS to work with dynamic libraries.

-- 
/Jacob Carlborg

Sep 07 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-07 09:35:26 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

 
 It doesn't work that well. dmd breaks with every new OS update.
 
 The winner with binary compatibility is, far and away, Microsoft.

 
 Maybe it would work better if you would use the proper API instead of putting
 __name_beg and __name_end around sections in the binary, i.e. __minfo_beg and
 __minfo_end.

 
 Actually, I did follow documented behavior of ld. Unfortunately, ld 
 does not follow the documented behavior.

Indeed. Although nowhere in the documentation does it says what the 
linker does with empty sections, it is reasonable to assume they'd be 
treated like other sections (kept in the right order). But this has 
been proven unreliable and it turns out there are proper APIs to do 
what this hack was meant to do, so we should use them instead.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 07 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-07 14:23, Michel Fortin wrote:
 On 2011-09-07 09:35:26 +0000, Walter Bright <newshound2 digitalmars.com>
 said:

 On 9/7/2011 12:35 AM, Jacob Carlborg wrote:
 On 2011-09-06 20:00, Walter Bright wrote:
 On 9/6/2011 5:02 AM, Michel Fortin wrote:
 What Apple does is meant to keep binary
 compatibility.

 It doesn't work that well. dmd breaks with every new OS update.

 The winner with binary compatibility is, far and away, Microsoft.

 Maybe it would work better if you would use the proper API instead of
 putting
 __name_beg and __name_end around sections in the binary, i.e.
 __minfo_beg and
 __minfo_end.

 Actually, I did follow documented behavior of ld. Unfortunately, ld
 does not follow the documented behavior.

 Indeed. Although nowhere in the documentation does it says what the
 linker does with empty sections, it is reasonable to assume they'd be
 treated like other sections (kept in the right order). But this has been
 proven unreliable and it turns out there are proper APIs to do what this
 hack was meant to do, so we should use them instead.

 From the ld man page, section "Layout":

"All zero fill sections will appear after all non-zero fill sections in 
their segments."

-- 
/Jacob Carlborg

Sep 07 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

Indeed, I think you're right that they are better than Apple. But you 
have to keep in mind that DMD doesn't depend on Microsoft's linker, and 
doesn't depend on Microsoft's C runtime. I bet you'd see more breakages 
otherwise.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 07 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

 Indeed, I think you're right that they are better than Apple. But you have to
 keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend
 on Microsoft's C runtime. I bet you'd see more breakages otherwise.


I used to know people who worked in Microsoft's "app compat" department. The 
lengths they would go to to maintain support for older apps was amazing. It 
wasn't about just supporting documented behavior, it was supporting
undocumented 
behavior and gross misuse of the APIs.

Sep 07 2011

dsimcha <dsimcha yahoo.com> writes:

On 9/7/2011 6:22 PM, Walter Bright wrote:
 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright
 <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

 Indeed, I think you're right that they are better than Apple. But you
 have to
 keep in mind that DMD doesn't depend on Microsoft's linker, and
 doesn't depend
 on Microsoft's C runtime. I bet you'd see more breakages otherwise.


 I used to know people who worked in Microsoft's "app compat" department.
 The lengths they would go to to maintain support for older apps was
 amazing. It wasn't about just supporting documented behavior, it was
 supporting undocumented behavior and gross misuse of the APIs.

Yeh, the story of Raymond Chen working on a team that disassembled 
SimCity and inserted extra code to make it work even though it used 
previously freed memory comes to mind.

Sep 07 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/7/2011 4:06 PM, dsimcha wrote:
 On 9/7/2011 6:22 PM, Walter Bright wrote:
 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright
 <newshound2 digitalmars.com> said:

 The winner with binary compatibility is, far and away, Microsoft.

 Indeed, I think you're right that they are better than Apple. But you
 have to
 keep in mind that DMD doesn't depend on Microsoft's linker, and
 doesn't depend
 on Microsoft's C runtime. I bet you'd see more breakages otherwise.


 I used to know people who worked in Microsoft's "app compat" department.
 The lengths they would go to to maintain support for older apps was
 amazing. It wasn't about just supporting documented behavior, it was
 supporting undocumented behavior and gross misuse of the APIs.

 Yeh, the story of Raymond Chen working on a team that disassembled SimCity and
 inserted extra code to make it work even though it used previously freed memory
 comes to mind.


I believe this was a large factor in the success of Microsoft Windows.

Sep 07 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-07 22:22:25 +0000, Walter Bright <newshound2 digitalmars.com> said:

 On 9/7/2011 5:21 AM, Michel Fortin wrote:
 On 2011-09-06 18:00:36 +0000, Walter Bright <newshound2 digitalmars.com> said:
 
 The winner with binary compatibility is, far and away, Microsoft.

 
 Indeed, I think you're right that they are better than Apple. But you have to
 keep in mind that DMD doesn't depend on Microsoft's linker, and doesn't depend
 on Microsoft's C runtime. I bet you'd see more breakages otherwise.

 
 I used to know people who worked in Microsoft's "app compat" 
 department. The lengths they would go to to maintain support for older 
 apps was amazing. It wasn't about just supporting documented behavior, 
 it was supporting undocumented behavior and gross misuse of the APIs.

Well, sometime Apple does support undocumented behaviour of previous 
version of their OS too. Take this prototype from time.h for instance:

	clock_t clock(void) __DARWIN_ALIAS(clock);

What this __DARWIN_ALIAS macro does is it forces the code to use 
"_clock$UNIX2003" as the symbol name for the clock() function instead 
of the standard "_clock" symbol name. That's because the older version 
of the function had some bug in it (it was not conformant to some UNIX 
standard) but they still wanted old binaries to continue using the old 
version (so they don't break). Code compiled with the newer header will 
link with the fixed "_clock$UNIX2003" function instead of the old buggy 
one.

But more generally, there's sometime a long term cost in supporting 
undocumented behaviour. If you let developers use undocumented things 
without consequence, you send the message that they can depend on them 
and they'll just depend on them more, and the more software that 
depends on undocumented behaviours the harder it becomes to tweak the 
API without breaking everything.

-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 07 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

I dislike naming things with a leading "D" like "DInput". Shouldn't we
keep code that relies on C to be put in etc.c or somewhere?

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 18:55:08 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 I dislike naming things with a leading "D" like "DInput". Shouldn't we
 keep code that relies on C to be put in etc.c or somewhere?

I think the names are not great.  The names are somewhat based on the  
metamorphosis of the entire interface structure.

What about BufferedInput and BufferedOutput?  Michel Fortin suggested  
those.

-Steve

Sep 03 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Ah, reading your post I see this is just a start of the overhaul. I
assumed this was already getting ready for a review. Names can be
fixed eventually. :)

Sep 03 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 04:35, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:55:08 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 I dislike naming things with a leading "D" like "DInput". Shouldn't we
 keep code that relies on C to be put in etc.c or somewhere?

 I think the names are not great. The names are somewhat based on the
 metamorphosis of the entire interface structure.

 What about BufferedInput and BufferedOutput? Michel Fortin suggested those.

 -Steve

These names are a lot better.

-- 
/Jacob Carlborg

Sep 04 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Also, changing structs to classes is gonna *massively* break code
everywhere. Why inheritance instead of a predicate like isInputStream
= is(typeof(T t; t.put; t.close)), you know the drill..

Sep 03 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 04.09.2011, 00:57 Uhr, schrieb Andrej Mitrovic  
<andrej.mitrovich gmail.com>:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Wasn't this overhaul _meant_ to break existing code by offering a new API?  
Still that's a serious issue of course, but not too surprising. I'm  
ambivalent on the inheritance vs predicate debate. Interfaces are the way  
it is meant to be done and actually ensure correct types. Predicates work  
with structs as well. I don't know if this would be important.

Sep 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, September 04, 2011 02:49:40 Marco Leise wrote:
 Am 04.09.2011, 00:57 Uhr, schrieb Andrej Mitrovic
 
 <andrej.mitrovich gmail.com>:
 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

 
 Wasn't this overhaul _meant_ to break existing code by offering a new API?
 Still that's a serious issue of course, but not too surprising. I'm
 ambivalent on the inheritance vs predicate debate. Interfaces are the way
 it is meant to be done and actually ensure correct types. Predicates work
 with structs as well. I don't know if this would be important.

Any overhaul of existing functionality needs to improve on existing 
functionality. Changes just to change aren't valuable. So, changes should 
generally avoiding breaking backwards compatibility unless we gain something 
from it. So, as long as these changes are an overall improvement, then we'll 
just have to deal with the code breakage. However, if the code breakage 
doesn't actually gain us anything, then we should avoid it. So, complaints 
about code breakage are valid, but they aren't deal breaking.

- Jonathan M Davis

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So, complaints
 about code breakage are valid, but they aren't deal breaking.

The larger the amount of code that is broken, the more gain there must be to 
justify it.

Breaking std.stdio, which is used everywhere, this thoroughly needs a very high 
bar of justification.

Sep 03 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

Seems to me like virtually every module in Phobos gets a complete
rewrite sooner or later. Yikes! Afaik the upcoming ones are also
std.xml, std.variant, maybe std.json too? (can't recall). Was there
really so much bad code written in Phobos all along that they all
require a rewrite?

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrej Mitrovic (andrej.mitrovich gmail.com)'s article
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

It's really amazing how much cruft 2-3 year old D code tends to have: 
Workarounds
for compiler bugs, workarounds for previously missing features, a generally
lower
standard for quality before we implemented a proper review process, etc.  Heck,
I've got a pull request in Github that rewrites a substantial portion of
std.parallelism to take advantage of better implementations I've found for
parallel foreach and amap, fix a couple bugs and get rid of tons of cruft, and
this module's only been in Phobos a few months.  These changes are purely under
the hood, though, and there should be zero code breakage.

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 7:27 PM, dsimcha wrote:
 These changes are purely under
 the hood, though, and there should be zero code breakage.

Those are the great kind of changes, and it's also nice in that it means the
API 
was done reasonably right.

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 10:02 PM, Andrej Mitrovic wrote:
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

It's not that bad. First, it's understandable that now there are 
considerably more contributors and it's a bit easier tinkering with 
existing stuff than coming up with all new stuff.

Second, historically we're at an all-time high of talent involved in D. 
I'm sure it will go up much more, but previously we've had a more 
accepting attitude to new functionality at the cost of scrutiny (e.g. 
std.xml and std.json, both written by episodic contributors). (I really 
regret having had that attitude, it hurt us.) So now that there are so 
many eyeballs focused on the code, and not just any eyeballs but 
eyeballs connected to good brains, there is pressure building up.

There are quite a few pieces in Phobos that are withstanding scrutiny 
quite well: getopt, algorithm, variant (which can be, I think, safely 
extended to new great functionality), range, conv, random, and more. 
There are, unfortunately, others that didn't start off the right foot 
and right now are somewhat of an eyesore. I trust we will figure what to 
do about each on a by-case basis, though I agree with Walter that we 
should balance the breakage cost with correspondingly high rewards in 
terms of functionality improvements.


Andrei

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 I'm sure it will go up much more, but previously we've had a more
 accepting attitude to new functionality at the cost of scrutiny (e.g.
 std.xml and std.json, both written by episodic contributors). (I really
 regret having had that attitude, it hurt us.) So now that there are so
 many eyeballs focused on the code, and not just any eyeballs but
 eyeballs connected to good brains, there is pressure building up.
 There are quite a few pieces in Phobos that are withstanding scrutiny
 quite well: getopt, algorithm, variant (which can be, I think, safely
 extended to new great functionality), range, conv, random, and more.
 There are, unfortunately, others that didn't start off the right foot
 and right now are somewhat of an eyesore. I trust we will figure what to
 do about each on a by-case basis, though I agree with Walter that we
 should balance the breakage cost with correspondingly high rewards in
 terms of functionality improvements.
 Andrei

Yes, the quality standard has gone up massively.  When I was prepping
std.parallelism for review a few months ago, I generally used the existing
Phobos
documentation as a guideline for what std.parallelism's docs should resemble.
Andrei, of course, ripped the documentation apart.  In hindsight it led to
massive
improvements and was for the better.  It certainly set the tone for clear,
precise
documentation in the future and the same high standards were applied to std.path
and the std.curl.  However, at the time I actually thought he just hated
std.parallelism at a gut level and was looking for any excuse to keep it out of
Phobos.  (I apologize for having thought this and therefore taken a much more
adversarial view of the review process than I should have.)

Sep 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, September 04, 2011 03:22:21 dsimcha wrote:
 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 
 I'm sure it will go up much more, but previously we've had a more
 accepting attitude to new functionality at the cost of scrutiny (e.g.
 std.xml and std.json, both written by episodic contributors). (I really
 regret having had that attitude, it hurt us.) So now that there are so
 many eyeballs focused on the code, and not just any eyeballs but
 eyeballs connected to good brains, there is pressure building up.
 There are quite a few pieces in Phobos that are withstanding scrutiny
 quite well: getopt, algorithm, variant (which can be, I think, safely
 extended to new great functionality), range, conv, random, and more.
 There are, unfortunately, others that didn't start off the right foot
 and right now are somewhat of an eyesore. I trust we will figure what to
 do about each on a by-case basis, though I agree with Walter that we
 should balance the breakage cost with correspondingly high rewards in
 terms of functionality improvements.
 Andrei

 
 Yes, the quality standard has gone up massively.  When I was prepping
 std.parallelism for review a few months ago, I generally used the existing
 Phobos documentation as a guideline for what std.parallelism's docs should
 resemble. Andrei, of course, ripped the documentation apart.  In hindsight
 it led to massive improvements and was for the better.  It certainly set
 the tone for clear, precise documentation in the future and the same high
 standards were applied to std.path and the std.curl.  However, at the time
 I actually thought he just hated std.parallelism at a gut level and was
 looking for any excuse to keep it out of Phobos.  (I apologize for having
 thought this and therefore taken a much more adversarial view of the review
 process than I should have.)

std.datetime is far better for having gone through multiple reviews as well. 
The resulting code isn't perfect, and reviews don't always catch everything, 
but thorough reviews really help improve the quality of code. Even just having 
other contributors look over pull requests tends to find stuff that can and 
should be improved. So, while there will likely always be some issues with 
code that make it into Phobos, the overall code quality is definitely 
improving.

- Jonathan M Davis

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

On 9/3/2011 8:22 PM, dsimcha wrote:
 However, at the time I actually thought he just hated
 std.parallelism at a gut level and was looking for any excuse to keep it out of
 Phobos.  (I apologize for having thought this and therefore taken a much more
 adversarial view of the review process than I should have.)

I can vouch for Andrei's reviews appearing to be personal, but they are not. 
He's mercilessly ripped up some of my stuff, but I had to agree he was right
and 
the resulting improvement was well worth it.

I don't much care for blowing sunshine, flattery and false praise.

Andrei sets a high bar, I'm glad he does, and we'll all be better off for it.

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 12:11 AM, Walter Bright wrote:
 On 9/3/2011 8:22 PM, dsimcha wrote:
 However, at the time I actually thought he just hated
 std.parallelism at a gut level and was looking for any excuse to keep
 it out of
 Phobos. (I apologize for having thought this and therefore taken a
 much more
 adversarial view of the review process than I should have.)

 I can vouch for Andrei's reviews appearing to be personal, but they are
 not. He's mercilessly ripped up some of my stuff, but I had to agree he
 was right and the resulting improvement was well worth it.

 I don't much care for blowing sunshine, flattery and false praise.

 Andrei sets a high bar, I'm glad he does, and we'll all be better off
 for it.

This is a bit of a surprise for me because I fancy (fancied...) to see 
myself as this emotionless, rational reviewer. Thank you all for putting 
up with me.

Andrei

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 9:53 PM, Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints
 about code breakage are valid, but they aren't deal breaking.

 The larger the amount of code that is broken, the more gain there must
 be to justify it.

 Breaking std.stdio, which is used everywhere, this thoroughly needs a
 very high bar of justification.

I agree. I'm hoping the new stuff could build on top of std.stdio.

Andrei

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 22:27:49 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 9:53 PM, Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints
 about code breakage are valid, but they aren't deal breaking.

 The larger the amount of code that is broken, the more gain there must
 be to justify it.

 Breaking std.stdio, which is used everywhere, this thoroughly needs a
 very high bar of justification.

 I agree. I'm hoping the new stuff could build on top of std.stdio.

It is my plan for the eventual result to break either no code, or as  
little code as possible.  The current library is mostly a  
proof-of-concept, to see what people think, and to show what might be  
possible.  I think the interfaces in this library make for a much  
easier-to-write xml library for instance.

It's by no means a proposal for immediate acceptance into Phobos, I'm  
sorry if it came across that way.

We have to break something in std.stdio, because it's fixated on FILE *.   
We need something that allows FILE * to play the game, but is focused on a  
D-based solution.  Otherwise, we have no room for improvement.  that's  
what I'm striving for.  And along the way, I'm trying to make it as  
efficient as possible.

-Steve

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 11:33 PM, Steven Schveighoffer wrote:
 We have to break something in std.stdio, because it's fixated on FILE *.
 We need something that allows FILE * to play the game, but is focused on
 a D-based solution. Otherwise, we have no room for improvement.

I'm not 100% convinced of that. We can achieve a good deal of 
improvement by resorting to platform-specific code. Clearly that's not 
the best way to go but it's not difficult and it does have its merit.

Overall I think the design of std.stdio should be followed:

1. User opens a File (or whatever), which is a struct. The struct uses RAII.

2. Using the struct you can directly call primitives to read and write 
stuff.

3. You can also decide you want a polymorphic stream out of it, and you 
get to decide the parameters of the stream (buffering, chunking, 
synchronicity and whatnot). byChunk and byLine are good examples, 
although they aren't polymorphic. Once you have such a stream you're in 
polyland so you get to use all of its goodies (look ma no templates etc).

4. Once all copies of the struct is destroyed, all streams derived from 
it are automatically closed and will issue errors when used.

That's pretty much it! It's a simple design that does all we need.


Andrei

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 23:45:17 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 11:33 PM, Steven Schveighoffer wrote:
 We have to break something in std.stdio, because it's fixated on FILE *.
 We need something that allows FILE * to play the game, but is focused on
 a D-based solution. Otherwise, we have no room for improvement.

 I'm not 100% convinced of that. We can achieve a good deal of  
 improvement by resorting to platform-specific code. Clearly that's not  
 the best way to go but it's not difficult and it does have its merit.

 Overall I think the design of std.stdio should be followed:

 1. User opens a File (or whatever), which is a struct. The struct uses  
 RAII.

OK, I think that's the offer on the table I keep getting :)  I'm  
definitely going to use this, and its name will be File.  I think it has  
to be in order to be compatible with all current code.

 2. Using the struct you can directly call primitives to read and write  
 stuff.

Buffered reads and writes?  If so, don't you need to decide the items in  
point 3 before read/write?  If not buffered, then I think I can work with  
this.

 3. You can also decide you want a polymorphic stream out of it, and you  
 get to decide the parameters of the stream (buffering, chunking,  
 synchronicity and whatnot). byChunk and byLine are good examples,  
 although they aren't polymorphic. Once you have such a stream you're in  
 polyland so you get to use all of its goodies (look ma no templates etc).

 4. Once all copies of the struct is destroyed, all streams derived from  
 it are automatically closed and will issue errors when used.

OK, I think I know how to do this.

I'm assuming if you want to use exclusively the poly versions, you can do  
that.  I.e. you don't have to keep an RAII File struct around.

 That's pretty much it! It's a simple design that does all we need.

I'll work on that.

How should text vs. non-text i/o work?  C currently conflates them at the  
same level, but I think they are two separate layers.  What do you think?

-Steve

Sep 03 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Saturday, September 03, 2011 18:53:00 Walter Bright wrote:
 On 9/3/2011 5:58 PM, Jonathan M Davis wrote:
 However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So,
 complaints about code breakage are valid, but they aren't deal
 breaking.

 
 The larger the amount of code that is broken, the more gain there must be to
 justify it.
 
 Breaking std.stdio, which is used everywhere, this thoroughly needs a very
 high bar of justification.

Agreed.

- Jonathan M Davis

Sep 03 2011

bearophile <bearophileHUGS lycos.com> writes:

Jonathan M Davis:

 Breaking std.stdio, which is used everywhere, this thoroughly needs a very
 high bar of justification.

 
 Agreed.

The purpose of the gofix tool in the Go language library is to lower this bar
significantly :-)

Bye,
bearophile

Sep 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Sunday, September 04, 2011 04:02:17 Andrej Mitrovic wrote:
 Seems to me like virtually every module in Phobos gets a complete
 rewrite sooner or later. Yikes! Afaik the upcoming ones are also
 std.xml, std.variant, maybe std.json too? (can't recall). Was there
 really so much bad code written in Phobos all along that they all
 require a rewrite?

Most of it's older stuff which has been around since D1, I believe - either 
that or it came fairly early in D2.

- Jonathan M Davis

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

== Quote from Jonathan M Davis (jmdavisProg gmx.com)'s article
 Any overhaul of existing functionality needs to improve on existing
 functionality. Changes just to change aren't valuable. So, changes should
 generally avoiding breaking backwards compatibility unless we gain something
 from it. So, as long as these changes are an overall improvement, then we'll
 just have to deal with the code breakage. However, if the code breakage
 doesn't actually gain us anything, then we should avoid it. So, complaints
 about code breakage are valid, but they aren't deal breaking.
 - Jonathan M Davis

I mostly agree with what you said, except that this proposal breaks a frequently
used standard library module severely and without a clear gradual migration
path.

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic  
<andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

Because it breaks runtime swapping of I/O.

For example, if you wanted to change stdin to a network socket, it's  
simple, just assign another InputStream.

However, if stdin is a templated struct, you cannot do this at runtime,  
you have to decide at compile time what your stdin is.  Believe it or not,  
this is not dissimilar to FILE *, except we have more flexibility.

But I realize the implications now.  I think I have to revisit this  
decision.

We definitely need classes at the lower level, but I think we can wrap  
them with structs that are commonly used for RAII and for not breaking  
existing code.

-Steve

Sep 03 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 04:48, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

 Because it breaks runtime swapping of I/O.

 For example, if you wanted to change stdin to a network socket, it's
 simple, just assign another InputStream.

 However, if stdin is a templated struct, you cannot do this at runtime,
 you have to decide at compile time what your stdin is. Believe it or
 not, this is not dissimilar to FILE *, except we have more flexibility.

 But I realize the implications now. I think I have to revisit this
 decision.

 We definitely need classes at the lower level, but I think we can wrap
 them with structs that are commonly used for RAII and for not breaking
 existing code.

 -Steve

Tango has added a new method to Object, "dispose". The method is called 
by the runtime when a scoped class exits a scope:

void foo ()
{
     scope f = new File;
}

When "foo" exits File.dispose will be called and it can close any file 
handles. I think it's quite clever.

-- 
/Jacob Carlborg

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 On 2011-09-04 04:48, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 18:57:06 -0400, Andrej Mitrovic
 <andrej.mitrovich gmail.com> wrote:

 Also, changing structs to classes is gonna *massively* break code
 everywhere. Why inheritance instead of a predicate like isInputStream
 = is(typeof(T t; t.put; t.close)), you know the drill..

 Because it breaks runtime swapping of I/O.

 For example, if you wanted to change stdin to a network socket, it's
 simple, just assign another InputStream.

 However, if stdin is a templated struct, you cannot do this at runtime,
 you have to decide at compile time what your stdin is. Believe it or
 not, this is not dissimilar to FILE *, except we have more flexibility.

 But I realize the implications now. I think I have to revisit this
 decision.

 We definitely need classes at the lower level, but I think we can wrap
 them with structs that are commonly used for RAII and for not breaking
 existing code.

 -Steve

 Tango has added a new method to Object, "dispose". The method is called
 by the runtime when a scoped class exits a scope:

 void foo ()
 {
 scope f = new File;
 }

 When "foo" exits File.dispose will be called and it can close any file
 handles. I think it's quite clever.

What happens if f is aliased beyond the existence of foo()?

Andrei

Sep 04 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 14:59, Andrei Alexandrescu wrote:
 On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 Tango has added a new method to Object, "dispose". The method is called
 by the runtime when a scoped class exits a scope:

 void foo ()
 {
 scope f = new File;
 }

 When "foo" exits File.dispose will be called and it can close any file
 handles. I think it's quite clever.

 What happens if f is aliased beyond the existence of foo()?

 Andrei

I'm not sure if this is what you mean but:

File file;

void foo ()
{
     scope f = new File;
     file = f;
}

void main ()
{
     foo;
     // file is disposed here
}

In the above example "dispose" will be called when "foo" exits. After 
the call to "foo" in the main function "file" will refer to an object 
that is disposed, i.e. an object where the "dispose" method has been called.

I don't know how bad this is or if it is bad at all. I would be the same 
as the following code:

File file;

void foo ()
{
     auto f = new File;
     f.close;
     file = f;
}

void main ()
{
     foo;
}

-- 
/Jacob Carlborg

Sep 04 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/4/11 2:58 PM, Jacob Carlborg wrote:
 On 2011-09-04 14:59, Andrei Alexandrescu wrote:
 On 9/4/11 7:11 AM, Jacob Carlborg wrote:
 Tango has added a new method to Object, "dispose". The method is called
 by the runtime when a scoped class exits a scope:

 void foo ()
 {
 scope f = new File;
 }

 When "foo" exits File.dispose will be called and it can close any file
 handles. I think it's quite clever.

 What happens if f is aliased beyond the existence of foo()?

 Andrei

 I'm not sure if this is what you mean but:

 File file;

 void foo ()
 {
 scope f = new File;
 file = f;
 }

 void main ()
 {
 foo;
 // file is disposed here
 }

 In the above example "dispose" will be called when "foo" exits. After
 the call to "foo" in the main function "file" will refer to an object
 that is disposed, i.e. an object where the "dispose" method has been
 called.

 I don't know how bad this is or if it is bad at all.

Well it's not bad but a bit underwhelming. Clearly it's better than the 
unsafe behavior of scope, but it's nothing to write home about. The 
grand save it makes is replacing "scope(exit) f.dispose();" with "scope" 
in front of the declaration. That does systematically save some typing, 
but it's a feature with only local, non-modular effect, and limited 
abstraction power.


Andrei

Sep 04 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-04 21:34, Andrei Alexandrescu wrote:
 On 9/4/11 2:58 PM, Jacob Carlborg wrote:
 I'm not sure if this is what you mean but:

 File file;

 void foo ()
 {
 scope f = new File;
 file = f;
 }

 void main ()
 {
 foo;
 // file is disposed here
 }

 In the above example "dispose" will be called when "foo" exits. After
 the call to "foo" in the main function "file" will refer to an object
 that is disposed, i.e. an object where the "dispose" method has been
 called.

 I don't know how bad this is or if it is bad at all.

 Well it's not bad but a bit underwhelming. Clearly it's better than the
 unsafe behavior of scope, but it's nothing to write home about. The
 grand save it makes is replacing "scope(exit) f.dispose();" with "scope"
 in front of the declaration. That does systematically save some typing,
 but it's a feature with only local, non-modular effect, and limited
 abstraction power.


 Andrei

Yeah, a variable declared as "scope" shouldn't, preferably, exit it's 
scope. The compiler will at least complain if you try to return a scoped 
variable.

-- 
/Jacob Carlborg

Sep 04 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 17:20:53 -0400, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s  
 article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

 After a quick look, I have two concerns:

 1.  File is a class, not a struct.  This precludes using reference  
 counting as the
 current std.stdio.File does, meaning you have to close all your Files  
 manually.  I
 loved the reference counting semantics, especially the last few releases  
 since
 most of the relevant compiler bugs have been fixed.

As long as a class can contain a File as a member, this argument makes no  
sense to me.  In other words, it's impossible to remove the GC from the  
File destructor/refcounting system.

I think what may end up happening, in terms of File being a scoped entity  
is:

File becomes a struct.

File's sole member is a class that implements InputStream, OutputStream,  
and ref counting.  This would be roughly equivalent to today's File.   
Except it's not buffered.  I think the names need work, and you are very  
right to point out that we should make existing code work as much as  
possible.

 2.  File(someFileName, someMode) needs to work.  Not supporting this  
 method of
 instantiating a File object would break way too much code.

I can change File.open to File.opCall, that will fix that.

-Steve

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 10:11 PM, Steven Schveighoffer wrote:
 On Sat, 03 Sep 2011 17:20:53 -0400, dsimcha <dsimcha yahoo.com> wrote:

 == Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s
 article
 Hello,
 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.
 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.
 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html
 Thanks,
 Andrei

 After a quick look, I have two concerns:

 1. File is a class, not a struct. This precludes using reference
 counting as the
 current std.stdio.File does, meaning you have to close all your Files
 manually. I
 loved the reference counting semantics, especially the last few
 releases since
 most of the relevant compiler bugs have been fixed.

 As long as a class can contain a File as a member, this argument makes
 no sense to me. In other words, it's impossible to remove the GC from
 the File destructor/refcounting system.

The meaning of the argument is that just because there is the 
possibility of a File leaking, we shouldn't increase the likelihood of 
such a leak.


Andrei

Sep 03 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/03/2011 09:54 PM, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

File is now a class. This will break a lot of code.
What is happening to the refcounted File feature? It seems that the new 
way of file handling, using a file class, is more error prone than the 
old way?

But it is really great to hear that the efficiency problems of std.stdio 
are being sorted out!

Sep 03 2011

Michel Fortin <michel.fortin michelf.com> writes:

On 2011-09-03 19:54:05 +0000, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Hello,
 
 
 There are a number of issues related to D's current handling of 
 streams, including the existence of the imperfect etc.stream and the 
 over-specialization of std.stdio.
 
 Steve has worked on an extensive overhaul of std.stdio which would 
 obviate the need for etc.stream and would improve both the generality 
 and efficiency of std.stdio.
 
 Please chime in with feedback; he's away from the Usenet but allowed me 
 to post this on his behalf. I uploaded the docs to
 
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Looks good…

Hum, inconsistent casing of enum members.

And shouldn't there be a way to do non-blocking IO? ;-)

I like that File is now a class because it's cleaner that way, but 
non-deterministic destruction is going to be a problem. That said, it 
was already a problem anyway if you stored a File struct in a class, so 
maybe we need a more general solution for reference-counted classes.

Class names DInput and DOutput sounds silly. If all classes implemented 
purely in D had a D prefix, it'd get redundant pretty fast (like KDE 
apps beginning in K). I'd suggest BufferedInput and BufferedOutput, or 
something else that actually describes what the class does, instead of 
DInput and DOutput. And I'd make them final, that way there won't be 
any virtual call overhead until the buffer needs to be replenished or 
flushed from the wrapped input or output stream.


-- 
Michel Fortin
michel.fortin michelf.com
http://michelf.com/

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 17:55:12 -0400, Michel Fortin  =

<michel.fortin michelf.com> wrote:

 On 2011-09-03 19:54:05 +0000, Andrei Alexandrescu  =

 <SeeWebsiteForEmail erdani.org> said:

 Hello,
   There are a number of issues related to D's current handling of  =


 streams, including the existence of the imperfect etc.stream and the =


 =

 over-specialization of std.stdio.
  Steve has worked on an extensive overhaul of std.stdio which would  =


 obviate the need for etc.stream and would improve both the generality=


  =

 and efficiency of std.stdio.
  Please chime in with feedback; he's away from the Usenet but allowed=


  =

 me to post this on his behalf. I uploaded the docs to
  http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

 Looks good=E2=80=A6

Well, at least someone thinks so ;)

 Hum, inconsistent casing of enum members.

Can be fixed.

 And shouldn't there be a way to do non-blocking IO? ;-)

Yes.  I haven't gotten to that yet.  This is a very early version, not  =

ready for inclusion.  It's mostly a proof-of-concept.

 I like that File is now a class because it's cleaner that way, but  =

 non-deterministic destruction is going to be a problem. That said, it =

 =

 was already a problem anyway if you stored a File struct in a class, s=

o  =

 maybe we need a more general solution for reference-counted classes.

I agree, but I think I need to revisit that aspect.  As broken as the  =

reference counting mechanism is, much code is based on it, so we can't s=
ay  =

you have to revisit all source code in order to be compatible.

And as Andrei points out, it works in cases where you *don't* store the =
 =

struct on the heap, why should that be disabled?

 Class names DInput and DOutput sounds silly. If all classes implemente=

d  =

 purely in D had a D prefix, it'd get redundant pretty fast (like KDE  =

 apps beginning in K).

Yes, it made sense when I was going through the different iterations of =
my  =

interface ideas.  But you are right.  BTW, these started out as  =

DBufferedInput and DBufferedOutput, and CStream was CBufferedStream.

 I'd suggest BufferedInput and BufferedOutput, or something else that  =

 actually describes what the class does, instead of DInput and DOutput.=

  =

 And I'd make them final, that way there won't be any virtual call  =

 overhead until the buffer needs to be replenished or flushed from the =

 =

 wrapped input or output stream.

They are final, ddoc just doesn't expose that...

See my later post to the source.  Things might be clearer.

-Steve

Sep 03 2011

dsimcha <dsimcha yahoo.com> writes:

Actually I'll generalize the comment I made before:  As much as I like more
efficiency, I despise the massive overhaul and code breakage and the complexity
of
having a zillion tiny objects to do everything that File used to do.  I would
like
to see the native I/O under the hood plus something more like the current API
for
basic file I/O.  I'd vote against the current design just because of the massive
code breakage it would cause with no migration plan.

Sep 03 2011

Walter Bright <newshound2 digitalmars.com> writes:

What happens if I write:

    printf("hello ");
    writeln("world");

?

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 20:47:05 -0400, Walter Bright  
<newshound2 digitalmars.com> wrote:

 What happens if I write:


     printf("hello ");
     writeln("world");

useCStdio();

This makes all the standard handles C-based.

And crap, I see I did not document it.... grr....

See here:

https://github.com/schveiguy/phobos/blob/new-io/std/stdio.d#L3332

Sorry....

-Steve

Sep 03 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/3/11 3:54 PM, Andrei Alexandrescu wrote:
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Here are a few points following a pass through the dox:

* After thinking some more about it, I find the approach seek() plus 
enumerated Anchor undesirable. It's a bad case of logical coupling as 
one never calls seek() passing an anchor as a variable. It's really 
three functions - seekForward, seekBackward, and seekAbsolute. Heck, 
knowing what seek does, it should be just seekAbsolute. But then there 
are several possible designs; a logically coupled seek() is not a good 
turn in any case.

* Seekable should document that tell() is O(1) and seek() can be 
considered O(1) but with a large constant factor.

* Why is close() not part of Seekable, since Seekable seems to be the 
base of all streams?

* Class File is IMHO not going to cut the mustard. It needs to be a 
struct with a destructor. One should be able to _get_ an InputStream or 
an OutputStream interface out of a File object (i.e. a file is a factory 
of such interfaces), but the File itself must be a struct.

* I don't understand the difference between read() and readComplete().

* readUntil is a bit tenuous. I was hoping for a simpler interface to 
buffered streams, e.g. just expose the buffer as a ubyte[].

* readUntil(const(ubyte)[]) does not give a cheap means to figure 
whether the read ended because file ended or the terminator was met.

* There's several readUntil but only one appendUntil. Why?

* Document the difference between skip and seek. Also, skip should take 
a ulong.

* I see encoder and decoder() in DInput, should both be decoder?

* StreamWidth, TextXXX and friends are a bit sudden because they 
introduce a higher-level abstraction in a module so far only preoccupied 
to transferring bytes. I was thinking that kind of stuff would belong to 
a formatter/serializer module.

Overall, there are interesting elements in this proposal but I don't 
quite feel it hit the proverbial nail on the head.


Andrei

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 21:47:53 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/3/11 3:54 PM, Andrei Alexandrescu wrote:
 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

 Here are a few points following a pass through the dox:

 * After thinking some more about it, I find the approach seek() plus  
 enumerated Anchor undesirable. It's a bad case of logical coupling as  
 one never calls seek() passing an anchor as a variable. It's really  
 three functions - seekForward, seekBackward, and seekAbsolute. Heck,  
 knowing what seek does, it should be just seekAbsolute. But then there  
 are several possible designs; a logically coupled seek() is not a good  
 turn in any case.

I think you need to support all three, but they could be individual  
functions.  It just is easy to provide the same interface the OS handle  
provides.  Let's entertain changing to three separate functions.

But I think we need to support seek from front, seek from end, and seek  
 from current.  I don't know about the three you mentioned.  How would you  
seek to the end if you didn't have seekEnd?  And seeking forward or  
backward I think is captured much better via a positive or negative  
integer.  I can imagine having to write code like this:

if(pos < cur)
    seekBackward(cur - pos);
else
    seekForward(pos - cur);

 * Seekable should document that tell() is O(1) and seek() can be  
 considered O(1) but with a large constant factor.

OK, docs need lots of TLC for sure.

 * Why is close() not part of Seekable, since Seekable seems to be the  
 base of all streams?

Hm... not really sure.  I suppose it could be!  But then, should the  
interface be called Seekable?  What about just Stream?

 * Class File is IMHO not going to cut the mustard. It needs to be a  
 struct with a destructor. One should be able to _get_ an InputStream or  
 an OutputStream interface out of a File object (i.e. a file is a factory  
 of such interfaces), but the File itself must be a struct.

I'm seeing a large backlash on this decision.  I'm going to revisit it.

Note, however, that it was a poor choice of name for File on my part.   
File is *not* equivalent to the current stdio.File, in that it's not  
buffered, and is not text-based.

 * I don't understand the difference between read() and readComplete().

read() gets as much data as it can from the buffer and from the stream  
using at most one low-level read.  readComplete() will continually read  
until either EOF is encountered, or the requested data is read.  I started  
making read() do what readComplete does, but it surprisingly is a very  
difficult low-level thing to write.  However, readComplete() is trivial to  
implement on top of read(), which is why I split the two functions.

Please, come up with a better name, I hate readComplete :)

 * readUntil is a bit tenuous. I was hoping for a simpler interface to  
 buffered streams, e.g. just expose the buffer as a ubyte[].

I think we need a const(ubyte)[] peek(size_t nbytes).  Would this suffice?

 * readUntil(const(ubyte)[]) does not give a cheap means to figure  
 whether the read ended because file ended or the terminator was met.

You are right.  I'll think about this.

 * There's several readUntil but only one appendUntil. Why?

Didn't get around to it yet.  The overloads for readUntil are trivial, so  
can be copied easily enough to appendUntil.

 * Document the difference between skip and seek. Also, skip should take  
 a ulong.

skip is buffer-only.  It will never trigger a low-level call.
I will fill the docs more completely.

Given this, I think size_t is the right type, as a buffer cannot be more  
than size_t bytes in length.

 * I see encoder and decoder() in DInput, should both be decoder?

Yes.  encoder is for DOutput, copy-paste error.

 * StreamWidth, TextXXX and friends are a bit sudden because they  
 introduce a higher-level abstraction in a module so far only preoccupied  
 to transferring bytes. I was thinking that kind of stuff would belong to  
 a formatter/serializer module.

Could be moved.  However, stdin stderr and stdout are traditionally  
text-based, and stdio contains them.  I wanted to split out text-handling  
 from the basic buffered stream, since it's very specific.  For example,  
having to deal with an object that supports formatted text i/o for a  
network socket seems uncommon.

I'm open to suggestions.

Note, I must have had a brain-malfunction when I gave what I thought was a  
fairly completely-documented module.  I missed some very important  
declarations and functions.  I'll work on fixing the docs and giving you a  
new copy.  Thanks again for hosting it.

-Steve

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 15:54:05 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 Hello,


 There are a number of issues related to D's current handling of streams,  
 including the existence of the imperfect etc.stream and the  
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would  
 obviate the need for etc.stream and would improve both the generality  
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me  
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Thank you Andrei for posting this.  Before I add some more details, let me  
first say, this is a very early version, but it does work (and spanks the  
pants off of the current stdio in the tests I've run).

I'll add several very important things:

1. At the moment, this is written for Linux *ONLY*.  I have very good  
experience with Windows i/o, and I am 100% certain I can implement this  
library for it.  However, it's not my main OS, so I wanted to first get  
something working with my main working environment.
2. This is *not* currently multithread aware.  But it will be.  However, I  
think one important aspect to consider is to make a *thread-local* aware  
i/o library to avoid unnecessary locking when an i/o connection is only  
used in one thread.  But please leave that part alone for now, I'm working  
on how to make the code reusable as shared types.  Actually, if anyone has  
good ideas on that, please share!
3. Although I am dead-set on getting *something* into Phobos, I am not  
attached at all to the symbol names, or even some major design choices.  I  
have seen so far it's one of the major concerns, and I think we can find  
good names.  The names I came up with are not exactly arbitrary, but they  
are somewhat based on earlier designs that I have since abandoned, so  
renaming is definitely in order.
4. You can get the full source here:  
https://github.com/schveiguy/phobos/tree/new-io  I used the 2.054 stock  
compiler, and a version of druntime that includes Lars' new std-process  
changes, also on my github account:  
https://github.com/schveiguy/druntime/tree/new-std-process  Please use  
those when trying out the code.

--------------------------

So let me tell you about the library design and why I did it the way I did  
it.  Then, I'll respond to individual concerns already posted.

The major problem I think the current std.stdio has is, it's buffered  
solution is based on C's FILE * implementation.  Specifically, we have  
very little control and access to the buffer implementation.  I think the  
key (or at least one of the keys) to uber-fast I/O is trying to copy as  
little as possible *needlessly*.  Seamless and safe buffer access I think  
is the key to this.  In addition to that, C's FILE * has several  
limitations:

1. On Windows, it's based on DMC's runtime, which limits 60 simultaneous  
open files (Windows OS limit is 10,000 I think)
2. 64-bit support is not standard in all C implementations (namely Windows)
3. All FILE * objects are inherently shared, meaning lock-free I/O is very  
cumbersome, especially considering we have D's shared/unshared system.
4. C supports UTF-8, and it's supposed to support UTF-16 (but I can't get  
UTF-16 to work).  I think D ought to support all forms of UTF, since UTF  
is an integral part of the language.

In addition to this, we have numerous D tools at our disposal --  
delegates, closures, ranges, etc.  In other words, limiting us to C's  
interfaces means either duct-taping on those features, or abandoning  
them.  While a noble effort, and probably the best we could get, a prime  
example is the LockingFileReader range in std.stdio.  Just reading it made  
me cringe.  Have a look:  
https://github.com/D-Programming-Language/phobos/blob/master/std/stdio.d#L1282

I felt, we must be able to do something better.

So I started creating what I thought would be a good i/o library.  I did  
not start from the existing code, but just rewrote everything.  The basic  
concept is, we implement buffering once, and implement low-level devices  
that can be wrapped by the buffering implementation.  Almost everything  
that would use I/O wants to use a buffered version of it, so make the  
low-level aggregate minimal, and put all the useful functionality into the  
buffer.  I also wanted to make sure it is very easy to implement  
*efficient* ranges.

One design decision early on is that the device-level should be a class.   
There are a few good reasons for this:

1. an I/O device is a reference-type.  Copying it does not open another  
handle.  So even if we *wanted* structs, they would be pImpl structs.
2. One simple idea that works very well at the OS level is the file  
descriptor concept.  The file descriptor provides an *interface* to user  
code for operating on a stream.  And they are easily inter-changeable.   
This means a fd could be a network socket, a file, a pipe, a COM port, and  
the basic interface never changes.  So we should use that same concept --  
define a simple interface for a low-level device, and then you can  
implement the buffer around that interface.  Since classes are the only  
types which support interfaces, I chose them.

Yes, I know classes suffer from the dreaded "I don't know when the GC is  
going to get around to closing this file" problem.  I think though, we  
have ways to mediate that (I'll post some responses to points about that  
elsewhere in the thread).

One other important design decision I made was that the standard handles  
*must* be changable at runtime to C-based i/o.  This was mainly to appease  
Walter, as he insists on having compatible I/O with C functions (such as  
printf).  I think he has a good point, but I think limiting this to  
basically the standard handles is the right level of compatibility.

After going through many iterations (you can look at the github history if  
you are interested), I settled on this basic tree.  Note that I'm very  
open to changing any parts of this, as long as the basic concept of a  
common buffer type surrounding a low-level device type is kept intact.

interface Seekable => an interface defining seek functions for a device.
interface InputStream : Seekable => an interface defining functions that  
can be called on an input device.  This is non-buffered.
interface OutputStream : Seekable => an interface defining functions that  
can be called on an output device.  Also non-buffered.

class File : InputStream, OutputStream => The implementation for the OS  
handle-based input output stream.  This is akin to a file descriptor.   
(Note, I realize this is a poor name choice for this, it should probably  
be changed).

final class DInput => The buffered input stream.  This implements the  
buffer which surrounds an InputStream.
final class DOutput => The buffered output stream.  This implements the  
buffer which surrounds an OutputStream.
final class CStream => A Buffered Input and output stream based on C's  
FILE *.  This is used if you want to be compatible with C input or output,  
and is used in TextInput and TextOutput when using the C standard handles.

struct TextInput => A text-based input stream.  This implements UTF  
translation of all forms and handles formatted input.  Main member  
function is readf.
struct TextOutput => A text-based output stream. This implements UTF  
translation of all forms and handles formatted output.  Main member  
functions are the write* family.

It seems like a lot.  But keep in mind that almost everyone will only ever  
used DInput, DOutput, TextInput and TextOutput.  These replace the current  
std.stdio.File.  The low level devices are for implementing low-level  
devices.  They are not really for being used, except to wrap in a buffer.   
I expect that convenience functions will exist to create the correct  
buffered stream when given the right parameters.  The most obvious example  
is the function openFile (which is included).  The nice thing is, due to  
the auto return feature and templates, this takes care of some of the mess  
of having 4 main types to deal with.

I want to reiterate, I have created something that works, not something  
that is perfect.  I want everyone's input on how it should be changed --  
including major design decisions.  I'm open to changing just about  
everything.  The *only* major concept I want to keep is the buffering  
surrounding a low-level device.

Thanks for taking the time to look at this.  I hope it will become good  
enough to be included in Phobos.  I plan to do everything I can to make it  
happen.

-Steve

Sep 03 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sat, 03 Sep 2011 21:58:09 -0400, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

please read the previous comment, it includes a link to the source as well  
as further explanations...

Boy, I could have planned this better...

-Steve

Sep 03 2011

David Nadlinger <see klickverbot.at> writes:

I will come back with some more detailed feedback later on, but a few 
nits that caught my eye:

  - I don't think changing file from being a struct to a class is a good 
idea. First, it breaks an awful lot of D/Phobos programs already out 
there, both because of the struct->class change and because of the other 
API changes. Second, I feel we should really try to make use of RAII for 
things like file handles – I know we have »scope (exit) file.close()«, 
but forcing the user to remember to always type that needs a very good 
reason, imho. Couldn't File rather have some factory methods returning 
stream interface implementations?

  - CStream and DInput/Output? I don't care how it is implemented under 
the hood, give me something that works! ;) In this case, I guess CStream 
is somewhat appropriate, as C (FILE*) streams are widely known, but 
still I'm not too fond of the names.

  - bufsize -> bufSize?

  - Why on earth does DDoc render the enum default parameter as 
»(Anchor).Begin«? Is there a bug report for this?

  - I am sure there is a reason why the design uses decoder delegates, 
but without the source being available, I didn't find it immediately 
obvious where the advantages of using it over processing what is being 
read() from the stream are. Is this so data can be processed before 
going into the buffer? On a related note, what seems to be the decoder 
property getter is named »encoder()«.

David


On 9/3/11 9:54 PM, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

Sep 03 2011

"Paulo Pinto" <pjmlp progtools.org> writes:

Hi,

what is an "abstract interface" ?

--
Paulo

"Andrei Alexandrescu" <SeeWebsiteForEmail erdani.org> wrote in message 
news:j3u0l4$1atr$1 digitalmars.com...
 Hello,


 There are a number of issues related to D's current handling of streams, 
 including the existence of the imperfect etc.stream and the 
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would obviate 
 the need for etc.stream and would improve both the generality and 
 efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me to 
 post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

Sep 04 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/4/11, Paulo Pinto <pjmlp progtools.org> wrote:
 Hi,

 what is an "abstract interface" ?

I'm wondering the same thing.

Sep 04 2011

David Nadlinger <see klickverbot.at> writes:

On 9/4/11 4:30 PM, Andrej Mitrovic wrote:
 On 9/4/11, Paulo Pinto<pjmlp progtools.org>  wrote:
 Hi,

 what is an "abstract interface" ?

 I'm wondering the same thing.

A bug in ddoc. ;)

David

Sep 04 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

I think that openFile, File.open and CStream.open should shouldn't take 
a string as the mode, it should be an enum or similar. Andrei is making 
a big deal out of using enums instead of bools. A bool value can contain 
"true" or "false", a string can contain an infinite number of different 
values.

-- 
/Jacob Carlborg

Sep 04 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

 I think that openFile, File.open and CStream.open should shouldn't take  
 a string as the mode, it should be an enum or similar. Andrei is making  
 a big deal out of using enums instead of bools. A bool value can contain  
 "true" or "false", a string can contain an infinite number of different  
 values.

openFile takes it as a template argument, and it will fail at compile time  
if the parameter is not correct (if not now, it will when the library is  
ready for inclusion).

I agree that enum is cleaner and easier to deal with from the library's  
point of view, but we have 2 things going for us by using strings:

1. The string formats are backwards compatible, and well defined.  In  
fact, CStream.open just passes the mode string without modification to  
fopen.
2. The brevity of and ability to comprehend a string literal vs. multiple  
enums.

You can think of it like printf (or writef).  The format string has  
infinitely wrong possible format strings, which must be rejected at run  
time.  But I'll take that any day over C++'s format modifiers which are  
type checked at compile-time.

Remember, typically, string formats are most frequently literals, and easy  
to read/write.  While there is great potential for invalid parameters, the  
reality is this rarely happens, and if it does, the errors are seen  
immediately.

-Steve

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 12:50, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

 I think that openFile, File.open and CStream.open should shouldn't
 take a string as the mode, it should be an enum or similar. Andrei is
 making a big deal out of using enums instead of bools. A bool value
 can contain "true" or "false", a string can contain an infinite number
 of different values.

 openFile takes it as a template argument, and it will fail at compile
 time if the parameter is not correct (if not now, it will when the
 library is ready for inclusion).

If it validates the string at compile time than that's great.

 I agree that enum is cleaner and easier to deal with from the library's
 point of view, but we have 2 things going for us by using strings:

 1. The string formats are backwards compatible, and well defined. In
 fact, CStream.open just passes the mode string without modification to
 fopen.
 2. The brevity of and ability to comprehend a string literal vs.
 multiple enums.

 You can think of it like printf (or writef). The format string has
 infinitely wrong possible format strings, which must be rejected at run
 time. But I'll take that any day over C++'s format modifiers which are
 type checked at compile-time.

It's not very often I use the print format functions. Most of the time I 
use Tango and with Tango's format strings at least you don't have to 
specify the type.

 Remember, typically, string formats are most frequently literals, and
 easy to read/write. While there is great potential for invalid
 parameters, the reality is this rarely happens, and if it does, the
 errors are seen immediately.

 -Steve

I would not say that they are easy to read, or at least 
understand/remember what a given mode means. I always have to double 
check the documentation when using these kind of modes. I always have to 
check if a given mode creates a new file or not.

-- 
/Jacob Carlborg

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 08:49:22 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 12:50, Steven Schveighoffer wrote:
 On Sun, 04 Sep 2011 07:07:05 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-03 21:54, Andrei Alexandrescu wrote:
 Hello,


 There are a number of issues related to D's current handling of  
 streams,
 including the existence of the imperfect etc.stream and the
 over-specialization of std.stdio.

 Steve has worked on an extensive overhaul of std.stdio which would
 obviate the need for etc.stream and would improve both the generality
 and efficiency of std.stdio.

 Please chime in with feedback; he's away from the Usenet but allowed  
 me
 to post this on his behalf. I uploaded the docs to

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html


 Thanks,

 Andrei

 I think that openFile, File.open and CStream.open should shouldn't
 take a string as the mode, it should be an enum or similar. Andrei is
 making a big deal out of using enums instead of bools. A bool value
 can contain "true" or "false", a string can contain an infinite number
 of different values.

 openFile takes it as a template argument, and it will fail at compile
 time if the parameter is not correct (if not now, it will when the
 library is ready for inclusion).

 If it validates the string at compile time than that's great.

 I agree that enum is cleaner and easier to deal with from the library's
 point of view, but we have 2 things going for us by using strings:

 1. The string formats are backwards compatible, and well defined. In
 fact, CStream.open just passes the mode string without modification to
 fopen.
 2. The brevity of and ability to comprehend a string literal vs.
 multiple enums.

 You can think of it like printf (or writef). The format string has
 infinitely wrong possible format strings, which must be rejected at run
 time. But I'll take that any day over C++'s format modifiers which are
 type checked at compile-time.

 It's not very often I use the print format functions. Most of the time I  
 use Tango and with Tango's format strings at least you don't have to  
 specify the type.

writef is the same, %s is equivalent to calling toString().

But the format specifiers for Tango are also strings, and not compile-time  
verified.

My point was simply, using a string to indicate flags or formatting  
instructions is pretty efficient, easy to write, and easy to read.

 Remember, typically, string formats are most frequently literals, and
 easy to read/write. While there is great potential for invalid
 parameters, the reality is this rarely happens, and if it does, the
 errors are seen immediately.

 -Steve

 I would not say that they are easy to read, or at least  
 understand/remember what a given mode means. I always have to double  
 check the documentation when using these kind of modes. I always have to  
 check if a given mode creates a new file or not.

Yeah, creating a new file is implied by a combination of modes.

The one that's confusing I think is that "a" is for append, but "+" kind  
of tacks on appending to any other mode.  It's not the most well-designed  
spec for file opening.  Add to that you have the "b" which is a noop on  
most OSes.

There is the possibility that we could accept an alternative open mode  
string, which we could design better.  But we have to keep fopen's spec,  
it's already used everywhere.

-Steve

Sep 06 2011

Andrej Mitrovic <andrej.mitrovich gmail.com> writes:

On 9/6/11, Steven Schveighoffer <schveiguy yahoo.com> wrote:
 There is the possibility that we could accept an alternative open mode
 string, which we could design better.  But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Or an alternative enum instead of a string. I'm another one of those
people who forgets what the various read/write modes are.

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

Ok, I would prefer to use enums if they have sensible names. Something 
like this:

File.open(Mode.read | Mode.write); // for both read and write

-- 
/Jacob Carlborg

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

 Ok, I would prefer to use enums if they have sensible names. Something
 like this:

 File.open(Mode.read | Mode.write); // for both read and write

Honest, C's openmode strings have been around for so long, they hardly 
confuse anyone anymore. I'd rather use "rw" and call it a day.

Andrei

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+"  
 kind
 of tacks on appending to any other mode. It's not the most  
 well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

 Ok, I would prefer to use enums if they have sensible names. Something
 like this:

 File.open(Mode.read | Mode.write); // for both read and write

 Honest, C's openmode strings have been around for so long, they hardly  
 confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;)

The plus "+" is odd, especially with "a" meaning "append".

And there's that really useless "b" :)

But I think this does *not* invalidate the usage of strings to denote open  
mode, it just needs more design.  The good thing about it is, we can  
augment the string flags and be binary and perfectly backwards compatible.

-Steve

Sep 06 2011

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer  
<schveiguy yahoo.com>:

 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they hardly  
 confuse anyone anymore. I'd rather use "rw" and call it a day.

 That's not a valid fopen string ;)

Sorry, but I had to laugh. There could not have been a better counter  
example for using fopen strings. I can live with them, but it is one of  
the bad designs in C that could use an alternative.

OCaml, Go
fopen strings are used in: C, Ruby, PHP, Python, NodeJS
Java has reinvented mode strings:  
http://download.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#RandomAcce
sFile(java.io.File,  
java.lang.String)
Other languages also distinguish only between a fixed set of cases, like  
read, write and append. I found Scala and Perl to do that.

In the end a string just like an enum with the enum being statically  
checked and the string being shorter. Every character corresponds to an  
'ored' enum value. They can both be extended with flags that work with  
Windows and Posix, like 'create only if non-existent' or hints that may  
work on one system only, like exclusive access.

Sep 06 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/6/11 12:24 PM, Marco Leise wrote:
 Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer
 <schveiguy yahoo.com>:

 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they
 hardly confuse anyone anymore. I'd rather use "rw" and call it a day.

 That's not a valid fopen string ;)

 Sorry, but I had to laugh. There could not have been a better counter
 example for using fopen strings. I can live with them, but it is one of
 the bad designs in C that could use an alternative.

Guess I'm destroyed.

Andrei

Sep 06 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Tue, 06 Sep 2011 13:24:34 -0400, Marco Leise <Marco.Leise gmx.de> wrote:

Am 06.09.2011, 17:39 Uhr, schrieb Steven Schveighoffer
<schveiguy yahoo.com>:

On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
Honest, C's openmode strings have been around for so long, they hardly
confuse anyone anymore. I'd rather use "rw" and call it a day.

That's not a valid fopen string ;)

Sorry, but I had to laugh. There could not have been a better counter
example for using fopen strings. I can live with them, but it is one of
the bad designs in C that could use an alternative.

I agree, but: 1. strings are statically checkable in D (see my openFile
for an example), and 2. just because the flags were poorly chosen in C
doesn't mean we must adhere to that spec. In other words, "rw" is not a
valid fopen mode string, but it *could* be a valid std.stdio mode string.

Enums are used in: Unix, Windows, Delphi, Haskell, Lisp, C++,

fopen strings are used in: C, Ruby, PHP, Python, NodeJS
Java has reinvented mode strings:
http://download.oracle.com/javase/1.4.2/docs/api/java/io/RandomAccessFile.html#RandomAcce
sFile(java.io.File,
java.lang.String)
Other languages also distinguish only between a fixed set of cases, like
read, write and append. I found Scala and Perl to do that.

In the end a string just like an enum with the enum being statically
checked and the string being shorter. Every character corresponds to an
'ored' enum value. They can both be extended with flags that work with
Windows and Posix, like 'create only if non-existent' or hints that may
work on one system only, like exclusive access.

I like enums in terms of writing code that processes them, but in terms of
calling functions with them, I mean look at a sample fstream constructor
in C++:

fstream ifs("filename.txt", ios_base::in | ios_base::out);

vs.

File("filename.txt", "r+"); // or "rw"

There's just no way you can think "rw" is less descriptive or
understandable than ios_base::in | ios_base::out.

-Steve

Sep 06 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

But "r+" is. And that's what I assume will be used when I see a file 
opening function taking a string "mode" parameter. But if you say that 
"rw" can/will be used instead than that's better.

-- 
/Jacob Carlborg

Sep 07 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 But "r+" is. And that's what I assume will be used when I see a file  
 opening function taking a string "mode" parameter. But if you say that  
 "rw" can/will be used instead than that's better.

Yes, I'll try to add "rw" and maybe some other letter combinations that  
make sense in my next version.

But I think we still have to support "r+", even though it's esoteric,  
because too much existing code already does this, and to not support it  
would leave silently compiling bugs.

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 But "r+" is. And that's what I assume will be used when I see a file
 opening function taking a string "mode" parameter. But if you say that
 "rw" can/will be used instead than that's better.

 Yes, I'll try to add "rw" and maybe some other letter combinations that
 make sense in my next version.

 But I think we still have to support "r+", even though it's esoteric,
 because too much existing code already does this, and to not support it
 would leave silently compiling bugs.

 -Steve

Didn't you just say that you would check the string at compile time?

-- 
/Jacob Carlborg

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 09:05:35 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in  
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 But "r+" is. And that's what I assume will be used when I see a file
 opening function taking a string "mode" parameter. But if you say that
 "rw" can/will be used instead than that's better.

 Yes, I'll try to add "rw" and maybe some other letter combinations that
 make sense in my next version.

 But I think we still have to support "r+", even though it's esoteric,
 because too much existing code already does this, and to not support it
 would leave silently compiling bugs.

 -Steve

 Didn't you just say that you would check the string at compile time?

You can if you make it a template parameter.  For example, my openFile  
function that I wrote does this (in fact, I needed a template mode string  
because the return type depends on it).  The downside is you cannot pass a  
runtime-generated string.  I cannot actually think of any use cases for  
that however.

In any case, the existing API does not use a template parameter, and we  
have to try and break as little code as possible.

I wonder if there's a way to give the option of using a template parameter  
or using a positional parameter without having two different symbol  
names.  hm...

openFile!(string modedefault = "r")(string filename, string mode =  
modedefault) if (isValidOpenMode(modedefault))
{
    if(!isValidOpenMode(mode))
       throw new Exception("invalid file open mode: " ~ mode);
    ...
}

Would that work?

-Steve

Sep 08 2011

"Simen Kjaeraas" <simen.kjaras gmail.com> writes:

On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer  
<schveiguy yahoo.com> wrote:

 I wonder if there's a way to give the option of using a template  
 parameter or using a positional parameter without having two different  
 symbol names.  hm...

 openFile!(string modedefault = "r")(string filename, string mode =  
 modedefault) if (isValidOpenMode(modedefault))
 {
     if(!isValidOpenMode(mode))
        throw new Exception("invalid file open mode: " ~ mode);
     ...
 }

 Would that work?

Neat! And yes, it certainly does work. I'm still unsure when someone
will actually need to specify that at runtime, but maybe for scripting
languages?

-- 
   Simen

Sep 08 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/8/11 11:11 AM, Simen Kjaeraas wrote:
 On Thu, 08 Sep 2011 15:17:51 +0200, Steven Schveighoffer
 <schveiguy yahoo.com> wrote:

 I wonder if there's a way to give the option of using a template
 parameter or using a positional parameter without having two different
 symbol names. hm...

 openFile!(string modedefault = "r")(string filename, string mode =
 modedefault) if (isValidOpenMode(modedefault))
 {
 if(!isValidOpenMode(mode))
 throw new Exception("invalid file open mode: " ~ mode);
 ...
 }

 Would that work?

 Neat! And yes, it certainly does work. I'm still unsure when someone
 will actually need to specify that at runtime, but maybe for scripting
 languages?

My opinion: we're spending way too much energy on this. File I/O poses 
much more difficult problems than choosing representation of open flags.

Andrei

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 15:17, Steven Schveighoffer wrote:
 You can if you make it a template parameter. For example, my openFile
 function that I wrote does this (in fact, I needed a template mode
 string because the return type depends on it). The downside is you
 cannot pass a runtime-generated string. I cannot actually think of any
 use cases for that however.

 In any case, the existing API does not use a template parameter, and we
 have to try and break as little code as possible.

 I wonder if there's a way to give the option of using a template
 parameter or using a positional parameter without having two different
 symbol names. hm...

 openFile!(string modedefault = "r")(string filename, string mode =
 modedefault) if (isValidOpenMode(modedefault))
 {
 if(!isValidOpenMode(mode))
 throw new Exception("invalid file open mode: " ~ mode);
 ...
 }

 Would that work?

 -Steve

That looks nice if it works.

-- 
/Jacob Carlborg

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 03:05 PM, Jacob Carlborg wrote:
 On 2011-09-08 13:04, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:27:43 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 But "r+" is. And that's what I assume will be used when I see a file
 opening function taking a string "mode" parameter. But if you say that
 "rw" can/will be used instead than that's better.

 Yes, I'll try to add "rw" and maybe some other letter combinations that
 make sense in my next version.

 But I think we still have to support "r+", even though it's esoteric,
 because too much existing code already does this, and to not support it
 would leave silently compiling bugs.

 -Steve

 Didn't you just say that you would check the string at compile time?

That is not compatible with the auto f = File(name, mode); interface.

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

BTW, I think that using:

Mode.read | Mode.write

Instead of "rw" is the same thing as one should name variables with a 
proper descriptive names instead of just "a" or "b".

-- 
/Jacob Carlborg

Sep 07 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 BTW, I think that using:

 Mode.read | Mode.write

 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

I disagree: "rw" is quite obvious because you have context. It is not

Mode.read | Mode.write vs "rw"

but

File("filename.txt", Mode.read | Mode.write);

vs

File("filename.txt", "rw");

Sep 07 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/07/2011 01:27 PM, Timon Gehr wrote:
 On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 BTW, I think that using:

 Mode.read | Mode.write

 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

 I disagree: "rw" is quite obvious because you have context. It is not

 Mode.read | Mode.write vs "rw"

 but

 File("filename.txt", Mode.read | Mode.write);

 vs

 File("filename.txt", "rw");

Oh, btw:

final switch(Mode.read|Mode.write){
     case Mode.read: writeln(1); break;
     case Mode.write: writeln(2); break;
}

=> 2

hm...

Sep 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, September 07, 2011 13:32:46 Timon Gehr wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 On 09/07/2011 09:30 AM, Jacob Carlborg wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:
 
 fstream ifs("filename.txt", ios_base::in | ios_base::out);
 
 vs.
 
 File("filename.txt", "r+"); // or "rw"
 
 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.
 
 -Steve

 
 BTW, I think that using:
 
 Mode.read | Mode.write
 
 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

 
 I disagree: "rw" is quite obvious because you have context. It is not
 
 Mode.read | Mode.write vs "rw"
 
 but
 
 File("filename.txt", Mode.read | Mode.write);
 
 vs
 
 File("filename.txt", "rw");

 
 Oh, btw:
 
 final switch(Mode.read|Mode.write){
      case Mode.read: writeln(1); break;
      case Mode.write: writeln(2); break;
 }
 
 => 2
 
 hm...

Personally, I don't think that &ing or |ing enums should result in an enum, 
and this case illustrates one reason why. But ultimately, the main issue IMHO 
is that &ing or |ring enums doesn't generally result in a valid enum value, so 
it just doesn't make sense.

- Jonathan M Davis

Sep 07 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:

 final switch(Mode.read|Mode.write){
       case Mode.read: writeln(1); break;
       case Mode.write: writeln(2); break;
 }

 =>  2

 hm...


Actually, it will print nothing, not even an Assertion failure, my enum 
definition was wrong

 Personally, I don't think that&ing or |ing enums should result in an enum,
 and this case illustrates one reason why. But ultimately, the main issue IMHO
 is that&ing or |ring enums doesn't generally result in a valid enum value, so
 it just doesn't make sense.

Yes exactly. That is why I always use

alias int MODE;
enum:MODE{
     MODEread=1,
     MODEwrite=2,
}

Sep 07 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, September 07, 2011 14:16:55 Timon Gehr wrote:
 On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:
 
 final switch(Mode.read|Mode.write){
 
       case Mode.read: writeln(1); break;
       case Mode.write: writeln(2); break;
 
 }
 
 =>  2
 
 hm...


 
 Actually, it will print nothing, not even an Assertion failure, my enum
 definition was wrong

Did you compile with -w? I don't remember if that affects final switch or not, 
but there's definitely a problem if you can get final switch to take a value 
that it doesn't handle without using a cast.

 Personally, I don't think that&ing or |ing enums should result in an
 enum, and this case illustrates one reason why. But ultimately, the
 main issue IMHO is that&ing or |ring enums doesn't generally result in
 a valid enum value, so it just doesn't make sense.

 
 Yes exactly. That is why I always use
 
 alias int MODE;
 enum:MODE{
      MODEread=1,
      MODEwrite=2,
 }

And how is that any different from

alias int MODE;
enum MODEread = 1;
enum MODEwrite = 2;

They're manifest constants, not enum values. So, you're basically suggesting 
that flags be done with manifest constants as opposed to enums? That doesn't 
encapsulate as well IMHO, and I'd still object to a function having a MODE 
parameter, since that implies that a MODE is a single flag, whereas it's a 
group of flags - that and as far as Phobos goes, we don't generally use aliases 
like that (of course, we don't name types in all caps or start variable or 
enum value names with uppercase characaters either, so what Phobos does 
obviously isn't necssarily what you stick to).

- Jonathan M Davis

Sep 07 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/07/2011 10:49 PM, Jonathan M Davis wrote:
 On Wednesday, September 07, 2011 14:16:55 Timon Gehr wrote:
 On 09/07/2011 01:42 PM, Jonathan M Davis wrote:
 On 09/07/2011 01:27 PM, Timon Gehr wrote:
 Oh, btw:

 final switch(Mode.read|Mode.write){

        case Mode.read: writeln(1); break;
        case Mode.write: writeln(2); break;

 }

 =>   2

 hm...


 Actually, it will print nothing, not even an Assertion failure, my enum
 definition was wrong

 Did you compile with -w? I don't remember if that affects final switch or not,
 but there's definitely a problem if you can get final switch to take a value
 that it doesn't handle without using a cast.

final switch works the same with or without warnings. Basically final 
switch is wrong in assuming that enumerations can only contain the 
declared values, because the bitwise operators work on enums.

 Personally, I don't think that&ing or |ing enums should result in an
 enum, and this case illustrates one reason why. But ultimately, the
 main issue IMHO is that&ing or |ring enums doesn't generally result in
 a valid enum value, so it just doesn't make sense.

 Yes exactly. That is why I always use

 alias int MODE;
 enum:MODE{
       MODEread=1,
       MODEwrite=2,
 }

 And how is that any different from

 alias int MODE;
 enum MODEread = 1;
 enum MODEwrite = 2;

It is not. But there is currently no nice way to express a set of 
orthogonal flags. enumerations are mis-used for it sometimes, but as you 
said that does not make sense. I sometimes have small bugs because the 
alias is weakly typed though.

 They're manifest constants, not enum values. So, you're basically suggesting
 that flags be done with manifest constants as opposed to enums? That doesn't
 encapsulate as well IMHO, and I'd still object to a function having a MODE
 parameter, since that implies that a MODE is a single flag, whereas it's a
 group of flags -

I'd argue that (MODEread | MODEwrite) is a single mode resulting from 
the composition of the MODEread and MODEwrite modes.

 that and as far as Phobos goes, we don't generally use aliases
 like that (of course, we don't name types in all caps or start variable or
 enum value names with uppercase characaters either, so what Phobos does
 obviously isn't necssarily what you stick to).

That is why imho Phobos should not use enums for file modes. They are 
just not a good match, because the language is so confused about what is 
valid on enums and what is not.

Sep 07 2011

travert phare.normalesup.org (Christophe) writes:

 It is not. But there is currently no nice way to express a set of 
 orthogonal flags.

Well, you could use an array of flags ? Oh, wait, that is precisely what 
"r", "w", "rw" would be.
Another option is to use the power of typesafe variadic functions:

enum Mode :char { read, write }
File fOpen(string filename, Mode[]...);

auto file = fOpen("test.txt", Mode.read, Mode.write);

Isn't it much clearer than using (Mode.read | Mode.write)? Even using 
explicitely [Mode.read, Mode.write] sounds safer anyway. It is uses more 
memory that using bits operators, but who cares about few bytes when 
opening a whole file ?

-- 
Christophe

Sep 07 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/07/2011 11:59 PM, Christophe wrote:
 It is not. But there is currently no nice way to express a set of
 orthogonal flags.

 Well, you could use an array of flags ? Oh, wait, that is precisely what
 "r", "w", "rw" would be.

At least it is short.

 Another option is to use the power of typesafe variadic functions:

 enum Mode :char { read, write }
 File fOpen(string filename, Mode[]...);

 auto file = fOpen("test.txt", Mode.read, Mode.write);

 Isn't it much clearer than using (Mode.read | Mode.write)? Even using
 explicitely [Mode.read, Mode.write] sounds safer anyway. It is uses more
 memory that using bits operators, but who cares about few bytes when
 opening a whole file ?

do you seriously prefer

auto f=fOpen("bah.txt",[Mode.read, Mode.write]);

over

auto f=fOpen("bah.txt","rw");


if that is the case, you could do

auto f=fOpen("bah.txt",encodeMode!([Mode.read, Mode.write]));


that even saves the few bytes.

Sep 07 2011

Tobias Pankrath <tobias pankrath.net> writes:

Christophe wrote:

 It is not. But there is currently no nice way to express a set of
 orthogonal flags.

 Well, you could use an array of flags ? Oh, wait, that is precisely 

what
 "r", "w", "rw" would be.
 Another option is to use the power of typesafe variadic functions:
 
 enum Mode :char { read, write }
 File fOpen(string filename, Mode[]...);
 
 auto file = fOpen("test.txt", Mode.read, Mode.write);

I like the variadic version most. Another alternative
would be to use a extra struct for flags, that
supports OR'ing them in a better way than plain
enum do. Maybe we can get some inspiration from
http://doc.qt.nokia.com/4.7/qflags.html

I'd like to add, that if we once get a good IDE for D,
it won't be able to show me possible values of the mode
parameter, if its type is just string.

And files may be not the only part of phobos that will
need flags. In the end, there should be a solution that
works even though the API and possible values are not
known from C.

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 BTW, I think that using:

 Mode.read | Mode.write

 Instead of "rw" is the same thing as one should name variables with a  
 proper descriptive names instead of just "a" or "b".

It's not the same.  "a" and "b" do not have any meaning, they are just  
variable names.  "r" stands for read and "w" stands for write.  It's  
pretty obvious that they do, especially in the context of opening a file.

I'd equate it to using i, j, k for index variables -- they are not  
descriptive, but in context, everyone knows what they mean.

And in response to the discussion about enum flags not being & or |  
together, I emphatically think enums should be used for bitfields.   
Remember, enum is not just an enumeration, it's a manifest constant.  I  
see no reason that we should not use the namespace-creation ability of  
enum to create such constants.  I don't see the downside.

-Steve

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 01:13 PM, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 BTW, I think that using:

 Mode.read | Mode.write

 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

 It's not the same. "a" and "b" do not have any meaning, they are just
 variable names. "r" stands for read and "w" stands for write. It's
 pretty obvious that they do, especially in the context of opening a file.

 I'd equate it to using i, j, k for index variables -- they are not
 descriptive, but in context, everyone knows what they mean.

I totally agree.

 And in response to the discussion about enum flags not being & or |
 together, I emphatically think enums should be used for bitfields.
 Remember, enum is not just an enumeration, it's a manifest constant.

enum Enumeration{
     field0,
     field1,
}

enum manifestConstant=0;

 I see no reason that we should not use the namespace-creation ability of
 enum to create such constants. I don't see the downside.

The downside is that eg. final switch incorrectly assumes that enum 
values are not composeable. It is imho a small inconsistency in the 
language's design.

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 08:20:46 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/08/2011 01:13 PM, Steven Schveighoffer wrote:

 And in response to the discussion about enum flags not being & or |
 together, I emphatically think enums should be used for bitfields.
 Remember, enum is not just an enumeration, it's a manifest constant.

 enum Enumeration{
      field0,
      field1,
 }

 enum manifestConstant=0;

There are other forms too:

enum MyConstants
{
    const1 = 5;
    const2 = 42;
}

enum flags {
    flag1 = 0x01,
    flag2 = 0x02,
    flag3 = 0x04,
}

Those are clearly manifest constants with a namespace.  The last one is a  
bitfield.

 I see no reason that we should not use the namespace-creation ability of
 enum to create such constants. I don't see the downside.

 The downside is that eg. final switch incorrectly assumes that enum  
 values are not composeable. It is imho a small inconsistency in the  
 language's design.

So don't use final switch?  Again, not all enums are enumerations, you  
have to judge whether final switch is applicable based on interpretation  
of that.

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-08 13:13, Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:

 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:

 fstream ifs("filename.txt", ios_base::in | ios_base::out);

 vs.

 File("filename.txt", "r+"); // or "rw"

 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.

 -Steve

 BTW, I think that using:

 Mode.read | Mode.write

 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

 It's not the same. "a" and "b" do not have any meaning, they are just
 variable names. "r" stands for read and "w" stands for write. It's
 pretty obvious that they do, especially in the context of opening a file.

I guess it's a little clearer in the context of opening a file. "a" can 
be short for "apple" and "b" can be short for "beer".

-- 
/Jacob Carlborg

Sep 08 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 08, 2011 07:13:48 Steven Schveighoffer wrote:
 On Wed, 07 Sep 2011 03:30:17 -0400, Jacob Carlborg <doob me.com> wrote:
 On 2011-09-06 19:39, Steven Schveighoffer wrote:
 I like enums in terms of writing code that processes them, but in
 terms
 of calling functions with them, I mean look at a sample fstream
 constructor in C++:
 
 fstream ifs("filename.txt", ios_base::in | ios_base::out);
 
 vs.
 
 File("filename.txt", "r+"); // or "rw"
 
 There's just no way you can think "rw" is less descriptive or
 understandable than ios_base::in | ios_base::out.
 
 -Steve

 
 BTW, I think that using:
 
 Mode.read | Mode.write
 
 Instead of "rw" is the same thing as one should name variables with a
 proper descriptive names instead of just "a" or "b".

 
 It's not the same.  "a" and "b" do not have any meaning, they are just
 variable names.  "r" stands for read and "w" stands for write.  It's
 pretty obvious that they do, especially in the context of opening a file.
 
 I'd equate it to using i, j, k for index variables -- they are not
 descriptive, but in context, everyone knows what they mean.
 
 And in response to the discussion about enum flags not being & or |
 together, I emphatically think enums should be used for bitfields.
 Remember, enum is not just an enumeration, it's a manifest constant.  I
 see no reason that we should not use the namespace-creation ability of
 enum to create such constants.  I don't see the downside.

I think that it makes perfect sense to use enums for flags. What I don't think 
makes sense is making the type of the variable which holds the flags to be that 
enum type unless _every_ possible combination of flags has its own flag so that 
&ing or |ing enums always results in a valid enum. I have no gripe with using 
enums for flags. It's using an enum to hold a value which is not a valid value 
for that enum which is the problem IMHO.

- Jonathan M Davis

Sep 08 2011

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't think
 makes sense is making the type of the variable which holds the flags to be that
 enum type unless _every_ possible combination of flags has its own flag so that
 &ing or |ing enums always results in a valid enum.

This ain't going to work because it would require the human user to 
write by hand a combinatorial number of symbols.

A ligthweight fixed-sized set with named members is a worthy abstraction 
for the standard library.


Andrei

Sep 08 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of flags
 has its own flag so that &ing or |ing enums always results in a valid
 enum.

 
 This ain't going to work because it would require the human user to
 write by hand a combinatorial number of symbols.
 
 A ligthweight fixed-sized set with named members is a worthy abstraction
 for the standard library.

I agree. I'm not arguing that the user _should_ create such a combination of 
flags. That would be horrible. I'm just arguing that having a set of flags with 
enums, e.g.

enum Flag { a = 1, b = 2, c = 4, d = 8 };

and then having Flag.a | Flag.b or Flag.a & Flag.b result in a value of type 
Flag is not a good idea, because the result isn't a valid Flag. It should 
result in whatever the base type is (int in this case), and functions which 
take such flags &ed or |ed should take them using the base type, not the enum 
type.

- Jonathan M Davis

Sep 08 2011

Timon Gehr <timon.gehr gmx.ch> writes:

On 09/08/2011 10:33 PM, Jonathan M Davis wrote:
 On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of flags
 has its own flag so that&ing or |ing enums always results in a valid
 enum.

 This ain't going to work because it would require the human user to
 write by hand a combinatorial number of symbols.

 A ligthweight fixed-sized set with named members is a worthy abstraction
 for the standard library.

 I agree. I'm not arguing that the user _should_ create such a combination of
 flags. That would be horrible. I'm just arguing that having a set of flags with
 enums, e.g.

 enum Flag { a = 1, b = 2, c = 4, d = 8 };

 and then having Flag.a | Flag.b or Flag.a&  Flag.b result in a value of type
 Flag is not a good idea, because the result isn't a valid Flag. It should
 result in whatever the base type is (int in this case), and functions which
 take such flags&ed or |ed should take them using the base type, not the enum
 type.

 - Jonathan M Davis

+1.

Sep 08 2011

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

On Thu, 08 Sep 2011 17:34:50 -0400, Timon Gehr <timon.gehr gmx.ch> wrote:

 On 09/08/2011 10:33 PM, Jonathan M Davis wrote:
 On Thursday, September 08, 2011 15:04:56 Andrei Alexandrescu wrote:
 On 9/8/11 2:02 PM, Jonathan M Davis wrote:
 I think that it makes perfect sense to use enums for flags. What I  
 don't
 think makes sense is making the type of the variable which holds the
 flags to be that enum type unless _every_ possible combination of  
 flags
 has its own flag so that&ing or |ing enums always results in a valid
 enum.

 This ain't going to work because it would require the human user to
 write by hand a combinatorial number of symbols.

 A ligthweight fixed-sized set with named members is a worthy  
 abstraction
 for the standard library.

 I agree. I'm not arguing that the user _should_ create such a  
 combination of
 flags. That would be horrible. I'm just arguing that having a set of  
 flags with
 enums, e.g.

 enum Flag { a = 1, b = 2, c = 4, d = 8 };

 and then having Flag.a | Flag.b or Flag.a&  Flag.b result in a value of  
 type
 Flag is not a good idea, because the result isn't a valid Flag. It  
 should
 result in whatever the base type is (int in this case), and functions  
 which
 take such flags&ed or |ed should take them using the base type, not the  
 enum
 type.

 - Jonathan M Davis

 +1.

I could go either way on this.  On one hand, it's nice to say "this is a  
bitfield, and the compiler will force you to use my enumeration constants  
to build it", and on the other hand, anyone who passes in integers  
(especially something non-hex or non-binary like 12345) is asking for  
code-review rejection ;)

I did use an enumeration argument that included a single bit which could  
be or'd in the stdio overhaul.  It was still verifying the enum was valid  
in the contract, so it just as easily could be uint (or maybe it was  
ubyte?).  I don't suppose the type checking is all that critical.

-Steve

Sep 08 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 17:39, Steven Schveighoffer wrote:
 On Tue, 06 Sep 2011 11:11:27 -0400, Andrei Alexandrescu
 Honest, C's openmode strings have been around for so long, they hardly
 confuse anyone anymore. I'd rather use "rw" and call it a day.

 That's not a valid fopen string ;)

 The plus "+" is odd, especially with "a" meaning "append".

 And there's that really useless "b" :)

Exactly.

 But I think this does *not* invalidate the usage of strings to denote
 open mode, it just needs more design. The good thing about it is, we can
 augment the string flags and be binary and perfectly backwards compatible.

 -Steve


-- 
/Jacob Carlborg

Sep 07 2011

Jacob Carlborg <doob me.com> writes:

On 2011-09-06 17:11, Andrei Alexandrescu wrote:
 On 9/6/11 10:05 AM, Jacob Carlborg wrote:
 On 2011-09-06 15:02, Steven Schveighoffer wrote:
 Yeah, creating a new file is implied by a combination of modes.

 The one that's confusing I think is that "a" is for append, but "+" kind
 of tacks on appending to any other mode. It's not the most well-designed
 spec for file opening. Add to that you have the "b" which is a noop on
 most OSes.

 There is the possibility that we could accept an alternative open mode
 string, which we could design better. But we have to keep fopen's spec,
 it's already used everywhere.

 -Steve

 Ok, I would prefer to use enums if they have sensible names. Something
 like this:

 File.open(Mode.read | Mode.write); // for both read and write

 Honest, C's openmode strings have been around for so long, they hardly
 confuse anyone anymore. I'd rather use "rw" and call it a day.

 Andrei

I disagree.

-- 
/Jacob Carlborg

Sep 07 2011

Kagamin <spam here.lot> writes:

Andrei Alexandrescu Wrote:

 http://erdani.com/d/new-stdio/phobos-prerelease/std_stdio.html

Ddoc screwed the types, right?

Sep 05 2011

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, September 06, 2011 19:29:13 Josh Simmons wrote:
 On Tue, Sep 6, 2011 at 7:09 PM, Jonathan M Davis <jmdavisProg gmx.com> 

wrote:

 libraries and have done quite well with them. In fact, I believe that
 the large size of their standard libraries is generally seen as major
 advantage of those languages.
 
 No, we can't have everything in the standard library. No, an XML parser
 in the standard library likely won't meet everyone's needs. However,
 having a large standard library can be of great benefit to the users of
 the language even if it doesn't solve every problem that they could
 possibly have. The question isn't really whether we should add stuff
 like XML parsing to Phobos. The question is what is the best general
 implementation for a such a module and whether we can get an
 implementation of high enough quality to be able to go in the standard
 library. It's a question of time, man power, and quality.
 
 Obviously, Phobos is not going to explode in size overnight, but it _is_
 going to grow in size, and eventually it should be fairly large. We
 already have several useful additions in the review queue which will
 likely make it into Phobos in one form or another over the next few
 months.
 
 - Jonathan M Davis

 

 their massive standard libraries too.
 
 I just think the effort is better spent creating a solid language and
 encouraging third party libraries through better tools.

For the most part, the folks working on Phobos are not the same folks who work 
on dmd. There's some overlap, but they're definitely not the same people. So, 
the fact that people are working on the standard library does _nothing_ to 
slow the language down. If anything, it helps, because it provides a standard 
code base which uses (and therefore tests) the various features of the 
language. Third party libraries are great, but I don't see why you would ever 
want to discourage development of a language's standard library in favor of 
third party libraries. In some cases, modules in the standard library have 
originated in third party libraries anyway.


doesn't mean that Phobos shouldn't try and be as large is it can be while 
still maintaining high quality.

- Jonathan M Davis

Sep 06 2011

D Programming

C/C++ Programming

Other

digitalmars.D - std.stdio overhaul by Steve Schveighoffer