www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Finalizing D2

reply Jason House <jason.james.house gmail.com> writes:
Andrei has indicated that the current plan is to finalize D2 when his book
comes out.

Given this, I'm interested in what _community_ activity should be done as part
of this. 

Should there be a formal review and polishing of the D spec? More than just
criticizing faults, people should submit patches or open a discussion of what
something means. Unimplemented features should be clearly marked or removed. 

Should the final freezing of D2 be delayed until major D1 libraries port to D2?
I'm mostly thinking of Tango, but I bet there are others. It may even be good
if major libraries could use a Phobos-compatible license and become part of the
releases by digital mars. 

Can we generate a bugfix most wanted list? The formal list could inspire
patches by motivated community members. There should be a quality requirement
and a review process for submissions.


To do this, we only need coordinators and a willingness from Walter to promptly
handle all the patch submissions. (I don't care if Walter delegates, but it's
tough to get motivated to do work if there's no promise for using the output of
one's hard work. Walter should also be able to use a red pen on the most-wanted
list before the tasks are given out.

Thoughts?
May 22 2009
next sibling parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from Jason House (jason.james.house gmail.com)'s article
 Andrei has indicated that the current plan is to finalize D2 when his book
comes

 Given this, I'm interested in what _community_ activity should be done as part

 Should there be a formal review and polishing of the D spec? More than just

something means. Unimplemented features should be clearly marked or removed. Given that it's a fairly daunting task to review every dark corner of the spec, I think D2 will initially need to be declared somewhere between alpha and stable, i.e. beta, at least for a while. This means no changes that break code in non-trivial ways, i.e. ways that require significant portions to be redesigned, but things like bug fixes that break a few corner cases that rely on the bug are ok.
 Should the final freezing of D2 be delayed until major D1 libraries port to D2?

major libraries could use a Phobos-compatible license and become part of the releases by digital mars. Good question, it's kind of a chicken and egg problem. My gut feeling is that D2 must be frozen so that it's not a moving target and those ports can happen. On the other hand, if the Tango people need one or two small changes to the core language to simplify their port, it could be worth implementing. On the other hand, the Tango people have been pretty clear about what they need, which is mostly a stable spec and one or two small enhancements that have already been filed in Bugzilla. Also, I think it would be worth it to eventually nominate a few generally useful modules from various community-developed libs for inclusion in Phobos, but non-breaking additions to Phobos don't really need to be completed before D2 is finalized.
 Can we generate a bugfix most wanted list? The formal list could inspire
patches

process for submissions. This is pretty much already done via the voting feature in Bugzilla.
May 22 2009
parent Jason House <jason.james.house gmail.com> writes:
dsimcha wrote:

 == Quote from Jason House (jason.james.house gmail.com)'s article
 Andrei has indicated that the current plan is to finalize D2 when his
 book comes

 Given this, I'm interested in what _community_ activity should be done as
 part

 Should there be a formal review and polishing of the D spec? More than
 just

what something means. Unimplemented features should be clearly marked or removed. Given that it's a fairly daunting task to review every dark corner of the spec, I think D2 will initially need to be declared somewhere between alpha and stable, i.e. beta, at least for a while. This means no changes that break code in non-trivial ways, i.e. ways that require significant portions to be redesigned, but things like bug fixes that break a few corner cases that rely on the bug are ok.

Given how Andrei's book is at least half written, I'm going to guess that *all* major features for D2 have been decided. Maybe the TLS change was the last breaking change. Having a 100% stable dmd compiler isn't really required. All that is needed is for everyone to understand what is planned. I think that's pretty easy to do. Even if there are a few small surprises along the way, it should be possible to update one small part of the spec. Obviously, it really doesn't make a lot of sense for the community to go forward with stuff like this without at least a confirmation that such efforts won't be in vain. (A recurring element in all of these ideas is that at least a small amount of effort/communication is needed by D's core contributors, but that the bulk of the work is placed on the community as a whole instead of on them) It'd be nice if we could start setting a date for when D2 will put a freeze on major features (maybe call it a beta release, a release candidate, whatever)
 Should the final freezing of D2 be delayed until major D1 libraries port
 to D2?

good if major libraries could use a Phobos-compatible license and become part of the releases by digital mars. Good question, it's kind of a chicken and egg problem. My gut feeling is that D2 must be frozen so that it's not a moving target and those ports can happen. On the other hand, if the Tango people need one or two small changes to the core language to simplify their port, it could be worth implementing. On the other hand, the Tango people have been pretty clear about what they need, which is mostly a stable spec and one or two small enhancements that have already been filed in Bugzilla.

It is definitely a chicken and egg problem, but I think it's relatively easy to work through. All we need is a period of relative stability where changes are driven mostly by deviations from a D2 spec. Actually that makes me realize that this should probably be a sequential process. Maybe we finalize the spec first before pushing for any final fixes. It's probably possible to even have other features going into the compiler while the spec is developed, but I think it can't be something drastic like a replacement const system ;)
 Also, I think it would be worth it to eventually nominate a few generally
 useful modules from various community-developed libs for inclusion in
 Phobos, but non-breaking additions to Phobos don't really need to be
 completed before D2 is finalized.

I agree. When D2 spec, compiler, and libs are in shape for finalization, there should be a decision on if the libraries for D2 should freeze along with the compiler specification. I'm guessing Walter will vote yes, but I bet some will want something more than the D2 equivalent of "D1 Phobos a dead library. Let's all use Tango that's better maintained."
 Can we generate a bugfix most wanted list? The formal list could inspire
 patches

a review process for submissions. This is pretty much already done via the voting feature in Bugzilla.

That may be, but there are a few other important nuances: 1. Only bugs relative to the D2 spec should matter. Feature enhancements (and bugs who's fix would essentially require an enhancement) should not be considered. 2. Once a set of bugs are selected, someone should start soliciting patch submissions from the community. 3. Not everyone votes on bugs, so if that's the way this is done, it should be made official. Once it's official, I can guarantee there will be more votes in bugzilla!
May 22 2009
prev sibling next sibling parent reply BCS <none anon.com> writes:
Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Maybe it should be declared "done" as in it's got everything that Walter, Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.
May 22 2009
parent reply Jason House <jason.james.house gmail.com> writes:
BCS wrote:

 Hello Jason,
 
 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Maybe it should be declared "done" as in it's got everything that Walter, Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.

Yes! "Walter, Andrei, Bartosz, and friends": If you're reading this, can you shed some light on what's happening before D2 is declared stable? And when?
May 22 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 BCS wrote:
 
 Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.

Yes! "Walter, Andrei, Bartosz, and friends": If you're reading this, can you shed some light on what's happening before D2 is declared stable? And when?

I've submitted the first three chapters to Rough Cuts. I will make progress towards writing up until the end of August. The last chapter concerns concurrency and is the fuzziest one. Thank you for your initiative to enlist help from the community. There's a lot of very visible help already happening: there's been a sharp increase in bug reports and patches recently. Walter and I are still scratching our head over that (it's not like dmd got much crappier overnight). I can only infer that more people have started using more of D. I'd be thrilled to add more stuff to Phobos. Stuff can be done with ranges that's almost indistinguishable from poetry. But ranges aren't everything, Georg :o). I think Shin's BlackHole and WhiteHole slammed open a door to a world of amazing possibilities. Things like compile-time reflection, run-time reflection, and dynamic loading are very hot and the possibilities are huge. Among other things, Variant can with relative ease implement a function var.call("fun", arg1, arg2) that forwards everything dynamically to a member function of the embedded object. So, there's no need to worry about not being listened to. If you do great things, they will be noticed. Andrei
May 22 2009
next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
 there's been a sharp 
 increase in bug reports and patches recently. Walter and I are still 
 scratching our head over that (it's not like dmd got much crappier 
 overnight). I can only infer that more people have started using more of D.

I think it's mostly a complex consequence of showing DMD source code. I have "predicted" this outcome in one post more than one year ago. Bye, bearophile
May 22 2009
parent reply Don <nospam nospam.com> writes:
bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp 
 increase in bug reports and patches recently. Walter and I are still 
 scratching our head over that (it's not like dmd got much crappier 
 overnight). I can only infer that more people have started using more of D.

I think it's mostly a complex consequence of showing DMD source code. I have "predicted" this outcome in one post more than one year ago.

Yes. It's the simple fact that you can compile DMD out-of-the-box. In fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".
May 22 2009
next sibling parent Don <nospam nospam.com> writes:
Brad Roberts wrote:
 Don wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp increase in bug reports and patches recently.
 Walter and I are still scratching our head over that (it's not like
 dmd got much crappier overnight). I can only infer that more people
 have started using more of D.

I have "predicted" this outcome in one post more than one year ago.

fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".

I don't believe that to be the case. That would explain why more _fixes_ are being provided (primarily thanks to your contributions), but not why there's been an increase in bug _filing_. http://d.puremagic.com/issues/reports.cgi?product=D&datasets=UNCONFIRMED%3A&datasets=NEW%3A&datasets=ASSIGNED%3A&datasets=REOPENED%3A&datasets=VERIFIED%3A&datasets=FIXED%3A Regardless.. it's all good. More reports == more chances of more things being fixed. Later, Brad

A lot of the new bugs seem to be D2, which I presume is related to Andrei's new Phobos2 -- now it's far more appealing to use D2. D2 really hasn't been stress tested very much so far. I tried to do a D1-only bug graph, but couldn't get it to work. If the D1 bug reports are increasing as well, that'd be very hard to explain.
May 23 2009
prev sibling next sibling parent Frits van Bommel <fvbommel REMwOVExCAPSs.nl> writes:
Brad Roberts wrote:
 Don wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp increase in bug reports and patches recently.
 Walter and I are still scratching our head over that (it's not like
 dmd got much crappier overnight). I can only infer that more people
 have started using more of D.

I have "predicted" this outcome in one post more than one year ago.

fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".

I don't believe that to be the case. That would explain why more _fixes_ are being provided (primarily thanks to your contributions), but not why there's been an increase in bug _filing_.

Trying to compile and run code with DMD is not the only way to find bugs in it. Some bugs can be found by reading the source and seeing something that "doesn't look right". Another contributing factor may be that LDC users have been running into some bugs caused by weird/buggy behavior of the frontend, prompting the LDC developers to patch it. From there, it's often a small step to port that patch to DMD, compile it, verify that it works, and submit it to bugzilla. (That's for D1 issues only though, since LDC doesn't work with D2)
May 23 2009
prev sibling parent reply Christopher Wright <dhasenan gmail.com> writes:
Brad Roberts wrote:
 Don wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp increase in bug reports and patches recently.
 Walter and I are still scratching our head over that (it's not like
 dmd got much crappier overnight). I can only infer that more people
 have started using more of D.

I have "predicted" this outcome in one post more than one year ago.

fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".

I don't believe that to be the case. That would explain why more _fixes_ are being provided (primarily thanks to your contributions), but not why there's been an increase in bug _filing_.

Walter would say that the number of bug reports for a compiler is an indication of its popularity.
May 23 2009
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Christopher Wright (dhasenan gmail.com)'s article
 Brad Roberts wrote:
 Don wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp increase in bug reports and patches recently.
 Walter and I are still scratching our head over that (it's not like
 dmd got much crappier overnight). I can only infer that more people
 have started using more of D.

I have "predicted" this outcome in one post more than one year ago.

fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".

I don't believe that to be the case. That would explain why more _fixes_ are being provided (primarily thanks to your contributions), but not why there's been an increase in bug _filing_.

indication of its popularity.

The other thing is that, seeing more bug fixes with each release (thanks Don), people are more motivated to file bugs they notice. Also, with D2 getting close to final, people are probably more motivated to make sure whatever's been bothering them gets into Bugzilla now than before.
May 23 2009
prev sibling parent reply Walter Bright <newshound1 digitalmars.com> writes:
Christopher Wright wrote:
 Walter would say that the number of bug reports for a compiler is an 
 indication of its popularity.

Yes, I would say that (!)
May 23 2009
parent Christopher Wright <dhasenan gmail.com> writes:
Walter Bright wrote:
 Christopher Wright wrote:
 Walter would say that the number of bug reports for a compiler is an 
 indication of its popularity.

Yes, I would say that (!)

In fact, you did say that!
May 23 2009
prev sibling next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu Wrote:

 Jason House wrote:
 BCS wrote:
 
 Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.

Yes! "Walter, Andrei, Bartosz, and friends": If you're reading this, can you shed some light on what's happening before D2 is declared stable? And when?

I've submitted the first three chapters to Rough Cuts. I will make progress towards writing up until the end of August. The last chapter concerns concurrency and is the fuzziest one.

Ok, so pen down in three months?
 Thank you for your initiative to enlist help from the community. There's 
 a lot of very visible help already happening: there's been a sharp 
 increase in bug reports and patches recently. Walter and I are still 
 scratching our head over that (it's not like dmd got much crappier 
 overnight). I can only infer that more people have started using more of D.

The increase is interesting. Out of curiosity, is the increase dominantly for the backend? I wonder if having a sense of D2 stabilizing is increasing usage of D2 overall.
 I'd be thrilled to add more stuff to Phobos. Stuff can be done with 
 ranges that's almost indistinguishable from poetry. But ranges aren't 
 everything, Georg :o). I think Shin's BlackHole and WhiteHole slammed 
 open a door to a world of amazing possibilities. Things like 
 compile-time reflection, run-time reflection, and dynamic loading are 
 very hot and the possibilities are huge. Among other things, Variant can 
 with relative ease implement a function var.call("fun", arg1, arg2) that 
 forwards everything dynamically to a member function of the embedded object.

What do you / others consider the weakest / missing parts of Phobos?
May 22 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 Jason House wrote:
 BCS wrote:

 Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.

"Walter, Andrei, Bartosz, and friends": If you're reading this, can you shed some light on what's happening before D2 is declared stable? And when?

progress towards writing up until the end of August. The last chapter concerns concurrency and is the fuzziest one.

Ok, so pen down in three months?

Yah.
 Thank you for your initiative to enlist help from the community. There's 
 a lot of very visible help already happening: there's been a sharp 
 increase in bug reports and patches recently. Walter and I are still 
 scratching our head over that (it's not like dmd got much crappier 
 overnight). I can only infer that more people have started using more of D.

The increase is interesting. Out of curiosity, is the increase dominantly for the backend? I wonder if having a sense of D2 stabilizing is increasing usage of D2 overall.

Walter has no specific statistics.
 I'd be thrilled to add more stuff to Phobos. Stuff can be done with 
 ranges that's almost indistinguishable from poetry. But ranges aren't 
 everything, Georg :o). I think Shin's BlackHole and WhiteHole slammed 
 open a door to a world of amazing possibilities. Things like 
 compile-time reflection, run-time reflection, and dynamic loading are 
 very hot and the possibilities are huge. Among other things, Variant can 
 with relative ease implement a function var.call("fun", arg1, arg2) that 
 forwards everything dynamically to a member function of the embedded object.

What do you / others consider the weakest / missing parts of Phobos?

Wow. Where should I start. Let me go down the list of modules and share a few thoughts. * std.array: we need to make a decision about differentiating arrays from slices. * std.base64: doesn't deserve a separate module * std.bind: eliminate? * std.bitmanip: define a range for BitArray and eliminate opApply. Add opSlice. * std.vendor: should this go in core? * std.complex: IMPLEMENT. Eliminate any trace of built-in complex. * std.conv: define operations to stream data out and in in binary and text formats. * std.cover: another little module that should be merged somewhere * std.date: unnecessarily clunky and low-level. Also, somehow Walter thinks that std.dateparse has absolutely nothing to do with date. * std.demangle: another small module. Should be merged with e.g. other compiler-specific stuff. * std.encoding, std.utf: we need a massive overhaul of all encoding-specific stuff. Massive. Epic. The current pile of... functionality makes the simplest stuff look like rocket surgery. * std.md5: we should add more such encryption devices. * std.metastrings: I hate the name. Merge into std.string using ctfe * std.mmfile: integrate with the garbage collector. It should be there. * std.outbuffer: I think this shouldn't be a class and shouldn't have that name. * std.outofmemory: why??? * std.process: add pipe() for Windows. Actually that should be in stdio. * std.regex, std.regexp: merge and finalize. * std.signals: I don't know much. A review wouldn't hurt. * std.socket, std.socketstream: We need a real networking library. * std.stdio: implement readf and various I/O specific ranges * std.cstream, std.stream: eliminate. * std.string: arrange so there's no overlapping/conflict with std.algorithm. Implement bidir range for reading strings correctly (already done that). * std.system: merge somewhere * std.thread: replace * std.variant: add dynamic method invocation capabilities * std.xml: replace with something that moves faster than molasses. * std.zip: rewrite Well there's much other stuff I'm sure but I just dumped what came to mind when taking a look. Andrei
May 22 2009
next sibling parent Daniel Keep <daniel.keep.lists gmail.com> writes:
Andrei Alexandrescu wrote:
 ...
 
 * std.base64: doesn't deserve a separate module

The joys of a flat module hierarchy: it has to go *somewhere*. :P
 ...
 
 * std.conv: define operations to stream data out and in in binary and
 text formats.

What do you mean by "stream data"? Are we talking serialisation?
 ...
 
 * std.date: unnecessarily clunky and low-level. Also, somehow Walter
 thinks that std.dateparse has absolutely nothing to do with date.

<tangent> My PhD involves writing a simulator, and it's used in some cases to model historical events. As in, before 1980. I needed a little Python script to take output in one format and dump it into a database. That's when I found out that Python's date API is absolutely pants-on-head retarded; it can't cope with dates before the UNIX epoch; it just dies. Every single Python library I could find for working with dates was broken in the same way. It's like everyone seemed to think the universe began with the invention of UNIX. I ended up porting Phobos' date parsing and formatting code to Python. </tangent>
 ...
 
 * std.metastrings: I hate the name. Merge into std.string using ctfe

How do you plan to do that? The problem with CTFE at the moment is that code which works in CTFE is usually VERY suboptimal, while optimal code doesn't run in CTFE. So you end up with two functions for everything. I usually either end up stuffing the string functions in separate modules or appending "_ctfe" to all CTFE-compatible functions.
 ...
 
 * std.socket, std.socketstream: We need a real networking library.

TcpRange(T)? :P
 ...
 
 * std.variant: add dynamic method invocation capabilities

Any idea what you'll do here? Will (TypeInfo|ClassInfo).getMembers be implemented, or will you be generating shims on instantiation?
 * std.xml: replace with something that moves faster than molasses.

I'd say to just steal from Tango. Their parsers seem to more or less utterly destroy everything else in terms of speed. http://dotnot.org/blog/archives/category/software/d-programming-language/ In one test, Tango's PullParser is almost 100 times faster than std.xml! Hell, you could refactor PullParser to have a range interface if you wanted to. :D
 * std.zip: rewrite

Good luck with that. The Zip format sucks like a battery of Dyson's hooked up in series to form some sort of ultimate sucking machine. APPNOTE doesn't help, either. It's always nice to have a format spec which specifies that you can have multiple redundant copies of the same information which can DIFFER, and then doesn't define which one is canonical or if it's even allowed. That and I swear Tango's Zip module is cursed. I'm trying to close some tickets on it, and I'm getting segfaults in places where it should be impossible to get segfaults, plus when I try to debug it, the debugger crashes. *urgh*
May 23 2009
prev sibling next sibling parent reply grauzone <none example.net> writes:
 * std.bind: eliminate?

Unneeded, because D2 has real closures. (That modules still make a lot of sense in D1, but now it's only a collection of awkward template hacks.)
 * std.metastrings: I hate the name. Merge into std.string using ctfe

Sounds like fun. I hope you'll provide Walter with suggestions how to improve CTFE while fighting with it.
 * std.mmfile: integrate with the garbage collector. It should be there.

Why should the GC know about it?
 * std.outbuffer: I think this shouldn't be a class and shouldn't have 
 that name.

I found this class to be absolutely useless. And there isn't even std.inbuffer! One of the crappier parts of Phobos.
 * std.signals: I don't know much. A review wouldn't hurt.

Crap. Who uses that?
 * std.cstream, std.stream: eliminate.

Of course not without replacement?
 * std.variant: add dynamic method invocation capabilities

Sounds hot.
 * std.xml: replace with something that moves faster than molasses.
 * std.zip: rewrite
 * std.socket, std.socketstream: We need a real networking library.
 * std.md5: we should add more such encryption devices.
 * std.base64: doesn't deserve a separate module
 * std.conv: define operations to stream data out and in in binary and
 text formats.

How about giving these up to Tango? The only problem is, it has not been ported to D2 yet. PS: Anyone knows how to make Thunderbird not insert spaces before a '>' on the start of a line?
May 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 * std.bind: eliminate?

Unneeded, because D2 has real closures. (That modules still make a lot of sense in D1, but now it's only a collection of awkward template hacks.)
 * std.metastrings: I hate the name. Merge into std.string using ctfe

Sounds like fun. I hope you'll provide Walter with suggestions how to improve CTFE while fighting with it.

Me? I thought I'm saying what could be done, not what *I* should be doing :o).
 * std.mmfile: integrate with the garbage collector. It should be there.

Why should the GC know about it?

Because the only way to make memory-mapped files safe is to have the GC handle them.
 * std.outbuffer: I think this shouldn't be a class and shouldn't have 
 that name.

I found this class to be absolutely useless. And there isn't even std.inbuffer! One of the crappier parts of Phobos.

It's used in regular expressions.
 * std.signals: I don't know much. A review wouldn't hurt.

Crap. Who uses that?
 * std.cstream, std.stream: eliminate.

Of course not without replacement?
 * std.variant: add dynamic method invocation capabilities

Sounds hot.
 * std.xml: replace with something that moves faster than molasses.
 * std.zip: rewrite

> * std.md5: we should add more such encryption devices. > * std.base64: doesn't deserve a separate module > * std.conv: define operations to stream data out and in in binary and > text formats. How about giving these up to Tango? The only problem is, it has not been ported to D2 yet.

That's not an option. Andrei
May 23 2009
parent reply grauzone <none example.net> writes:
 * std.mmfile: integrate with the garbage collector. It should be there.

Why should the GC know about it?


To add: in all sane situations, the mmaped region won't contain any pointers, and the GC doesn't have to scan it. Allocating address space is already done by the OS. Freeing the mmaped region is not the GC's responsibility, but can be left to finalizers/destructors.
 Because the only way to make memory-mapped files safe is to have the GC 
 handle them.

Care to explain?
 * std.outbuffer: I think this shouldn't be a class and shouldn't have 
 that name.

I found this class to be absolutely useless. And there isn't even std.inbuffer! One of the crappier parts of Phobos.

It's used in regular expressions.

Not saying the concept is useless, but the implementation. But maybe you planned fixing this anyway.
 How about giving these up to Tango? The only problem is, it has not 
 been ported to D2 yet.

That's not an option.

Licensing reasons? Not-Invented-Here-Syndrome? You love reinventing the wheel?
May 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
grauzone wrote:
 * std.mmfile: integrate with the garbage collector. It should be there.

Why should the GC know about it?


To add: in all sane situations, the mmaped region won't contain any pointers, and the GC doesn't have to scan it. Allocating address space is already done by the OS. Freeing the mmaped region is not the GC's responsibility, but can be left to finalizers/destructors.
 Because the only way to make memory-mapped files safe is to have the 
 GC handle them.

Care to explain?

mmhandle h = mapFile("test.txt"); char[] x = cast(char[]) h.ptr; h.unmapFile; Any attempt to use x will crash the program. So it's the gc who needs to unmap files when they are no longer referenced.
 * std.outbuffer: I think this shouldn't be a class and shouldn't 
 have that name.

I found this class to be absolutely useless. And there isn't even std.inbuffer! One of the crappier parts of Phobos.

It's used in regular expressions.

Not saying the concept is useless, but the implementation. But maybe you planned fixing this anyway.
 How about giving these up to Tango? The only problem is, it has not 
 been ported to D2 yet.

That's not an option.

Licensing reasons? Not-Invented-Here-Syndrome? You love reinventing the wheel?

Licensing and the love for reading snickering remarks. Andrei
May 23 2009
next sibling parent reply Jason House <jason.james.house gmail.com> writes:
Andrei Alexandrescu wrote:

 grauzone wrote:
 How about giving these up to Tango? The only problem is, it has not
 been ported to D2 yet.

That's not an option.

Licensing reasons? Not-Invented-Here-Syndrome? You love reinventing the wheel?

Licensing and the love for reading snickering remarks.

Two questions: 1. Do these libraries need to be part of D2 Phobos, or could they be dropped and simply point users to use Tango? 2. Has anyone _really_ tried to request moving small pieces of Tango into Phobos? I know Sean and Don have moved some of their code over from Tango to Phobos, but I think those happened under different circumstances.
May 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Jason House wrote:
 Andrei Alexandrescu wrote:
 
 grauzone wrote:
 How about giving these up to Tango? The only problem is, it has not
 been ported to D2 yet.


wheel?


Two questions: 1. Do these libraries need to be part of D2 Phobos, or could they be dropped and simply point users to use Tango?

Some of them may be dropped. Andrei
May 23 2009
parent reply bearophile <bearophileHUGS lycos.com> writes:
Andrei Alexandrescu:
Some of them may be dropped.<

My suggestion, for D2, is to assume all minimally serious D2 programmers will have both Phobos2 and Tango2 installed. So Phobos can contain core functionality and Tango the utilities, higher level things, more data structures, etc, reducing the overlapping to low levels. Someone recently has said a comparison: they may become like the STL and Boost of C++, usually both used/installed. Bye, bearophile
May 23 2009
parent Leandro Lucarella <llucax gmail.com> writes:
bearophile, el 23 de mayo a las 14:12 me escribiste:
 Andrei Alexandrescu:
Some of them may be dropped.<

My suggestion, for D2, is to assume all minimally serious D2 programmers will have both Phobos2 and Tango2 installed. So Phobos can contain core functionality and Tango the utilities, higher level things, more data structures, etc, reducing the overlapping to low levels. Someone recently has said a comparison: they may become like the STL and Boost of C++, usually both used/installed.

I don't know exactly what's the point for this. In C++ it makes sense because the standarization process is really annoying. If you don't have a "parallel pseudo standard library" C++ is close to useless. And there is no "official" opensource C++ STL, every compiler is supposed to implement its own. Most modern languages with a relaxed community-driven specifications, that can evolve easily, don't have such duality, they try to include common enough functionality into the standar library, because anyone that wants to implement a new compiler can use the "official" opensource standar library. I agree that having a huuuge standard library is not good either, because is kind of problematic when porting to small devices and such. But to address this I think it could be better to define a core standard library and an extended one. But let both be standard. Then a compiler can provide only the core standard library if minimalism is needed, or the complete one (something like "full java" vs. "java me"). That said, I don't see as a problem that other libraries exists. Is just that Tango seems to be a very base library, which makes sense to be standard, when you expect most people will always use phobos+tango. -- Leandro Lucarella (luca) | Blog colectivo: http://www.mazziblog.com.ar/blog/ ---------------------------------------------------------------------------- GPG Key: 5F5A8D05 (F8CD F9A7 BF00 5431 4145 104C 949E BFB6 5F5A 8D05) ---------------------------------------------------------------------------- Hello? Is there anybody in there? Just nod if you can hear me. Is there anyone at home?
May 23 2009
prev sibling next sibling parent BCS <none anon.com> writes:
Hello Andrei,

 grauzone wrote:
 
 Because the only way to make memory-mapped files safe is to have the
 GC handle them.
 


char[] x = cast(char[]) h.ptr; h.unmapFile; Any attempt to use x will crash the program. So it's the gc who needs to unmap files when they are no longer referenced.

maybe normaly, but you need a way to backdoor this.
May 23 2009
prev sibling parent Lionello Lunesu <lio lunesu.remove.com> writes:
Andrei Alexandrescu wrote:
 grauzone wrote:
 * std.mmfile: integrate with the garbage collector. It should be 
 there.

Why should the GC know about it?


To add: in all sane situations, the mmaped region won't contain any pointers, and the GC doesn't have to scan it. Allocating address space is already done by the OS. Freeing the mmaped region is not the GC's responsibility, but can be left to finalizers/destructors.
 Because the only way to make memory-mapped files safe is to have the 
 GC handle them.

Care to explain?

mmhandle h = mapFile("test.txt"); char[] x = cast(char[]) h.ptr; h.unmapFile; Any attempt to use x will crash the program. So it's the gc who needs to unmap files when they are no longer referenced.

Memory mapped files are unlike memory in that they keep the actual mapped file locked. There must be a deterministic way to unlock those files. L.
May 25 2009
prev sibling next sibling parent Jason House <jason.james.house gmail.com> writes:
Would it be a good idea to transcribe this list onto a "Phobos Help Wanted" 
page?

I'm thinking they should be categorized into 4 basic categories.  
Theoretically, as time goes on higher numbered items should be convertible 
to lower numbered items.

1. Pure library work
   -> Should include basic status info such as:
      "Nobody is working on it"
      "As of XXXX, Mr. Z. has started working on a patch"
      "Andrei has said D2 Phobos needs this"
2. Blocked or partially blocked by bugzilla issues
   -> Should list a bugzilla link for the issues limiting the implementation
   -> Each issue should have a basic status info such as:
      "Nobody is working on it"
      "As of XXXX, Mr. Z. has started working on a patch"
      "Andrei confirmed with Walter this fix is worthwhile for D2"
      etc...
3. Mostly requires discussion / agreement within the community
   -> Links to relevant threads on the D newsgroup (with two lines of recap)
4. Language design work
   -> Links to relevant threads on the D newsgroup
   -> Short paragraphs with design ideas (may need to be on a separate page)


I'd imagine items 3 and 4 would inspire discussions on the newsgroup for 
ironing out the details.  Below, I reordered your list with an initial cut 
at categories.  There are a few category 1 items that are in there simply 
because there's so much rework to be done that I doubt anyone would complain 
about any attempt to clean it up.  For smaller tweaks (such as where to move 
something), I put it into category 3 since little stuff is more likely to 
generate opinions.


Andrei Alexandrescu wrote:

 Jason House wrote:
 What do you / others consider the weakest / missing parts of Phobos?

Wow. Where should I start. Let me go down the list of modules and share a few thoughts.

category 1 (pure library work) ---------------------------------
 * std.base64: doesn't deserve a separate module
 * std.bitmanip: define a range for BitArray and eliminate opApply. Add
 opSlice.
 * std.complex: IMPLEMENT. Eliminate any trace of built-in complex.
 * std.conv: define operations to stream data out and in in binary and
 text formats.
 * std.encoding, std.utf: we need a massive overhaul of all
 encoding-specific stuff. Massive. Epic. The current pile of...
 functionality makes the simplest stuff look like rocket surgery.
 * std.md5: we should add more such encryption devices.
 * std.metastrings: I hate the name. Merge into std.string using ctfe
 * std.mmfile: integrate with the garbage collector. It should be there.
 * std.process: add pipe() for Windows. Actually that should be in stdio.
 * std.regex, std.regexp: merge and finalize.
 * std.socket, std.socketstream: We need a real networking library.
 * std.stdio: implement readf and various I/O specific ranges
 * std.thread: replace
 * std.variant: add dynamic method invocation capabilities
 * std.xml: replace with something that moves faster than molasses.
 * std.zip: rewrite

category 2 (Blocked by bugzilla issues) --------------------------------- category 3 (requires community discussion) ---------------------------------
 * std.bind: eliminate?
 * std.vendor: should this go in core?
 * std.cover: another little module that should be merged somewhere
 * std.date: unnecessarily clunky and low-level. Also, somehow Walter
 thinks that std.dateparse has absolutely nothing to do with date.
 * std.demangle: another small module. Should be merged with e.g. other
 compiler-specific stuff.
 * std.outbuffer: I think this shouldn't be a class and shouldn't have
 that name.
 * std.outofmemory: why???
 * std.signals: I don't know much. A review wouldn't hurt.
 * std.cstream, std.stream: eliminate.
 * std.string: arrange so there's no overlapping/conflict with
 std.algorithm. Implement bidir range for reading strings correctly
 (already done that).
 * std.system: merge somewhere

category 4 (language design work) ---------------------------------
 * std.array: we need to make a decision about differentiating arrays
 from slices.

May 23 2009
prev sibling next sibling parent reply BCS <none anon.com> writes:
Hello Andrei,

 * std.date: unnecessarily clunky and low-level. Also, somehow Walter
 thinks that std.dateparse has absolutely nothing to do with date.

My company has this little project that I wrote in c#: http://precisionsoftware.blogspot.com/2009/03/natural-language-net-date-parser.html Would anyone be interested in it being ported to D? Right now we are trying to sell the c# version (no takers yet) so I'd have to talk to them about it.
 * std.mmfile: integrate with the garbage collector. It should be
 there.

If you are talking putting it in std.gc or whatever its called now, that is one of the last places I'd look for this. If you are *only* talking about just hooks into the GC to un-map stuff, I'm fine with that.
 * std.socket, std.socketstream: We need a real networking library.

what would it do on top of what that does?
May 23 2009
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
BCS wrote:
 Hello Andrei,
 
 * std.date: unnecessarily clunky and low-level. Also, somehow Walter
 thinks that std.dateparse has absolutely nothing to do with date.

My company has this little project that I wrote in c#: http://precisionsoftware.blogspot.com/2009/03/natural-language- et-date-parser.html Would anyone be interested in it being ported to D? Right now we are trying to sell the c# version (no takers yet) so I'd have to talk to them about it.

Looks interesting, but probably a long shot since a D port would cannibalize your employer's business. It should be noted that D already has a solid date parser in std.dateparse, written by Walter himself (and he does know how to write a parser).
 * std.mmfile: integrate with the garbage collector. It should be
 there.

If you are talking putting it in std.gc or whatever its called now, that is one of the last places I'd look for this. If you are *only* talking about just hooks into the GC to un-map stuff, I'm fine with that.
 * std.socket, std.socketstream: We need a real networking library.

what would it do on top of what that does?

I haven't studied it, but Walter said he doesn't like it and I trust him. Anyhow, we'd need to create at least full range integration and support for protocols such as http, ftp, ssh, and imap. Today's languages load a webpage in one line, and that's great so we need to do that. It's even better to be able to process the webpage while it's loading (concurrency!), so we want to do that as well. Andrei
May 23 2009
next sibling parent BCS <none anon.com> writes:
Hello Andrei,

 BCS wrote:
 
 Hello Andrei,
 
 * std.date: unnecessarily clunky and low-level. Also, somehow Walter
 thinks that std.dateparse has absolutely nothing to do with date.
 

http://precisionsoftware.blogspot.com/2009/03/natural-language-net-da te-parser.html Would anyone be interested in it being ported to D? Right now we are trying to sell the c# version (no takers yet) so I'd have to talk to them about it.

cannibalize your employer's business.

We built it as part of a larger project and then chose to try and sell it. It's not our main product by any strech.
 It should be noted that D already has a solid date parser in
 std.dateparse, written by Walter himself (and he does know how to
 write
 a parser).

I'd almost bet that it doesn't cover near as many cases as ours does. Recurring dates for example. The primary IP in it is how it handles dates and the grammar, not the parser.
 * std.socket, std.socketstream: We need a real networking library.
 


him. Anyhow, we'd need to create at least full range integration and support for protocols such as http, ftp, ssh, and imap. Today's languages load a webpage in one line, and that's great so we need to do that. It's even better to be able to process the webpage while it's loading (concurrency!), so we want to do that as well.

So additions, not replacements. OK
May 23 2009
prev sibling parent Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Denis Koroskin wrote:
 On Sat, 23 May 2009 22:20:14 +0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> wrote:
 
 * std.socket, std.socketstream: We need a real networking library.


I haven't studied it, but Walter said he doesn't like it and I trust him. Anyhow, we'd need to create at least full range integration and support for protocols such as http, ftp, ssh, and imap. Today's languages load a webpage in one line, and that's great so we need to do that. It's even better to be able to process the webpage while it's loading (concurrency!), so we want to do that as well. Andrei

We wrote a networking library with a unique modern flexible design. It was initially written in C++, but I'm slowly porting it to D2 (it's already usable and I wrote a few applications with it by now). If anyone is interested, I may contribute it to Phobos. Its design overview (very short one) is attached for those who are interested.

Sounds great! The doc comes off as binary, so I'm pasting it below for others' convenience. Andrei There are two main concepts in "net" (which is a name of the library): - Link Link is an establish connection between two computers. It is a very simple interface that has two important methods - void send(const(void)[] data) and void disconnect(). - Driver Driver is something that creates Links and transfer data between them. Each driver has a set orthogonal properties (some guaranties that they provide): - Reliable Boundary Indicates that packet boundary is guaranteed by the driver. This means, if packet with size N is sent and recieved on the other side, it is recieved as one packet with size N, neither splitted into several portions nor merged with other packets. Stream protocols like TCP often doesn't support reliable packet boundaries, datagrams like UDP often does support this feature. - Reliable Content Indicates that packet consistency is guaranteed by the driver. This means, sent data is recieved without corruption of content. All corrupted data will be filtered out by the driver that supports this feature. - Reliable Order Indicates that packet order is guaranteed by the driver. This means, packets will never change their order is driver supports this feature. - Reliable Delivery Indicates that packet delivery is guaranteed by the driver. This means, while connection is still alive, any data sent is recieved on the other side (maybe after some time, but will be). etc. These are called driver capabilities. If a driver doesn't have some property which is important for your application (for example, content reliability, or packet order), you can create a proxy-driver that will externally add missing feature. This is one of the main ideas behind Drivers: they should be easily "decorable" (compoundable). Other example is, if your driver doesn't compress data automatically, you may easily wrap it with some driver that supports data compression. Networking library provides a set of cross-platform proxies that provide any of the required features. Here is an incomplete list of implemented Driver Proxies (in addition to Proxies that fulfil requirements above - consistency, order, etc) - FastCompression Driver Compresses traffic before sending it over network - Local Driver A kind of "loop back" driver. "Sends" data within address space of the single application (no data copying ever occurs) - Signature Driver "Signs" every outgoing packet and filters out packets with wrong signature - Statistics Driver Gathers statistics on transferred data (number of lost packets, out-of-order packets, damaged packets, bytes sent/received, etc) - Timeout driver Automatically disconnects when a specified timeout is reached Future work: - Encryption Driver Why was it important for us? We develop games for embedded devices (think of game consoles, pocket pcs, phones etc). Some of them have very primitive hardware and software. For example, some of them don't implement BSD Sockets, have no TCP or UDP driver (*very* common case) etc. This is why our networking library doesn't rely on any of these features, although they are used when available. All that is needed is a simple ability to transfer data in *any* way. Everything is else configurable externally by our library. For example, our library provides cross-platform implementation of TCP over UDP. You decide what features you create driver with depending on your needs. For example, when developing turn-based strategy, it is not very important to have ultra-low traffic, and ease of development is of more importance. In this case you may request all of the features and simplify your code dramatically. Sometimes you need to connect over some specific protocol, such as TCP or UDP (for example, access a web-page over HTTP). In this case, you request some concrete driver implementation. Drivers are created using factory methods like the following: Driver createDriver(uint requiredDriverCaps, ...); Driver wrapDriver(Driver hostDriver, uint requiredDriverCaps, ...); Driver createTcpServerDriver(bool async, ushort listeningPort, ...); Driver createTcpClientDriver(bool async, ushort defaultDestinationPort, ...); Driver createUdpDriver(bool async, ushort listeningPort, ushort defaultDestinationPort, ...); etc Packet Processing Whenever a driver receives new packet, it unwinds it (some proxy drivers may add additional data to packets - checksum, packet index, etc - or completely modify it - encryption, compression, etc) and passes to the corresponding link. If a programmer wants to handle packets that come from links, he subscribes to them: // Using a listener link.addIncomingPacketProcessor(this); // Using a delegate link.addIncomingPacketProcessor(&someMethod); These callbacks may be invoked either in main thread (synchronous, during implicit driver.processIncomingPackets() call) or in other thread (asynchronous) - behavior is specified during Driver intiialization. Creating a new link is as simple as: auto link = driver.createLinkTo(host, port); A new valid link is always returned even though the connection may not be establish immediately (non-blocking operation). You can start using it (sending data etc) without waiting until connection fully establishs. Notification callback will be invoked if connection fails. Some drivers may emit new Links. Whenever new connection is received, a link is created and passed to Listeneres. You subscribe to Driver events the same way you do it with Links - driver.addIncomingLinksProcessor(...); All the operations are inherently asynchronous (non-blocking). For example, there is no method link.receive() that waits until a link receives any packet (although it's often very handy, but mostly for prototyping). Operations like this implemented using helpers. This is a core functionality, the one that "drives" the development. Everything else (HTTP/FTP/SSH connections, etc) needs to be built on top of the core functionality in a cross-platform manner. Other cool feature that is implemented using our library is remote procedure call (which is very helpful, not only for debugging). In short, you may remotely invoke almost any method on any object with any arguments and get result back. D compile-time reflection capabilities will significantly simplify porting this.
May 23 2009
prev sibling next sibling parent BLS <windevguy hotmail.de> writes:
std.dtl

std.pattern

(hoped that a least the singleton made it to the moon )

Björn
May 23 2009
prev sibling next sibling parent "Denis Koroskin" <2korden gmail.com> writes:
------------0XcjxIG2dIJ5nNSx24uLvR
Content-Type: text/plain; format=flowed; delsp=yes; charset=utf-8
Content-Transfer-Encoding: 7bit

On Sat, 23 May 2009 22:20:14 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 * std.socket, std.socketstream: We need a real networking library.


I haven't studied it, but Walter said he doesn't like it and I trust him. Anyhow, we'd need to create at least full range integration and support for protocols such as http, ftp, ssh, and imap. Today's languages load a webpage in one line, and that's great so we need to do that. It's even better to be able to process the webpage while it's loading (concurrency!), so we want to do that as well. Andrei

We wrote a networking library with a unique modern flexible design. It was initially written in C++, but I'm slowly porting it to D2 (it's already usable and I wrote a few applications with it by now). If anyone is interested, I may contribute it to Phobos. Its design overview (very short one) is attached for those who are interested. ------------0XcjxIG2dIJ5nNSx24uLvR Content-Disposition: attachment; filename="design overview" Content-Type: application/octet-stream; name="design overview" Content-Transfer-Encoding: Base64 VGhlcmUgYXJlIHR3byBtYWluIGNvbmNlcHRzIGluICJuZXQiICh3aGljaCBpcyBh IG5hbWUgb2YgdGhlIGxpYnJhcnkpOg0KDQotIExpbmsNCglMaW5rIGlzIGFuIGVz dGFibGlzaCBjb25uZWN0aW9uIGJldHdlZW4gdHdvIGNvbXB1dGVycy4gSXQgaXMg YSB2ZXJ5IHNpbXBsZSBpbnRlcmZhY2UgdGhhdCBoYXMgdHdvIGltcG9ydGFudCBt ZXRob2RzIC0gdm9pZCBzZW5kKGNvbnN0KHZvaWQpW10gZGF0YSkgYW5kIHZvaWQg ZGlzY29ubmVjdCgpLg0KDQotIERyaXZlcg0KCURyaXZlciBpcyBzb21ldGhpbmcg dGhhdCBjcmVhdGVzIExpbmtzIGFuZCB0cmFuc2ZlciBkYXRhIGJldHdlZW4gdGhl bS4NCg0KRWFjaCBkcml2ZXIgaGFzIGEgc2V0IG9ydGhvZ29uYWwgcHJvcGVydGll cyAoc29tZSBndWFyYW50aWVzIHRoYXQgdGhleSBwcm92aWRlKToNCg0KLSBSZWxp YWJsZSBCb3VuZGFyeQ0KCUluZGljYXRlcyB0aGF0IHBhY2tldCBib3VuZGFyeSBp cyBndWFyYW50ZWVkIGJ5IHRoZSBkcml2ZXIuDQogICAgVGhpcyBtZWFucywgaWYg cGFja2V0IHdpdGggc2l6ZSBOIGlzIHNlbnQgYW5kIHJlY2lldmVkIG9uIHRoZSBv dGhlciBzaWRlLA0KICAgIGl0IGlzIHJlY2lldmVkIGFzIG9uZSBwYWNrZXQgd2l0 aCBzaXplIE4sIG5laXRoZXIgc3BsaXR0ZWQgaW50byBzZXZlcmFsIHBvcnRpb25z IG5vcg0KCW1lcmdlZCB3aXRoIG90aGVyIHBhY2tldHMuIFN0cmVhbSBwcm90b2Nv bHMgbGlrZSBUQ1Agb2Z0ZW4gZG9lc24ndCBzdXBwb3J0IHJlbGlhYmxlIA0KCXBh Y2tldCBib3VuZGFyaWVzLCBkYXRhZ3JhbXMgbGlrZSBVRFAgb2Z0ZW4gZG9lcyBz dXBwb3J0IHRoaXMgZmVhdHVyZS4NCg0KLSBSZWxpYWJsZSBDb250ZW50DQoJSW5k aWNhdGVzIHRoYXQgcGFja2V0IGNvbnNpc3RlbmN5IGlzIGd1YXJhbnRlZWQgYnkg dGhlIGRyaXZlci4NCiAgICBUaGlzIG1lYW5zLCBzZW50IGRhdGEgaXMgcmVjaWV2 ZWQgd2l0aG91dCBjb3JydXB0aW9uIG9mIGNvbnRlbnQuIEFsbCBjb3JydXB0ZWQg ZGF0YQ0KICAgIHdpbGwgYmUgZmlsdGVyZWQgb3V0IGJ5IHRoZSBkcml2ZXIgdGhh dCBzdXBwb3J0cyB0aGlzIGZlYXR1cmUuDQoNCi0gUmVsaWFibGUgT3JkZXINCglJ bmRpY2F0ZXMgdGhhdCBwYWNrZXQgb3JkZXIgaXMgZ3VhcmFudGVlZCBieSB0aGUg ZHJpdmVyLg0KICAgIFRoaXMgbWVhbnMsIHBhY2tldHMgd2lsbCBuZXZlciBjaGFu Z2UgdGhlaXIgb3JkZXIgaXMgZHJpdmVyIHN1cHBvcnRzDQogICAgdGhpcyBmZWF0 dXJlLg0KCQ0KLSBSZWxpYWJsZSBEZWxpdmVyeQ0KCUluZGljYXRlcyB0aGF0IHBh Y2tldCBkZWxpdmVyeSBpcyBndWFyYW50ZWVkIGJ5IHRoZSBkcml2ZXIuDQogICAg VGhpcyBtZWFucywgd2hpbGUgY29ubmVjdGlvbiBpcyBzdGlsbCBhbGl2ZSwgYW55 IGRhdGEgc2VudCBpcyANCiAgICByZWNpZXZlZCBvbiB0aGUgb3RoZXIgc2lkZSAo bWF5YmUgYWZ0ZXIgc29tZSB0aW1lLCBidXQgd2lsbCBiZSkuDQoJDQpldGMuDQoN ClRoZXNlIGFyZSBjYWxsZWQgZHJpdmVyIGNhcGFiaWxpdGllcy4gSWYgYSBkcml2 ZXIgZG9lc24ndCBoYXZlIHNvbWUgcHJvcGVydHkgd2hpY2ggaXMgaW1wb3J0YW50 IGZvciB5b3VyIGFwcGxpY2F0aW9uIChmb3IgZXhhbXBsZSwgY29udGVudCByZWxp YWJpbGl0eSwgb3IgcGFja2V0IG9yZGVyKSwgeW91IGNhbiBjcmVhdGUgYSBwcm94 eS1kcml2ZXIgdGhhdCB3aWxsIGV4dGVybmFsbHkgYWRkIG1pc3NpbmcgZmVhdHVy ZS4gVGhpcyBpcyBvbmUgb2YgdGhlIG1haW4gaWRlYXMgYmVoaW5kIERyaXZlcnM6 IHRoZXkgc2hvdWxkIGJlIGVhc2lseSAiZGVjb3JhYmxlIiAoY29tcG91bmRhYmxl KS4gT3RoZXIgZXhhbXBsZSBpcywgaWYgeW91ciBkcml2ZXIgZG9lc24ndCBjb21w cmVzcyBkYXRhIGF1dG9tYXRpY2FsbHksIHlvdSBtYXkgZWFzaWx5IHdyYXAgaXQg d2l0aCBzb21lIGRyaXZlciB0aGF0IHN1cHBvcnRzIGRhdGEgY29tcHJlc3Npb24u DQoNCk5ldHdvcmtpbmcgbGlicmFyeSBwcm92aWRlcyBhIHNldCBvZiBjcm9zcy1w bGF0Zm9ybSBwcm94aWVzIHRoYXQgcHJvdmlkZSBhbnkgb2YgdGhlIHJlcXVpcmVk IGZlYXR1cmVzLiBIZXJlIGlzIGFuIGluY29tcGxldGUgbGlzdCBvZiBpbXBsZW1l bnRlZCBEcml2ZXIgUHJveGllcyAoaW4gYWRkaXRpb24gdG8gUHJveGllcyB0aGF0 IGZ1bGZpbCByZXF1aXJlbWVudHMgYWJvdmUgLSBjb25zaXN0ZW5jeSwgb3JkZXIs IGV0YykNCg0KIC0gRmFzdENvbXByZXNzaW9uIERyaXZlcg0KCQlDb21wcmVzc2Vz IHRyYWZmaWMgYmVmb3JlIHNlbmRpbmcgaXQgb3ZlciBuZXR3b3JrDQoNCiAtIExv Y2FsIERyaXZlciANCgkJQSBraW5kIG9mICJsb29wIGJhY2siIGRyaXZlci4gIlNl bmRzIiBkYXRhIHdpdGhpbiBhZGRyZXNzIHNwYWNlIG9mIHRoZSBzaW5nbGUgYXBw bGljYXRpb24gKG5vIGRhdGEgY29weWluZyBldmVyIG9jY3VycykNCg0KIC0gU2ln bmF0dXJlIERyaXZlcg0KCQkiU2lnbnMiIGV2ZXJ5IG91dGdvaW5nIHBhY2tldCBh bmQgZmlsdGVycyBvdXQgcGFja2V0cyB3aXRoIHdyb25nIHNpZ25hdHVyZQ0KDQog LSBTdGF0aXN0aWNzIERyaXZlcg0KCQlHYXRoZXJzIHN0YXRpc3RpY3Mgb24gdHJh bnNmZXJyZWQgZGF0YSAobnVtYmVyIG9mIGxvc3QgcGFja2V0cywgb3V0LW9mLW9y ZGVyIHBhY2tldHMsIGRhbWFnZWQgcGFja2V0cywgYnl0ZXMgc2VudC9yZWNlaXZl ZCwgZXRjKQ0KDQogLSBUaW1lb3V0IGRyaXZlcg0KCQlBdXRvbWF0aWNhbGx5IGRp c2Nvbm5lY3RzIHdoZW4gYSBzcGVjaWZpZWQgdGltZW91dCBpcyByZWFjaGVkDQoJ CQ0KRnV0dXJlIHdvcms6DQogLSBFbmNyeXB0aW9uIERyaXZlcg0KCQkNCg0KV2h5 IHdhcyBpdCBpbXBvcnRhbnQgZm9yIHVzPyBXZSBkZXZlbG9wIGdhbWVzIGZvciBl bWJlZGRlZCBkZXZpY2VzICh0aGluayBvZiBnYW1lIGNvbnNvbGVzLCBwb2NrZXQg cGNzLCBwaG9uZXMgZXRjKS4gU29tZSBvZiB0aGVtIGhhdmUgdmVyeSBwcmltaXRp dmUgaGFyZHdhcmUgYW5kIHNvZnR3YXJlLiBGb3IgZXhhbXBsZSwgc29tZSBvZiB0 aGVtIGRvbid0IGltcGxlbWVudCBCU0QgU29ja2V0cywgaGF2ZSBubyBUQ1Agb3Ig VURQIGRyaXZlciAoKnZlcnkqIGNvbW1vbiBjYXNlKSBldGMuIFRoaXMgaXMgd2h5 IG91ciBuZXR3b3JraW5nIGxpYnJhcnkgZG9lc24ndCByZWx5IG9uIGFueSBvZiB0 aGVzZSBmZWF0dXJlcywgYWx0aG91Z2ggdGhleSBhcmUgdXNlZCB3aGVuIGF2YWls YWJsZS4gQWxsIHRoYXQgaXMgbmVlZGVkIGlzIGEgc2ltcGxlIGFiaWxpdHkgdG8g dHJhbnNmZXIgZGF0YSBpbiAqYW55KiB3YXkuIEV2ZXJ5dGhpbmcgaXMgZWxzZSBj b25maWd1cmFibGUgZXh0ZXJuYWxseSBieSBvdXIgbGlicmFyeS4gRm9yIGV4YW1w bGUsIG91ciBsaWJyYXJ5IHByb3ZpZGVzIGNyb3NzLXBsYXRmb3JtIGltcGxlbWVu dGF0aW9uIG9mIFRDUCBvdmVyIFVEUC4NCg0KWW91IGRlY2lkZSB3aGF0IGZlYXR1 cmVzIHlvdSBjcmVhdGUgZHJpdmVyIHdpdGggZGVwZW5kaW5nIG9uIHlvdXIgbmVl ZHMuIEZvciBleGFtcGxlLCB3aGVuIGRldmVsb3BpbmcgdHVybi1iYXNlZCBzdHJh dGVneSwgaXQgaXMgbm90IHZlcnkgaW1wb3J0YW50IHRvIGhhdmUgdWx0cmEtbG93 IHRyYWZmaWMsIGFuZCBlYXNlIG9mIGRldmVsb3BtZW50IGlzIG9mIG1vcmUgaW1w b3J0YW5jZS4gSW4gdGhpcyBjYXNlIHlvdSBtYXkgcmVxdWVzdCBhbGwgb2YgdGhl IGZlYXR1cmVzIGFuZCBzaW1wbGlmeSB5b3VyIGNvZGUgZHJhbWF0aWNhbGx5Lg0K DQpTb21ldGltZXMgeW91IG5lZWQgdG8gY29ubmVjdCBvdmVyIHNvbWUgc3BlY2lm aWMgcHJvdG9jb2wsIHN1Y2ggYXMgVENQIG9yIFVEUCAoZm9yIGV4YW1wbGUsIGFj Y2VzcyBhIHdlYi1wYWdlIG92ZXIgSFRUUCkuIEluIHRoaXMgY2FzZSwgeW91IHJl cXVlc3Qgc29tZSBjb25jcmV0ZSBkcml2ZXIgaW1wbGVtZW50YXRpb24uDQoNCkRy aXZlcnMgYXJlIGNyZWF0ZWQgdXNpbmcgZmFjdG9yeSBtZXRob2RzIGxpa2UgdGhl IGZvbGxvd2luZzoNCg0KRHJpdmVyIGNyZWF0ZURyaXZlcih1aW50IHJlcXVpcmVk RHJpdmVyQ2FwcywgLi4uKTsNCkRyaXZlciB3cmFwRHJpdmVyKERyaXZlciBob3N0 RHJpdmVyLCB1aW50IHJlcXVpcmVkRHJpdmVyQ2FwcywgLi4uKTsNCkRyaXZlciBj cmVhdGVUY3BTZXJ2ZXJEcml2ZXIoYm9vbCBhc3luYywgdXNob3J0IGxpc3Rlbmlu Z1BvcnQsIC4uLik7DQpEcml2ZXIgY3JlYXRlVGNwQ2xpZW50RHJpdmVyKGJvb2wg YXN5bmMsIHVzaG9ydCBkZWZhdWx0RGVzdGluYXRpb25Qb3J0LCAuLi4pOw0KRHJp dmVyIGNyZWF0ZVVkcERyaXZlcihib29sIGFzeW5jLCB1c2hvcnQgbGlzdGVuaW5n UG9ydCwgdXNob3J0IGRlZmF1bHREZXN0aW5hdGlvblBvcnQsIC4uLik7DQoNCmV0 Yw0KDQpQYWNrZXQgUHJvY2Vzc2luZw0KDQpXaGVuZXZlciBhIGRyaXZlciByZWNl aXZlcyBuZXcgcGFja2V0LCBpdCB1bndpbmRzIGl0IChzb21lIHByb3h5IGRyaXZl cnMgbWF5IGFkZCBhZGRpdGlvbmFsIGRhdGEgdG8gcGFja2V0cyAtIGNoZWNrc3Vt LCBwYWNrZXQgaW5kZXgsIGV0YyAtIG9yIGNvbXBsZXRlbHkgbW9kaWZ5IGl0IC0g ZW5jcnlwdGlvbiwgY29tcHJlc3Npb24sIGV0YykgYW5kIHBhc3NlcyB0byB0aGUg Y29ycmVzcG9uZGluZyBsaW5rLiBJZiBhIHByb2dyYW1tZXIgd2FudHMgdG8gaGFu ZGxlIHBhY2tldHMgdGhhdCBjb21lIGZyb20gbGlua3MsIGhlIHN1YnNjcmliZXMg dG8gdGhlbToNCg0KCS8vIFVzaW5nIGEgbGlzdGVuZXINCglsaW5rLmFkZEluY29t aW5nUGFja2V0UHJvY2Vzc29yKHRoaXMpOw0KCS8vIFVzaW5nIGEgZGVsZWdhdGUN CglsaW5rLmFkZEluY29taW5nUGFja2V0UHJvY2Vzc29yKCZzb21lTWV0aG9kKTsN Cg0KVGhlc2UgY2FsbGJhY2tzIG1heSBiZSBpbnZva2VkIGVpdGhlciBpbiBtYWlu IHRocmVhZCAoc3luY2hyb25vdXMsIGR1cmluZyBpbXBsaWNpdCBkcml2ZXIucHJv Y2Vzc0luY29taW5nUGFja2V0cygpIGNhbGwpIG9yIGluIG90aGVyIHRocmVhZCAo YXN5bmNocm9ub3VzKSAtIGJlaGF2aW9yIGlzIHNwZWNpZmllZCBkdXJpbmcgRHJp dmVyIGludGlpYWxpemF0aW9uLg0KDQpDcmVhdGluZyBhIG5ldyBsaW5rIGlzIGFz IHNpbXBsZSBhczoNCg0KCWF1dG8gbGluayA9IGRyaXZlci5jcmVhdGVMaW5rVG8o aG9zdCwgcG9ydCk7DQoNCkEgbmV3IHZhbGlkIGxpbmsgaXMgYWx3YXlzIHJldHVy bmVkIGV2ZW4gdGhvdWdoIHRoZSBjb25uZWN0aW9uIG1heSBub3QgYmUgZXN0YWJs aXNoIGltbWVkaWF0ZWx5IChub24tYmxvY2tpbmcgb3BlcmF0aW9uKS4gWW91IGNh biBzdGFydCB1c2luZyBpdCAoc2VuZGluZyBkYXRhIGV0Yykgd2l0aG91dCB3YWl0 aW5nIHVudGlsIGNvbm5lY3Rpb24gZnVsbHkgZXN0YWJsaXNocy4gTm90aWZpY2F0 aW9uIGNhbGxiYWNrIHdpbGwgYmUgaW52b2tlZCBpZiBjb25uZWN0aW9uIGZhaWxz Lg0KDQpTb21lIGRyaXZlcnMgbWF5IGVtaXQgbmV3IExpbmtzLiBXaGVuZXZlciBu ZXcgY29ubmVjdGlvbiBpcyByZWNlaXZlZCwgYSBsaW5rIGlzIGNyZWF0ZWQgYW5k IHBhc3NlZCB0byBMaXN0ZW5lcmVzLiBZb3Ugc3Vic2NyaWJlIHRvIERyaXZlciBl dmVudHMgdGhlIHNhbWUgd2F5IHlvdSBkbyBpdCB3aXRoIExpbmtzIC0gZHJpdmVy LmFkZEluY29taW5nTGlua3NQcm9jZXNzb3IoLi4uKTsNCg0KQWxsIHRoZSBvcGVy YXRpb25zIGFyZSBpbmhlcmVudGx5IGFzeW5jaHJvbm91cyAobm9uLWJsb2NraW5n KS4gRm9yIGV4YW1wbGUsIHRoZXJlIGlzIG5vIG1ldGhvZCBsaW5rLnJlY2VpdmUo KSB0aGF0IHdhaXRzIHVudGlsIGEgbGluayByZWNlaXZlcyBhbnkgcGFja2V0IChh bHRob3VnaCBpdCdzIG9mdGVuIHZlcnkgaGFuZHksIGJ1dCBtb3N0bHkgZm9yIHBy b3RvdHlwaW5nKS4gT3BlcmF0aW9ucyBsaWtlIHRoaXMgaW1wbGVtZW50ZWQgdXNp bmcgaGVscGVycy4NCg0KVGhpcyBpcyBhIGNvcmUgZnVuY3Rpb25hbGl0eSwgdGhl IG9uZSB0aGF0ICJkcml2ZXMiIHRoZSBkZXZlbG9wbWVudC4gRXZlcnl0aGluZyBl bHNlIChIVFRQL0ZUUC9TU0ggY29ubmVjdGlvbnMsIGV0YykgbmVlZHMgdG8gYmUg YnVpbHQgb24gdG9wIG9mIHRoZSBjb3JlIGZ1bmN0aW9uYWxpdHkgaW4gYSBjcm9z cy1wbGF0Zm9ybSBtYW5uZXIuDQoNCk90aGVyIGNvb2wgZmVhdHVyZSB0aGF0IGlz IGltcGxlbWVudGVkIHVzaW5nIG91ciBsaWJyYXJ5IGlzIHJlbW90ZSBwcm9jZWR1 cmUgY2FsbCAod2hpY2ggaXMgdmVyeSBoZWxwZnVsLCBub3Qgb25seSBmb3IgZGVi dWdnaW5nKS4gSW4gc2hvcnQsIHlvdSBtYXkgcmVtb3RlbHkgaW52b2tlIGFsbW9z dCBhbnkgbWV0aG9kIG9uIGFueSBvYmplY3Qgd2l0aCBhbnkgYXJndW1lbnRzIGFu ZCBnZXQgcmVzdWx0IGJhY2suIEQgY29tcGlsZS10aW1lIHJlZmxlY3Rpb24gY2Fw YWJpbGl0aWVzIHdpbGwgc2lnbmlmaWNhbnRseSBzaW1wbGlmeSBwb3J0aW5nIHRo aXMu ------------0XcjxIG2dIJ5nNSx24uLvR--
May 23 2009
prev sibling next sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-23 01:25:49 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 * std.xml: replace with something that moves faster than molasses.

I started to write an XML parser using D1 and a pseudo-range implementation a little while ago, but never finished it. (I was undecided about the API, and that somewhat killed my interest.) Perhaps I should finish it and contribute to Phobos. The irking thing about the API was that if I expose a range for parsing and returning tokens, I then need a switch statement to do the right thing about each kind of these tokens (like instantiating the proper node type) whereas with a callback API you don't need to bother saving and then switching on a flag value telling you which kind of node you've read (and callbacks can be aliases in templates). They are two different compromises between speed and flexibility and I guess both should be supported. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 23 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Michel Fortin wrote:
 On 2009-05-23 01:25:49 -0400, Andrei Alexandrescu
 <SeeWebsiteForEmail erdani.org> said:
 
 * std.xml: replace with something that moves faster than molasses.

I started to write an XML parser using D1 and a pseudo-range implementation a little while ago, but never finished it. (I was undecided about the API, and that somewhat killed my interest.) Perhaps I should finish it and contribute to Phobos. The irking thing about the API was that if I expose a range for parsing and returning tokens, I then need a switch statement to do the right thing about each kind of these tokens (like instantiating the proper node type) whereas with a callback API you don't need to bother saving and then switching on a flag value telling you which kind of node you've read (and callbacks can be aliases in templates). They are two different compromises between speed and flexibility and I guess both should be supported.

Callbacks are "easier" to set up, but are incredibly complicated for any sort of structured parsing. The problem is that you can't easily change the behaviour of the parser once it's started. I had to write a SAX parser for a structured data format a few years ago. I swear that 90% of the code (and it's a monstrously huge module) was just boilerplate to work around the bloody callback system. I've come to the conclusion that the SAX api is about the worse POSSIBLE way of parsing anything more complex than a flat file that shouldn't have been XML in the first place. Something like Tango's PullParser is the superior API because although it's more verbose up-front, that's as bad as it gets. Plus, you can actually do stuff like call subroutines.
May 24 2009
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-24 03:22:47 -0400, Daniel Keep <daniel.keep.lists gmail.com> said:

 Callbacks are "easier" to set up, but are incredibly complicated for any
 sort of structured parsing.  The problem is that you can't easily change
 the behaviour of the parser once it's started.
 
 I had to write a SAX parser for a structured data format a few years
 ago.  I swear that 90% of the code (and it's a monstrously huge module)
 was just boilerplate to work around the bloody callback system.  I've
 come to the conclusion that the SAX api is about the worse POSSIBLE way
 of parsing anything more complex than a flat file that shouldn't have
 been XML in the first place.

A callback API isn't necessarily SAX. A callback API doesn't necessarily have to parse everything until completion, it could parse only the next token and call the appropriate callback. If I can construct a range class/struct over my callback API I'll be happy. And if I can recursively call the parser API inside a callback handler so I can reuse the call stack while parsing then I'll be very happy.
 Something like Tango's PullParser is the superior API because although
 it's more verbose up-front, that's as bad as it gets.  Plus, you can
 actually do stuff like call subroutines.

All that is needed really is a callback system that parses only one token. Then the callback can update the PullParser state, or the token-range state, run in a loop to produce a SAX-like API, or directly do what you want to do, which may include parsing more tokens using different callbacks until you reach a closing tag. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 24 2009
parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Michel Fortin wrote:
 ...
 
 A callback API isn't necessarily SAX. A callback API doesn't necessarily
 have to parse everything until completion, it could parse only the next
 token and call the appropriate callback.

When I talk "callback api," I mean something fundamentally like SAX. The reason is that if your callback api only does a single callback, all you've really done is move the switch statement inside the function call at the cost of having to define a crapload of functions outside of it.
 If I can construct a range class/struct over my callback API I'll be
 happy. And if I can recursively call the parser API inside a callback
 handler so I can reuse the call stack while parsing then I'll be very
 happy.

I don't see how constructing a range over a callback api will work. Callback apis are inversion of control, ranges aren't. As for using a callback api recursively, that just seems like a lot of work to replicate the way a pull api works in the first place.
 Something like Tango's PullParser is the superior API because although
 it's more verbose up-front, that's as bad as it gets.  Plus, you can
 actually do stuff like call subroutines.

All that is needed really is a callback system that parses only one token. Then the callback can update the PullParser state, or the token-range state, run in a loop to produce a SAX-like API, or directly do what you want to do, which may include parsing more tokens using different callbacks until you reach a closing tag.

Like I said, this seems like a lot of work to bolt a callback interface onto something a pull api is designed for. At best, you'll end up rewriting this:
 foreach( tt ; pp )
 {
     switch( tt )
     {
         case XmlTokenType.StartElement: blah(pp.name); break;
         ...
     }
 }

to this:
 pp.parse
 (
     XmlToken(Type.StartElement, {blah(pp.name);}),
     ...
 );

Except of course that you now can't easily control the loop, nor can do you do fall-through on the cases.
May 24 2009
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-24 12:51:43 -0400, Daniel Keep <daniel.keep.lists gmail.com> said:

 Michel Fortin wrote:
 ...
 
 A callback API isn't necessarily SAX. A callback API doesn't necessarily
 have to parse everything until completion, it could parse only the next
 token and call the appropriate callback.

When I talk "callback api," I mean something fundamentally like SAX.

SAX is defintely a popular callback API for XML, but to me a callback API just imply that some callback gets called.
 The reason is that if your callback api only does a single callback, all
 you've really done is move the switch statement inside the function call
 at the cost of having to define a crapload of functions outside of it.

The thing is that inside the parser code there is already a separate code path for dealing with each type of token. Various callbacks can be called from these separate code paths. When you return after parsing one token, the code path isn't different anymore, so you need to add an extra swich statement that wouldn't be there with a callback called from the right code path.
 If I can construct a range class/struct over my callback API I'll be
 happy. And if I can recursively call the parser API inside a callback
 handler so I can reuse the call stack while parsing then I'll be very
 happy.

I don't see how constructing a range over a callback api will work. Callback apis are inversion of control, ranges aren't.

Your definition of a callback API is about inversion of control. My definition is just that it parse one token and call a function for that token. If you read what I wrote using your definition, it obviously can't work indeed.
 ...
 
 Like I said, this seems like a lot of work to bolt a callback interface
 onto something a pull api is designed for.
 
 At best, you'll end up rewriting this:
 
 foreach( tt ; pp )
 {
 switch( tt )
 {
 case XmlTokenType.StartElement: blah(pp.name); break;
 ...
 }
 }

to this:
 pp.parse
 (
 XmlToken(Type.StartElement, {blah(pp.name);}),
 ...
 );

Except of course that you now can't easily control the loop, nor can do you do fall-through on the cases.

Again, my definition of a callback API doesn't include an implicit loop, just a callback. And I intend the callback to be a template argument so it can be dispatched using function overloading and/or function templates. So you'll have this instead: bool continue = true; do continue = pp.readNext!(callback)(); while (continue); void callback(OpenElementToken t) { blah(t.name); } void callback(CloseElementToken t) { ... } void callback(CharacterDataToken t) { ... } ... No switch statement and no inversion of control. And here's my current prototype for a range: alias Algebraic!( CharDataToken, CommentToken, PIToken, CDataSectionToken, AttrToken, XMLDeclToken, OpenElementToken, CloseElementToken, EmptyElementToken ) XMLToken; struct XMLForwardRange(Parser) { bool empty; XMLToken front; Parser parser; this(Parser parser) { this.parser = parser; popFront(); // parse first token } void popFront() { empty = !parser.readNext!(callback)(); } private void callback(T)(T token) { front = token; } } Constructing a pull parser using the same pattern should be pretty easy if you wanted to. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 24 2009
next sibling parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-24 14:13:31 -0400, Michel Fortin <michel.fortin michelf.com> said:

 The reason is that if your callback api only does a single callback, all
 you've really done is move the switch statement inside the function call
 at the cost of having to define a crapload of functions outside of it.

The thing is that inside the parser code there is already a separate code path for dealing with each type of token. Various callbacks can be called from these separate code paths. When you return after parsing one token, the code path isn't different anymore, so you need to add an extra swich statement that wouldn't be there with a callback called from the right code path.

I suddenly noticed that I misunderstood what you meant in the paragraph above so I don't expect my answer above to fit your question. Nevertheless, I suppose the examples at the end of my previous post will clarify things: basically the callback isn't a function pointer, it's an alias template argument which can disptach to overloaded functions or template functions so you don't need a switch statement. Sorry for any confusion. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 24 2009
prev sibling parent reply Daniel Keep <daniel.keep.lists gmail.com> writes:
Michel Fortin wrote:
 On 2009-05-24 12:51:43 -0400, Daniel Keep <daniel.keep.lists gmail.com>
 said:

(Cutting us mostly going back-and-forth on what a callback api would look like.
 ...

 Like I said, this seems like a lot of work to bolt a callback interface
 onto something a pull api is designed for.

 ...

 Except of course that you now can't easily control the loop, nor can do
 you do fall-through on the cases.

Again, my definition of a callback API doesn't include an implicit loop, just a callback. And I intend the callback to be a template argument so it can be dispatched using function overloading and/or function templates. So you'll have this instead: bool continue = true; do continue = pp.readNext!(callback)(); while (continue); void callback(OpenElementToken t) { blah(t.name); } void callback(CloseElementToken t) { ... } void callback(CharacterDataToken t) { ... } ... No switch statement and no inversion of control.

Except that you can't define overloads of a function inside a function. Which means you have to stuff all of your code in a set of increasingly obtusely-named globals or private members. Like elemAStart, elemAData, elemAAttr, elemAClose, elemBStart, elemBData, elemBAttr, ... One problem I see here is that you're going to spaghettify the code and state. For example, let's say I'm writing code to handle a particular element. I can't put the code and state for this into a single function, I have to break it out over several. One function for each event. This means I need to have all state variables visible from each function. So I have to start shoving the state into the owning object instead of on the stack. Whoops, I can't recurse now, can I? Sucks if I'm using any sort of hierarchical structure. I can't use the call stack, so I have to invent my own. I don't want to make every state variable a stack, so I put each component of the parser into a separate object which I can instantiate and kick off. And at that point, I've just reinvented SAX. Well, almost. I have control over the loop. I still can't simply break out of it; I've got to mess around with flags to get that done. Meanwhile, if I write that code with a PullParser, it's just a collection of normal functions, one per element type with all the related code together in one place. Or, if I don't want them all bundled together, I can dispatch to smaller functions. I have a feeling you're going to head down this path irrespective, so I'll just hope you can figure out a way to make the api not suck.
May 24 2009
parent Michel Fortin <michel.fortin michelf.com> writes:
On 2009-05-24 20:31:05 -0400, Daniel Keep <daniel.keep.lists gmail.com> said:

 Michel Fortin wrote:
 On 2009-05-24 12:51:43 -0400, Daniel Keep <daniel.keep.lists gmail.com>
 said:

(Cutting us mostly going back-and-forth on what a callback api would look like.
 ...
 
 Like I said, this seems like a lot of work to bolt a callback interface
 onto something a pull api is designed for.
 
 ...
 
 Except of course that you now can't easily control the loop, nor can do
 you do fall-through on the cases.

Again, my definition of a callback API doesn't include an implicit loop, just a callback. And I intend the callback to be a template argument so it can be dispatched using function overloading and/or function templates. So you'll have this instead: bool continue = true; do continue = pp.readNext!(callback)(); while (continue); void callback(OpenElementToken t) { blah(t.name); } void callback(CloseElementToken t) { ... } void callback(CharacterDataToken t) { ... } ... No switch statement and no inversion of control.

Except that you can't define overloads of a function inside a function.

I didn't know that. Interesting point. Perhaps that's just a bug in the compiler that we could get fixed though. Any clue on that? I notice it also happen if you want to specialize a nested template function.
 Which means you have to stuff all of your code in a set of increasingly
 obtusely-named globals or private members.  Like elemAStart, elemAData,
 elemAAttr, elemAClose, elemBStart, elemBData, elemBAttr, ...

But when inside a function you can still dispatch using a nested function template: void callback(T)(T t) { static if (is(T : OpenElementToken)) { blah(t.name); } static if (is(T : CloseElementToken)) { ... } } It sure is a little less elegant, but you still skip a switch.
 ...
 And at that point, I've just reinvented SAX.  Well, almost.  I have
 control over the loop.  I still can't simply break out of it; I've got
 to mess around with flags to get that done.
 
 Meanwhile, if I write that code with a PullParser, it's just a
 collection of normal functions, one per element type with all the
 related code together in one place.  Or, if I don't want them all
 bundled together, I can dispatch to smaller functions.

There's no way I'm not including a pull API, most likely implemented as a range.
 I have a feeling you're going to head down this path irrespective, so
 I'll just hope you can figure out a way to make the api not suck.

I want to offer at least two API options (so you can choose the most appropriate parser API for what you do), and I want all of them to share the same underlying parser (so I don't write two or three parsers) with no compromise on speed. I'm now realizing that an inversion of control can increase the performance of the parser by not having to rebranch on the current state each time you ask for a new token. I don't want to force inversion of control to anyone, but surely an API with inversion of control should be possible at full speed, and it can't be built on top of a pull parser. So basically, the way I see it, you'd have two APIs: the inversion of control callback parser (for which you can specify a stop criterion so that it saves it state and release control) and the range parser. The range is built on top of the inversion of control parser with a stop criterion making it stop and save its state after each token. With inlining, both APIs should run at optimal speed. Perhaps you'll say that it's complicated, but if you have a better idea capable of extracting a maximum of performance for both parser APIs, then I'd like to know. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 26 2009
prev sibling parent dolive <dolive89 sina.com> writes:
Andrei Alexandrescu Š“µ½:

 Jason House wrote:
 Andrei Alexandrescu Wrote:
 
 Jason House wrote:
 BCS wrote:

 Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Andrei, Barotsz and friends what in it, but it might be changed if the Lib writers as for some tweaks. Sort of a "feature" freaze.

"Walter, Andrei, Bartosz, and friends": If you're reading this, can you shed some light on what's happening before D2 is declared stable? And when?

progress towards writing up until the end of August. The last chapter concerns concurrency and is the fuzziest one.

Ok, so pen down in three months?

Yah.
 Thank you for your initiative to enlist help from the community. There's 
 a lot of very visible help already happening: there's been a sharp 
 increase in bug reports and patches recently. Walter and I are still 
 scratching our head over that (it's not like dmd got much crappier 
 overnight). I can only infer that more people have started using more of D.

The increase is interesting. Out of curiosity, is the increase dominantly for the backend? I wonder if having a sense of D2 stabilizing is increasing usage of D2 overall.

Walter has no specific statistics.
 I'd be thrilled to add more stuff to Phobos. Stuff can be done with 
 ranges that's almost indistinguishable from poetry. But ranges aren't 
 everything, Georg :o). I think Shin's BlackHole and WhiteHole slammed 
 open a door to a world of amazing possibilities. Things like 
 compile-time reflection, run-time reflection, and dynamic loading are 
 very hot and the possibilities are huge. Among other things, Variant can 
 with relative ease implement a function var.call("fun", arg1, arg2) that 
 forwards everything dynamically to a member function of the embedded object.

What do you / others consider the weakest / missing parts of Phobos?

Wow. Where should I start. Let me go down the list of modules and share a few thoughts. * std.array: we need to make a decision about differentiating arrays from slices. * std.base64: doesn't deserve a separate module * std.bind: eliminate? * std.bitmanip: define a range for BitArray and eliminate opApply. Add opSlice. * std.vendor: should this go in core? * std.complex: IMPLEMENT. Eliminate any trace of built-in complex. * std.conv: define operations to stream data out and in in binary and text formats. * std.cover: another little module that should be merged somewhere * std.date: unnecessarily clunky and low-level. Also, somehow Walter thinks that std.dateparse has absolutely nothing to do with date. * std.demangle: another small module. Should be merged with e.g. other compiler-specific stuff. * std.encoding, std.utf: we need a massive overhaul of all encoding-specific stuff. Massive. Epic. The current pile of... functionality makes the simplest stuff look like rocket surgery. * std.md5: we should add more such encryption devices. * std.metastrings: I hate the name. Merge into std.string using ctfe * std.mmfile: integrate with the garbage collector. It should be there. * std.outbuffer: I think this shouldn't be a class and shouldn't have that name. * std.outofmemory: why??? * std.process: add pipe() for Windows. Actually that should be in stdio. * std.regex, std.regexp: merge and finalize. * std.signals: I don't know much. A review wouldn't hurt. * std.socket, std.socketstream: We need a real networking library. * std.stdio: implement readf and various I/O specific ranges * std.cstream, std.stream: eliminate. * std.string: arrange so there's no overlapping/conflict with std.algorithm. Implement bidir range for reading strings correctly (already done that). * std.system: merge somewhere * std.thread: replace * std.variant: add dynamic method invocation capabilities * std.xml: replace with something that moves faster than molasses. * std.zip: rewrite Well there's much other stuff I'm sure but I just dumped what came to mind when taking a look. Andrei

Should have a database related foundation modul£¬for example: std.data dolive
Jun 04 2009
prev sibling parent Georg Wrede <georg.wrede iki.fi> writes:
Andrei Alexandrescu wrote:
 But ranges aren't everything, Georg :o).

:-)
May 23 2009
prev sibling parent Brad Roberts <braddr puremagic.com> writes:
Don wrote:
 bearophile wrote:
 Andrei Alexandrescu:
 there's been a sharp increase in bug reports and patches recently.
 Walter and I are still scratching our head over that (it's not like
 dmd got much crappier overnight). I can only infer that more people
 have started using more of D.

I think it's mostly a complex consequence of showing DMD source code. I have "predicted" this outcome in one post more than one year ago.

Yes. It's the simple fact that you can compile DMD out-of-the-box. In fact, everyone who has downloaded DMD is "forced" to have a working copy of the source code! It's interesting to compare this with GDC, which, with the GNU license, is a purer form of "free software". Yet, it's amazingly difficult to get it to compile (I tried once, and failed). It's not just about having source code "available".

I don't believe that to be the case. That would explain why more _fixes_ are being provided (primarily thanks to your contributions), but not why there's been an increase in bug _filing_. http://d.puremagic.com/issues/reports.cgi?product=D&datasets=UNCONFIRMED%3A&datasets=NEW%3A&datasets=ASSIGNED%3A&datasets=REOPENED%3A&datasets=VERIFIED%3A&datasets=FIXED%3A Regardless.. it's all good. More reports == more chances of more things being fixed. Later, Brad
May 22 2009
prev sibling next sibling parent reply Robert Clipsham <robert octarineparrot.com> writes:
Jason House wrote:
 Andrei has indicated that the current plan is to finalize D2 when his
 book comes out.
 
 Given this, I'm interested in what _community_ activity should be
 done as part of this.
 
 Should there be a formal review and polishing of the D spec? More
 than just criticizing faults, people should submit patches or open a
 discussion of what something means. Unimplemented features should be
 clearly marked or removed.

I personally think this is a must before D2 is declared as stable. I don't see how it is possible to call a language complete/stable without a complete specification. We don't want D2 to end up in the state D1 is, where it's 'stable', but the spec is incomplete so there are breaking issues which won't be fixed as the language is 'stable'.
 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others.
 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

I think not. As long as dmd2 goes through a beta/release candidate phase I don't think that this will be an issue.
 Can we generate a bugfix most wanted list? The formal list could
 inspire patches by motivated community members. There should be a
 quality requirement and a review process for submissions.

This is what the voting system in bugzilla is for!
 To do this, we only need coordinators and a willingness from Walter
 to promptly handle all the patch submissions. (I don't care if Walter
 delegates, but it's tough to get motivated to do work if there's no
 promise for using the output of one's hard work. Walter should also
 be able to use a red pen on the most-wanted list before the tasks are
 given out.

I don't care how it happens, as long as it does. I think as long as there are no blocker bugs eg #340 at the time of the first stable release, or any other bugs that will cause breaking changes to fix, it should work out alright.
 Thoughts?

My main thought is that this is a bit early to be thinking about this. D2 is still in alpha, with lots of feature and bug changes in each release. Until its feature set begins to settle I don't think it is too important to think about how to manage a release. When we get to that stage, I think as long as there is a point of feature freeze, where: * All remaining major bugs are worked out (possibly across a few releases) * The spec is cleaned up, updated and completed * We have this discussion, and make sure the members of the D community are happy with the language, happy their libraries/apps will port well etc Then D2 will become a major success!
May 22 2009
parent Jason House <jason.james.house gmail.com> writes:
Robert Clipsham wrote:

 My main thought is that this is a bit early to be thinking about this.
 D2 is still in alpha, with lots of feature and bug changes in each
 release. Until its feature set begins to settle I don't think it is too
 important to think about how to manage a release.

I get the impression from some of Andrei's posts that D2 may be declared done in 3 months. (I made up that number, but that's the general vibe that I get) If that's true, it really is time to consider this stuff. There's not a lot of time for revisions to the book before it goes to print. I also expect a solid finalization process to take a few months. D shouldn't take anywhere near as long as C++0x to standardize. If we really do focus on supplying patches, that will take a considerable amount of time.
May 22 2009
prev sibling next sibling parent Tomas Lindquist Olsen <tomas.l.olsen gmail.com> writes:
On Sat, May 23, 2009 at 7:25 AM, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:
 Jason House wrote:
 Andrei Alexandrescu Wrote:

 Jason House wrote:
 BCS wrote:

 Hello Jason,

 Should the final freezing of D2 be delayed until major D1 libraries
 port to D2? I'm mostly thinking of Tango, but I bet there are others=






 It may even be good if major libraries could use a Phobos-compatible
 license and become part of the releases by digital mars.

Maybe it should be declared "done" as in it's got everything that Walter, Andrei, Barotsz and friends what in it, but it might be changed if th=





 Lib
 writers as for some tweaks. Sort of a "feature" freaze.

Yes! "Walter, Andrei, Bartosz, and friends": If you're reading this, can yo=




 shed some light on what's happening before D2 is declared stable? =C2=





progress towards writing up until the end of August. The last chapter concerns concurrency and is the fuzziest one.

Ok, so pen down in three months?

Yah.
 Thank you for your initiative to enlist help from the community. There'=



 a lot of very visible help already happening: there's been a sharp incr=



 in bug reports and patches recently. Walter and I are still scratching =



 head over that (it's not like dmd got much crappier overnight). I can o=



 infer that more people have started using more of D.

The increase is interesting. Out of curiosity, is the increase dominantl=


 for the backend? I wonder if having a sense of D2 stabilizing is increas=


 usage of D2 overall.

Walter has no specific statistics.
 I'd be thrilled to add more stuff to Phobos. Stuff can be done with
 ranges that's almost indistinguishable from poetry. But ranges aren't
 everything, Georg :o). I think Shin's BlackHole and WhiteHole slammed o=



 door to a world of amazing possibilities. Things like compile-time
 reflection, run-time reflection, and dynamic loading are very hot and t=



 possibilities are huge. Among other things, Variant can with relative e=



 implement a function var.call("fun", arg1, arg2) that forwards everythi=



 dynamically to a member function of the embedded object.

What do you / others consider the weakest / missing parts of Phobos?

Wow. Where should I start. Let me go down the list of modules and share a few thoughts. * std.array: we need to make a decision about differentiating arrays from slices. * std.base64: doesn't deserve a separate module * std.bind: eliminate? * std.bitmanip: define a range for BitArray and eliminate opApply. Add opSlice. * std.vendor: should this go in core? * std.complex: IMPLEMENT. Eliminate any trace of built-in complex.

How do you plan to handle ABI compatibility with C if complex becomes a library type? Drop it?
 * std.conv: define operations to stream data out and in in binary and tex=

 formats.

 * std.cover: another little module that should be merged somewhere

 * std.date: unnecessarily clunky and low-level. Also, somehow Walter thin=

 that std.dateparse has absolutely nothing to do with date.

 * std.demangle: another small module. Should be merged with e.g. other
 compiler-specific stuff.

 * std.encoding, std.utf: we need a massive overhaul of all encoding-speci=

 stuff. Massive. Epic. The current pile of... functionality makes the
 simplest stuff look like rocket surgery.

 * std.md5: we should add more such encryption devices.

 * std.metastrings: I hate the name. Merge into std.string using ctfe

 * std.mmfile: integrate with the garbage collector. It should be there.

 * std.outbuffer: I think this shouldn't be a class and shouldn't have tha=

 name.

 * std.outofmemory: why???

 * std.process: add pipe() for Windows. Actually that should be in stdio.

 * std.regex, std.regexp: merge and finalize.

 * std.signals: I don't know much. A review wouldn't hurt.

 * std.socket, std.socketstream: We need a real networking library.

 * std.stdio: implement readf and various I/O specific ranges

 * std.cstream, std.stream: eliminate.

 * std.string: arrange so there's no overlapping/conflict with std.algorit=

 Implement bidir range for reading strings correctly (already done that).

 * std.system: merge somewhere

 * std.thread: replace

 * std.variant: add dynamic method invocation capabilities

 * std.xml: replace with something that moves faster than molasses.

 * std.zip: rewrite

 Well there's much other stuff I'm sure but I just dumped what came to min=

 when taking a look.


 Andrei

May 23 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Sat, 23 May 2009 17:33:10 +0400, Andrei Alexandrescu
<SeeWebsiteForEmail erdani.org> wrote:

 * std.xml: replace with something that moves faster than molasses.
 * std.zip: rewrite

> * std.md5: we should add more such encryption devices. > * std.base64: doesn't deserve a separate module > * std.conv: define operations to stream data out and in in binary and > text formats. How about giving these up to Tango? The only problem is, it has not been ported to D2 yet.

That's not an option. Andrei

That's an *awesome* option! These are big complex tasks. A lot of internal redisign, breaking changes etc will follow alongside with bugfixes. Phobos can't afford something like this. Besides they are and not crucial part of language, and I believe should be done as a third-party library. Most importantly, Tango has already implemented all of the above. It is an important task not only to allow Tango and Phobos coexist, but make them fit together.
May 23 2009