www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Phobos Proposal: replace std.xml with kxml.

reply Bernard Helyer <b.helyer gmail.com> writes:
When I first started using D, one of the things I needed quite early on 
was a way of writing and reading XML. Naturally, when I saw std.xml in 
Phobos, I was quite pleased.

That was of course, until I started to use it.

http://d.puremagic.com/issues/show_bug.cgi?id=3088
http://d.puremagic.com/issues/show_bug.cgi?id=4069
http://d.puremagic.com/issues/show_bug.cgi?id=3201

I vented my frustation on IRC, where opticron mentioned he had an XML 
library of his own. I find it superior to std.xml, especially 
considering how it actually works, and is maintained.

http://opticron.no-ip.org/svn/branches/kxml/

It is already under the Boost License, and opticron has said

"<opticron> ... if they really want to snag mine and clean it up for use 
in phobos, that's fine
<opticron> I'd even relicense the code if I have to"

I'm going to keep on using kxml regardless, but I thought it would be 
nice if Phobos had a working XML library. What say you?



-Bernard.
May 03 2010
next sibling parent reply "Adam D. Ruppe" <destructionator gmail.com> writes:
On Tue, May 04, 2010 at 09:18:46AM +1200, Bernard Helyer wrote:
 I'm going to keep on using kxml regardless, but I thought it would be 
 nice if Phobos had a working XML library. What say you?
I've got something almost ready to be thrown into the ring too, my DOM lib.: http://arsdnet.net/dcode/dom.d My initial goal here was to mimic Javascript in the browser, but it has since grown to have a bunch of extensions too. An important aspect of mimicing js is I don't really care about the xml standard; it tries to figure out whatever ugly crap you throw its way and makes a few assumptions for html. But, it can be used for generic xml stuff too. -- Adam D. Ruppe http://arsdnet.net
May 03 2010
parent BCS <none anon.com> writes:
Hello Adam,

 An important aspect of mimicing js is I don't really care about the
 xml
 standard; it tries to figure out whatever ugly crap you throw its way
 and makes a few assumptions for html. But, it can be used for generic
 xml
 stuff too.
That can be handy but can also lead to problems. -- ... <IXOYE><
May 04 2010
prev sibling parent reply Graham Fawcett <fawcett uwindsor.ca> writes:
On Tue, 04 May 2010 09:18:46 +1200, Bernard Helyer wrote:

 When I first started using D, one of the things I needed quite early on
 was a way of writing and reading XML. Naturally, when I saw std.xml in
 Phobos, I was quite pleased.
 
 That was of course, until I started to use it.
 
 http://d.puremagic.com/issues/show_bug.cgi?id=3088
 http://d.puremagic.com/issues/show_bug.cgi?id=4069
 http://d.puremagic.com/issues/show_bug.cgi?id=3201
 
 I vented my frustation on IRC, where opticron mentioned he had an XML
 library of his own. I find it superior to std.xml, especially
 considering how it actually works, and is maintained.
 
 http://opticron.no-ip.org/svn/branches/kxml/
 
 It is already under the Boost License, and opticron has said
 
 "<opticron> ... if they really want to snag mine and clean it up for use
 in phobos, that's fine
 <opticron> I'd even relicense the code if I have to"
 
 I'm going to keep on using kxml regardless, but I thought it would be
 nice if Phobos had a working XML library. What say you?
I haven't looked at kxml -- but why not just wrap libxml2? It's widely regarded as a fast, stable, portable and *correct* XML library. I wrote a partial libxml2 wrapper (mostly the tree.h stuff, and some libxslt) in under an hour as a learning exercise; someone with real D chops could turn out a polished interface in short time. The fact that libxml2/libxslt support not only XML parsing and DOM building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc., means that any homegrown library will be hard-pressed to cover the same range of tools and features. There are too many half-baked XML libraries in the world. No disrespect intended to opticron or anyone else; it just doesn't make a lot of sense to reinvent such a complex wheel (and believing that XML processing isn't complex is a sure sign that your homegrown library's design is incomplete!). Graham
May 03 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Graham Fawcett wrote:
 On Tue, 04 May 2010 09:18:46 +1200, Bernard Helyer wrote:
 
 When I first started using D, one of the things I needed quite early on
 was a way of writing and reading XML. Naturally, when I saw std.xml in
 Phobos, I was quite pleased.

 That was of course, until I started to use it.

 http://d.puremagic.com/issues/show_bug.cgi?id=3088
 http://d.puremagic.com/issues/show_bug.cgi?id=4069
 http://d.puremagic.com/issues/show_bug.cgi?id=3201

 I vented my frustation on IRC, where opticron mentioned he had an XML
 library of his own. I find it superior to std.xml, especially
 considering how it actually works, and is maintained.

 http://opticron.no-ip.org/svn/branches/kxml/

 It is already under the Boost License, and opticron has said

 "<opticron> ... if they really want to snag mine and clean it up for use
 in phobos, that's fine
 <opticron> I'd even relicense the code if I have to"

 I'm going to keep on using kxml regardless, but I thought it would be
 nice if Phobos had a working XML library. What say you?
I haven't looked at kxml -- but why not just wrap libxml2? It's widely regarded as a fast, stable, portable and *correct* XML library. I wrote a partial libxml2 wrapper (mostly the tree.h stuff, and some libxslt) in under an hour as a learning exercise; someone with real D chops could turn out a polished interface in short time. The fact that libxml2/libxslt support not only XML parsing and DOM building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc., means that any homegrown library will be hard-pressed to cover the same range of tools and features. There are too many half-baked XML libraries in the world. No disrespect intended to opticron or anyone else; it just doesn't make a lot of sense to reinvent such a complex wheel (and believing that XML processing isn't complex is a sure sign that your homegrown library's design is incomplete!). Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges. Andrei
May 03 2010
next sibling parent reply Bernard Helyer <b.helyer gmail.com> writes:
On 04/05/10 11:01, Andrei Alexandrescu wrote:
 I think what we need for the standard library is to take a solid XML
 library licensed generously and adapt it to work with arbitrary ranges.

 Andrei
Care to give an example?
May 03 2010
parent reply Ellery Newcomer <ellery-newcomer utulsa.edu> writes:
On 05/03/2010 06:24 PM, Bernard Helyer wrote:
 On 04/05/10 11:01, Andrei Alexandrescu wrote:
 I think what we need for the standard library is to take a solid XML
 library licensed generously and adapt it to work with arbitrary ranges.

 Andrei
Care to give an example?
I was curious about this too. When I looked around, I saw TinyXML (zlib) POCO xml (boost) but I've never used either and couldn't say whether either is solid.
May 03 2010
parent Richard Webb <webby beardmouse.co.uk> writes:
RapidXML also uses the Boost license (it's included as part of the Boost
PropertyTree library).
I haven't used it though, so i can't say how i compares to the others.
May 04 2010
prev sibling parent reply Graham Fawcett <fawcett uwindsor.ca> writes:
On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:

 Graham Fawcett wrote:
 The fact that libxml2/libxslt support not only XML parsing and DOM
 building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
 means that any homegrown library will be hard-pressed to cover the same
 range of tools and features.
 
 There are too many half-baked XML libraries in the world. No disrespect
 intended to opticron or anyone else; it just doesn't make a lot of
 sense to reinvent such a complex wheel (and believing that XML
 processing isn't complex is a sure sign that your homegrown library's
 design is incomplete!).
 
 Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges.
By "adapt" do you mean writing a wrapper for an existing library, or translating the source code of the library into D? What constitutes a "generous license" in this context? (For what it's worth, libxml2 is under the MIT License.) Graham
May 04 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Graham Fawcett wrote:
 On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:
 
 Graham Fawcett wrote:
 The fact that libxml2/libxslt support not only XML parsing and DOM
 building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
 means that any homegrown library will be hard-pressed to cover the same
 range of tools and features.

 There are too many half-baked XML libraries in the world. No disrespect
 intended to opticron or anyone else; it just doesn't make a lot of
 sense to reinvent such a complex wheel (and believing that XML
 processing isn't complex is a sure sign that your homegrown library's
 design is incomplete!).

 Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges.
By "adapt" do you mean writing a wrapper for an existing library, or translating the source code of the library into D? What constitutes a "generous license" in this context? (For what it's worth, libxml2 is under the MIT License.) Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible. Andrei
May 04 2010
next sibling parent reply Graham Fawcett <fawcett uwindsor.ca> writes:
On Tue, 04 May 2010 09:09:29 -0700, Andrei Alexandrescu wrote:

 Graham Fawcett wrote:
 On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:
 
 Graham Fawcett wrote:
 The fact that libxml2/libxslt support not only XML parsing and DOM
 building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
 means that any homegrown library will be hard-pressed to cover the
 same range of tools and features.

 There are too many half-baked XML libraries in the world. No
 disrespect intended to opticron or anyone else; it just doesn't make
 a lot of sense to reinvent such a complex wheel (and believing that
 XML processing isn't complex is a sure sign that your homegrown
 library's design is incomplete!).

 Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges.
By "adapt" do you mean writing a wrapper for an existing library, or translating the source code of the library into D? What constitutes a "generous license" in this context? (For what it's worth, libxml2 is under the MIT License.) Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible.
I think I understand your motivations: this is standard library, and so you want to minimize dependencies. But from a maintenance perspective, it seems a bad idea to translate a complex library into D code that few people will actively maintain -- whereas writing a wrapper (and introducing a library dependency) would keep the codebase small, let you share maintenance costs with the third-party library's developers, and (arguably) increase the stability and quality of the stdlib? I am not pushing for libxml2 as The Answer. I'm just questioning the motivation to translate other people's code to D, when the D platform excels at library integration. (Although I agree with your suggestion to borrow inspiration/code from Boost for datetime and other features; that's different, since Boost cannot feasibly be wrapped.) Best, Graham
May 04 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Graham Fawcett wrote:
 On Tue, 04 May 2010 09:09:29 -0700, Andrei Alexandrescu wrote:
 
 Graham Fawcett wrote:
 On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:

 Graham Fawcett wrote:
 The fact that libxml2/libxslt support not only XML parsing and DOM
 building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
 means that any homegrown library will be hard-pressed to cover the
 same range of tools and features.

 There are too many half-baked XML libraries in the world. No
 disrespect intended to opticron or anyone else; it just doesn't make
 a lot of sense to reinvent such a complex wheel (and believing that
 XML processing isn't complex is a sure sign that your homegrown
 library's design is incomplete!).

 Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges.
By "adapt" do you mean writing a wrapper for an existing library, or translating the source code of the library into D? What constitutes a "generous license" in this context? (For what it's worth, libxml2 is under the MIT License.) Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible.
I think I understand your motivations: this is standard library, and so you want to minimize dependencies. But from a maintenance perspective, it seems a bad idea to translate a complex library into D code that few people will actively maintain -- whereas writing a wrapper (and introducing a library dependency) would keep the codebase small, let you share maintenance costs with the third-party library's developers, and (arguably) increase the stability and quality of the stdlib? I am not pushing for libxml2 as The Answer. I'm just questioning the motivation to translate other people's code to D, when the D platform excels at library integration. (Although I agree with your suggestion to borrow inspiration/code from Boost for datetime and other features; that's different, since Boost cannot feasibly be wrapped.) Best, Graham
My concern is purely technical - a library we just link to would force a number of choices, such as input representation (e.g. arrays of char). Ideally we should be able to change the library to accept any compatible range of any compatible characters. As a simple example, consider std.algorithm.levenshteinDistance. There are plenty of good implementations and initially I just wrote one almost identical to the Web lore. However, later I needed to compute Levenshtein distances between strings stored in lists (tries, actually). Well that doesn't work because the implementation at that time used random access s[i] and t[i] all over the place. But it wasn't difficult to change the algorithm to work with forward ranges. So now we have one of the few Levenshtein distance implementations that work with other inputs than arrays. In particular, we work correctly with UTF inputs without needing to copy the input, something that I haven't seen anywhere else. If you google for ``levenshtein utf'' Google will even think the query has a typo. Search results include an OCaml implementation that copies the input (http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levensh ein_distance#OCaml) and a Ruby implementation that also copies the input (http://rubyforge.org/frs/?group_id=2080&release_id=7389). By using the range abstraction, we get to support UTF Levenshtein without significant additional implementation effort - the code is very similar to the one using indices throughout. Andrei
May 04 2010
parent Graham Fawcett <fawcett uwindsor.ca> writes:
On Tue, 04 May 2010 11:56:31 -0700, Andrei Alexandrescu wrote:

 Graham Fawcett wrote:
 On Tue, 04 May 2010 09:09:29 -0700, Andrei Alexandrescu wrote:
 
 Graham Fawcett wrote:
 On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:

 Graham Fawcett wrote:
 The fact that libxml2/libxslt support not only XML parsing and DOM
 building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
 means that any homegrown library will be hard-pressed to cover the
 same range of tools and features.

 There are too many half-baked XML libraries in the world. No
 disrespect intended to opticron or anyone else; it just doesn't
 make a lot of sense to reinvent such a complex wheel (and believing
 that XML processing isn't complex is a sure sign that your
 homegrown library's design is incomplete!).

 Graham
I think what we need for the standard library is to take a solid XML library licensed generously and adapt it to work with arbitrary ranges.
By "adapt" do you mean writing a wrapper for an existing library, or translating the source code of the library into D? What constitutes a "generous license" in this context? (For what it's worth, libxml2 is under the MIT License.) Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible.
I think I understand your motivations: this is standard library, and so you want to minimize dependencies. But from a maintenance perspective, it seems a bad idea to translate a complex library into D code that few people will actively maintain -- whereas writing a wrapper (and introducing a library dependency) would keep the codebase small, let you share maintenance costs with the third-party library's developers, and (arguably) increase the stability and quality of the stdlib? I am not pushing for libxml2 as The Answer. I'm just questioning the motivation to translate other people's code to D, when the D platform excels at library integration. (Although I agree with your suggestion to borrow inspiration/code from Boost for datetime and other features; that's different, since Boost cannot feasibly be wrapped.) Best, Graham
My concern is purely technical - a library we just link to would force a number of choices, such as input representation (e.g. arrays of char). Ideally we should be able to change the library to accept any compatible range of any compatible characters. As a simple example, consider std.algorithm.levenshteinDistance. There are plenty of good implementations and initially I just wrote one almost identical to the Web lore. However, later I needed to compute Levenshtein distances between strings stored in lists (tries, actually). Well that doesn't work because the implementation at that time used random access s[i] and t[i] all over the place. But it wasn't difficult to change the algorithm to work with forward ranges. So now we have one of the few Levenshtein distance implementations that work with other inputs than arrays. In particular, we work correctly with UTF inputs without needing to copy the input, something that I haven't seen anywhere else. If you google for ``levenshtein utf'' Google will even think the query has a typo. Search results include an OCaml implementation that copies the input (http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/
Levenshtein_distance#OCaml)
 and a Ruby implementation that also copies the input
 (http://rubyforge.org/frs/?group_id=2080&release_id=7389). By using the
 range abstraction, we get to support UTF Levenshtein without significant
 additional implementation effort - the code is very similar to the one
 using indices throughout.
That's a strong argument -- thank you for taking the time to respond. Regards, Graham
May 04 2010
prev sibling parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-04 12:09:29 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Graham Fawcett wrote:
 By "adapt" do you mean writing a wrapper for an existing library, or 
 translating the source code of the library into D?
 What constitutes a "generous license" in this context? (For what it's 
 worth, libxml2 is under the MIT License.)
 
 Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible.
I think if you wanted to port an XML library to make use of ranges, the only viable option is probably to find one based on C++ iterators. Otherwise it'll look more like a rewrite than a port, and at this point why not write one from scratch? Anyway, just in case, would you be interested in an XML tokenizer and simple DOM following this model? http://michelf.com/docs/d/mfr/xmltok.html http://michelf.com/docs/d/mfr/xml.html At the base is a pull parser and an event parser mixed in the same function template: "tokenize", allowing you to alternate between even-based and pull-parsing at will. I'm using it, but its development is on hold at this time, I'm just maintaining it so it compiles on the newest versions of DMD. The only thing it doesn't parse at this time is inline DTDs inside the doctype. Also, it currently only works only with strings, for simplicity and performance. There is one issue about non-string parsing: when parsing a string, it's easy to just slice the string and move it around, but if you're parsing from a generic input range, you basically have to copy characters one by one, which is much less efficient. So ideally the algorithm should use slices whenever it can (when the input is a string). I'm not sure yet how to attack this problem, but I'm thinking that perhaps parsing primitives should be "part of" the range interface. I say this in the sense that a range should provide specialized implementation of primitive when it can implement them more efficiently (like by slicing). You wrote a while ago about designing parsing primitives, is this part of Phobos now? Anyway, the problem above is probably the one reason we might want to write the parser from scratch: it needs to bind to specializable higher-level parsing functions to take advantage of the performance characteristics of certain ranges, such as those you can slice. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 04 2010
parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
Michel Fortin wrote:
 On 2010-05-04 12:09:29 -0400, Andrei Alexandrescu 
 <SeeWebsiteForEmail erdani.org> said:
 
 Graham Fawcett wrote:
 By "adapt" do you mean writing a wrapper for an existing library, or 
 translating the source code of the library into D?
 What constitutes a "generous license" in this context? (For what it's 
 worth, libxml2 is under the MIT License.)

 Graham
We'd need to modify the code. I haven't looked into available xml libraries so I don't know which would be eligible.
I think if you wanted to port an XML library to make use of ranges, the only viable option is probably to find one based on C++ iterators. Otherwise it'll look more like a rewrite than a port, and at this point why not write one from scratch?
Design is also a considerable time expense, though I agree that use of ranges may actually improve the design too.
 Anyway, just in case, would you be interested in an XML tokenizer and 
 simple DOM following this model?
 
     http://michelf.com/docs/d/mfr/xmltok.html
     http://michelf.com/docs/d/mfr/xml.html
 
 At the base is a pull parser and an event parser mixed in the same 
 function template: "tokenize", allowing you to alternate between 
 even-based and pull-parsing at will. I'm using it, but its development 
 is on hold at this time, I'm just maintaining it so it compiles on the 
 newest versions of DMD.
Sounds great, but I need to defer XML expertise to others.
 The only thing it doesn't parse at this time is inline DTDs inside the 
 doctype.
 
 Also, it currently only works only with strings, for simplicity and 
 performance. There is one issue about non-string parsing: when parsing a 
 string, it's easy to just slice the string and move it around, but if 
 you're parsing from a generic input range, you basically have to copy 
 characters one by one, which is much less efficient. So ideally the 
 algorithm should use slices whenever it can (when the input is a string).
 
 I'm not sure yet how to attack this problem, but I'm thinking that 
 perhaps parsing primitives should be "part of" the range interface. I 
 say this in the sense that a range should provide specialized 
 implementation of primitive when it can implement them more efficiently 
 (like by slicing). You wrote a while ago about designing parsing 
 primitives, is this part of Phobos now?
 
 Anyway, the problem above is probably the one reason we might want to 
 write the parser from scratch: it needs to bind to specializable 
 higher-level parsing functions to take advantage of the performance 
 characteristics of certain ranges, such as those you can slice.
There are a number of issues. One is that you should allow wchar and dchar in addition to char as basic character types (probably ubyte too for exotic encodings). In essence the char type should be a template parameter. The other is that perhaps you could be able to use zero-based slices, i.e. s[0 .. i] as opposed to arbitrary slices s[i .. j]. A zero-based slice can be supported better than an arbitrary one. Andrei
May 04 2010
parent reply Michel Fortin <michel.fortin michelf.com> writes:
On 2010-05-04 19:41:39 -0400, Andrei Alexandrescu 
<SeeWebsiteForEmail erdani.org> said:

 Anyway, just in case, would you be interested in an XML tokenizer and 
 simple DOM following this model?
 
     http://michelf.com/docs/d/mfr/xmltok.html
     http://michelf.com/docs/d/mfr/xml.html
 
 At the base is a pull parser and an event parser mixed in the same 
 function template: "tokenize", allowing you to alternate between 
 even-based and pull-parsing at will. I'm using it, but its development 
 is on hold at this time, I'm just maintaining it so it compiles on the 
 newest versions of DMD.
Sounds great, but I need to defer XML expertise to others.
If someone else wants to use it, I offer it. Otherwise I'll surely continue working on it eventually.
 Anyway, the problem above is probably the one reason we might want to 
 write the parser from scratch: it needs to bind to specializable 
 higher-level parsing functions to take advantage of the performance 
 characteristics of certain ranges, such as those you can slice.
There are a number of issues. One is that you should allow wchar and dchar in addition to char as basic character types (probably ubyte too for exotic encodings). In essence the char type should be a template parameter.
I totally agree about wchar and dchar... and you also need ubyte with an encoding detection system (checking the encoding in the xml prolog) to correctly parse XML files in any encoding. I think I have a ubyte parser, but it currently just accepts UTF-8 and then branch to the "string" version. (UTF-16 would be needed to implement correctly the XML spec.)
 The other is that perhaps you could be able to use zero-based slices, 
 i.e. s[0 .. i] as opposed to arbitrary slices s[i .. j]. A zero-based 
 slice can be supported better than an arbitrary one.
Yeah, well, it doesn't really work so well. You have to parse the input before slicing. XML doesn't tell you in advance how many characters or code units a string will take. What you need is more like this: // Example XML Parser bool isAtEndOfAttributeContent(dchar char) { if (char == '"') return true; if (char == '<') throw new ParseError(); return false; } void parseXML(T)(T input) if (IsInputRange!T) { [...] case '"': input.popFront(); // remove leading quote string content = readUntil!(isAtEndOfAttributeContent)(input); assert(input.front == '"'); input.popFront(); // remove tailing quote [...] } // Example parsing primitive // String version: can slice string readUntil(isAtEndPredicate)(ref string input) { string savedInput; while (!input.empty && isAtEndPredicate(input.front)) { input.popFront(); } return savedInput[0..$-input.length]; } // Generic input range version: can't slice, must copy immutable(ElementType!T)[] readUntil(isAtEndPredicate, T)(T input) if (IsInputRange!T) { immutable(ElementType!T)[] copy; // should use appender here while (!input.empty) { dchar frontChar = input.front; if (isAtEndPredicate(frontChar)) break; else copy ~= frontChar; input.popFront(); } return copy; } It's easy to appreciate the difference in performance between the string version and the generic version of readUntil just by looking at the code. -- Michel Fortin michel.fortin michelf.com http://michelf.com/
May 04 2010
parent "Robert Jacques" <sandford jhu.edu> writes:
On Tue, 04 May 2010 21:55:53 -0400, Michel Fortin  
<michel.fortin michelf.com> wrote:
 	// String version: can slice
 	string readUntil(isAtEndPredicate)(ref string input) {
 		string savedInput;
 		while (!input.empty && isAtEndPredicate(input.front)) {
 			input.popFront();
 		}
 		return savedInput[0..$-input.length];
 	}
What about using forward ranges and take? // String version: can slice Take!T readUntil(isAtEndPredicate, T)(ref T input) if(isForwardRange!T) { auto savedInput = input; size_t n = 0; while (!input.empty && isAtEndPredicate(input.front)) { input.popFront(); n++; } return take(savedInput,n); }
May 04 2010