www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - XmlTokenizer review: Features and API

I have written an xml tokenizer because I wanted to learn D2 in a sizeable
project. It is templated to support any ranges that have character as element
(but using any other range than w/d/string is at your own risk for now). It
also uses template variables to set different features at compile time. The
features control the degree of xml conformance, namespace support, type of
entity decoding, default entity collection and support for xml fragment parsing.

I liked the experience of programming in D2 and would like to contribute
making the language and its library better. If you find the xml tokenizer
useful and well written, I would eventually like to replace the current
std.xml implementation with it.

There is probably still a lot of work to be done on the tokenizer before its
inclusion in phobos but the good news is that I have the time to improve it
(Around 5 hours each Saturday, this is about the time I spent in the last 5
months writing it). So I want to do a step by step review of the code to know
what needed to be change in order for its inclusion in phobos.

I think we should simply start with reviewing features&api first. The basic
question:
-	Does it make the easy things easy and the difficult one possible?
-	Are the different features useful? Does it have too many of them? Is it
missing some?
-	Are the function/field/enum names easily understandable? Do they follow
Phobos naming convention?

I will try to ask for more review each week depending on the change requested.
Other area that will need to be reviewed later: helper code that can be merged
in phobos, conformance, performance, code structure, test arbitrary character
support, test UTF support for non-english language, etc.

You can read the documentation: http://www.ericdesbiens.com/d/xmltokenizer.html
You can access the code on github: http://github.com/olace/experimental

Hoping to become a regular contributor
Eric Desbiens
Nov 28 2010