digitalmars.D - std.xml and Adam D Ruppe's dom module

Alvaro (17/17) Feb 06 2012 The current std.xml needs a replacement (I think it has already been

Jonathan M Davis (48/70) Feb 06 2012 uage-we

Adam D. Ruppe (18/22) Feb 06 2012 What does range based API mean in this context? I do offer

Jonathan M Davis (14/46) Feb 07 2012 Ideally, std.xml would operate of ranges of dchar (but obviously be opti...

Jacob Carlborg (5/51) Feb 07 2012 I think there should be a pull or sax parser at the lowest level and
Johannes Pfau (12/62) Feb 08 2012 Using ranges of dchar directly can be horribly inefficient in some

Jonathan M Davis (5/15) Feb 08 2012 That's why you accept ranges of dchar but specialize the code for string...

Johannes Pfau (9/26) Feb 08 2012 But spezializing for strings is not enough, you could stream XML over

Adam D. Ruppe (22/26) Feb 08 2012 The way Document.parse works now in my code is with slices.
Robert Jacques (3/17) Feb 08 2012 Speaking as the one proposing said Json replacement, I'd like to point o...

Johannes Pfau (71/103) Feb 09 2012 Regarding wstrings and dstrings: We'll JSON seems to be UTF8 in almost

Sean Kelly (6/122) Feb 09 2012 This. And decoded JSON strings are always smaller than encoded strings--...

Johannes Pfau (10/24) Feb 09 2012 BTW: Do you know DYAML?

Robert Jacques (4/28) Feb 09 2012 I know about D-YAML, but haven't taken a deep look at it; it was develop...

Johannes Pfau (16/68) Feb 09 2012 I know, I didn't mean to criticize. I just thought DYAML could give

Sean Kelly (26/45) Feb 09 2012 For XML, template the parser on char type so transcoding is unnecessary....

Adam D. Ruppe (11/15) Feb 06 2012 I'm biased (obviously), but I've never seen a more convenient

bls (9/9) Feb 07 2012 You know it, web stuff documentation is a weak point. web stuff looks

Adam D. Ruppe (65/71) Feb 07 2012 The closest i have is a little blog like thing that I started

James Miller (22/91) Feb 07 2012 on(event) {

Adam D. Ruppe (21/25) Feb 07 2012 I'm a bit of a hypocrite here; I'll complain until the cows

Adam D. Ruppe (100/100) Feb 07 2012 Here's more ddocs.

Jacob Carlborg (5/17) Feb 07 2012 I think a much better API, than the one browsers provide, can be created...

Jose Armando Garcia (4/27) Feb 08 2012 I know very little about html programming but dart did just that. It

Jacob Carlborg (5/33) Feb 08 2012 It seems so. I haven't looked over the docs but it's good that someone
Adam D. Ruppe (12/14) Feb 08 2012 That actually looks very similar to what the browsers

Adam D. Ruppe (3/5) Feb 08 2012 I'd say I've proven that! dom.d is very, very nice IMO.

Jacob Carlborg (6/21) Feb 07 2012 Maybe Adam's code can be used as a base of implementing a library like

Adam D. Ruppe (16/19) Feb 08 2012 That looks like it does the same job as cgi.d.

Jacob Carlborg (11/28) Feb 09 2012 It seems Rack supports additional interface next to CGI. But I think we

Adam D. Ruppe (26/31) Feb 09 2012 Yeah, in cgi.d, you use Cgi.requestUri, which is an immutable

Jacob Carlborg (4/32) Feb 09 2012 Cool, you already thought of all of this it seems.
Andrei Alexandrescu (3/5) Feb 09 2012 Cue the choir: "Please submit to Phobos".

Adam D. Ruppe (20/21) Feb 09 2012 Perhaps when I finish the URL struct in there. (It

Adam D. Ruppe (34/36) Feb 09 2012 hehehe, I played with this a little bit more tonight.

Joel (10/10) Sep 16 2014 I think there might be a better one out, I noticed some thing in

ketmar via Digitalmars-d (3/4) Sep 17 2014 sure. documentation on 'import' keyword should englighten you.
Adam D. Ruppe (10/11) Sep 17 2014 Since most of dom.d's functionality is inside the Document and

Joel (3/3) Sep 17 2014 Yay! Got it working again! Thanks guys!

Alvaro <alvaroDotSegura gmail.com> writes:

The current std.xml needs a replacement (I think it has already been 
decided so), something with a familiar DOM API with facilities to 
manipulate XML documents.

I needed to do a quick XML file transform and std.xml was of little use. 
Then I found this Adam D. Ruppe's library:

https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff

Its "dom" module has that sort of Javascript-like DOM manipulation code. 
It has getElementsByTagName(), getElementById(), firstChild, nodeValue, 
innerText (read/write), toString, etc. Easy and performing. The XML 
processing I needed to make took minutes with that!

I thus suggest, if licensing allows (?) and no better exists, to base a 
newer std.xml module on that code as a starting point. Well, after 
cleaning up and fixing anything necessary. For instance, that module was 
designed for Web server programming and is targeted at HTML DOM mainly. 
It has stuff to deal with CSS styles, HTML specific elements and their 
special handling (table, a, form, audio, input, ...). A lot of that can 
be left out in XML.

Feb 06 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Tuesday, February 07, 2012 00:15:37 Alvaro wrote:
 The current std.xml needs a replacement (I think it has already been
 decided so), something with a familiar DOM API with facilities to
 manipulate XML documents.
=20
 I needed to do a quick XML file transform and std.xml was of little u=

se.
 Then I found this Adam D. Ruppe's library:
=20
 https://github.com/adamdruppe/misc-stuff-including-D-programming-lang=

uage-we
 b-stuff
=20
 Its "dom" module has that sort of Javascript-like DOM manipulation co=

de.
 It has getElementsByTagName(), getElementById(), firstChild, nodeValu=

e,
 innerText (read/write), toString, etc. Easy and performing. The XML
 processing I needed to make took minutes with that!
=20
 I thus suggest, if licensing allows (?) and no better exists, to base=

 a
 newer std.xml module on that code as a starting point. Well, after
 cleaning up and fixing anything necessary. For instance, that module =

was
 designed for Web server programming and is targeted at HTML DOM mainl=

y.
 It has stuff to deal with CSS styles, HTML specific elements and thei=

r
 special handling (table, a, form, audio, input, ...). A lot of that c=

an
 be left out in XML.

Tomek Sowi=C5=84ski was working on a new std.xml which was intended to =
become the=20
new std.xml, but he hasn't posted since June, so I have no idea where t=
hat=20
stands. And I believe that xmlp is intended as a possible candidate as =
well.

The main issue is that we need someone to design it, implement it, and =
push it=20
through the review process. Just suggesting it is not enough. Someone n=
eeds to=20
champion the new module. And no one has gone all the way with that yet.=


Also, two of the major requirements for an improved std.xml are that it=
 needs=20
to have a range-based API, and it needs to be fast. I don't know how Ad=
am's=20
stacks up against that. Tango's XML parser has pretty much set the bar =
on=20
speed ( http://dotnot.org/blog/archives/2008/03/12/why-is-dtango-so-fas=
t-at-
parsing-xml/ ), and while we may not reach that (especially with the in=
itial=20
implementation), we want a design which is going to be fast and have th=
e=20
potential of reaching Tango-like speeds (whether that's currently possi=
ble or=20
not with a range-based interface is probably highly dependent on dmd's =
current=20
optimizition capabalities - inlining in particular).

So, if Adam wants to work on getting his XML module into Phobos (which =
I=20
question, since if he really wanted to, I would have expected him to do=
 it=20
already) or if someone else wants to work on it (and the license allows=
 it),=20
then it may be possible to get it into Phobos. But someone needs to put=
 all of=20
that time and effort in.

- Jonathan M Davis

Feb 06 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis 
wrote:
 Also, two of the major requirements for an improved std.xml are 
 that it needs to have a range-based API, and it needs to be 
 fast.

What does range based API mean in this context? I do offer
a couple ranges over the tree, but it really isn't the main
thing there.

Check out Element.tree() for the main one.


But, if you mean taking a range for input, no, doesn't
do that. I've been thinking about rewriting the parse
function (if you look at it, you'll probably hate it
too!). But, what I have works and is tested on a variety
of input, including garbage that was a pain to get working
right, so I'm in no rush to change it.

 Tango's XML parser has pretty much set the bar on speed

Yeah, I'm pretty sure Tango whips me hard on speed. I spent
some time in the profiler a month or two ago and got a
significant speedup over the datasets I use (html files),
but I'm sure there's a whole lot more that could be done.



The biggest thing is I don't think you could use my parse
function as a stream.

Feb 06 2012

"Jonathan M Davis" <jmdavisProg gmx.com> writes:

On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis
 
 wrote:
 Also, two of the major requirements for an improved std.xml are
 that it needs to have a range-based API, and it needs to be
 fast.

 
 What does range based API mean in this context? I do offer
 a couple ranges over the tree, but it really isn't the main
 thing there.
 
 Check out Element.tree() for the main one.
 
 
 But, if you mean taking a range for input, no, doesn't
 do that. I've been thinking about rewriting the parse
 function (if you look at it, you'll probably hate it
 too!). But, what I have works and is tested on a variety
 of input, including garbage that was a pain to get working
 right, so I'm in no rush to change it.
 
 Tango's XML parser has pretty much set the bar on speed

 
 Yeah, I'm pretty sure Tango whips me hard on speed. I spent
 some time in the profiler a month or two ago and got a
 significant speedup over the datasets I use (html files),
 but I'm sure there's a whole lot more that could be done.
 
 
 
 The biggest thing is I don't think you could use my parse
 function as a stream.

Ideally, std.xml would operate of ranges of dchar (but obviously be optimized 
for strings, since there are lots of optimizations that can be done with 
string processing - at least as far as unicode goes) and it would return a 
range of some kind. The result would probably be a document type of some kind 
which provided a range of its top level nodes (or maybe just the root node) 
which each then provided ranges over their sub-nodes, etc. At least, that's 
the kind of thing that I would expect. Other calls on the document and nodes 
may not be range-based at all (e.g. xpaths should probably be supported, and 
that doesn't necessarily involve ranges). The best way to handle it all would 
probably depend on the implementation. I haven't implemented a full-blown XML 
parser, so I don't know what the best way to go about it would be, but 
ideally, you'd be able to process the nodes as a range.

- Jonathan M Davis

Feb 07 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-08 02:44, Jonathan M Davis wrote:
 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis

 wrote:
 Also, two of the major requirements for an improved std.xml are
 that it needs to have a range-based API, and it needs to be
 fast.

 What does range based API mean in this context? I do offer
 a couple ranges over the tree, but it really isn't the main
 thing there.

 Check out Element.tree() for the main one.


 But, if you mean taking a range for input, no, doesn't
 do that. I've been thinking about rewriting the parse
 function (if you look at it, you'll probably hate it
 too!). But, what I have works and is tested on a variety
 of input, including garbage that was a pain to get working
 right, so I'm in no rush to change it.

 Tango's XML parser has pretty much set the bar on speed

 Yeah, I'm pretty sure Tango whips me hard on speed. I spent
 some time in the profiler a month or two ago and got a
 significant speedup over the datasets I use (html files),
 but I'm sure there's a whole lot more that could be done.



 The biggest thing is I don't think you could use my parse
 function as a stream.

 Ideally, std.xml would operate of ranges of dchar (but obviously be optimized
 for strings, since there are lots of optimizations that can be done with
 string processing - at least as far as unicode goes) and it would return a
 range of some kind. The result would probably be a document type of some kind
 which provided a range of its top level nodes (or maybe just the root node)
 which each then provided ranges over their sub-nodes, etc. At least, that's
 the kind of thing that I would expect. Other calls on the document and nodes
 may not be range-based at all (e.g. xpaths should probably be supported, and
 that doesn't necessarily involve ranges). The best way to handle it all would
 probably depend on the implementation. I haven't implemented a full-blown XML
 parser, so I don't know what the best way to go about it would be, but
 ideally, you'd be able to process the nodes as a range.

 - Jonathan M Davis

I think there should be a pull or sax parser at the lowest level and 
then a XML document module on top of that parser.

-- 
/Jacob Carlborg

Feb 07 2012

Johannes Pfau <nospam example.com> writes:

Am Tue, 07 Feb 2012 20:44:08 -0500
schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:

 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis
 
 wrote:
 Also, two of the major requirements for an improved std.xml are
 that it needs to have a range-based API, and it needs to be
 fast.

 
 What does range based API mean in this context? I do offer
 a couple ranges over the tree, but it really isn't the main
 thing there.
 
 Check out Element.tree() for the main one.
 
 
 But, if you mean taking a range for input, no, doesn't
 do that. I've been thinking about rewriting the parse
 function (if you look at it, you'll probably hate it
 too!). But, what I have works and is tested on a variety
 of input, including garbage that was a pain to get working
 right, so I'm in no rush to change it.
 
 Tango's XML parser has pretty much set the bar on speed

 
 Yeah, I'm pretty sure Tango whips me hard on speed. I spent
 some time in the profiler a month or two ago and got a
 significant speedup over the datasets I use (html files),
 but I'm sure there's a whole lot more that could be done.
 
 
 
 The biggest thing is I don't think you could use my parse
 function as a stream.

 
 Ideally, std.xml would operate of ranges of dchar (but obviously be
 optimized for strings, since there are lots of optimizations that can
 be done with string processing - at least as far as unicode goes) and
 it would return a range of some kind. The result would probably be a
 document type of some kind which provided a range of its top level
 nodes (or maybe just the root node) which each then provided ranges
 over their sub-nodes, etc. At least, that's the kind of thing that I
 would expect. Other calls on the document and nodes may not be
 range-based at all (e.g. xpaths should probably be supported, and
 that doesn't necessarily involve ranges). The best way to handle it
 all would probably depend on the implementation. I haven't
 implemented a full-blown XML parser, so I don't know what the best
 way to go about it would be, but ideally, you'd be able to process
 the nodes as a range.
 
 - Jonathan M Davis

Using ranges of dchar directly can be horribly inefficient in some
cases, you'll need at least some kind off buffered dchar range. Some
std.json replacement code tried to use only dchar ranges and had to
reassemble strings character by character using Appender. That sucks
especially if you're only interested in a small part of the data and
don't care about the rest.
So for pull/sax parsers: Use buffering, return strings(better:
w/d/char[]) as slices to that buffer. If the user needs to keep a
string, he can still copy it. (String decoding should also be done
on-demand only).

Feb 08 2012

Jonathan M Davis <jmdavisProg gmx.com> writes:

On Wednesday, February 08, 2012 09:12:57 Johannes Pfau wrote:
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

That's why you accept ranges of dchar but specialize the code for strings. 
Then you can use any dchar range with it that you want but can get the extra 
efficiency of using strings if you want to do that.

- Jonathan M Davis

Feb 08 2012

Johannes Pfau <nospam example.com> writes:

Am Wed, 08 Feb 2012 00:29:55 -0800
schrieb Jonathan M Davis <jmdavisProg gmx.com>:

 On Wednesday, February 08, 2012 09:12:57 Johannes Pfau wrote:
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

 
 That's why you accept ranges of dchar but specialize the code for
 strings. Then you can use any dchar range with it that you want but
 can get the extra efficiency of using strings if you want to do that.
 
 - Jonathan M Davis

But spezializing for strings is not enough, you could stream XML over
the network and want to parse it on the fly (think of XMPP/Jabber). Or
you could read huge xml files which you do not want to load completely
into ram. Data is read into buffers anyway, so the parser should be
able to deal with that. (although a buffer of w/d/chars could be
considered to be a string, but then the parser would need to handle
incomplete input)

Feb 08 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 8 February 2012 at 08:12:57 UTC, Johannes Pfau 
wrote:
 Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep 
 a string, he can still copy it. (String decoding should also be 
 done on-demand only).

The way Document.parse works now in my code is with slices.
I think the best way to speed mine up is to untangle the mess
of recursive nested functions.

Last time I attacked dom.d with the profiler, I found a lot
of time was spent on string decoding, which looked like this:

foreach(c; str) { if(isEntity) value ~= decoded(value); else 
value ~= c; }

basically.


This reallocation was slow... but I got a huge speedup, not by
skipping decoding, but by scanning it first:

bool decode = false;
foreach(c; str) { if(c == '&') { decode = true; break; } }

if(!decode) return str;
// still uses the old decoder, which is the fastest I could find;
// ~= actually did better than appender in my tests!




But, quickly scanning the string and skipping the decode loop if
there are no entities about IIRC tripled the parse speed.

Right now, if I comment the decode call out entirely, there's very
little difference in speed on the data I've tried, so I think
decoding like this works well.

Feb 08 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau <nospam example.com> wrote:
 Am Tue, 07 Feb 2012 20:44:08 -0500
 schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:
 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis



[snip]
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

Speaking as the one proposing said Json replacement, I'd like to point out that
JSON strings != UTF strings: manual conversion is required some of the time.
And I use appender as a dynamic buffer in exactly the manner you suggest.
There's even an option to use a string cache to minimize total memory usage.
(Hmm... that functionality should probably be re-factored out and made into its
own utility) That said, I do end up doing a bunch of useless encodes and
decodes, so I'm going to special case those away and add slicing support for
strings. wstrings and dstring will still need to be converted as currently Json
values only accept strings and therefore also Json tokens only support strings.
As a potential user of the sax/pull interface would you prefer the extra
clutter of special side channels for zero-copy wstrings and dstrings?

Feb 08 2012

Johannes Pfau <nospam example.com> writes:

Am Wed, 08 Feb 2012 20:49:48 -0600
schrieb "Robert Jacques" <sandford jhu.edu>:

 On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau
 <nospam example.com> wrote:
 Am Tue, 07 Feb 2012 20:44:08 -0500
 schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:
 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis



 [snip]
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

 
 Speaking as the one proposing said Json replacement, I'd like to
 point out that JSON strings != UTF strings: manual conversion is
 required some of the time. And I use appender as a dynamic buffer in
 exactly the manner you suggest. There's even an option to use a
 string cache to minimize total memory usage. (Hmm... that
 functionality should probably be re-factored out and made into its
 own utility) That said, I do end up doing a bunch of useless encodes
 and decodes, so I'm going to special case those away and add slicing
 support for strings. wstrings and dstring will still need to be
 converted as currently Json values only accept strings and therefore
 also Json tokens only support strings. As a potential user of the
 sax/pull interface would you prefer the extra clutter of special side
 channels for zero-copy wstrings and dstrings?

Regarding wstrings and dstrings: We'll JSON seems to be UTF8 in almost
all cases, so it's not that important. But i think it should be
possible to use templates to implement identical parsers for d/w/strings

Regarding the use of Appender: Long text ahead ;-)

I think pull parsers should really be as fast a possible and low-level.
For easy to use highlevel stuff there's always DOM and a safe,
high-level serialization API should be implemented based on the
PullParser as well. The serialization API would read only the requested
data, skipping the rest:
----------------
struct Data
{
    string link;
}
auto Data = unserialize!Data(json);
----------------

So in the PullParser we should
avoid memory allocation whenever possible, I think we can even avoid it
completely:

I think dchar ranges are just the wrong input type for parsers, parsers
should use buffered ranges or streams (which would be basically the
same). We could use a generic BufferedRange with real
dchar-ranges then. This BufferedRange could use a static buffer, so
there's no need to allocate anything.

The pull parser should return slices to the original string (if the
input is a string) or slices to the Range/Stream's buffer.
Of course, such a slice is only valid till the pull parser is called
again. The slice also wouldn't be decoded yet. And a slice string could
only be as long as the buffer, but I don't think this is an issue, a
512KB buffer can already store 524288 characters.

If the user wants to keep a string, he should really do
decodeJSONString(data).idup. There's a little more opportunity for
optimization: As long as a decoded json string is always smaller than
the encoded one(I don't know if it is), we could have a decodeJSONString
function which overwrites the original buffer --> no memory allocation.

If that's not the case, decodeJSONString has to allocate iff the
decoded string is different. So we need a function which always returns
the decoded string as a safe too keep copy and a function which returns
the decoded string as a slice if the decoded string is
the same as the original.

An example: string json = 
{
   "link":"http://www.google.com",
   "useless_data":"lorem ipsum",
   "more":{
      "not interested":"yes"
   }
}

now I'm only interested in the link. I should be possible to parse that
with zero memory allocations:

auto parser = Parser(json);
parser.popFront();
while(!parser.empty)
{
    if(parser.front.type == KEY
       && tempDecodeJSON(parser.front.value) == "link")
    {
        parser.popFront();
        assert(!parser.empty && parser.front.type == VALUE);
        return decodeJSON(parser.front.value); //Should return a slice
    }
    //Skip everything else;
    parser.popFront();
}

tempDecodeJSON returns a decoded string, which (usually) isn't safe to
store(it can/should be a slice to the internal buffer, here it's a
slice to the original string, so it could be stored, but there's no
guarantee). In this case, the call to tempDecodeJSON could even be left
out, as we only search for "link" wich doesn't need encoding.

Feb 09 2012

Sean Kelly <sean invisibleduck.org> writes:

This. And decoded JSON strings are always smaller than encoded strings--JSON=
 uses escaping to encode non UTF-8 stuff, so in the case where someone sends=
 a surrogate pair (legal in JSON) it's encoded as \u0000\u0000. In short, it=
's absolutely possible to create a pull parser that never allocates, even fo=
r decoding. As proof, I've done it before. :-p

On Feb 9, 2012, at 3:07 AM, Johannes Pfau <nospam example.com> wrote:

 Am Wed, 08 Feb 2012 20:49:48 -0600
 schrieb "Robert Jacques" <sandford jhu.edu>:
=20
 On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau
 <nospam example.com> wrote:
 Am Tue, 07 Feb 2012 20:44:08 -0500
 schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:
 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis



 [snip]
=20
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

=20
 Speaking as the one proposing said Json replacement, I'd like to
 point out that JSON strings !=3D UTF strings: manual conversion is
 required some of the time. And I use appender as a dynamic buffer in
 exactly the manner you suggest. There's even an option to use a
 string cache to minimize total memory usage. (Hmm... that
 functionality should probably be re-factored out and made into its
 own utility) That said, I do end up doing a bunch of useless encodes
 and decodes, so I'm going to special case those away and add slicing
 support for strings. wstrings and dstring will still need to be
 converted as currently Json values only accept strings and therefore
 also Json tokens only support strings. As a potential user of the
 sax/pull interface would you prefer the extra clutter of special side
 channels for zero-copy wstrings and dstrings?

=20
 Regarding wstrings and dstrings: We'll JSON seems to be UTF8 in almost
 all cases, so it's not that important. But i think it should be
 possible to use templates to implement identical parsers for d/w/strings
=20
 Regarding the use of Appender: Long text ahead ;-)
=20
 I think pull parsers should really be as fast a possible and low-level.
 For easy to use highlevel stuff there's always DOM and a safe,
 high-level serialization API should be implemented based on the
 PullParser as well. The serialization API would read only the requested
 data, skipping the rest:
 ----------------
 struct Data
 {
    string link;
 }
 auto Data =3D unserialize!Data(json);
 ----------------
=20
 So in the PullParser we should
 avoid memory allocation whenever possible, I think we can even avoid it
 completely:
=20
 I think dchar ranges are just the wrong input type for parsers, parsers
 should use buffered ranges or streams (which would be basically the
 same). We could use a generic BufferedRange with real
 dchar-ranges then. This BufferedRange could use a static buffer, so
 there's no need to allocate anything.
=20
 The pull parser should return slices to the original string (if the
 input is a string) or slices to the Range/Stream's buffer.
 Of course, such a slice is only valid till the pull parser is called
 again. The slice also wouldn't be decoded yet. And a slice string could
 only be as long as the buffer, but I don't think this is an issue, a
 512KB buffer can already store 524288 characters.
=20
 If the user wants to keep a string, he should really do
 decodeJSONString(data).idup. There's a little more opportunity for
 optimization: As long as a decoded json string is always smaller than
 the encoded one(I don't know if it is), we could have a decodeJSONString
 function which overwrites the original buffer --> no memory allocation.
=20
 If that's not the case, decodeJSONString has to allocate iff the
 decoded string is different. So we need a function which always returns
 the decoded string as a safe too keep copy and a function which returns
 the decoded string as a slice if the decoded string is
 the same as the original.
=20
 An example: string json =3D=20
 {
   "link":"http://www.google.com",
   "useless_data":"lorem ipsum",
   "more":{
      "not interested":"yes"
   }
 }
=20
 now I'm only interested in the link. I should be possible to parse that
 with zero memory allocations:
=20
 auto parser =3D Parser(json);
 parser.popFront();
 while(!parser.empty)
 {
    if(parser.front.type =3D=3D KEY
       && tempDecodeJSON(parser.front.value) =3D=3D "link")
    {
        parser.popFront();
        assert(!parser.empty && parser.front.type =3D=3D VALUE);
        return decodeJSON(parser.front.value); //Should return a slice
    }
    //Skip everything else;
    parser.popFront();
 }
=20
 tempDecodeJSON returns a decoded string, which (usually) isn't safe to
 store(it can/should be a slice to the internal buffer, here it's a
 slice to the original string, so it could be stored, but there's no
 guarantee). In this case, the call to tempDecodeJSON could even be left
 out, as we only search for "link" wich doesn't need encoding.

Feb 09 2012

Johannes Pfau <nospam example.com> writes:

Am Wed, 08 Feb 2012 20:49:48 -0600
schrieb "Robert Jacques" <sandford jhu.edu>:
 
 Speaking as the one proposing said Json replacement, I'd like to
 point out that JSON strings != UTF strings: manual conversion is
 required some of the time. And I use appender as a dynamic buffer in
 exactly the manner you suggest. There's even an option to use a
 string cache to minimize total memory usage. (Hmm... that
 functionality should probably be re-factored out and made into its
 own utility) That said, I do end up doing a bunch of useless encodes
 and decodes, so I'm going to special case those away and add slicing
 support for strings. wstrings and dstring will still need to be
 converted as currently Json values only accept strings and therefore
 also Json tokens only support strings. As a potential user of the
 sax/pull interface would you prefer the extra clutter of special side
 channels for zero-copy wstrings and dstrings?

BTW: Do you know DYAML?
https://github.com/kiith-sa/D-YAML

I think it has a pretty nice DOM implementation which doesn't require
any changes to phobos. As YAML is a superset of JSON, adapting it for
std.json shouldn't be too hard. The code is boost licensed and well
documented.

I think std.json would have better chances of being merged into phobos
if it didn't rely on changes to std.variant.

Feb 09 2012

"Robert Jacques" <sandford jhu.edu> writes:

On Thu, 09 Feb 2012 05:13:52 -0600, Johannes Pfau <nospam example.com> wrote:
 Am Wed, 08 Feb 2012 20:49:48 -0600
 schrieb "Robert Jacques" <sandford jhu.edu>:
 Speaking as the one proposing said Json replacement, I'd like to
 point out that JSON strings != UTF strings: manual conversion is
 required some of the time. And I use appender as a dynamic buffer in
 exactly the manner you suggest. There's even an option to use a
 string cache to minimize total memory usage. (Hmm... that
 functionality should probably be re-factored out and made into its
 own utility) That said, I do end up doing a bunch of useless encodes
 and decodes, so I'm going to special case those away and add slicing
 support for strings. wstrings and dstring will still need to be
 converted as currently Json values only accept strings and therefore
 also Json tokens only support strings. As a potential user of the
 sax/pull interface would you prefer the extra clutter of special side
 channels for zero-copy wstrings and dstrings?

 BTW: Do you know DYAML?
 https://github.com/kiith-sa/D-YAML

 I think it has a pretty nice DOM implementation which doesn't require
 any changes to phobos. As YAML is a superset of JSON, adapting it for
 std.json shouldn't be too hard. The code is boost licensed and well
 documented.

 I think std.json would have better chances of being merged into phobos
 if it didn't rely on changes to std.variant.

I know about D-YAML, but haven't taken a deep look at it; it was developed long
after I wrote my own JSON library. I did look into YAML before deciding to use
JSON for my application; I just didn't need the extra features and implementing
them would've taken extra dev time.

As for reliance on changes to std.variant, this was a change *suggested* by
Andrei. And while it is the slower route to go, I believe it is the correct
software engineering choice; prior to the change I was implementing my own
typed union (i.e. I poorly reinvented std.variant) Actually, most of my initial
work on Variant was to make its API just as good as my home-rolled JSON type.
Furthermore, a quick check of the YAML code-base seems to indicate that
underneath the hood, Variant is being used. I'm actually a little curious about
what prevented YAML from being expressed using std.variant directly and if
those limitations can be removed.

* The other thing slowing both std.variant and std.json down is my thesis
writing :)

Feb 09 2012

Johannes Pfau <nospam example.com> writes:

Am Thu, 09 Feb 2012 08:18:15 -0600
schrieb "Robert Jacques" <sandford jhu.edu>:

 On Thu, 09 Feb 2012 05:13:52 -0600, Johannes Pfau
 <nospam example.com> wrote:
 Am Wed, 08 Feb 2012 20:49:48 -0600
 schrieb "Robert Jacques" <sandford jhu.edu>:
 Speaking as the one proposing said Json replacement, I'd like to
 point out that JSON strings != UTF strings: manual conversion is
 required some of the time. And I use appender as a dynamic buffer
 in exactly the manner you suggest. There's even an option to use a
 string cache to minimize total memory usage. (Hmm... that
 functionality should probably be re-factored out and made into its
 own utility) That said, I do end up doing a bunch of useless
 encodes and decodes, so I'm going to special case those away and
 add slicing support for strings. wstrings and dstring will still
 need to be converted as currently Json values only accept strings
 and therefore also Json tokens only support strings. As a
 potential user of the sax/pull interface would you prefer the
 extra clutter of special side channels for zero-copy wstrings and
 dstrings?

 BTW: Do you know DYAML?
 https://github.com/kiith-sa/D-YAML

 I think it has a pretty nice DOM implementation which doesn't
 require any changes to phobos. As YAML is a superset of JSON,
 adapting it for std.json shouldn't be too hard. The code is boost
 licensed and well documented.

 I think std.json would have better chances of being merged into
 phobos if it didn't rely on changes to std.variant.

 
 I know about D-YAML, but haven't taken a deep look at it; it was
 developed long after I wrote my own JSON library.

I know, I didn't mean to criticize. I just thought DYAML could give
some useful inspiration for the DOM api.

 I did look into
 YAML before deciding to use JSON for my application; I just didn't
 need the extra features and implementing them would've taken extra
 dev time.

Sure, I was only referring to DYAML cause the DOM is very similar. Just
remove some features and it would suit JSON very well. One problem is
that DYAML uses some older YAML version which isn't 100% compatible
with JSON, so it can't be used as a JSON parser. There's also no way to
tell it to generate only JSON compatible output (and AFAIK that's a
design decision and not simply a missing feature)
 
 As for reliance on changes to std.variant, this was a change
 *suggested* by Andrei.

Ok, then those changes obviously make sense. I actually thought Andrei
didn't like some of those changes.

 And while it is the slower route to go, I
 believe it is the correct software engineering choice; prior to the
 change I was implementing my own typed union (i.e. I poorly
 reinvented std.variant) Actually, most of my initial work on Variant
 was to make its API just as good as my home-rolled JSON type.
 Furthermore, a quick check of the YAML code-base seems to indicate
 that underneath the hood, Variant is being used. I'm actually a
 little curious about what prevented YAML from being expressed using
 std.variant directly and if those limitations can be removed.

I guess the custom Node type was only added to support additional
methods(isScalar, isSequence, isMapping, add, remove, removeAt) and I'm
not sure if those are supported on Variant (length, foreach, opIndex,
opIndexAssign), but IIRC those are supported in your new std.variant.
 
 * The other thing slowing both std.variant and std.json down is my
 thesis writing :)

Feb 09 2012

Sean Kelly <sean invisibleduck.org> writes:

For XML, template the parser on char type so transcoding is unnecessary. Sin=
ce JSON is UTF-8 I'd use char there, and at least for the event parser don't=
 proactively decode strings--let the user do this. In fact, don't proactivel=
y decode anything. Give me the option of getting a number via its string rep=
resentation directly from the input buffer. Roughly, JSON events should be:

Enter object
Object key
Int value (as string)
Float value (as string)
Null
True
False
Etc.=20

On Feb 8, 2012, at 6:49 PM, "Robert Jacques" <sandford jhu.edu> wrote:

 On Wed, 08 Feb 2012 02:12:57 -0600, Johannes Pfau <nospam example.com> wro=

te:
 Am Tue, 07 Feb 2012 20:44:08 -0500
 schrieb "Jonathan M Davis" <jmdavisProg gmx.com>:
 On Tuesday, February 07, 2012 00:56:40 Adam D. Ruppe wrote:
 On Monday, 6 February 2012 at 23:47:08 UTC, Jonathan M Davis



 [snip]
=20
 Using ranges of dchar directly can be horribly inefficient in some
 cases, you'll need at least some kind off buffered dchar range. Some
 std.json replacement code tried to use only dchar ranges and had to
 reassemble strings character by character using Appender. That sucks
 especially if you're only interested in a small part of the data and
 don't care about the rest.
 So for pull/sax parsers: Use buffering, return strings(better:
 w/d/char[]) as slices to that buffer. If the user needs to keep a
 string, he can still copy it. (String decoding should also be done
 on-demand only).

=20
 Speaking as the one proposing said Json replacement, I'd like to point out=

 that JSON strings !=3D UTF strings: manual conversion is required some of t=
he time. And I use appender as a dynamic buffer in exactly the manner you su=
ggest. There's even an option to use a string cache to minimize total memory=
 usage. (Hmm... that functionality should probably be re-factored out and ma=
de into its own utility) That said, I do end up doing a bunch of useless enc=
odes and decodes, so I'm going to special case those away and add slicing su=
pport for strings. wstrings and dstring will still need to be converted as c=
urrently Json values only accept strings and therefore also Json tokens only=
 support strings. As a potential user of the sax/pull interface would you pr=
efer the extra clutter of special side channels for zero-copy wstrings and d=
strings?

Feb 09 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Monday, 6 February 2012 at 23:15:50 UTC, Alvaro wrote:
 Its "dom" module has that sort of Javascript-like DOM 
 manipulation code

I'm biased (obviously), but I've never seen a more convenient
way to work with xml. I like my convenience functions a lot.


 I thus suggest, if licensing allows (?

It is Boost licensed, so you can use it (I put the license
at the bottom of my files).

I don't know if it is phobos material, but if there's demand,
maybe I can make that happen.

 A lot of that can be left out in XML.

Right, though you can ignore it too. I sometimes use it
to work with other kinds of xml (web apis sending and receiving),
RSS, and others.

It works well enough for me.

Feb 06 2012

bls <bizprac orange.fr> writes:

You know it, web stuff documentation is a weak point. web stuff looks 
very interesting ...  so a real world sample app would be nice..

I would like to see a sample RIA -  M- V-C  wise, using (say using dojo 
dijit as View layer ) in conjunction with the D web stuff . (Model- 
Controler)

I think atm your library is not made for the masses, but it would be 
nevertheless interesting to see how someone can glue backend (web stuff) 
and frontend stuff (dojo/dijit) together.
thanks for reading.

Feb 07 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tuesday, 7 February 2012 at 19:27:46 UTC, bls wrote:
 You know it, web stuff documentation is a weak point.

Yeah, I know.

 looks very interesting ...  so a real world sample app would be 
 nice..

The closest i have is a little blog like thing that I started
and haven't worked on since.

http://arsdnet.net/blog/my-source

 I would like to see a sample RIA -  M- V-C  wise, using (say 
 using dojo dijit as View layer ) in conjunction with the D web 
 stuff . (Model- Controler)

I haven't used one of those toolkits, but plain javascript
is easy to do like this.


I'm writing a little browser game on the weekends. When I finish
it, I might be able to show it to you. (I'll have to replace the
copyrighted images in there right now though.)



Basically you just call the server functions in your event 
handlers.


Want to replace a div with content from the other side?

D
===
import arsd.web;
class Server : ApiProvider {
   string hello(string name) { return "hello " ~ name; }
}
mixin FancyMain!Server;
===

HTML/Javascript
===
<script src="app/functions.js"></script>
<div id="holder"></div>
<button 
onclick="Server.hello(prompt('Name?')).useToReplace('holder');">Click 
me</button>
===



And when you click that button, the string D returns will be
put inside as text.

That's the basic of it.




I'm taking this to an extreme with this:

http://arsdnet.net:8080/

(that's my custom D web server, so if it doesn't respond, I 
probably
closed the program. Will leave it open for a while though.)


Click on the link and the body. You should get some text added.



Take a look at the javascript though:
http://arsdnet.net:8080/sse.js


	document.body.addEventListener("click", function(event) {
		var response = Test.cancelableEvent(
			event.type,
			event.target.getAttribute("data-node-index")
		).getSync();
  [snip]



It uses synchronous ajax on the click handler to forward the
event to the server, which then passes it through the server
side dom.


D:
===

/* the library implements EventResponse cancelableEvent(...),
which dispatches the event on the server, so you can write: */

         // body is a keyword in D, hence mainBody instead.
document.mainBody.addEventListener("click"), (Event ev) {
     ev.target.appendText("got click!");
     ev.preventDefault();
});

===


sync ajax requests suck because it blocks the whole web app
waiting for a response. But writing server side event handlers
is a fun toy.








I do write real applications with web.d, but they are almost
all proprietary, so little toys is all I really have to show
right now.

Feb 07 2012

James Miller <james aatch.net> writes:

 On Tuesday, 7 February 2012 at 19:27:46 UTC, bls wrote:
 You know it, web stuff documentation is a weak point.


 Yeah, I know.


 looks very interesting ... =C2=A0so a real world sample app would be nic=


e..
 The closest i have is a little blog like thing that I started
 and haven't worked on since.

 http://arsdnet.net/blog/my-source


 I would like to see a sample RIA - =C2=A0M- V-C =C2=A0wise, using (say u=


sing dojo
 dijit as View layer ) in conjunction with the D web stuff . (Model-
 Controler)


 I haven't used one of those toolkits, but plain javascript
 is easy to do like this.


 I'm writing a little browser game on the weekends. When I finish
 it, I might be able to show it to you. (I'll have to replace the
 copyrighted images in there right now though.)



 Basically you just call the server functions in your event handlers.


 Want to replace a div with content from the other side?

 D
 =3D=3D=3D
 import arsd.web;
 class Server : ApiProvider {
 =C2=A0string hello(string name) { return "hello " ~ name; }
 }
 mixin FancyMain!Server;
 =3D=3D=3D

 HTML/Javascript
 =3D=3D=3D
 <script src=3D"app/functions.js"></script>
 <div id=3D"holder"></div>
 <button
 onclick=3D"Server.hello(prompt('Name?')).useToReplace('holder');">Click
 me</button>
 =3D=3D=3D



 And when you click that button, the string D returns will be
 put inside as text.

 That's the basic of it.




 I'm taking this to an extreme with this:

 http://arsdnet.net:8080/

 (that's my custom D web server, so if it doesn't respond, I probably
 closed the program. Will leave it open for a while though.)


 Click on the link and the body. You should get some text added.



 Take a look at the javascript though:
 http://arsdnet.net:8080/sse.js


 =C2=A0 =C2=A0 =C2=A0 =C2=A0document.body.addEventListener("click", functi=

on(event) {
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0var response =3D T=

est.cancelableEvent(
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=

=A0 =C2=A0event.type,
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=

=A0 =C2=A0event.target.getAttribute("data-node-index")
 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0).getSync();
 =C2=A0[snip]



 It uses synchronous ajax on the click handler to forward the
 event to the server, which then passes it through the server
 side dom.


 D:
 =3D=3D=3D

 /* the library implements EventResponse cancelableEvent(...),
 which dispatches the event on the server, so you can write: */

 =C2=A0 =C2=A0 =C2=A0 =C2=A0// body is a keyword in D, hence mainBody inst=

ead.
 document.mainBody.addEventListener("click"), (Event ev) {
 =C2=A0 =C2=A0ev.target.appendText("got click!");
 =C2=A0 =C2=A0ev.preventDefault();
 });

 =3D=3D=3D


 sync ajax requests suck because it blocks the whole web app
 waiting for a response. But writing server side event handlers
 is a fun toy.








 I do write real applications with web.d, but they are almost
 all proprietary, so little toys is all I really have to show
 right now.

As somebody that frequently laments the lack of documentation in
general (I use Silverstripe at work, in which the documentation is
patchy at best) I work hard on my documentation.

Adam's stuff is very good, I plan to take a look at it and "borrow"
some code for a web framework I'm working on. But I am also
open-sourcing modules that I am using as I go along. It would be cool
if we ended up with a set of independent modules that worked well
together but had few dependencies.

I guess the point is whether Adam is ok with the community extending
his work separately, since
"misc-stuff-including-D-programming-language-web-stuff" isn't exactly
a catchy name :P.

It would be unfortunate if Adam's work wasn't built upon, and that
work was needlessly replicated, then tightly integrated into some
other project, rather than being something that everybody could use.

Feb 07 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 8 February 2012 at 01:33:51 UTC, James Miller wrote:
 As somebody that frequently laments the lack of documentation

I'm a bit of a hypocrite here; I'll complain until the cows
come home about /other/ people's crappy documentation...
then turn around and do a bad job at it myself.

I tried to do cgi.d
overview: http://arsdnet.net/web.d/cgi.d.html
    ddoc: http://arsdnet.net/web.d/cgi.html


But I still have a lot to do on web.d documentation.
The best source there is probably random newsgroup posts,
(better than the comments in the code!) and I've changed
things more than once too.

 Adam's stuff is very good

Thanks!


 It would be cool if we ended up with a set of independent
 modules that worked well together but had few dependencies.

Yeah, I think that would be great.


If you want to modify my files, you can always fork it,
but the best way is probably to do a pull request on the
github.

I try to reply to those quickly, and we can try to keep
one copy there to avoid duplication of effort.

Then, add on modules outside the scope of the code can be
done by anyone. I'd put links to stuff in the documentation,
so it is easy to find extended functionality.

Feb 07 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

Here's more ddocs.

http://arsdnet.net/web.d/web.html
http://arsdnet.net/web.d/dom.html


Not terribly useful, I'll admit. The Javascript
discussion at the bottom of the first link might be
good to read though.

The dom.html there is mostly just (incomplete) method
listing. I didn't write most of it up at all.

When I started doing dom.d, I was going to be strictly
a clone of the browser implementation, so some of the
comments still say things like "extension", but I went
my own direction a long time ago.

I don't know... I think most of the dom is self-explanatory
if you're already experienced with javascript anyway though.


BTW here's something I wrote up on aother site as an
example of how awesome D is:




examples:
[code="D ROX"]

// don't need a document to create elements
auto div = Element.make("div", "Hello <there>"); // convenience 
params do innerText
assert(div.innerText == "Hello <there>"); // text set
assert(div.innerHTML == "Hello &lt;there&gt;"); // html properly 
encoded mindlessly

// getting and setting attributes works w/ property syntax
// of course they are always properly encoded for charset and 
html stuffs
div.id = "whoa";
assert(div.id == "whoa");
div.customAtrribute = div.id ~ "works";
assert(div.customAttribute == "whoaworks");

// i also support the dataset sugar over attributes

div.dataset.myData = "cool";
assert(div.getAttribute("data-my-data") == "cool" == 
div.dataset.myData);

// as well as with the style..

div.style = "top: 10px;" // works with strings just like any 
other attribute

div.style.top = "15px"; // or with property syntax like in 
javascript

assert(div.style.top == "15px"); // reading works no matter how 
you set it
assert(div.style == "top: 15px;"); // or you can read it as a 
string


// i have convenience methods too

div.innerText = "sweet"; // worx, w/ proper encoding

// calls Element.make and appendChild in one go.
// can easily set text and/or a tag specific attribute
auto a = div.addChild("a", "Click me!", "link.html"); // tagName, 
value, value2, dependent on tag

a.addClass("cool").addClass("beans"); // convenience methods 
(chainable btw) for the class attribute
assert(a.hasClass("cool") && a.hasClass("beans") && 
!a.hasClass("coolbeans"));
assert(a.className == "cool beans");

// subclasses rock too, especially with automatic checked casts
Link link = div.requireSelector!Link("a"); // css selector syntax 
supported
// alternatively:
link = cast(Link) a; // but if a isn't actually a Link, this can 
be null

// easy working with the link url
a.setValue("cool", "param");
assert(a.href == "link.html?cool=param");

// arsd.html also includes functions to convert links into POST 
forms btw

// the Form subclass rox too

auto form = cast(Form) Element.make("form");
form.method = "POST";
// convenience functions from the browsers are here but who needs 
them when the dom rox?
form.innerHTML = "<select 
name=\"cool\"><option>Whoa</option><option>WTF</option></select>";

// similarly to the Link class, you can set values with ease
form.setValue("cool", "WTF"); // even works on non-text form 
elements
form.setValue("sweet", "whoa"); // also can implicitly create a 
hidden element to carry a value (can lead to mistakes but so 
useful)


// and the Table class is pretty sweet

auto table = cast(Table) Element.make("table");
table.caption = "Cool"; // sets the caption element, not an 
attribute

// variadic templates rok
table.appendRow("sweet", "whoa", 10, Element.make("a")); // adds 
to the <tbody> a <tr> with appropriate children
[/code]


some people luv jquery and think it is the best thing ever

those people have never used my dom library


Speaking of jquery, what about collections of elements?

Well D has this thing called foreach which does that. But, just
to prove I could, I wrote a couple dozen lines of D to do this:

==
document["p a"].addClass("mutated").appendText("all links in 
paragraphs get this text and that class!");
==



Operator overloading template craziness!

But i'm pretty meh on that since foreach is better anyway.



foreach rox

d rox

Feb 07 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-08 03:29, Adam D. Ruppe wrote:
 Here's more ddocs.

 http://arsdnet.net/web.d/web.html
 http://arsdnet.net/web.d/dom.html


 Not terribly useful, I'll admit. The Javascript
 discussion at the bottom of the first link might be
 good to read though.

 The dom.html there is mostly just (incomplete) method
 listing. I didn't write most of it up at all.

 When I started doing dom.d, I was going to be strictly
 a clone of the browser implementation, so some of the
 comments still say things like "extension", but I went
 my own direction a long time ago.

I think a much better API, than the one browsers provide, can be created 
for operating on the DOM, especially in D.

-- 
/Jacob Carlborg

Feb 07 2012

Jose Armando Garcia <jsancio gmail.com> writes:

On Wed, Feb 8, 2012 at 5:41 AM, Jacob Carlborg <doob me.com> wrote:
 On 2012-02-08 03:29, Adam D. Ruppe wrote:
 Here's more ddocs.

 http://arsdnet.net/web.d/web.html
 http://arsdnet.net/web.d/dom.html


 Not terribly useful, I'll admit. The Javascript
 discussion at the bottom of the first link might be
 good to read though.

 The dom.html there is mostly just (incomplete) method
 listing. I didn't write most of it up at all.

 When I started doing dom.d, I was going to be strictly
 a clone of the browser implementation, so some of the
 comments still say things like "extension", but I went
 my own direction a long time ago.


 I think a much better API, than the one browsers provide, can be created for
 operating on the DOM, especially in D.

I know very little about html programming but dart did just that. It
is my understanding that they abandon JS's DOM and created their own
API: http://api.dartlang.org/html.html
 --
 /Jacob Carlborg

Feb 08 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-08 13:11, Jose Armando Garcia wrote:
 On Wed, Feb 8, 2012 at 5:41 AM, Jacob Carlborg<doob me.com>  wrote:
 On 2012-02-08 03:29, Adam D. Ruppe wrote:
 Here's more ddocs.

 http://arsdnet.net/web.d/web.html
 http://arsdnet.net/web.d/dom.html


 Not terribly useful, I'll admit. The Javascript
 discussion at the bottom of the first link might be
 good to read though.

 The dom.html there is mostly just (incomplete) method
 listing. I didn't write most of it up at all.

 When I started doing dom.d, I was going to be strictly
 a clone of the browser implementation, so some of the
 comments still say things like "extension", but I went
 my own direction a long time ago.


 I think a much better API, than the one browsers provide, can be created for
 operating on the DOM, especially in D.

 I know very little about html programming but dart did just that. It
 is my understanding that they abandon JS's DOM and created their own
 API: http://api.dartlang.org/html.html

It seems so. I haven't looked over the docs but it's good that someone 
is at least trying.

-- 
/Jacob Carlborg

Feb 08 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 8 February 2012 at 12:11:40 UTC, Jose Armando 
Garcia wrote:
 is my understanding that they abandon JS's DOM and created 
 their own API: http://api.dartlang.org/html.html

That actually looks very similar to what the browsers
do now, which is a good thing to me - you don't have
to learn new stuff to sit down and get started.

But, looking through the Element interface:
http://api.dartlang.org/html/Element.html

you can see it is very familiar to the regular browser
API. It offers some of the IE extensions (which rock btw;
I implemented them too. outerHTML, innerText, etc.)


It doesn't actually go as far as I do in expanding the
api though.

Feb 08 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 8 February 2012 at 07:41:54 UTC, Jacob Carlborg 
wrote:
 I think a much better API, than the one browsers provide, can 
 be created for operating on the DOM, especially in D.

I'd say I've proven that! dom.d is very, very nice IMO.

Feb 08 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-08 02:33, James Miller wrote:
 As somebody that frequently laments the lack of documentation in
 general (I use Silverstripe at work, in which the documentation is
 patchy at best) I work hard on my documentation.

 Adam's stuff is very good, I plan to take a look at it and "borrow"
 some code for a web framework I'm working on. But I am also
 open-sourcing modules that I am using as I go along. It would be cool
 if we ended up with a set of independent modules that worked well
 together but had few dependencies.

 I guess the point is whether Adam is ok with the community extending
 his work separately, since
 "misc-stuff-including-D-programming-language-web-stuff" isn't exactly
 a catchy name :P.

 It would be unfortunate if Adam's work wasn't built upon, and that
 work was needlessly replicated, then tightly integrated into some
 other project, rather than being something that everybody could use.

Maybe Adam's code can be used as a base of implementing a library like 
Rack in D.

http://rack.rubyforge.org/

-- 
/Jacob Carlborg

Feb 07 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 8 February 2012 at 07:37:23 UTC, Jacob Carlborg 
wrote:
 Maybe Adam's code can be used as a base of implementing a 
 library like Rack in D.

 http://rack.rubyforge.org/

That looks like it does the same job as cgi.d.

cgi.d actually offers a uniform interface across various
web servers and integration methods.

If you always talk through the Cgi class, and use the GenericMain
mixin, you can run the same program with:

1) cgi, tested on Apache and IIS (including implementations for 
methods
that don't work on one or the other natively)

2) fast cgi (using the C library)

3) HTTP itself (something I expanded this last weekend and still 
want
to make better)




Sometimes I think I should rename it, to reflect this, but meh,
misc-stuff-including blah blah shows how good I am at names!

Feb 08 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-08 15:51, Adam D. Ruppe wrote:
 On Wednesday, 8 February 2012 at 07:37:23 UTC, Jacob Carlborg wrote:
 Maybe Adam's code can be used as a base of implementing a library like
 Rack in D.

 http://rack.rubyforge.org/

 That looks like it does the same job as cgi.d.

 cgi.d actually offers a uniform interface across various
 web servers and integration methods.

 If you always talk through the Cgi class, and use the GenericMain
 mixin, you can run the same program with:

 1) cgi, tested on Apache and IIS (including implementations for methods
 that don't work on one or the other natively)

 2) fast cgi (using the C library)

 3) HTTP itself (something I expanded this last weekend and still want
 to make better)




 Sometimes I think I should rename it, to reflect this, but meh,
 misc-stuff-including blah blah shows how good I am at names!

It seems Rack supports additional interface next to CGI. But I think we 
could take this one step further. I'm not entirely sure what APIs Rack 
provides but in Rails they have a couple of method to uniform the 
environment variables.

For example, ENV["REQUEST_URI"] returns differently on different 
servers. Rails provides a method, "request_uri" on the request object 
that will return the same value on all different servers.

I don't know if CGI already has support for something similar.

-- 
/Jacob Carlborg

Feb 09 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 February 2012 at 08:26:25 UTC, Jacob Carlborg 
wrote:
 For example, ENV["REQUEST_URI"] returns differently on 
 different servers. Rails provides a method, "request_uri" on 
 the request object that will return the same value on all 
 different servers.

 I don't know if CGI already has support for something similar.

Yeah, in cgi.d, you use Cgi.requestUri, which is an immutable
string, instead of using the environment variable directly.

   requestUri = getenv("REQUEST_URI");
// Because IIS doesn't pass requestUri, we simulate it here if 
it's empty.
    if(requestUri.length == 0) {
         // IIS sometimes includes the script name as part of the 
path info - we don't want that
         if(pathInfo.length >= scriptName.length && (pathInfo[0 .. 
scriptName.length] == scriptName))
             pathInfo = pathInfo[scriptName.length .. $];

            requestUri = scriptName ~ pathInfo ~ 
(queryString.length ? ("?" ~ queryString) : "");

           // FIXME: this works for apache and iis... but what 
about others?





That's in the cgi constructor. Somewhat ugly code, but I figure
better to have ugly code in the library than incompatibilities
in the user program!

The http constructor creates these variables from the raw headers.


Here's the ddoc:
http://arsdnet.net/web.d/cgi.html

If you search for "requestHeaders", you'll see all the stuff
following. If you use those class members instead of direct
environment variables, you'll get max compatibility.

Feb 09 2012

Jacob Carlborg <doob me.com> writes:

On 2012-02-09 15:56, Adam D. Ruppe wrote:
 On Thursday, 9 February 2012 at 08:26:25 UTC, Jacob Carlborg wrote:
 For example, ENV["REQUEST_URI"] returns differently on different
 servers. Rails provides a method, "request_uri" on the request object
 that will return the same value on all different servers.

 I don't know if CGI already has support for something similar.

 Yeah, in cgi.d, you use Cgi.requestUri, which is an immutable
 string, instead of using the environment variable directly.

 requestUri = getenv("REQUEST_URI");
 // Because IIS doesn't pass requestUri, we simulate it here if it's empty.
 if(requestUri.length == 0) {
 // IIS sometimes includes the script name as part of the path info - we
 don't want that
 if(pathInfo.length >= scriptName.length && (pathInfo[0 ..
 scriptName.length] == scriptName))
 pathInfo = pathInfo[scriptName.length .. $];

 requestUri = scriptName ~ pathInfo ~ (queryString.length ? ("?" ~
 queryString) : "");

 // FIXME: this works for apache and iis... but what about others?





 That's in the cgi constructor. Somewhat ugly code, but I figure
 better to have ugly code in the library than incompatibilities
 in the user program!

 The http constructor creates these variables from the raw headers.


 Here's the ddoc:
 http://arsdnet.net/web.d/cgi.html

 If you search for "requestHeaders", you'll see all the stuff
 following. If you use those class members instead of direct
 environment variables, you'll get max compatibility.

Cool, you already thought of all of this it seems.

-- 
/Jacob Carlborg

Feb 09 2012

Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:

On 2/9/12 6:56 AM, Adam D. Ruppe wrote:
 Here's the ddoc:
 http://arsdnet.net/web.d/cgi.html

Cue the choir: "Please submit to Phobos".

Andrei

Feb 09 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Thursday, 9 February 2012 at 17:36:01 UTC, Andrei Alexandrescu 
wrote:
 Cue the choir: "Please submit to Phobos".

Perhaps when I finish the URL struct in there. (It
takes a url and breaks it down into parts you can edit,
and can do rebasing. Currently, the handling of the Location:
header is technically wrong - the http spec says it is supposed
to be an absolute url, but I don't enforce that.

Now, in cgi mode, it doesn't matter, since the web server
fixes it up for us. But, in http mode... well, it still
doesn't matter since the browsers can all figure it out,
but I'd like to do the right thing anyway.).


I might change the http constructor and/or add one
that takes a std.socket socket cuz that would be cool.



But I just don't want to submit it when I still might
be making some big changes in the near future.




BTW, I spent a little time reorganizing and documenting
dom.d a bit more.

http://arsdnet.net/web.d/dom.html

Still not great docs, but if you come from javascript,
I think it is pretty self-explanatory anyway.

Feb 09 2012

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Tuesday, 7 February 2012 at 20:00:26 UTC, Adam D. Ruppe wrote:
 I'm taking this to an extreme with this:

 http://arsdnet.net:8080/


hehehe, I played with this a little bit more tonight.

http://arsdnet.net/dcode/sse/

needs the bleeding edge dom.d from my github.
https://github.com/adamdruppe/misc-stuff-including-D-programming-language-web-stuff

Here's the code, not very long.
http://arsdnet.net/dcode/sse/test.d

The best part is this:

  document.mainBody.addEventListener("click", (Element thislol, 
Event event) {
    event.target.style.color = "red";
    event.target.appendText(" clicked! ");
    event.preventDefault();
  });


A html onclick handler written in D!



Now, like I said before, probably not usable for real work. What
this does is for each user session, it creates a server side DOM
object.

Using observers on the DOM, it listens for changes and forwards 
them
to Javascript. You use the D api to change your document, and it
sends them down. I've only implemented a couple mutation events,
but they go a long way - appendChild and setAttribute - as they
are the building blocks for many of the functions.

On the client side, the javascript listens for events and forwards
them to D.

To sync the elements on both sides, I added a special feature
to dom.d to put an attribute there that is usable on both sides.
The Makefile in there shows the -version needed to enable it.


Since it is a server side document btw, you can refresh the 
browser
and keep the same document. It could quite plausible gracefully 
degrade!



But, yeah, lots of fun. D rox.

Feb 09 2012

"Joel" <joelcnz gmail.com> writes:

I think there might be a better one out, I noticed some thing in 
a forum (Aug 2013).

I've found that GTkD and arsd\dom.d (Adam's) have 'Event' symbols 
that clash with each other.

With the code 'alias jEvent = gdk.Event;' I get errors saying 
gtk.Event is used as a type.

Is there an easier way to to think out symbol clashes?

I've got some code here:
https://github.com/joelcnz/Jyble

Thanks for any help.

Sep 16 2014

ketmar via Digitalmars-d <digitalmars-d puremagic.com> writes:

On Wed, 17 Sep 2014 04:57:03 +0000
Joel via Digitalmars-d <digitalmars-d puremagic.com> wrote:

 Is there an easier way to to think out symbol clashes?

sure. documentation on 'import' keyword should englighten you.

Sep 17 2014

"Adam D. Ruppe" <destructionator gmail.com> writes:

On Wednesday, 17 September 2014 at 04:57:05 UTC, Joel wrote:
 Is there an easier way to to think out symbol clashes?

Since most of dom.d's functionality is inside the Document and 
other classes, you can probably get you code to compile most 
easily with a selective import:

import arsd.dom : Document, Element, Form, Link;

or something like that. You might not even be using Form and 
Link. Then since the rest of the functions used are methods of 
those classes, you don't need to list them individually, they'll 
just work, and you didn't import the Event class so it should be 
like it doesn't even exist,

Sep 17 2014

"Joel" <joelcnz gmail.com> writes:

Yay! Got it working again! Thanks guys!

I want to get it working on OSX at some point. I haven't tried 
yet.

Sep 17 2014

D Programming

C/C++ Programming

Other

digitalmars.D - std.xml and Adam D Ruppe's dom module