www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - (X)HTML/XML in DDoc

reply "Borden" <2013 bordenrhodes.com> writes:
Good evening, all,

I imagine that this is a sensitive topic, so I'll do my best not 
to flame bait and I hope that readers will assume that I mean the 
best if I don't word things very diplomatically:

I think that the DDoc spec does an excellent job in creating a 
common-sense, low-burdensome way to document source code in-line. 
I think its rules are largely clear and unambiguous. 
Unfortunately, I'm finding that the clarity breaks down once 
macros are introduced.

What sticks out for me is the spec's aversion to XHTML, clearly 
stated in "D's goals for embedded documentation" (number 4) and 
under the bit for Embedded HTML, which discourages HTML in favour 
of macros.

I want to understand why DDoc prefers macros to some XML-based 
markup. I've observed the following things about working with 
DDoc:
1) The DLangSpec is written almost entirely in macros. Most of 
the macros defined for the DLangSpec simply rewrite HTML tags 
into a macro. It seems that the macros copy the functionality of 
the HTML tags without adding any usefulness or clarity to them.
2) In fact, I'm finding that macros make the documentation less 
flexible to work with. For example, in trying to create a .ddoc 
to parse the DLangSpec files into HTML 5, I'm noticing that the 
DLangSpec sources define and use a lot of SECTION# macros, which 
simply turn the SECTION# heading into a <h#> and raw-dumps the 
second argument. Because of the way it's been set up, I have to 
find and redefine each of these SECTION# macros and, even when I 
do, I can't redefine the macros to use HTML5's nested <section> 
tag system.
3) I do find the macros extremely helpful in mass-producing 
repetitive lines of HTML where only one or two attributes change 
in each line. Of course, since macros don't have any logic to 
them (such as looping, conditions, etc.), they can only shorten 
my workload so much.

Maybe the problem is that the DLangSpec simply stretches the 
capabilities of DDoc far too much. I know that XML is verbose, 
redundant and often a nightmare to work with, but lots of tools 
are available to manage the tedium of XML and convert files to 
and from any format that you like - which seems to be what DDoc 
was designed to but can only accomplish after it's parsed into 
HTML!

So these are my thoughts, and I invite feedback on them. I'm not 
really proposing anything - except for, maybe, changing the 
DLangSpec source into a more wieldy format...
May 15 2013
next sibling parent "Sergei Nosov" <sergei.nosov gmail.com> writes:
On Thursday, 16 May 2013 at 06:46:55 UTC, Borden wrote:
 I want to understand why DDoc prefers macros to some XML-based 
 markup.

As kind of a follow-up, I wanna ask why DDoc prefers macros to some lightweight markup language like Markdown. For me, it's really uncomfortable to read the documentation in the source itself because of numerous $(MACROS ) stuff. I believe there's nothing that fancy an inline documentation should do, to use some "dead-proof" features like those macros. Marking bold like *so* is far more user friendly than $(B so). And it's not that limiting also - tools such as pandoc do a great job converting lightweight markups to pretty neat HTML, LaTeX, PDF and what not.
May 16 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
I don't want to turn this thread into a DDoc-bashing rag, but 
another observation I've made is that, ironicaly, DDoc macros are 
not self documenting. If one types $(SOME_MACRO this, that, the 
other) it's not immediately obvious to what 'this,' 'that,' 'the 
other' refer without interpreting the macro definitions. Perhaps 
this was a design feature in order to transfer the formatting 
burden onto the .ddoc file, but I'm finding the exact opposite as 
I look through the DLangSpec.

In contrast, <some-tag attr1="this" attr2="that">the 
other</some-tag> is far more obvious in terms of what 
relationship the variables have to one another. Again, I'm not 
suggesting for a second that XML is the documenter's cure-all, 
but illustrating some of the things which make DDoc macros 
awkward to use.

What are the general opinions on La(Tex) in terms of code 
documentation? I only have a superficial understanding of it but 
it seems to follow many of the principles that DDoc incorporates. 
It also has the advantage that it's not quite as obtrusive as XML 
when writing from scratch. Then again, I read on Wikipedia that 
Knuth's next incarnation of Tex is going to be XML-based. XML 
seems to be the unavoidable trend in file design.
May 16 2013
prev sibling next sibling parent "Adam D. Ruppe" <destructionator gmail.com> writes:
I think if you're using a lot of macros, ddoc starts to lose 
value because then it is harder to read in the code. The plainer 
the text, the better, IMO.

I use ddoc macros only on the individual ($important word) here 
and there and sometimes little ($lists, of, simple, stuff).
May 16 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Thursday, 16 May 2013 at 20:55:18 UTC, Adam D. Ruppe wrote:
 I think if you're using a lot of macros, ddoc starts to lose 
 value because then it is harder to read in the code. The 
 plainer the text, the better, IMO.

 I use ddoc macros only on the individual ($important word) here 
 and there and sometimes little ($lists, of, simple, stuff).

I think you're right, Adam. A standard 80-character line of code gets swallowed up pretty quickly in tags. Macros are certainly more concise but I've noticed that their main (ab)use is as HTML shorthand. I'm guilty of much the same: using macros to cut down on the amount of copy-and-pasting I have to do when I want to make a bunch of similar XML tags real quick. A corollary to this question is how much opposition I'd face if I were to rewrite the DLangSpec source files into a different markup. A corollary to the corollary is which standard would be easiest to write, read, understand and maintain?
May 16 2013
prev sibling next sibling parent "Craig Dillabaugh" <cdillaba cg.scs.carleton.ca> writes:
On Thursday, 16 May 2013 at 20:46:01 UTC, Borden wrote:
clip ...
 What are the general opinions on La(Tex) in terms of code 
 documentation? I only have a superficial understanding of it 
 but it seems to follow many of the principles that DDoc 
 incorporates. It also has the advantage that it's not quite as 
 obtrusive as XML when writing from scratch. Then again, I read 
 on Wikipedia that Knuth's next incarnation of Tex is going to 
 be XML-based. XML seems to be the unavoidable trend in file 
 design.

Can you provide a reference for the claim that the next incarnation of TeX is going to be XML based? I did a quick search and couldn't turn up anything. Perhaps it was some sort of April fool's joke :o) I can think of a lot of ways that TeX/LaTeX could be improved, adding XML to the mix is certainly not one of them! With regards to your question, would La(TeX) be good for documentation? The markup it uses is pretty easy to work with for the most part I think. It is easy to do things like enumerated/itemized lists. Tables are not fun, but I don't know if that is a markup issue as much as just the way La(TeX) handles tables. Cheers, Craig
May 16 2013
prev sibling next sibling parent "Borden" <2013 bordenrhodes.com> writes:
On Thursday, 16 May 2013 at 21:36:03 UTC, Craig Dillabaugh wrote:
 Can you provide a reference for the claim that the next
 incarnation of TeX is going to be XML based?  I did a quick 
 search and
 couldn't turn up anything.  Perhaps it was some sort of April 
 fool's joke
 :o)

You're right. I fell for it. I think I was referring to his announcement of iTeX, which any idiot should have noticed was a joke. I'm no idiot, I'm apparently far more gullible :-P
May 16 2013
prev sibling next sibling parent Nick Sabalausky <SeeWebsiteToContactMe semitwist.com> writes:
On Fri, 17 May 2013 03:01:35 +0200
"Borden" <2013 bordenrhodes.com> wrote:

 On Thursday, 16 May 2013 at 21:36:03 UTC, Craig Dillabaugh wrote:
 Can you provide a reference for the claim that the next
 incarnation of TeX is going to be XML based?  I did a quick 
 search and
 couldn't turn up anything.  Perhaps it was some sort of April 
 fool's joke
 :o)

You're right. I fell for it. I think I was referring to his announcement of iTeX, which any idiot should have noticed was a joke. I'm no idiot, I'm apparently far more gullible :-P

I think it's quite easy to believe joke claims of adopting XML, considering how much XML already gets used where it doesn't belong. This one's my favorite (a joke, obviously): http://www.charlespetzold.com/etc/CSAML.html
May 16 2013
prev sibling next sibling parent "Idan Arye" <GenericNPC gmail.com> writes:
On Friday, 17 May 2013 at 01:01:36 UTC, Borden wrote:
 On Thursday, 16 May 2013 at 21:36:03 UTC, Craig Dillabaugh 
 wrote:
 Can you provide a reference for the claim that the next
 incarnation of TeX is going to be XML based?  I did a quick 
 search and
 couldn't turn up anything.  Perhaps it was some sort of April 
 fool's joke
 :o)

You're right. I fell for it. I think I was referring to his announcement of iTeX, which any idiot should have noticed was a joke. I'm no idiot, I'm apparently far more gullible :-P

If you think about it, HTML can be considered the XML based successor of TeX(some flexibility of thought required) Anyways, the main problem I see with DDoc macros is that they don't have any computational power behind them - they put arguments inside fixed patterns and that's it. This might be OK for simple usage(like formatting and generating links), but they are often being used for more complex things, where they miss the purpose of one-source-format-compilable-to-many-target-formats. Tables were already given as example. In the dlang.org source code(which is also used by the Phobos documentation) they are implemented by replicating the HTML tables syntax, so you need to write: $(TABLE caption $(TR $(TH header1) $(TH header2)) $(TR $(TD data11) $(TD data12)) $(TR $(TD data21) $(TD data22)) $(TR $(TD data31) $(TD data32)) ) This works for HTML tables, but it does not work for LaTeX tables, since the cells are separated(by &), not delimited, so there is no way to define the TD macro to produce correct results for all cells. It'll also not work for ASCII tables - since you can't pad the cells and you can't determine the width of a column using DDoc macros. My suggestion is to leave the macros for simple notations, and for the more complex ones use custom sections + scripting. For example, the `std.algorithm` documentation has a Cheat Sheet table, where each function has an example of it's usage. This documentation is written as a big table at the head of the source file. This is bad, since a function's documentation in the source code should be concentrated near it's declaration. But, if we had custom sections + scripting, we could add a custom section named `CheatSheet` and in the DDoc comment of each function we could add this section, and a script would collect all the data from these custom sections and add make a single table out of it.
May 16 2013
prev sibling parent "Jesse Phillips" <Jesse.K.Phillips+D gmail.com> writes:
On Friday, 17 May 2013 at 01:01:36 UTC, Borden wrote:
 On Thursday, 16 May 2013 at 21:36:03 UTC, Craig Dillabaugh 
 wrote:
 Can you provide a reference for the claim that the next
 incarnation of TeX is going to be XML based?  I did a quick 
 search and
 couldn't turn up anything.  Perhaps it was some sort of April 
 fool's joke
 :o)

You're right. I fell for it. I think I was referring to his announcement of iTeX, which any idiot should have noticed was a joke. I'm no idiot, I'm apparently far more gullible :-P

Pretty good talk though: http://river-valley.tv/media/conferences/tug-2010/Don-Knuth/ Thanks for the news.
May 16 2013