www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - Fix for endless loop with HTML files

reply =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit

The problem is when it encounters a '<'
character, that is *not* followed by a
valid tag start, according to istagstart()

Such as: < CODE >, and similar constructs.

Then it fails to advance the pointer (!),
and keeps on scanning the '<' character
forever. Inserting a "else p++;" works...

You still need <code> and </code> in order
for D to actually parse the code, but at
least it comes back again with this patch.

--anders
Nov 09 2004
parent reply David Friedman <d3rdclsmail_a_ _t_earthlink_d_._t_net> writes:
Anders F Björklund wrote:
 The problem is when it encounters a '<'
 character, that is *not* followed by a
 valid tag start, according to istagstart()
 
 Such as: < CODE >, and similar constructs.
 
 Then it fails to advance the pointer (!),
 and keeps on scanning the '<' character
 forever. Inserting a "else p++;" works...
 
 You still need <code> and </code> in order
 for D to actually parse the code, but at
 least it comes back again with this patch.
 
 --anders
 

Thanks! I don't think "< CODE >" is valid HTML, so this patch should be enough. David
Nov 13 2004
parent reply "Thomas Kuehne" <thomas-dloop kuehne.cn> writes:
David Friedman schrieb:
 You still need <code> and </code> in order
 for D to actually parse the code, but at
 least it comes back again with this patch.

Thanks! I don't think "< CODE >" is valid HTML, so this patch should be enough.

I've run the attached file (with "< code>" and "</ code>")through the validator at http://validator.w3.org/check An got no warnings or errors. Thomas
Nov 13 2004
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Thomas Kuehne wrote:

 I've run the attached file (with "< code>" and "</ code>")through the
 validator at http://validator.w3.org/check
 
 An got no warnings or errors.

If you validate as XHTML, instead of the old HTML 4.0, you get the following validation errors (from same w3): 1.
 Line 9, column 2: character "<" is the first character of a delimiter
 but occurred as data
 
 < code >
 
 If you wish to include the "<" character in your output, you should
 escape it as "&lt;". Another possibility is that you forgot to close
 quotes in a previous tag.

2.
 Line 9, column 2: character data is not allowed here
 
 < code >
 
 You have used character data somewhere it is not permitted to appear.
 
 Mistakes that can cause this error include putting text directly in the
 body of the document without wrapping it in a container element (such as
 a <p>aragraph</p>) or forgetting to quote an attribute value (where
 characters such as "%" and "/" are common, but cannot appear without
 surrounding quotes).

In general, XHTML and UTF-8 are now recommended instead of the old HTML 4.01 and ISO-8859-1... Anyway, nobody writes < code > in any real stuff. It's just that it's nice if D doesn't HANG on it. :-) --anders
Nov 14 2004