www.digitalmars.com         C & C++   DMDScript  

digitalmars.D.bugs - [DOC] HTML docs are not valid utf-8 files

reply Derek Parnell <derek psych.ward> writes:
Two of the HTML doc files, phobos.html and std_format.html are not valid
UTF8 files. 

Specifically, this D code fails with "Error: 4invalid UTF-8 sequence"  ...


  dchar[] lText;

  lText = std.utf.toUTF32(cast(char[])read("phobos.html"));
  lText = std.utf.toUTF32(cast(char[])read("std_format.html"));

All the other document files load correctly.


(I'm working on a utility that generates a cross-reference index HTML
file(s) based on whatever HTML files it's given. I'm tired of trying to
find anything in the official docs, even though I know its there
somewhere.)

-- 
Derek
Melbourne, Australia
29/04/2005 4:33:43 PM
Apr 28 2005
parent =?ISO-8859-1?Q?Anders_F_Bj=F6rklund?= <afb algonet.se> writes:
Derek Parnell wrote:

 Two of the HTML doc files, phobos.html and std_format.html
 are not valid UTF8 files. 

They are not valid HTML either... "tidy" is useful to clean them.
 (I'm working on a utility that generates a cross-reference index HTML
 file(s) based on whatever HTML files it's given. I'm tired of trying to
 find anything in the official docs, even though I know its there
 somewhere.)

I made a XHTML/UTF-8 version, but as it is non-distributable... --anders
Apr 29 2005