digitalmars.D - Std.xml is twice as slow on windows vs Linux. std.xml2 is pushing
- Michael Rynn <michaelrynn optusnet.com.au> Feb 12 2011
Std.xml may have a druntime or compile issue on windows. csv formatted performance results, past into spreadsheet for better viewing. "platform+compile","input","parse","slice parse","parse+dom","slice +dom","std.xml" "Ubuntu32+vbox+gdc2",0.012,0.072,0.052,0.1,0.069,0.092 "Ubuntu32+vbox+dmd2",0.008,0.058,0.041,0.085,0.057,0.078 "Windows7+vbox+dmd2",0.007,0.056,0.039,0.08,0.05,0.195 ,,,,,, ,"Percentage to 0.1",,,,, "Ubuntu32+vbox+gdc2",12,72,52,100,69,92 "Ubuntu32+vbox+dmd2",8,58,41,85,57,78 "Windows7+vbox+dmd2",7,56,39,80,50,195 The figure in the last column shows std.xml to be twice as slow on windows. All programs compiled to release. Each row is the execution times on same running process with different test, run 100s of times one after the other and averaged. (sxml.d) Input is going through buffer and filter, to examine every unicode character in the document as dchar. Parse - Core parser throughput, returning a copy of each XML node content, tag name, attribute value pairs. No data structures are created from parse items returned. (std.xmlp.coreparse) Slice-Parse - Parser throughput with string alias of in-memory document, much less re-allocations. (std.xmlp.sliceparse) No data structures are created from parse items returned. Parse+DOM - Create a DOM using immutable string duplicates of document content. Alias of tag names and attribute names. (std.xmlp.domparse) Slice+DOM - DOM using the same aliased strings of the in-memory document. (std.xmlp.slicedoc) Anomaly for std.xml Std.xml is actually pretty nifty on both Ubuntu compiles (dmd and gdc). Its about twice as slow with windows dmd, wheras the the other implementation tests are quite similar between Ubuntu and Windows. Std.xml does seem to slice the original in-memory string, on code inspection. Building its array dom model might be inefficient compared to the linked DOM model of the other DOM tests. I am not sure what the slow down on windows is due be. Any ideas please? As for being ready to submit these varied replacements for std.xml, I can say that I hope its getting close. The actual XML parsers and DOM are complete as far as I know, in that I am only making minor changes as I work on the Xpath 1.0 expression compile and execution and some XSLT, which are not yet test suite ready. The validating parser is 100% XML test suite compliant. I do not know what the submission process is. The issue with a parser that references the original in-memory image, is that when things get complicated with different source encodings, and compliance source line end substitutions, entity replacements, character and standard entity replacements in content and attributes, this tends to replace much more of the original source, so the speed advantage diminishes. Also the entire document stays in memory till the last reference is garbage collectable. Its all on dsource.org/xml/trunk/ Michael Rynn.
Feb 12 2011