Welcome to Web-News
A Web-based News Reader
Subject Re: Text editing [Was: Re: #line decoder]
From bearophile <bearophileHUGS@lycos.com>
Date Thu, 25 Sep 2008 12:36:17 -0400
Newsgroups digitalmars.D.announce

Sergey Gromov:
> you probably actually want "new string[][first.length]", otherwise you
> simply get three huge strings.

Yes, I have not seem them missing commas in the test run output (this tells me to always avoid writeln and always use my putr, that puts "" around strings when they printed inside containers).
The timings of the loader3/loader4 versions now show that the ArrayBuilder is quite useful.

The version 4 too has the same bug, the new versions:

// loader3
import std.stream;
import std.string: split;
void main() {
    auto fin = new BufferedFile("data2.txt");
    string[] first = fin.readLine().split();
    auto cols = new string[][first.length];
    foreach (col, el; first)
        cols[col] ~= el.dup;
    foreach (string line; fin)
        foreach (col, el; line.split())
            cols[col] ~= el.dup;
}


// loader4
import std.stream, std.gc;
import std.string: split;
void main() {
    auto fin = new BufferedFile("data2.txt");
    string[] first = fin.readLine().split();
    auto cols = new string[][first.length];
    foreach (col, el; first)
        cols[col] ~= el.dup;
    disable();
    foreach (string line; fin)
        foreach (col, el; line.split())
            cols[col] ~= el.dup;
    enable();
}

And just to be sure and have a more real test, I have added a benchmark that shows the final array creation too (the final append is very quick, less than 0.01 s), derived from loader10:

// loader10b
import std.stream;
import std.string: split;
import d.string: xsplit;
import std.gc;
import d.builders: ArrayBuilder;
void main() {
    auto fin = new BufferedFile("data2.txt");
    string[] first = fin.readLine().split();
    auto cols = new ArrayBuilder!(string)[first.length];
    foreach (col, el; first)
        cols[col] ~= el.dup;
    disable();
    foreach (string line; fin)
        foreach (col, el; line.xsplit())
            cols[col] ~= el.dup;
    enable();
    string[][] truecols;
    foreach (col; cols)
        truecols ~= col.toarray;
}


Updated timings:
Timings, data2.txt, warm timings, best of 3:
  loader1:  23.05 s
  loader2:   3.00 s
  loader3:  44.79 s
  loader4:  39.28 s
  loader5:  21.31 s
  loader6:   7.20 s
  loader7:   7.51 s
  loader8:   8.45 s
  loader9:   5.46 s
  loader10:  3.73 s
  loader10b: 3.88 s
  loader11: 82.54 s
  loader12: 38.87 s

Bye and thank you,
bearophile

Recent messages in this thread
 
-# #line decoder BCS 24-Sep-2008 08:34 pm
.-# Text editing [Was: Re: #line decoder] bearophile 25-Sep-2008 08:41 am
..-# Re: Text editing [Was: Re: #line decoder] Sergey Gromov 25-Sep-2008 11:26 am
..|-# Re: Text editing [Was: Re: #line decoder] (Current message) bearophile 25-Sep-2008 12:36 pm
..|.-# Re: Text editing [Was: Re: #line decoder] Sergey Gromov 26-Sep-2008 09:03 pm
..|.|-# Re: Text editing [Was: Re: #line decoder] Sergey Gromov 26-Sep-2008 09:21 pm
..|.|.-# Re: Text editing [Was: Re: #line decoder] bearophile 27-Sep-2008 08:00 am
..|.|..\# Re: Text editing [Was: Re: #line decoder] Sergey Gromov 27-Sep-2008 08:22 am
..|.\# Re: Text editing [Was: Re: #line decoder] Sergey Gromov 27-Sep-2008 08:13 am
..\# Re: Text editing [Was: Re: #line decoder] Christopher Wright 25-Sep-2008 10:50 pm