digitalmars.D.learn - Function to print a diamond shape

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (16/16) Mar 20 2014 This is a somewhat common little exercise: Write a function that takes

Justin Whear (2/22) Mar 20 2014 What's the appropriate output for an even number?

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (5/6) Mar 20 2014 Great question! :) Size must be odd. I have this in my function:

Chris Williams (6/8) Mar 20 2014 Well one of the more convoluted methods that I can think of would

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (9/17) Mar 20 2014 A slightly convoluted solution that I've come up with considers the

Brad Anderson (16/33) Mar 20 2014 I'm not entirely happy with it but:

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (18/33) Mar 20 2014 Ooh. I like that. That would have never occurred to me. :)

Brad Anderson (9/37) Mar 20 2014 It felt kind of clumsy when I ended up with it. I don't think it

monarch_dodra (10/12) Mar 20 2014 I'd be interested in seeing a solution using "iota", and the
Timon Gehr (5/21) Mar 20 2014 import std.stdio, std.range, std.algorithm, std.math;

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (4/29) Mar 20 2014 Sweet! :)

bearophile (135/150) Mar 20 2014 Some of my solutions (using each() in the last two is easy):

Jay Norwood (20/23) Mar 20 2014 I like that replicate but easier for me to keep track of the

Jay Norwood (27/27) Mar 21 2014 This one calculates, then outputs subranges of the ba and sa char

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (34/36) Mar 20 2014 I have learned a lot, especially the following two:
Sergei Nosov (9/26) Mar 21 2014 Probably, the most boring way is

Andrea Fontana (2/10) Mar 21 2014 A single foreach(i; 0..N*N) is more boring!
Vladimir Panteleev (2/32) Mar 21 2014

Sergei Nosov (3/38) Mar 21 2014 Beat me. Yours is even more boring. =)

Jay Norwood (12/12) Mar 22 2014 The computation times of different methods can differ a lot.

Jay Norwood (16/16) Mar 22 2014 I decided to redirect stdout to nul and print the stopwatch

=?UTF-8?B?QWxpIMOHZWhyZWxp?= (4/5) Mar 22 2014 Cool. stderr should work too:

Jay Norwood (6/6) Mar 23 2014 Hmmm, looks like stderr.writefln requires format specs, else it

Jay Norwood (64/64) Mar 23 2014 I converted the solution examples to functions, wrote a test to

Jay Norwood (13/13) Mar 23 2014 A problem with the previous brad measurement is that his solution

bearophile (6/12) Mar 23 2014 The task didn't ask for a computationally efficient solution :-)

Jay Norwood (5/10) Mar 23 2014 Yes, this is just for my own education. My builds are using the

Jay Norwood (30/30) Mar 23 2014 These were the times on ubuntu 64 bit dmd. I added diamondShape,
monarch_dodra (23/35) Mar 24 2014 So it's about speed now? Then I submit this:

Jay Norwood (12/12) Mar 24 2014 Very nice example. I'll test on ubuntu later.

=?UTF-8?B?Ikx1w61z?= Marques" (26/32) Mar 23 2014 I used this to benchmark H. S. Teoh's calendar formatter:

bearophile (32/33) Mar 24 2014 if you like similar puzzles, here is another:

Jay Norwood (40/40) Mar 24 2014 not through yet with the diamond. This one is a little faster.

Jay Norwood (37/37) Mar 24 2014 These were times on ubuntu. I may have printed debug build times
monarch_dodra (41/81) Mar 25 2014 Interesting. I'd have thought the "extra copy" would be an

Jay Norwood (2/42) Mar 25 2014 ok. I'll try it. I was happy the appender was pretty fast.

Jay Norwood (14/14) Mar 25 2014 These are times on ubuntu. printDiamond3 was slower than

monarch_dodra (9/11) Mar 25 2014 Hum... Too bad :/

Jay Norwood (7/9) Mar 25 2014 Yes, I'm pretty happy to see the appender works well. The

Jay Norwood (26/26) Mar 25 2014 This is a first attempt at using parallel, but no improvement in

Jay Norwood (2/3) Mar 25 2014 oops. scratch that one. I tested a pointer to the wrong function.

Jay Norwood (23/23) Mar 25 2014 This corrects the parallel example range in the second foreach.

Jay Norwood (62/64) Apr 20 2014 I installed ubuntu 14.04 64 bit, and measured some of these

monarch_dodra (32/55) Apr 21 2014 With this slightly tweaked solution, I can get times of roughly

Jay Norwood (49/56) Apr 21 2014 Yes your solution is the fastest yet. Also, its times are

Jay Norwood (52/52) Apr 22 2014 Wow, joiner is much slower than join. Such a small choice can

monarch_dodra (11/15) Apr 22 2014 Yeah, that's because join actually works on "RoR, R", rather than

Jay Norwood (7/18) Apr 23 2014 Ok, thanks. I re-tried joiner with both parameters being ranges,

monarch_dodra (5/23) Apr 22 2014 I'm not sure what you mean? "data" returns the managed array, but

bearophile (4/9) Mar 28 2014 It used to work, but with the latest changes I think I have

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

This is a somewhat common little exercise: Write a function that takes 
the size of a diamond and produces a diamond of that size.

When printed, here is the output for size 11:

      *
     ***
    *****
   *******
  *********
***********
  *********
   *******
    *****
     ***
      *

What interesting, boring, efficient, slow, etc. ways are there?

Ali

Mar 20 2014

Justin Whear <justin economicmodeling.com> writes:

On Thu, 20 Mar 2014 14:25:02 -0700, Ali Çehreli wrote:

 This is a somewhat common little exercise: Write a function that takes
 the size of a diamond and produces a diamond of that size.
 
 When printed, here is the output for size 11:
 
       *
      ***
     *****
    *******
   *********
 ***********
   *********
    *******
     *****
      ***
       *
 
 What interesting, boring, efficient, slow, etc. ways are there?
 
 Ali


What's the appropriate output for an even number?

Mar 20 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/20/2014 02:30 PM, Justin Whear wrote:

 What's the appropriate output for an even number?

Great question! :) Size must be odd. I have this in my function:

     enforce(size % 2,
             format("Size cannot be an even number. (%s)", size));

Ali

Mar 20 2014

"Chris Williams" <yoreanon-chrisw yahoo.co.jp> writes:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

Well one of the more convoluted methods that I can think of would 
be to define a square as a set of four vectors, rotate 45 
degrees, and then create a rasterizer that checks for the 
presence of the rect at sequential points, and plots those to the 
console.

Mar 20 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/20/2014 02:52 PM, Chris Williams wrote:
 On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

 Well one of the more convoluted methods that I can think of would be to
 define a square as a set of four vectors, rotate 45 degrees, and then
 create a rasterizer that checks for the presence of the rect at
 sequential points, and plots those to the console.

A slightly convoluted solution that I've come up with considers the 
diamond as three pieces:

1) Top triangle

2) The widest line

3) The bottom triangle, which happens to be the .retro of the first part

     auto bottomHalf = topHalf.retro;

     auto diamond = chain(topHalf, widestLine, bottomHalf).joiner("\n");

Ali

Mar 20 2014

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function 
 that takes the size of a diamond and produces a diamond of that 
 size.

 When printed, here is the output for size 11:

      *
     ***
    *****
   *******
  *********
 ***********
  *********
   *******
    *****
     ***
      *

 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

I'm not entirely happy with it but:

   void main()
   {
     import std.algorithm, std.range, std.stdio, std.conv;

     enum length = 5;
     auto rng =
        chain(iota(length), iota(length, -1, -1))
       .map!((a => " ".repeat(length-a)),

       .map!(a => chain(a[0].joiner, a[1].joiner, "\n"))
       .joiner;

     writeln(rng);
   }

Had some trouble with the result coming out as integers instead 
of something string-like.

Mar 20 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/20/2014 03:03 PM, Brad Anderson wrote:

 I'm not entirely happy with it but:

I am not happy with my attempt either. :)

    void main()
    {
      import std.algorithm, std.range, std.stdio, std.conv;

      enum length = 5;
      auto rng =
         chain(iota(length), iota(length, -1, -1))

Ooh. I like that. That would have never occurred to me. :)

        .map!((a => " ".repeat(length-a)),

        .map!(a => chain(a[0].joiner, a[1].joiner, "\n"))
        .joiner;

      writeln(rng);
    }

Does that compile for you? Failed for me with v2.066-devel-d0f461a:

./phobos/std/typetuple.d(550): Error: template instance F!(__lambda1) 
cannot use local '__lambda1' as parameter to non-global template 
AppliedReturnType(alias f)
./phobos/std/typetuple.d(556): Error: template instance 
deneme.main.staticMap!(AppliedReturnType, __lambda1) error instantiating
./phobos/std/algorithm.d(404):        instantiated from here: 
staticMap!(AppliedReturnType, __lambda1, __lambda2)
deneme.d(161788):        instantiated from here: map!(Result)

That is pointing at this line:

         .map!((a => " ".repeat(length-a)),

A regression?

 Had some trouble with the result coming out as integers instead of
 something string-like.

I had the same problem at one point. I will try to understand when that 
happens.

Ali

Mar 20 2014

"Brad Anderson" <eco gnuk.net> writes:

On Thursday, 20 March 2014 at 22:46:53 UTC, Ali Çehreli wrote:
 On 03/20/2014 03:03 PM, Brad Anderson wrote:

 I'm not entirely happy with it but:

 I am not happy with my attempt either. :)

    void main()
    {
      import std.algorithm, std.range, std.stdio, std.conv;

      enum length = 5;
      auto rng =
         chain(iota(length), iota(length, -1, -1))

 Ooh. I like that. That would have never occurred to me. :)

It felt kind of clumsy when I ended up with it. I don't think it 
shows my intent very well (repeat the range in reverse). I wish 
Phobos had something like a mirror() range (i.e. chain(rng, 
rng.retro())).

        .map!((a => " ".repeat(length-a)),

        .map!(a => chain(a[0].joiner, a[1].joiner, "\n"))
        .joiner;

      writeln(rng);
    }

 Does that compile for you? Failed for me with 
 v2.066-devel-d0f461a:
 [snip]
 A regression?


I did it on dpaste which is using 2.065 so I suspect regression.

http://dpaste.dzfl.pl/71c331960cb0


 Had some trouble with the result coming out as integers

 instead of
 something string-like.

 I had the same problem at one point. I will try to understand 
 when that happens.

 Ali

I was getting the integers when I was using character literals 
with repeat() rather than string literals.

Mar 20 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

I'd be interested in seeing a solution using "iota", and the 
currently proposed "each" or "tee". A quick protype to draw a 
triangle would be:

iota(0, n).each!(a=>q{%(*%)}.writefln(a.iota()))();
or
iota(0, n).each!(a=>'*'.repeat(a).writeln())();

Adapting that to do a diamond should be straight forward? It 
would be a good benchmark of functional vs imperative code, and 
the usability of "each".

Mar 20 2014

Timon Gehr <timon.gehr gmx.ch> writes:

On 03/20/2014 10:25 PM, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function that takes
 the size of a diamond and produces a diamond of that size.

 When printed, here is the output for size 11:

       *
      ***
     *****
    *******
   *********
  ***********
   *********
    *******
     *****
      ***
       *

 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

import std.stdio, std.range, std.algorithm, std.math;

enum s=11;
writef("%(%s\n%)", (i=>i.map!(a=>i.map!(b=>"* "[a+b>s/2])))
                    (iota(-s/2,s/2+1).map!abs));

Mar 20 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/20/2014 03:48 PM, Timon Gehr wrote:
 On 03/20/2014 10:25 PM, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function that takes
 the size of a diamond and produces a diamond of that size.

 When printed, here is the output for size 11:

       *
      ***
     *****
    *******
   *********
  ***********
   *********
    *******
     *****
      ***
       *

 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

 import std.stdio, std.range, std.algorithm, std.math;

 enum s=11;
 writef("%(%s\n%)", (i=>i.map!(a=>i.map!(b=>"* "[a+b>s/2])))
                     (iota(-s/2,s/2+1).map!abs));

Sweet! :)

     "* "[a+b>s/2]    // loving it

Ali

Mar 20 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Ali Çehreli:

 This is a somewhat common little exercise: Write a function 
 that takes the size of a diamond and produces a diamond of that 
 size.

 When printed, here is the output for size 11:

      *
     ***
    *****
   *******
  *********
 ***********
  *********
   *******
    *****
     ***
      *

Some of my solutions (using each() in the last two is easy):

import std.stdio, std.array, std.string, std.range, 
std.algorithm, std.math;

void printDiamond1(in uint n) {
     immutable k = (n % 2 == 1) ? 1 : 2;

     for (int i = k; i <= n; i += 2)
         writeln("*".replicate(i).center(n));

     for (int i = n - 2; i >= k; i -= 2)
         writeln("*".replicate(i).center(n));
}

void printDiamond2(in int n) {
     iota(!(n % 2), n)
     .map!(i => "*"
                .replicate((n % 2) + ((n / 2) - abs(i - (n / 2))) 
* 2)
                .center(n))
     .join("\n")
     .writeln;
}

void printDiamond3(in int n) {
     writefln("%-(%s\n%)",
              iota(!(n % 2), n)
              .map!(i => "*"
                         .replicate((n % 2) + ((n / 2) - abs(i - 
(n / 2))) * 2)
                         .center(n)));
}

void main() {
     foreach (immutable i; 0 .. 15) {
         printDiamond3(i);
         writeln;
     }
}


Output:


*

**

  *
***
  *

  **
****
  **

   *
  ***
*****
  ***
   *

   **
  ****
******
  ****
   **

    *
   ***
  *****
*******
  *****
   ***
    *

    **
   ****
  ******
********
  ******
   ****
    **

     *
    ***
   *****
  *******
*********
  *******
   *****
    ***
     *

     **
    ****
   ******
  ********
**********
  ********
   ******
    ****
     **

      *
     ***
    *****
   *******
  *********
***********
  *********
   *******
    *****
     ***
      *

      **
     ****
    ******
   ********
  **********
************
  **********
   ********
    ******
     ****
      **

       *
      ***
     *****
    *******
   *********
  ***********
*************
  ***********
   *********
    *******
     *****
      ***
       *

       **
      ****
     ******
    ********
   **********
  ************
**************
  ************
   **********
    ********
     ******
      ****
       **


Bye,
bearophile

Mar 20 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Friday, 21 March 2014 at 00:31:58 UTC, bearophile wrote:
 This is a somewhat common little exercise: Write a function

 Bye,
 bearophile

I like that replicate but easier for me to keep track of the 
counts if I work from the center.

int blanks[];
blanks.length = n;
int stars[];
stars.length = n;

int c = n/2; // center of diamond
int cp1 = c+1;
blanks[c]=0;
stars[c]=n;

// calculate stars and blanks in each row
for(int i=1; i<cp1; i++){
     blanks[c-i] = blanks[c+i] = i;
     stars[c-i] = stars[c+i] = n - (i*2);
}

for (int i=0; i<n; i++){
     write(" ".replicate(blanks[i]));
     writeln("*".replicate(stars[i]));
}

Mar 20 2014

"Jay Norwood" <jayn prismnet.com> writes:

This one calculates, then outputs subranges of the ba and sa char 
arrays.

int n = 11;
int blanks[];
blanks.length = n;
int stars[];
stars.length = n;
char ba[];
ba.length = n;
ba[] = ' '; // fill full ba array
char sa[];
sa.length = n;
sa[] = '*'; // fill full sa array

int c = n/2; // center of diamond
int cp1 = c+1;
blanks[c]=0;
stars[c]=n;

// calculate stars and blanks in each row
for(int i=1; i<cp1; i++){
     blanks[c-i] = blanks[c+i] = i;
     stars[c-i] = stars[c+i] = n - (i*2);
}

// output subranges of the ba and sa char arrays
for (int i=0; i<n; i++){
     write(ba[$-blanks[i]..$]);
     writeln(sa[$-stars[i]..$]);
}

Mar 21 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/20/2014 02:25 PM, Ali Çehreli wrote:

 Write a function that takes
 the size of a diamond and produces a diamond of that size.

I have learned a lot, especially the following two:

1) chain'ing iotas is an effective way of producing non-monotonic number 
intervals (and more).

2) There is std.string.center. :)

Also considering readability, here is my favorite so far:

auto diamondShape(size_t N, dchar fillChar = '*')
{
     import std.range : chain, iota, repeat;
     import std.algorithm : map;
     import std.conv : text;
     import std.string : center, format;
     import std.exception : enforce;

     enforce(N % 2, format("Size must be an odd number. (%s)", N));

     return
         chain(iota(1, N, 2),
               iota(N, 0, -2))
         .map!(i => fillChar.repeat(i))
         .map!(s => s.text)
         .map!(s => s.center(N));
}

unittest
{
     import std.exception : assertThrown;
     import std.algorithm : equal;

     assertThrown(diamondShape(4));
     assert(diamondShape(3, 'o').equal([ " o ", "ooo", " o " ]));
}

void main()
{
     import std.stdio : writefln;

     writefln("%-(%s\n%)", diamondShape(11));
}

Ali

Mar 20 2014

"Sergei Nosov" <sergei.nosov gmail.com> writes:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function 
 that takes the size of a diamond and produces a diamond of that 
 size.

 When printed, here is the output for size 11:

      *
     ***
    *****
   *******
  *********
 ***********
  *********
   *******
    *****
     ***
      *

 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

Probably, the most boring way is

foreach(i; 0..N)
{
     foreach(j; 0..N)
         write(" *"[i + j >= N/2 && i + j < 3*N/2 && i - j <= N/2 
&& j - i <= N/2]);
     writeln;
}

Mar 21 2014

"Andrea Fontana" <nospam example.com> writes:

On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:
 Probably, the most boring way is

 foreach(i; 0..N)
 {
     foreach(j; 0..N)
         write(" *"[i + j >= N/2 && i + j < 3*N/2 && i - j <= 
 N/2 && j - i <= N/2]);
     writeln;
 }

A single foreach(i; 0..N*N) is more boring!

Mar 21 2014

"Vladimir Panteleev" <vladimir thecybershadow.net> writes:

On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:
 On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function 
 that takes the size of a diamond and produces a diamond of 
 that size.

 When printed, here is the output for size 11:

     *
    ***
   *****
  *******
 *********
 ***********
 *********
  *******
   *****
    ***
     *

 What interesting, boring, efficient, slow, etc. ways are there?

 Ali

 Probably, the most boring way is

 foreach(i; 0..N)
 {
     foreach(j; 0..N)
         write(" *"[i + j >= N/2 && i + j < 3*N/2 && i - j <= 
 N/2 && j - i <= N/2]);

write(" *"[abs(i-N/2) + abs(j-N/2) <= N/2]);

     writeln;
 }

Mar 21 2014

"Sergei Nosov" <sergei.nosov gmail.com> writes:

On Friday, 21 March 2014 at 13:59:27 UTC, Vladimir Panteleev 
wrote:
 On Friday, 21 March 2014 at 12:32:58 UTC, Sergei Nosov wrote:
 On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 This is a somewhat common little exercise: Write a function 
 that takes the size of a diamond and produces a diamond of 
 that size.

 When printed, here is the output for size 11:

    *
   ***
  *****
 *******
 *********
 ***********
 *********
 *******
  *****
   ***
    *

 What interesting, boring, efficient, slow, etc. ways are 
 there?

 Ali

 Probably, the most boring way is

 foreach(i; 0..N)
 {
    foreach(j; 0..N)
        write(" *"[i + j >= N/2 && i + j < 3*N/2 && i - j <= 
 N/2 && j - i <= N/2]);

 write(" *"[abs(i-N/2) + abs(j-N/2) <= N/2]);

    writeln;
 }


Beat me. Yours is even more boring. =)

Mar 21 2014

"Jay Norwood" <jayn prismnet.com> writes:

The computation times of different methods can differ a lot.   
How do you suggest to measure this effectively without the 
overhead of the write and writeln output?   Would a count of 
100001 and stubs like below be reasonable, or would there be 
something else that would  prevent the optimizer from getting too 
aggressive?

void writelnx(T...)(T args)
{
}
void writex(T...)(T args)
{
}

Mar 22 2014

"Jay Norwood" <jayn prismnet.com> writes:

I decided to redirect stdout to nul and print the stopwatch 
messages to stderr.
So, basically like this.

import std.stdio;
import std.datetime;
import std.cstream;

StopWatch sw;
sw.start();

measured code

sw.stop();
derr.writefln("time: ", sw.peek().msecs, "[ms]");

Then, windows results comparing two versions, this for n=2001, 
shows one form is about 3x faster when you redirect stdout to nul.

D:\diamond\diamond\diamond\Release>diamond 1>nul
time: 15[ms]
time: 42[ms]

Mar 22 2014

=?UTF-8?B?QWxpIMOHZWhyZWxp?= <acehreli yahoo.com> writes:

On 03/22/2014 06:03 PM, Jay Norwood wrote:

 derr.writefln("time: ", sw.peek().msecs, "[ms]");

Cool. stderr should work too:

     stderr.writefln(/* ... */);

Ali

Mar 22 2014

"Jay Norwood" <jayn prismnet.com> writes:

Hmmm, looks like stderr.writefln requires format specs, else it 
omits the additional parameters. (not so on derr.writefln)

stderr.writefln("time: %s%s",sw.peek().msecs, "[ms]");

D:\diamond\diamond\diamond\Release>diamond 1>nul
time: 16[ms]
time: 44[ms]

Mar 23 2014

"Jay Norwood" <jayn prismnet.com> writes:

I converted the solution examples to functions, wrote a test to 
measure each 100 times with a diamond of size 1001.  These are 
release build times.  timon's crashed so I took it out.  Maybe I 
made a mistake copying ... have to go back and look.


D:\diamond\diamond\diamond\Release>diamond 1>nul
brad: time: 78128[ms]
printDiamond1: time: 1166[ms]
printDiamond2: time: 1659[ms]
printDiamond3: time: 631[ms]
jay1: time: 466[ms]
sergei: time: 11944[ms]
jay2: time: 414[ms]


These are the the measurement functions


void measure( void function(in int a) func, int times, int 
diamondsz, string name ){
   StopWatch sw;
   sw.start();
   for (int i=0; i<times; i++){
     func(diamondsz);
   }
   sw.stop;
   stderr.writeln(name, ": time: ", sw.peek().msecs, "[ms]");
}

void measureu( void function(in uint a) func, int times, uint 
diamondsz, string name ){
   StopWatch sw;
   sw.start();
   for (int i=0; i<times; i++){
     func(diamondsz);
   }
   sw.stop;
   stderr.writeln(name, ": time: ", sw.peek().msecs, "[ms]");
}

int main(string[] argv)
{
	int times = 100;
	int dsz = 1001;
	uint dszu = 1001;
	measure (&brad,times,dsz,"brad");
	//measure (&timon,times,dsz,"timon");
	measureu (&printDiamond1,times,dszu,"printDiamond1");
	measure (&printDiamond2,times,dsz,"printDiamond2");
	measure (&printDiamond3,times,dsz,"printDiamond3");
	measure (&jay1,times,dsz,"jay1");
	measure (&sergei,times,dsz,"sergei");
	measure (&jay2,times,dsz,"jay2");

	return 0;

}

All the functions are like this:
void brad(in int length){
   import std.algorithm, std.range, std.stdio, std.conv;

   auto rng =
     chain(iota(length), iota(length, -1, -1))
     .map!((a => " ".repeat(length-a)),

     .map!(a => chain(a[0].joiner, a[1].joiner, "\n"))
     .joiner;

   writeln(rng);
}

void timon(in int s){
   import std.stdio, std.range, std.algorithm, std.math;

   writef("%(%s\n%)", (i=>i.map!(a=>i.map!(b=>"* "[a+b>s/2])))
     (iota(-s/2,s/2+1).map!abs));
}

Mar 23 2014

"Jay Norwood" <jayn prismnet.com> writes:

A problem with the previous brad measurement is that his solution 
creates a diamond of size 2n+1 for an input of n.  Correcting the 
size input for brad's function call, and re-running, I get this.  
So the various solutions can have overhead computation time of 
40x difference, depending on the implementation.

D:\diamond\diamond\diamond\Release>diamond 1>nul
brad: time: 19554[ms]
printDiamond1: time: 1154[ms]
printDiamond2: time: 1637[ms]
printDiamond3: time: 622[ms]
jay1: time: 475[ms]
sergei: time: 11939[ms]
jay2: time: 413[ms]

Mar 23 2014

"bearophile" <bearophileHUGS lycos.com> writes:

Jay Norwood:

 A problem with the previous brad measurement is that his 
 solution creates a diamond of size 2n+1 for an input of n.  
 Correcting the size input for brad's function call, and 
 re-running, I get this.  So the various solutions can have 
 overhead computation time of 40x difference, depending on the 
 implementation.

The task didn't ask for a computationally efficient solution :-) 
So you are measuring something that was not optimized for. So 
there's lot of variance.

Bye,
bearophile

Mar 23 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Sunday, 23 March 2014 at 17:30:20 UTC, bearophile wrote:

 The task didn't ask for a computationally efficient solution 
 :-) So you are measuring something that was not optimized for. 
 So there's lot of variance.

 Bye,
 bearophile

Yes, this is just for my own education.   My builds are using the 
dmd compiler on windows, and some  posts indicate I should expect 
better optimization currently with the ldc compiler... so maybe 
I'll get on a linux box and retest with ldc.

Mar 23 2014

"Jay Norwood" <jayn prismnet.com> writes:

These were the times on ubuntu 64 bit dmd.  I added diamondShape, 
which is slightly modified to be consistent with the others .. 
just removing the second parameter and doing the writeln calls 
within the function, as the others have been done.  This is still 
with dmd.  I've downloaded ldc.

Also,  I posted the test code on dpaste.com/hold/1753517


brad: time: 20837[ms]
printDiamond1: time: 482[ms]
printDiamond2: time: 944[ms]
printDiamond3: time: 490[ms]
jay1: time: 62[ms]
sergei: time: 4154[ms]
jay2: time: 30[ms]
diamondShape: time: 3384[ms]

void diamondShape(in int N)
{
     import std.range : chain, iota, repeat;
     import std.algorithm : map;
     import std.conv : text;
     import std.string : center, format;
     import std.exception : enforce;
     dchar fillChar = '*';
     enforce(N % 2, format("Size must be an odd number. (%s)", N));

     foreach(ln;
			chain(iota(1, N, 2),
				  iota(N, 0, -2))
			.map!(i => fillChar.repeat(i))
			.map!(s => s.text)
			.map!(s => s.center(N))) writeln(ln);
}

Mar 23 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Sunday, 23 March 2014 at 18:28:18 UTC, Jay Norwood wrote:
 On Sunday, 23 March 2014 at 17:30:20 UTC, bearophile wrote:

 The task didn't ask for a computationally efficient solution 
 :-) So you are measuring something that was not optimized for. 
 So there's lot of variance.

 Bye,
 bearophile

 Yes, this is just for my own education.   My builds are using 
 the dmd compiler on windows, and some  posts indicate I should 
 expect better optimization currently with the ldc compiler... 
 so maybe I'll get on a linux box and retest with ldc.

So it's about speed now? Then I submit this:

//----
void printDiamond(size_t N)
{
     char[32] rawSpace = void;
     char[64] rawStars = void;
     char* pSpace = rawSpace.ptr;
     char* pStars = rawStars.ptr;
     if (N > 64)
     {
         pSpace = new char[](N/2).ptr;
         pStars = new char[](N).ptr;
     }
     pSpace[0 .. N/2] = ' ';
     pStars[0 ..   N] = '*';

     N/=2;
     foreach         (n ; 0 .. N + 1)
         writeln(pSpace[0 .. N - n], pStars[0 .. 2*n+1]);
     foreach_reverse (n ; 0 .. N)
         writeln(pSpace[0 .. N - n], pStars[0 .. 2*n+1]);
}
//----

Mar 24 2014

"Jay Norwood" <jayn prismnet.com> writes:

Very nice example.   I'll test on ubuntu later.

On windows ...

D:\diamond\diamond\diamond\Release>diamond 1> nul
brad: time: 19544[ms]
printDiamond1: time: 1139[ms]
printDiamond2: time: 1656[ms]
printDiamond3: time: 663[ms]
jay1: time: 455[ms]
sergei: time: 11673[ms]
jay2: time: 411[ms]
diamondShape: time: 4399[ms]
printDiamond: time: 185[ms]

Mar 24 2014

=?UTF-8?B?Ikx1w61z?= Marques" <luis luismarques.eu> writes:

On Saturday, 22 March 2014 at 14:41:48 UTC, Jay Norwood wrote:
 The computation times of different methods can differ a lot.   
 How do you suggest to measure this effectively without the 
 overhead of the write and writeln output?   Would a count of 
 100001 and stubs like below be reasonable, or would there be 
 something else that would  prevent the optimizer from getting 
 too aggressive?

I used this to benchmark H. S. Teoh's calendar formatter:

version(benchmark)
{
     int main(string[] args)
     {
         enum MonthsPerRow = 3;
         auto t = benchmark!(function() {
                 foreach(formattedYear; iota(1800, 2000).map!(year 
=> formatYear(year, MonthsPerRow)))
                 {
                     foreach(_; formattedYear){};
                 }
             })(30);
         writeln(t[0].msecs * 0.001);
         return 0;
     }
}

While the optimizer could probably remove all of that, it 
doesn't. I also tested it against other options like walkLength, 
this ended up begin the better choice.

(BTW, using joiner instead of join I was able to more than double 
the performance: 
https://github.com/luismarques/dcal/tree/benchmark . Once the 
pipeline is made lazy end to end that will probably have even 
more impact.)

Mar 23 2014

"bearophile" <bearophile HUGS lycos.com> writes:

On Thursday, 20 March 2014 at 21:25:03 UTC, Ali Çehreli wrote:
 This is a somewhat common little exercise:

if you like similar puzzles, here is another:

Write a program that expects a 10-by-10 matrix from standard 
input. The program should compute sum of each row and each column 
and print the highest of these numbers to standard output.

     An example input:

     01 34 46 31 55 21 16 88 87 87
     32 40 82 40 43 96 08 82 41 86
     30 16 24 18 04 54 65 96 38 48
     32 00 99 90 24 75 89 41 04 01
     11 80 31 83 08 93 37 96 27 64
     09 81 28 41 48 23 68 55 86 72
     64 61 14 55 33 39 40 18 57 59
     49 34 50 81 85 12 22 54 80 76
     18 45 50 26 81 95 25 14 46 75
     22 52 37 50 37 40 16 71 52 17

     Expected output:

     615

The purpose is to write a "golfing" program, that is the shortest.

My current D solution is about 170 bytes (UNIX newlines):

void main(){
import std.stdio,std.range,std.algorithm,std.conv;
auto m=10.iota.map!(_=>readln.split.to!(int[]));
m.map!sum.chain(m.transposed.map!sum).reduce!max.write;
}


I am now trying to use std.file.slurp, but its documentation is 
insufficient.

A cryptic Python solution (not mine), 73 characters:

m=[map(int,_().split())for _ in[raw_input]*10]
_(max(map(sum,m+zip(*m))))

Bye,
bearophile

Mar 24 2014

"Jay Norwood" <jayn prismnet.com> writes:

not through yet with the diamond.  This one is a little faster.  
Appending the newline to the stars and calculating the slice 
backward from the end would save a w.put for the newlines ... 
probably faster.  I keep looking for a way to create a dynamic 
array of a specific size, filled with the init value I provide. 
Does it exist?

D:\diamond\diamond\diamond\Release>diamond 1>nul
brad: time: 19370[ms]
printDiamond1: time: 1140[ms]
printDiamond2: time: 1631[ms]
printDiamond3: time: 633[ms]
jay1: time: 459[ms]
sergei: time: 11886[ms]
jay2: time: 415[ms]
diamondShape: time: 4553[ms]
printDiamond: time: 187[ms]
printDiamonde2a: time: 139[ms]


void printDiamonde2a(in uint N)
{
     size_t N2 = N/2;
     char pSpace[] = uninitializedArray!(char[])(N2);
     pSpace[] = ' ';

     char pStars[] = uninitializedArray!(char[])(N);
     pStars[] = '*';

     char pNewLine[]=uninitializedArray!(char[])(2);
     pNewLine[] = '\n';

     auto w = appender!(char[])();
     w.reserve(N*4);

     foreach (n ; 0 .. N2 + 1){
         w.put(pSpace[0 .. N2 - n]);
         w.put(pStars[0 .. 2*n+1]);
         w.put(pNewLine[1]);
     }

     foreach_reverse (n ; 0 .. N2){
         w.put(pSpace[0 .. N2 - n]);
         w.put(pStars[0 .. 2*n+1]);
         w.put(pNewLine[1]);
     }
     write(w.data);
}

Mar 24 2014

"Jay Norwood" <jayn prismnet.com> writes:

These were times on ubuntu. I may have printed debug build times 
previously, but these are dmd release build.  I gave up trying to 
figure out how to build ldc on ubuntu.  The dmd one click 
installer is much appreciated.

brad: time: 12425[ms]
printDiamond1: time: 380[ms]
printDiamond2: time: 728[ms]
printDiamond3: time: 378[ms]
jay1: time: 62[ms]
sergei: time: 3965[ms]
jay2: time: 27[ms]
diamondShape: time: 2778[ms]
printDiamond: time: 19[ms]
printDiamonde: time: 19[ms]
printDiamonde2b: time: 16[ms]


This was using the appended newlines to get rid of the extra wput 
in the loops.

void printDiamonde2b(in uint N)
{
     uint N2 = N/2;
     char pSpace[] = uninitializedArray!(char[])(N2);
     pSpace[] = ' ';

     char pStars[] = uninitializedArray!(char[])(N+1);
     pStars[] = '*';

     pStars[$-1] = '\n';

     auto w = appender!(char[])();
     w.reserve(N*3);

     foreach (n ; 0 .. N2 + 1){
         w.put(pSpace[0 .. N2 - n]);
	w.put(pStars[$-2*n-2 .. $]);
	}

     foreach_reverse (n ; 0 .. N2){
         w.put(pSpace[0 .. N2 - n]);
	w.put(pStars[$-2*n-2 .. $]);
	}

     write(w.data);
}

Mar 24 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 25 March 2014 at 02:25:57 UTC, Jay Norwood wrote:
 not through yet with the diamond.  This one is a little faster.
  Appending the newline to the stars and calculating the slice 
 backward from the end would save a w.put for the newlines ... 
 probably faster.  I keep looking for a way to create a dynamic 
 array of a specific size, filled with the init value I provide. 
 Does it exist?

 D:\diamond\diamond\diamond\Release>diamond 1>nul
 brad: time: 19370[ms]
 printDiamond1: time: 1140[ms]
 printDiamond2: time: 1631[ms]
 printDiamond3: time: 633[ms]
 jay1: time: 459[ms]
 sergei: time: 11886[ms]
 jay2: time: 415[ms]
 diamondShape: time: 4553[ms]
 printDiamond: time: 187[ms]
 printDiamonde2a: time: 139[ms]


 void printDiamonde2a(in uint N)
 {
     size_t N2 = N/2;
     char pSpace[] = uninitializedArray!(char[])(N2);
     pSpace[] = ' ';

     char pStars[] = uninitializedArray!(char[])(N);
     pStars[] = '*';

     char pNewLine[]=uninitializedArray!(char[])(2);
     pNewLine[] = '\n';

     auto w = appender!(char[])();
     w.reserve(N*4);

     foreach (n ; 0 .. N2 + 1){
         w.put(pSpace[0 .. N2 - n]);
         w.put(pStars[0 .. 2*n+1]);
         w.put(pNewLine[1]);
     }

     foreach_reverse (n ; 0 .. N2){
         w.put(pSpace[0 .. N2 - n]);
         w.put(pStars[0 .. 2*n+1]);
         w.put(pNewLine[1]);
     }
     write(w.data);
 }

Interesting. I'd have thought the "extra copy" would be an 
overall slowdown, but I guess that's not the case.

I also tried your strategy of adding '\n' to the buffer, but I 
was getting some bad output on windows. I'm not sure why "\n\n" 
works though. On *nix, I'd have also expected a double line feed. 
Did you check the actual output?

Appender is better than "~=", but it's not actually that good 
either. Try this:

//----
void printDiamond3(size_t N)
{
     import core.memory;
     char* p = cast(char*)GC.malloc(N*N+16);
     p[0..N*N+16]='*';

     auto pp = p;
     N/=2;
     enum code = q{
         pp[0 .. N - n] = ' ';
         pp+=(1+N+n);
         version(Windows)
         {
             pp[0 .. 2] = "\r\n";
             pp+=2;
         }
         else
         {
             pp[0] = '\n';
             ++pp;
         }
     };
     foreach        (n; 0 .. N + 1) {mixin(code);}
     foreach_reverse(n; 0 .. N    ) {mixin(code);}
     write(p[0 .. pp-p]);
}
//----

This makes just 1 allocation of roughly the right size. It also 
eagerly fills the entire array with '*', since I *figure* that's 
faster than a lot of different writes.

I could be mistaken about that though, but I imagine the 
pre-allocation and not using Appender is definitely a boost.

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

 Interesting. I'd have thought the "extra copy" would be an 
 overall slowdown, but I guess that's not the case.


 I also tried your strategy of adding '\n' to the buffer, but I 
 was getting some bad output on windows. I'm not sure why "\n\n" 
 works though. On *nix, I'd have also expected a double line 
 feed. Did you check the actual output?

I checked the output.  The range selected is for one newline.
 Appender is better than "~=", but it's not actually that good 
 either. Try this:

 //----
 void printDiamond3(size_t N)
 {
     import core.memory;
     char* p = cast(char*)GC.malloc(N*N+16);
     p[0..N*N+16]='*';

     auto pp = p;
     N/=2;
     enum code = q{
         pp[0 .. N - n] = ' ';
         pp+=(1+N+n);
         version(Windows)
         {
             pp[0 .. 2] = "\r\n";
             pp+=2;
         }
         else
         {
             pp[0] = '\n';
             ++pp;
         }
     };
     foreach        (n; 0 .. N + 1) {mixin(code);}
     foreach_reverse(n; 0 .. N    ) {mixin(code);}
     write(p[0 .. pp-p]);
 }
 //----

 This makes just 1 allocation of roughly the right size. It also 
 eagerly fills the entire array with '*', since I *figure* 
 that's faster than a lot of different writes.

 I could be mistaken about that though, but I imagine the 
 pre-allocation and not using Appender is definitely a boost.

ok. I'll try it.  I was happy the appender was pretty fast.

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

These are times on ubuntu. printDiamond3 was slower than 
printDiamond.


brad: time: 12387[ms]
printDiamond1: time: 373[ms]
printDiamond2: time: 722[ms]
printDiamond3: time: 384[ms]
jay1: time: 62[ms]
sergei: time: 3918[ms]
jay2: time: 28[ms]
diamondShape: time: 2725[ms]
printDiamond: time: 19[ms]
printDiamonde2a: time: 18[ms]
printDiamonde2b: time: 14[ms]
printDiamond3: time: 26[ms]

Mar 25 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 25 March 2014 at 12:30:37 UTC, Jay Norwood wrote:
 These are times on ubuntu. printDiamond3 was slower than 
 printDiamond.

Hum... Too bad :/

I was able to improve my first "printDiamon" by having a single 
slice that contains spaces then stars, and make writeln's of that.

It gave (on my windows) speeds comparable to your printDiamond3. 
But not any speed differences that warrants posting new code.

Thanks for the benches. This was fun :)

I love how D can achieve *great* performance, while still looking 
readable and maintainable.

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Tuesday, 25 March 2014 at 15:31:12 UTC, monarch_dodra wrote:
 I love how D can achieve *great* performance, while still 
 looking readable and maintainable.

Yes,  I'm pretty happy to see the appender works well.  The 
parallel library also seems to work very well in my few 
experiences with it.

Maybe it would be useful to see how to use the parallel api to 
implement this, and if it can make a scalable impact on the 
execution time.

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

This is a first attempt at using parallel, but no improvement in 
speed on a corei7.  It is about 3x slower than the prior 
versions.  Probably the join was not a good idea.  Also, no 
foreach_reverse for the parallel, so it requires extra 
calculations for the reverse index.


void printDiamonde2cpa(in uint N)
{
     size_t N2 = N/2;
     char p[] = uninitializedArray!(char[])(N2+N);
     p[0..N2] = ' ';
     p[N2..$] = '*';
     char nl[] = uninitializedArray!(char[])(1);
     nl[] = '\n';

     char[][] wc = minimallyInitializedArray!(char[][])(N);

     auto w = appender!(char[])();

     foreach(n, ref elem; taskPool.parallel(wc[0..N2+1],100)){
         elem = p[n .. N2+2*n+1];
     }

     foreach (rn, ref elem ; taskPool.parallel(wc[0..N2],100)){
         int n = N2 - rn - 1;
         elem = p[n .. N2+2*n+1];
     }
     auto wj = join(wc,nl);
     w.put(wj);

     writeln(w.data);
}

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Wednesday, 26 March 2014 at 04:47:48 UTC, Jay Norwood wrote:
 This is a first attempt at using parallel, but no improvement

oops.  scratch that one. I tested a pointer to the wrong function.

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

This corrects the parallel example range in the second foreach.  
Still slow.

void printDiamonde2cpa(in uint N)
{
     size_t N2 = N/2;
     char p[] = uninitializedArray!(char[])(N2+N);
     p[0..N2] = ' ';
     p[N2..$] = '*';
     char nl[] = uninitializedArray!(char[])(1);
     nl[] = '\n';

     char[][] wc = minimallyInitializedArray!(char[][])(N);

     auto w = appender!(char[])();

     foreach(n, ref elem; taskPool.parallel(wc[0..N2+1],100)){
         elem = p[n .. N2+2*n+1];
     }

     foreach (rn, ref elem ; taskPool.parallel(wc[N2+1..N],100)){
         int n = N2 - rn - 1;
         elem = p[n .. N2+2*n+1];
     }
     auto wj = join(wc,nl);
     w.put(wj);

     writeln(w.data);
}

Mar 25 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Tuesday, 25 March 2014 at 08:42:30 UTC, monarch_dodra wrote:
 Interesting. I'd have thought the "extra copy" would be an 
 overall slowdown, but I guess that's not the case.

I installed ubuntu 14.04 64 bit, and measured some of these 
examples using gdc, ldc and dmd on a corei3 box.  The examples 
that wouldn't build had something to do with use of 
array.replicate and range.replicate conflicting in the libraries 
for gdc and ldc builds, which were based on 2.064.2.


This is the ldc2 (0.13.0 alpha)(2.064.2) result:
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 
1>/dev/null
brad: time: 2107[ms]
sergei: time: 2441[ms]
jay2: time: 26[ms]
diamondShape: time: 679[ms]
printDiamond: time: 19[ms]
printDiamonde2a: time: 9[ms]
printDiamonde2b: time: 8[ms]
printDiamond3: time: 14[ms]

This is the gdc(2.064.2) result:
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./a.out 
1>/dev/null
brad: time: 3216[ms]
sergei: time: 2828[ms]
jay2: time: 26[ms]
diamondShape: time: 776[ms]
printDiamond: time: 19[ms]
printDiamonde2a: time: 13[ms]
printDiamonde2b: time: 13[ms]
printDiamond3: time: 51[ms]

This is the dmd(2.065) result:
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 
1>/dev/null
brad: time: 10830[ms]
sergei: time: 3480[ms]
jay2: time: 29[ms]
diamondShape: time: 2462[ms]
printDiamond: time: 23[ms]
printDiamonde2a: time: 13[ms]
printDiamonde2b: time: 10[ms]
printDiamond3: time: 23[ms]


So this printDiamonde2b example had the fastest time of the 
solutions, and had similar times on all three builds. The ldc2 
compiler build is performing best in most examples on ubuntu.

void printDiamonde2b(in uint N)
{
     uint N2 = N/2;
     char pSpace[] = uninitializedArray!(char[])(N2);
     pSpace[] = ' ';

     char pStars[] = uninitializedArray!(char[])(N+1);
     pStars[] = '*';

     pStars[$-1] = '\n';

     auto w = appender!(char[])();
     w.reserve(N*3);

     foreach (n ; 0 .. N2 + 1){
         w.put(pSpace[0 .. N2 - n]);
	w.put(pStars[$-2*n-2 .. $]);
	}

     foreach_reverse (n ; 0 .. N2){
         w.put(pSpace[0 .. N2 - n]);
	w.put(pStars[$-2*n-2 .. $]);
	}

     write(w.data);
}

Apr 20 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Monday, 21 April 2014 at 00:11:14 UTC, Jay Norwood wrote:
 So this printDiamonde2b example had the fastest time of the 
 solutions, and had similar times on all three builds. The ldc2 
 compiler build is performing best in most examples on ubuntu.

 void printDiamonde2b(in uint N)
 {
     uint N2 = N/2;
     char pSpace[] = uninitializedArray!(char[])(N2);
     pSpace[] = ' ';

     char pStars[] = uninitializedArray!(char[])(N+1);
     pStars[] = '*';

     pStars[$-1] = '\n';

     auto w = appender!(char[])();
     w.reserve(N*3);

     foreach (n ; 0 .. N2 + 1){
         w.put(pSpace[0 .. N2 - n]);
 	w.put(pStars[$-2*n-2 .. $]);
 	}

     foreach_reverse (n ; 0 .. N2){
         w.put(pSpace[0 .. N2 - n]);
 	w.put(pStars[$-2*n-2 .. $]);
 	}

     write(w.data);
 }

With this slightly tweaked solution, I can get times of roughly 
50% to 100% faster, on my dmd-linux box:

//----
void printDiamonde2monarch(in uint N)
{
     uint N2 = N/2;

     char[] pBuf = uninitializedArray!(char[])(N + N2);
     pBuf[ 0 .. N2] = ' ';
     pBuf[N2 ..  $] = '*';

     auto slice = uninitializedArray!(char[])(3*N2*N2 + 4*N);

     size_t i;
     foreach (n ; 0 .. N2 + 1){
         auto w = 1 + N2 + n;
         slice[i .. i + w] = pBuf[n .. w + n];
         slice[(i+=w)++]='\n';
     }

     foreach_reverse (n ; 0 .. N2){
         auto w = 1 + N2 + n;
         slice[i .. i + w] = pBuf[n .. w + n];
         slice[(i+=w)++]='\n';
     }

     write(slice[0 .. i]);
}
//----

The two "key" points here, first, is to avoid using appender. 
Second, instead of having two buffer: "    " and "******\n", and 
two do two "slice copies", to only have 1 buffer "    *****", and 
to do 1 slice copy, and a single '\n' write. At this point, I'm 
not sure how we could be going any faster, short of using 
alloca...

How does this hold up on your environment?

Apr 21 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Monday, 21 April 2014 at 08:26:49 UTC, monarch_dodra wrote:
 The two "key" points here, first, is to avoid using appender. 
 Second, instead of having two buffer: "    " and "******\n", 
 and two do two "slice copies", to only have 1 buffer "    
 *****", and to do 1 slice copy, and a single '\n' write. At 
 this point, I'm not sure how we could be going any faster, 
 short of using alloca...

 How does this hold up on your environment?

Yes your solution is the fastest yet.  Also, its times are 
similar for all three compilers.   The range of execution times 
varied for different solutions from over 108 seconds down to 64 
msec.

I see that RefAppender's data() returns the managed array.  Can 
write() handle that?  It seems that would be more efficient than 
duplicating the  character buffer ... or perhaps writing directly 
to an OutBuffer, and then sending that to write() would avoid the 
duplication?

jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ gdc -O2 main.d
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./a.out 
1>/dev/null
brad: time: 31865[ms]
sergei: time: 28596[ms]
jay2: time: 258[ms]
diamondShape: time: 7512[ms]
printDiamond: time: 200[ms]
printDiamonde2a: time: 140[ms]
printDiamonde2b: time: 137[ms]
printDiamond3: time: 503[ms]
printDiamonde2monarch: time: 86[ms]
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ dmd -release 
main.d
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 
1>/dev/null
brad: time: 108111[ms]
sergei: time: 33949[ms]
jay2: time: 282[ms]
diamondShape: time: 24567[ms]
printDiamond: time: 230[ms]
printDiamonde2a: time: 132[ms]
printDiamonde2b: time: 106[ms]
printDiamond3: time: 222[ms]
printDiamonde2monarch: time: 66[ms]
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ~/ldc/bin/ldc2 
-O2 main.d
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 
1>/dev/null
brad: time: 20996[ms]
sergei: time: 24841[ms]
jay2: time: 259[ms]
diamondShape: time: 6797[ms]
printDiamond: time: 194[ms]
printDiamonde2a: time: 91[ms]
printDiamonde2b: time: 87[ms]
printDiamond3: time: 145[ms]
printDiamonde2monarch: time: 64[ms]
jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$

Apr 21 2014

"Jay Norwood" <jayn prismnet.com> writes:

Wow,  joiner is much slower than join.  Such a small choice can 
make this big of a difference.  Not at all expected, since the 
lazy calls, I thought, were considered to be more efficient.  
This is with ldc2 -O2.

jay jay-ubuntu:~/ec_ddt/workspace/diamond/source$ ./main 
1>/dev/null
brad: time: 21958[ms]
sergei: time: 24629[ms]
jay2: time: 259[ms]
diamondShape: time: 6701[ms]
printDiamond: time: 194[ms]
printDiamonde2a: time: 95[ms]
printDiamonde2b: time: 92[ms]
printDiamond3: time: 144[ms]
printDiamonde2monarch: time: 67[ms]
printDiamonde2cJoin: time: 96[ms]
printDiamonde2cJoiner: time: 16115[ms]


void printDiamonde2cJoin(in uint N)
{
	int n,l;
     size_t N2 = N/2;
     size_t NM1 = N-1;
     char p[] = uninitializedArray!(char[])(N2+N);
     p[0..N2] = ' ';
     p[N2..$] = '*';
     char nl[] = uninitializedArray!(char[])(1);
     nl[] = '\n';

     char wc[][] = minimallyInitializedArray!(char[][])(N);

     for(n=0,l=0; n<N2; n++,l+=2){
     	wc[n] = wc[NM1-n] = p[n .. N2+l+1];
     }

     wc[N2] = p[N2..$];
     auto wj = join(wc,nl);
     write(wj);
     write('\n');
}

void printDiamonde2cJoiner(in uint N)
{
	int n,l;
     size_t N2 = N/2;
     size_t NM1 = N-1;
     char p[] = uninitializedArray!(char[])(N2+N);
     p[0..N2] = ' ';
     p[N2..$] = '*';

     char wc[][] = minimallyInitializedArray!(char[][])(N);

     for(n=0,l=0; n<N2; n++,l+=2){
     	wc[n] = wc[NM1-n] = p[n .. N2+l+1];
     }

     wc[N2] = p[N2..$];
     write(joiner(wc,"\n"));
     write('\n');
}

Apr 22 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 22 April 2014 at 11:41:41 UTC, Jay Norwood wrote:
 Wow,  joiner is much slower than join.  Such a small choice can 
 make this big of a difference.  Not at all expected, since the 
 lazy calls, I thought, were considered to be more efficient.  
 This is with ldc2 -O2.

Yeah, that's because join actually works on "RoR, R", rather than 
"R, E". This means if you feed it a "string[], string", then it 
will actually iterate over individual *characters*. Not only 
that, but since you are using char[], it will decode them too.

"join" is faster for 2 reasons:
1) It detects you want to joins arrays, so it doesn't have to 
iterate over them: It just glues them "slice at once"
2) No UTF decoding.

I kind of wish we had a faster joiner, but I think it would have 
made the call ambiguous.

Apr 22 2014

"Jay Norwood" <jayn prismnet.com> writes:

On Tuesday, 22 April 2014 at 15:25:04 UTC, monarch_dodra wrote:
 Yeah, that's because join actually works on "RoR, R", rather 
 than "R, E". This means if you feed it a "string[], string", 
 then it will actually iterate over individual *characters*. Not 
 only that, but since you are using char[], it will decode them 
 too.

 "join" is faster for 2 reasons:
 1) It detects you want to joins arrays, so it doesn't have to 
 iterate over them: It just glues them "slice at once"
 2) No UTF decoding.

 I kind of wish we had a faster joiner, but I think it would 
 have made the call ambiguous.

Ok, thanks.  I re-tried joiner with both parameters being ranges, 
but there was no improvement in execution speed.  I thought 
perhaps from your comments that it might work.

     char nl[] = uninitializedArray!(char[])(1);
     nl[] = '\n';

     write(joiner(wc,nl));

Apr 23 2014

"monarch_dodra" <monarchdodra gmail.com> writes:

On Tuesday, 22 April 2014 at 05:05:30 UTC, Jay Norwood wrote:
 On Monday, 21 April 2014 at 08:26:49 UTC, monarch_dodra wrote:
 The two "key" points here, first, is to avoid using appender. 
 Second, instead of having two buffer: "    " and "******\n", 
 and two do two "slice copies", to only have 1 buffer "    
 *****", and to do 1 slice copy, and a single '\n' write. At 
 this point, I'm not sure how we could be going any faster, 
 short of using alloca...

 How does this hold up on your environment?

 Yes your solution is the fastest yet.  Also, its times are 
 similar for all three compilers.   The range of execution times 
 varied for different solutions from over 108 seconds down to 64 
 msec.

 I see that RefAppender's data() returns the managed array.  Can 
 write() handle that?  It seems that would be more efficient 
 than duplicating the  character buffer ...

I'm not sure what you mean? "data" returns the managed array, but 
no duplication ever actually happens. It's allocated on the GC. 
the only thing that is copied is the slice itself.

 or perhaps writing directly to an OutBuffer, and then sending 
 that to write() would avoid the duplication?

appender *is* the outbuffer :)

Apr 22 2014

"bearophile" <bearophileHUGS lycos.com> writes:

 void main(){
 import std.stdio,std.range,std.algorithm,std.conv;
 auto m=10.iota.map!(_=>readln.split.to!(int[]));
 m.map!sum.chain(m.transposed.map!sum).reduce!max.write;
 }

It used to work, but with the latest changes I think I have 
broken it.

Bye,
bearophile

Mar 28 2014

D Programming

C/C++ Programming

Other

digitalmars.D.learn - Function to print a diamond shape