digitalmars.D - Java memory efficiency and column-oriented data

bearophile (69/69) Feb 02 2012 Through Reddit I've found this good and long slides pack, it's about usi...

Marco Leise (3/7) Feb 02 2012 From the back of my head I remember thinking of this. I'm not sure if I...

bearophile (5/7) Feb 02 2012 Splitting up a struct Foo is possible, but then when you have to pass it...

Marco Leise (9/25) Feb 02 2012 Can you post code, that proves the claim and some figures? Pictures say ...

bcs (6/74) Feb 02 2012 Ideally this shouldn't require the property. The "natural" or auto type
Timon Gehr (11/79) Feb 03 2012 How would you implement such a thing? Introspection is not (yet) that

bearophile (15/25) Feb 03 2012 Phobos is meant to be first of all _useful_ :-)

bearophile <bearophileHUGS lycos.com> writes:

Through Reddit I've found this good and long slides pack, it's about using Java
data structures to increase memory efficiency of programs:

http://domino.research.ibm.com/comm/research_people.nsf/pages/sevitsky.pubs.html/$FILE/oopsla08%20memory-efficient%20java%20slides.pdf


good to have weak and soft references in Phobos, and to have better memory
analysis tools outside Phobos.

The slides have reminded me my desire of a column-oriented "struct array" in
Phobos (some time ago someone has written a minimal version for D1).

The usage is simple:


import std.stdio, std.conv;

struct Foo { // an example struct
    int x;
    float y;
    string s;

    this(int xx, float yy) {
        x = xx;
        y = yy;
        s = text(x);
    }

    float sum() {
        return x + y;
    }
}

void main() {
    auto a1 = new Foo[1000]; // normal not parallel array
    foreach (ref Foo f; a1)
        writeln(f.s, " ", f.sum());

    // default usage example of ParallelArray
    // 3 Foo fields stored as 3 separated arrays inside a2
    ParallelArray!Foo a2; // valid
    static assert(a2[0].sizeof == size_t.sizeof * 4); // 3 pointers + 1 length
    a2.length = 1000;
    foreach (ref Foo f; a2) // A f Foo is built on the fly
        writeln(f, " ", f.sum());
    a2[10] = Foo(1, 2, "1");
    foreach (x; a2.x_array) // x_array is a property slice
        writeln(x);
    foreach (y; a2.y_array)
        writeln(y);
    foreach (s; a2.s_array)
        writeln(s);

    // specialized usage example of ParallelArray
    // x,y fields stored as an array, s field as another array

    static assert(a3[0].sizeof == size_t.sizeof * 3); // 2 pointers + 1 length
    a3.length = 1000;
    foreach (ref Foo f; a3) // A f Foo is built on the fly
        writeln(f, " ", f.sum());
    a3[10] = Foo(1, 2, "1");
    foreach (xy; a3.x_y_array)
        writeln(xy.x, " ", xy.y);
    foreach (s; a3.s_array)
        writeln(s);

    // float z0 = a3.x_y_array[10].sum(); // invalid code



    // so if you give a string with the field names, you need to
    // list them all, and only once each. Other designs are possible
    // but this is the simplest to use and implement.

    float z1 = a3[10].sum(); // a3[10] returns a Foo
    
    // a3(10) doesn't create a Foo, it just fetches what
    // .sum() needs, so it's faster if you have to call .sum()
    // on many records.
    // so the calls to sum() are implemented at compile-time
    float z2 = a3(10).sum();

    // To keep design simple. ParallelArray can't create 2D arrays
}


Do you like?
I have several usages of such struct in my code.

Bye,
bearophile

Feb 02 2012

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 03.02.2012, 01:21 Uhr, schrieb bearophile <bearophileHUGS lycos.com>:

 Do you like?
 I have several usages of such struct in my code.

 Bye,
 bearophile

 From the back of my head I remember thinking of this. I'm not sure if I  
really needed it, or just split up the struct in two parts.

Feb 02 2012

bearophile <bearophileHUGS lycos.com> writes:

Marco Leise:

  From the back of my head I remember thinking of this. I'm not sure if I  
 really needed it, or just split up the struct in two parts.

Splitting up a struct Foo is possible, but then when you have to pass it around
you need to remember to pass all its parts. And things gets more complex if Foo
has various methods that use mixed subsets of the fields.

Column-wide arrays allow you to not change the original Foo struct, to use Foo
unchanged where you don't need a column-oriented data structure, and allow you
most common usages. They allow you to split your Foo in different ways in
different parts of your program, using different ParallelArray. They are
supposed to give higher scan performance if in a part of a program you need
only a subset of the fields. Iterating only on a subset of the fields of an
array of records is a common enough task in my D code.

Bye,
bearophile

Feb 02 2012

"Marco Leise" <Marco.Leise gmx.de> writes:

Am 03.02.2012, 03:38 Uhr, schrieb bearophile <bearophileHUGS lycos.com>:

 Marco Leise:

  From the back of my head I remember thinking of this. I'm not sure if I
 really needed it, or just split up the struct in two parts.

 Splitting up a struct Foo is possible, but then when you have to pass it  
 around you need to remember to pass all its parts. And things gets more  
 complex if Foo has various methods that use mixed subsets of the fields.

 Column-wide arrays allow you to not change the original Foo struct, to  
 use Foo unchanged where you don't need a column-oriented data structure,  
 and allow you most common usages. They allow you to split your Foo in  
 different ways in different parts of your program, using different  
 ParallelArray. They are supposed to give higher scan performance if in a  
 part of a program you need only a subset of the fields. Iterating only  
 on a subset of the fields of an array of records is a common enough task  
 in my D code.

 Bye,
 bearophile

Can you post code, that proves the claim and some figures? Pictures say  
more than thousand words. As long as I think scanning performance will  
hardly be influenced this just complicates things like

	arr[i].x = 1;  // no return by ref possible, so doesn't do what you want

or

	arr = minimallyInitializedArray!(Foo[])(1_000_000);

:)

-- Marco

Feb 02 2012

bcs <bcs example.com> writes:

On 02/02/2012 04:21 PM, bearophile wrote:
 Through Reddit I've found this good and long slides pack, it's about using
Java data structures to increase memory efficiency of programs:

 http://domino.research.ibm.com/comm/research_people.nsf/pages/sevitsky.pubs.html/$FILE/oopsla08%20memory-efficient%20java%20slides.pdf


good to have weak and soft references in Phobos, and to have better memory
analysis tools outside Phobos.

 The slides have reminded me my desire of a column-oriented "struct array" in
Phobos (some time ago someone has written a minimal version for D1).

 The usage is simple:


 import std.stdio, std.conv;

 struct Foo { // an example struct
      int x;
      float y;
      string s;

      this(int xx, float yy) {
          x = xx;
          y = yy;
          s = text(x);
      }

      float sum() {
          return x + y;
      }
 }

 void main() {
      auto a1 = new Foo[1000]; // normal not parallel array
      foreach (ref Foo f; a1)
          writeln(f.s, " ", f.sum());

      // default usage example of ParallelArray
      // 3 Foo fields stored as 3 separated arrays inside a2
      ParallelArray!Foo a2; // valid
      static assert(a2[0].sizeof == size_t.sizeof * 4); // 3 pointers + 1 length
      a2.length = 1000;
      foreach (ref Foo f; a2) // A f Foo is built on the fly
          writeln(f, " ", f.sum());
      a2[10] = Foo(1, 2, "1");
      foreach (x; a2.x_array) // x_array is a property slice

Ideally this shouldn't require the property. The "natural" or auto type 
for iterating a ParallelArray should be a proxy value that defines 
properties for all the members and looks them up on demand. It would 
just need two words, a pointer to the parent ParallelArray and an index 
into it.

          writeln(x);
      foreach (y; a2.y_array)
          writeln(y);
      foreach (s; a2.s_array)
          writeln(s);

      // specialized usage example of ParallelArray
      // x,y fields stored as an array, s field as another array

      static assert(a3[0].sizeof == size_t.sizeof * 3); // 2 pointers + 1 length
      a3.length = 1000;
      foreach (ref Foo f; a3) // A f Foo is built on the fly
          writeln(f, " ", f.sum());
      a3[10] = Foo(1, 2, "1");
      foreach (xy; a3.x_y_array)
          writeln(xy.x, " ", xy.y);
      foreach (s; a3.s_array)
          writeln(s);

      // float z0 = a3.x_y_array[10].sum(); // invalid code



      // so if you give a string with the field names, you need to
      // list them all, and only once each. Other designs are possible
      // but this is the simplest to use and implement.

      float z1 = a3[10].sum(); // a3[10] returns a Foo

      // a3(10) doesn't create a Foo, it just fetches what
      // .sum() needs, so it's faster if you have to call .sum()
      // on many records.
      // so the calls to sum() are implemented at compile-time
      float z2 = a3(10).sum();

      // To keep design simple. ParallelArray can't create 2D arrays
 }


 Do you like?
 I have several usages of such struct in my code.

 Bye,
 bearophile

Feb 02 2012

Timon Gehr <timon.gehr gmx.ch> writes:

On 02/03/2012 01:21 AM, bearophile wrote:
 Through Reddit I've found this good and long slides pack, it's about using
Java data structures to increase memory efficiency of programs:

 http://domino.research.ibm.com/comm/research_people.nsf/pages/sevitsky.pubs.html/$FILE/oopsla08%20memory-efficient%20java%20slides.pdf


good to have weak and soft references in Phobos, and to have better memory
analysis tools outside Phobos.

 The slides have reminded me my desire of a column-oriented "struct array" in
Phobos (some time ago someone has written a minimal version for D1).

 The usage is simple:


 import std.stdio, std.conv;

 struct Foo { // an example struct
      int x;
      float y;
      string s;

      this(int xx, float yy) {
          x = xx;
          y = yy;
          s = text(x);
      }

      float sum() {
          return x + y;
      }
 }

 void main() {
      auto a1 = new Foo[1000]; // normal not parallel array
      foreach (ref Foo f; a1)
          writeln(f.s, " ", f.sum());

      // default usage example of ParallelArray
      // 3 Foo fields stored as 3 separated arrays inside a2
      ParallelArray!Foo a2; // valid
      static assert(a2[0].sizeof == size_t.sizeof * 4); // 3 pointers + 1 length
      a2.length = 1000;
      foreach (ref Foo f; a2) // A f Foo is built on the fly
          writeln(f, " ", f.sum());
      a2[10] = Foo(1, 2, "1");
      foreach (x; a2.x_array) // x_array is a property slice
          writeln(x);
      foreach (y; a2.y_array)
          writeln(y);
      foreach (s; a2.s_array)
          writeln(s);

      // specialized usage example of ParallelArray
      // x,y fields stored as an array, s field as another array

      static assert(a3[0].sizeof == size_t.sizeof * 3); // 2 pointers + 1 length
      a3.length = 1000;
      foreach (ref Foo f; a3) // A f Foo is built on the fly
          writeln(f, " ", f.sum());
      a3[10] = Foo(1, 2, "1");
      foreach (xy; a3.x_y_array)
          writeln(xy.x, " ", xy.y);
      foreach (s; a3.s_array)
          writeln(s);

      // float z0 = a3.x_y_array[10].sum(); // invalid code



      // so if you give a string with the field names, you need to
      // list them all, and only once each. Other designs are possible
      // but this is the simplest to use and implement.

      float z1 = a3[10].sum(); // a3[10] returns a Foo

      // a3(10) doesn't create a Foo, it just fetches what
      // .sum() needs, so it's faster if you have to call .sum()
      // on many records.
      // so the calls to sum() are implemented at compile-time
      float z2 = a3(10).sum();

How would you implement such a thing? Introspection is not (yet) that 
powerful.

      // To keep design simple. ParallelArray can't create 2D arrays
 }


 Do you like?
 I have several usages of such struct in my code.

 Bye,
 bearophile

I like it and I think it would be a valuable addition to Phobos, as well 
as a nice showcase for Ds templates and string mixins. You didn't 
specify what the type of eg. a3.x_y_array is, I think it should be 
Tuple!(int, "x", float, "y")[]. Furthermore, I'd rather have it named 
a3.x_y. There should also be a possibility to manually name the parts, 
as well as a way to specify 'anything that is left':



static assert(is(typeof(pa.pos) == Tuple!(int, "x", float, "y")[]));

Feb 03 2012

bearophile <bearophileHUGS lycos.com> writes:

Timon Gehr:

 I like it and I think it would be a valuable addition to Phobos,

Marco Leise has asked for some benchmarks first, and I think he's right.


 as well as a nice showcase for Ds templates and string mixins.

Phobos is meant to be first of all _useful_ :-)


 You didn't 
 specify what the type of eg. a3.x_y_array is, I think it should be 
 Tuple!(int, "x", float, "y")[].

OK.


 Furthermore, I'd rather have it named a3.x_y.

The problem is when you ask for a single field then the name becomes something
just like "x" that I think is misleading because it's not the original x, it's
a tuple with a single item. I think a name that helps you tell apart the
original 'x' field from this new field is better. That's why I have added a
suffix.


 There should also be a possibility to manually name the parts,

I think this makes the splitting string a bit too much complex... This data
structure is meant to be as much as possible plain, like a POD array. I'd like
to not give the option to change the synthetic field names both to keep
ParallelArray simpler, and to help its users to find it _easy_ to tell exactly
what it spits out.


 as well as a way to specify 'anything that is left':
 


This was my first design for ParallelArray, but I think it's not a good idea,
and not I prefer the simpler design I've shown. Forcing a stricter management
of the names in the optional string allows for a stronger static typing: if you
don't do that if you later change the original Foo struct, adding a field, it
goes in the "rest" part, and you don't know what's in "rest" any more.

But ParallelArray is meant to be a transparent data structure, so it asks you
to list all the original fields once and only once (so if you add ore remove a
field in Foo, the compiler asks you to modify all ParallelArray of your program
defined on Foo that have a string too.  If you don't give a string,  like  
ParallelArray!Foo  then it compiles even if you change Foo).


Anyway, if users later think it's really needed, then this ParallelArray
feature is addable later with a syntax like:

And the syntax for naming new fields is simple as you say:

But I suggest to not put this in a first implementation of ParallelArray and
keep things simpler first.

Thank you for your comments,
bearophile

Feb 03 2012

D Programming

C/C++ Programming

Other

digitalmars.D - Java memory efficiency and column-oriented data