digitalmars.D - Bit subscriptions, RAM columnwise layouts

bearophile (43/44) Dec 11 2008 This is the home page of the Wirbel language:

BCS (4/9) Dec 11 2008 Darn you, now I'm going to need to wright a column struct array template...

bearophile (12/13) Dec 11 2008 If you like my help, I may want to help you :-)

BCS (7/46) Dec 11 2008 I've got to run off now but my current sticking point is getting from St...

BCS (25/37) Dec 13 2008 Done:

Bill Baxter (9/17) Dec 11 2008 That's kind of interesting too. There are definitely times I've

Nick Sabalausky (11/40) Dec 11 2008 I've always preferred array-of-structs because that way it's impossible ...

Steven Schveighoffer (4/27) Dec 11 2008 I like it. a.bits[1..5] = 0 looks a hell of a lot better than a &= ~(((...

Jarrett Billingsley (6/8) Dec 11 2008 I like it too, and I think it would be an extremely welcome addition

Saaa (1/8) Dec 11 2008

bearophile <bearophileHUGS lycos.com> writes:

This is the home page of the Wirbel language:
http://mathias-kettner.de/wirbel.html

This page shows something interesting:
http://mathias-kettner.de/wirbel_int_bitwise.html

From the page:
<<
Subscription operators
Wirbel lets you use an integer a little bit like a list of boolean values with
the fixed length of 32 or 64. You can used the square brackets to directly
address the bits:

a = 0
a[0] = true
a[2] = false
print(a)

This will output 5 since this first bit (value 1) and the third bit (value 4)
have been switched on. You can also query arbitrary bits:

if a[4]:
    print("Bit 4 is on -> a contains a 16")

Splices [slices] won't currently work but may follow in a later release of
Wirbel. 


For D the following is an alternative syntax seems a little more explicit:
int a = 156;
a.bits[0] = true;
a.bits[0..5] = 0;

Instead of:
int a = 156;
a[0] = true;
a[0..5] = 0;


Wirbel also has built-in bit sets, a bit as Pascal, but they are just 64-bit
long:
http://mathias-kettner.de/wirbel_bitsets.html
They support the usual set operators (almost like the Python sets).

--------------------

DataDraw is a very interesting library for C, it can be used as a RAM-database:
http://datadraw.sourceforge.net/

This is a PDF that explains lot of things:
http://datadraw.sourceforge.net/manual.pdf

(Beside the things I talk about here it has few other features, like optional
persistence on HD and undo).

The authors show that it can often have performance higher than C/C++ code.
Such performance comes mostly from two things (it's explained at pages 25-27 of
the PDF):
- On 64-bit CPUs/OSes when possible it uses 32-bit indexes instead of 64 bit
pointers (when data is a lot, I think it automatically uses 64 bit indexes),
this usually reduce the total amount of memory used, and this interacts well
with caches.
- By default it uses parallel arrays, that is if you create an array of
struct-like things, it doesn't actually keep them as an array of structs, but
as many arrays of single items. So the programmer doesn't specify how the
system lays and uses memory, and usually lets the system do by itself. There is
a also syntax to say that some or all fields of such records have to actually
be stored closely, this is useful for the less common situations when you need
to process more than a field at a time). This often leads to better usage of
the cache, and in the end speeds up several programs.

So today the caches have changed a little how optimal code has to be written in
C. The authors say that a downside of C is that it forces the programmer to do
such memory layout manually, and today this may lead to less performance.

In this regard I think it's like the difference from classes and structs in D:
while in D structs are PODs, where the fields are where you want them to be, D
classes order their fields as they like.
So I think it can be conceived some way in D to allow the compiler to perform
such memory layouts as DataDraw does (as I've said DataDraw also allows a more
manual memory layout, see the cache_together�directive of page 25-26 of the
PDF).

Note that today such shifting from record-wise layouts to column-wise ones is
happening in true databases too, especially ones that are used for data mining,
for performance reasons, because they interact better with caches, even the CPU
caches too. They have even invented hybrid things, like PAX, look especially
the figure4 at page 4 here:
http://www.cs.wisc.edu/multifacet/papers/vldb01_pax.pdf

Bye,
bearophile

Dec 11 2008

BCS <ao pathlink.com> writes:

Reply to bearophile,

 So today the caches have changed a little how optimal code has to be
 written in C. The authors say that a downside of C is that it forces
 the programmer to do such memory layout manually, and today this may
 lead to less performance.
 

Darn you, now I'm going to need to wright a column struct array template 
that takes a struct and stripes the columns.

OTOH this is going to be fun...

Dec 11 2008

bearophile <bearophileHUGS lycos.com> writes:

BCS:
I'm going to need to wright a column struct array template that takes a struct
and stripes the columns.<

If you like my help, I may want to help you :-)

Two things I'd like to see in such struct array template:
1) It has to define one different foreach for each field. So you can iterate
just one field. (or a small group of fields, see the memory layout C at the
bottom of this post). I don't know yet how this can be done, but I thin it's
doable. Do you have ideas?
2) A template boolean constant (that is given at compile time, false by
default) that when is true activates the collection of some usage statistics
(temporal patterns) inside the data structure. So after such test runs the
programmer is essentially able to ask this data structure what's the faster
memory layout according to the actual usage of a single data structure inside a
program, so he/she/shi can ask the data structure to use it.

There are four main possible memory layouts it can use:
A) Normal array of structs.
B) Separated arrays for each field.
C) A mix of the two, that is separated arrays that contain single fields or
group few fields that are usually accessed closely. I think this may give to
better performance in such situations.
D) There's also the PAX hybrid I have shown in the original post, that is a
single array of "blocks", where each block contains several records arranged as
separated columns. I am not sure, but there can be situations where this layout
may be good.

Bye,
bearophile

Dec 11 2008

BCS <ao pathlink.com> writes:

Reply to bearophile,

 BCS:
 
 I'm going to need to wright a column struct array template that takes
 a struct and stripes the columns.<
 

 If you like my help, I may want to help you :-)
 

I've got to run off now but my current sticking point is getting from
Struct.tupleof 
to a tuple of members names and then passing that tuple on to other templates.

My first few tries kept going odd.

Want to try? go ahead.

 Two things I'd like to see in such struct array template:
 
 1) It has to define one different foreach for each field. So you can
 iterate just one field. (or a small group of fields, see the memory
 layout C at the bottom of this post). I don't know yet how this can be
 done, but I thin it's doable. Do you have ideas?
 

yup foreach will take a delegate of the correct form for the RHS or just 
return the array its self.

 2) A template boolean constant (that is given at compile time, false
 by default) that when is true activates the collection of some usage
 statistics (temporal patterns) inside the data structure. So after
 such test runs the programmer is essentially able to ask this data
 structure what's the faster memory layout according to the actual
 usage of a single data structure inside a program, so he/she/shi can
 ask the data structure to use it.
 
 There are four main possible memory layouts it can use:
 
 A) Normal array of structs.
 
 B) Separated arrays for each field.
 
 C) A mix of the two, that is separated arrays that contain single
 fields or group few fields that are usually accessed closely. I think
 this may give to better performance in such situations.
 
 D) There's also the PAX hybrid I have shown in the original post, that
 is a single array of "blocks", where each block contains several
 records arranged as separated columns. I am not sure, but there can be
 situations where this layout may be good.
 
 Bye,
 bearophile

Dec 11 2008

BCS <ao pathlink.com> writes:

Reply to Benjamin,

 Reply to bearophile,
 
 So today the caches have changed a little how optimal code has to be
 written in C. The authors say that a downside of C is that it forces
 the programmer to do such memory layout manually, and today this may
 lead to less performance.
 

 Darn you, now I'm going to need to wright a column struct array
 template that takes a struct and stripes the columns.
 
 OTOH this is going to be fun...
 

Done:

http://www.dsource.org/projects/scrapple/browser/trunk/columns/columns.d

not to fancy but supports code like this:

struct Foo
{
	float k;
	int i;
	uint j;
	void print(){writef("%s, %s, %s\n", k, i, j);}
}

Foo foo;
Columns!(Foo) FooCol;

FooCol.length = 10;
for(int i = 0; i < 10; i++)
{
	foo.k = 3.14 * i;
	foo.i = -3 - i;
	FooCol[i] = foo;
	FooCol[i].j = 5 + i;
}
foreach(float f; FooCol.k) writef("%s ", f); writef(\n);
foreach(int   i; FooCol.i) writef("%s ", i); writef(\n);
foreach(uint  u; FooCol.j) writef("%s ", u); writef(\n);
FooCol[3].FullItem.print();

Dec 13 2008

"Bill Baxter" <wbaxter gmail.com> writes:

On Thu, Dec 11, 2008 at 11:21 PM, bearophile <bearophileHUGS lycos.com> wrote:
 Wirbel lets you use an integer a little bit like a list of boolean values with
the fixed length of 32 or 64. You can used the square brackets to directly
address the bits:

 a = 0
 a[0] = true
 a[2] = false
 print(a)

Nice idea, that makes a lot of sense.

 DataDraw is a very interesting library for C, it can be used as a RAM-database:
 http://datadraw.sourceforge.net/

 - By default it uses parallel arrays, that is if you create an array of
struct-like things, it doesn't actually keep them as an array of structs, but
as many arrays of single items. So the programmer doesn't specify how the
system lays and uses memory, and usually lets the system do by itself. There is
a also syntax to say that some or all fields of such records have to actually
be stored closely, this is useful for the less common situations when you need
to process more than a field at a time). This often leads to better usage of
the cache, and in the end speeds up several programs.

That's kind of interesting too.  There are definitely times I've
flip-flopped back and forth over whether some particular thing should
be an array-of-structures or a structure-of-arrays.  Definitely seems
do-able in D.  You could make it so a Record!(Struct,20) stores
Structs column-wise, or just flip a bit to store them the other way:
Record!(Struct,20,RecordOrientation.Row).

--bb

Dec 11 2008

"Nick Sabalausky" <a a.a> writes:

"Bill Baxter" <wbaxter gmail.com> wrote in message 
news:mailman.150.1229027026.22690.digitalmars-d puremagic.com...
 On Thu, Dec 11, 2008 at 11:21 PM, bearophile <bearophileHUGS lycos.com> 
 wrote:
 Wirbel lets you use an integer a little bit like a list of boolean values 
 with the fixed length of 32 or 64. You can used the square brackets to 
 directly address the bits:

 a = 0
 a[0] = true
 a[2] = false
 print(a)

 Nice idea, that makes a lot of sense.

 DataDraw is a very interesting library for C, it can be used as a 
 RAM-database:
 http://datadraw.sourceforge.net/

 - By default it uses parallel arrays, that is if you create an array of 
 struct-like things, it doesn't actually keep them as an array of structs, 
 but as many arrays of single items. So the programmer doesn't specify how 
 the system lays and uses memory, and usually lets the system do by 
 itself. There is a also syntax to say that some or all fields of such 
 records have to actually be stored closely, this is useful for the less 
 common situations when you need to process more than a field at a time). 
 This often leads to better usage of the cache, and in the end speeds up 
 several programs.

 That's kind of interesting too.  There are definitely times I've
 flip-flopped back and forth over whether some particular thing should
 be an array-of-structures or a structure-of-arrays.  Definitely seems
 do-able in D.  You could make it so a Record!(Struct,20) stores
 Structs column-wise, or just flip a bit to store them the other way:
 Record!(Struct,20,RecordOrientation.Row).

I've always preferred array-of-structs because that way it's impossible for 
the lengths or indicies to end up out-of-sync. However, it does seem like 
accessing the same field from multiple "rows" could in many cases be more 
common than accessing lots of fields from one "row". So given a nice generic 
class or template or something that could ensure proper atomicity (is that 
the correct noun form of "atomic operation"?) for struct-of-arrays 
operations, I can see a lot of cases where it would be much more 
cache-friendly than my usual array-of-structs approach. I'm glad this was 
brought up, I had never thought about it that way before.

Dec 11 2008

"Steven Schveighoffer" <schveiguy yahoo.com> writes:

"bearophile" wrote
 Subscription operators
 Wirbel lets you use an integer a little bit like a list of boolean values 
 with the fixed length of 32 or 64. You can used the square brackets to 
 directly address the bits:

 a = 0
 a[0] = true
 a[2] = false
 print(a)

 This will output 5 since this first bit (value 1) and the third bit (value 
 4) have been switched on. You can also query arbitrary bits:

 if a[4]:
    print("Bit 4 is on -> a contains a 16")

 Splices [slices] won't currently work but may follow in a later release of 
 Wirbel.


 For D the following is an alternative syntax seems a little more explicit:
 int a = 156;
 a.bits[0] = true;
 a.bits[0..5] = 0;

 Instead of:
 int a = 156;
 a[0] = true;
 a[0..5] = 0;

I like it.  a.bits[1..5] = 0 looks a hell of a lot better than a &= ~(((1 << 
4) - 1) << 1)

-Steve

Dec 11 2008

"Jarrett Billingsley" <jarrett.billingsley gmail.com> writes:

On Thu, Dec 11, 2008 at 4:20 PM, Steven Schveighoffer
<schveiguy yahoo.com> wrote:
 I like it.  a.bits[1..5] = 0 looks a hell of a lot better than a &= ~(((1 <<
 4) - 1) << 1)

I like it too, and I think it would be an extremely welcome addition
to a systems-level programming language.

It also makes me remember the old bit[], which I always thought should
have be castable to and from numeric types.  Oh, bit.

Dec 11 2008

"Saaa" <empty needmail.com> writes:

I agree!

 I like it.  a.bits[1..5] = 0 looks a hell of a lot better than a &= ~(((1 
 <<
 4) - 1) << 1)

 I like it too, and I think it would be an extremely welcome addition
 to a systems-level programming language.

 It also makes me remember the old bit[], which I always thought should
 have be castable to and from numeric types.  Oh, bit.

Dec 11 2008

D Programming

C/C++ Programming

Other

digitalmars.D - Bit subscriptions, RAM columnwise layouts