www.digitalmars.com         C & C++   DMDScript  

digitalmars.D - Arbitrary Size Integer Arrays

reply dsimcha <dsimcha yahoo.com> writes:
I'm thinking it might be useful to have a library type that allows for
arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
packed into an array.  This might be nice when you want a huge array of them,
the size you actually need is somewhere in between two native sizes, and you
care more about space/cache efficiency than raw CPU cycles.  For example,
let's say you wanted an array of N 19-bit integers.  You'd make an array of
uints (32-bit) with ceil(N * 19 / 32) elements, and when it was indexed, you'd
do a bunch of modulus and bit shifting to get the right 19 bits out of that
block of memory, and return a uint.  It would be slow as molasses, but really
space efficient, which would in certain cases be a reasonable tradeoff.

Has anyone implemented, or considered implementing this before?  If not, if I
implemented it well and submitted it as an enhancement, would it be a
reasonable thing to include in std.array?
Sep 21 2009
next sibling parent reply Andrei Alexandrescu <SeeWebsiteForEmail erdani.org> writes:
dsimcha wrote:
 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
 packed into an array.  This might be nice when you want a huge array of them,
 the size you actually need is somewhere in between two native sizes, and you
 care more about space/cache efficiency than raw CPU cycles.  For example,
 let's say you wanted an array of N 19-bit integers.  You'd make an array of
 uints (32-bit) with ceil(N * 19 / 32) elements, and when it was indexed, you'd
 do a bunch of modulus and bit shifting to get the right 19 bits out of that
 block of memory, and return a uint.  It would be slow as molasses, but really
 space efficient, which would in certain cases be a reasonable tradeoff.
 
 Has anyone implemented, or considered implementing this before?  If not, if I
 implemented it well and submitted it as an enhancement, would it be a
 reasonable thing to include in std.array?

I think that would be a nice addition. We haven't established some general rules for containers. Values or references? In the latter case: Classes or structs? If classes: hierarchy or largely independent classes? Andrei
Sep 21 2009
next sibling parent dsimcha <dsimcha yahoo.com> writes:
== Quote from Andrei Alexandrescu (SeeWebsiteForEmail erdani.org)'s article
 dsimcha wrote:
 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
 packed into an array.  This might be nice when you want a huge array of them,
 the size you actually need is somewhere in between two native sizes, and you
 care more about space/cache efficiency than raw CPU cycles.  For example,
 let's say you wanted an array of N 19-bit integers.  You'd make an array of
 uints (32-bit) with ceil(N * 19 / 32) elements, and when it was indexed, you'd
 do a bunch of modulus and bit shifting to get the right 19 bits out of that
 block of memory, and return a uint.  It would be slow as molasses, but really
 space efficient, which would in certain cases be a reasonable tradeoff.

 Has anyone implemented, or considered implementing this before?  If not, if I
 implemented it well and submitted it as an enhancement, would it be a
 reasonable thing to include in std.array?

general rules for containers. Values or references? In the latter case: Classes or structs? If classes: hierarchy or largely independent classes? Andrei

I'd go with struct value type to make it as similar feeling to a regular array as possible. On the other hand, regular arrays are going to be overhauled anyhow, so that makes me not so sure.
Sep 21 2009
prev sibling parent "Denis Koroskin" <2korden gmail.com> writes:
On Tue, 22 Sep 2009 00:03:28 +0400, Andrei Alexandrescu  
<SeeWebsiteForEmail erdani.org> wrote:

 dsimcha wrote:
 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented)  
 to be
 packed into an array.  This might be nice when you want a huge array of  
 them,
 the size you actually need is somewhere in between two native sizes,  
 and you
 care more about space/cache efficiency than raw CPU cycles.  For  
 example,
 let's say you wanted an array of N 19-bit integers.  You'd make an  
 array of
 uints (32-bit) with ceil(N * 19 / 32) elements, and when it was  
 indexed, you'd
 do a bunch of modulus and bit shifting to get the right 19 bits out of  
 that
 block of memory, and return a uint.  It would be slow as molasses, but  
 really
 space efficient, which would in certain cases be a reasonable tradeoff.
  Has anyone implemented, or considered implementing this before?  If  
 not, if I
 implemented it well and submitted it as an enhancement, would it be a
 reasonable thing to include in std.array?

I think that would be a nice addition. We haven't established some general rules for containers. Values or references? In the latter case: Classes or structs? If classes: hierarchy or largely independent classes? Andrei

You can't implement containers using structs unless default ctors for them are allowed. Is it ever going to be fixed?
Sep 23 2009
prev sibling next sibling parent reply bearophile <bearophileHUGS lycos.com> writes:
dsimcha:

 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
 packed into an array.

I think LLVM is already able to represent and operate on such numbers (but I don't know how/if they can be packed, maybe not). Eventually I hope to see LDC start using some of the many nice features of that nice back-end. Bye, bearophile
Sep 21 2009
parent reply dsimcha <dsimcha yahoo.com> writes:
== Quote from bearophile (bearophileHUGS lycos.com)'s article
 dsimcha:
 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
 packed into an array.


 Eventually I hope to see LDC start using some of the many nice features of that

 Bye,
 bearophile

Yes, but: 1. Packing them is the whole point. 2. This should be a library type. It's waaaay too niche to be a builtin, and if it were a compiler intrinsic that was specific to a backend, it wouldn't be portable so noone would use it.
Sep 21 2009
parent bearophile <bearophileHUGS lycos.com> writes:
dsimcha:

1.  Packing them is the whole point.<

It's your whole point, but LLVM is designed for many purposes, and such numbers are useful when you design hardware circuits starting from a C-like description.
 and if
 it were a compiler intrinsic that was specific to a backend, it wouldn't be
 portable so noone would use it.

Uhm... let's see. D currently doesn't have computed gotos because Walter said they are lot of work to implement and because they are mostly useful when you want to implement an interpreter. Yet I have found an usage of such computed gotos to speed up "parsing" of biological information. D can be compiled on GCC and LLVM backends too, and they both already support computed gotos. I don't know if Intel and Microsoft C++ compilers support computed gotos. Standard C doesn't allow for computed gotos, but GCC has them as not standard extension, and I guess some people use them. Limiting what D can do, refusing to use a nice feature of GCC/LLVM, because DMC doesn't currently have such feature doesn't look nice to me. So I may like to see this feature added to D specs (and implemented LDC/GDC) even if DMC can't support it. Bye, bearophile
Sep 21 2009
prev sibling next sibling parent reply Don <nospam nospam.com> writes:
dsimcha wrote:
 I'm thinking it might be useful to have a library type that allows for
 arbitrary size integers <= 64 bits (128 if ucent is ever implemented) to be
 packed into an array.  This might be nice when you want a huge array of them,
 the size you actually need is somewhere in between two native sizes, and you
 care more about space/cache efficiency than raw CPU cycles.  For example,
 let's say you wanted an array of N 19-bit integers.  You'd make an array of
 uints (32-bit) with ceil(N * 19 / 32) elements, and when it was indexed, you'd
 do a bunch of modulus and bit shifting to get the right 19 bits out of that
 block of memory, and return a uint.  It would be slow as molasses, but really
 space efficient, which would in certain cases be a reasonable tradeoff.
 
 Has anyone implemented, or considered implementing this before?  If not, if I
 implemented it well and submitted it as an enhancement, would it be a
 reasonable thing to include in std.array?

A couple of comments: (1) Many of the uses I can imagine for such a structure would have locality of reference. If you provide slice access (eg, give me elements [a..b] as an array of ints) then you can have reasonable performance. Unpacking consecutive elements can be done quite efficiently (it's an interesting optimisation problem, though!). (2) A floating-point version of this was discussed at some point. The number of exponent and significand bits would be specified. Again, you'd extract an array of such beasts into an array of floats or doubles.
Sep 22 2009
parent reply Walter Bright <newshound1 digitalmars.com> writes:
Don wrote:
 If you provide slice access (eg, give me elements [a..b] as an array of 
 ints) then you can have reasonable performance. Unpacking consecutive 
 elements can be done quite efficiently (it's an interesting optimisation 
 problem, though!).

It wouldn't be slicable because a slice would have to start at the beginning of a byte. The D bit type died in that quagmire.
Sep 23 2009
next sibling parent Don <nospam nospam.com> writes:
Walter Bright wrote:
 Don wrote:
 If you provide slice access (eg, give me elements [a..b] as an array 
 of ints) then you can have reasonable performance. Unpacking 
 consecutive elements can be done quite efficiently (it's an 
 interesting optimisation problem, though!).

It wouldn't be slicable because a slice would have to start at the beginning of a byte. The D bit type died in that quagmire.

I didn't mean an actual D slice -- I meant value semantics, not reference semantics. Give me a copy of X[a..b], putting it into an int[] array. Something like: struct ArbitrarySizedIntArray { int[] extractSlice(size_t a, size_t b, int [] dest=null) { } }
Sep 24 2009
prev sibling parent BCS <none anon.com> writes:
Hello Walter,

 Don wrote:
 
 If you provide slice access (eg, give me elements [a..b] as an array
 of ints) then you can have reasonable performance. Unpacking
 consecutive elements can be done quite efficiently (it's an
 interesting optimisation problem, though!).
 

beginning of a byte. The D bit type died in that quagmire.

It could be sliced if you are willing to play games like "element zero is N places after this memory address." At a minimum, Every 8th element will be byte aligned.
Sep 24 2009
prev sibling parent BCS <none anon.com> writes:
Content-Type: text/plain; charset=iso-8859-1
Content-Transfer-Encoding: quoted-printable

Hello=20dsimcha,

Here=20is=20a=20quick=20an=20dirty=20implementation=2e=20It=20has=20a=20few=20warts=20(it=20only=20has=20set-length,=20index=20and=20index-assign=20for=20an=20interface)=2e

I'm=20posting=20it=20in=20the=20public=20domain=20so=20do=20whatever=20you=20want=20with=20it=2e
Sep 22 2009